A Full RNS Variant Of FV Like Somewhat Homomorphic .

Transcription

A Full RNS Variant of FV like SomewhatHomomorphic Encryption SchemesJean-Claude Bajard1 , Julien Eynard2 , M. Anwar Hasan2 , and Vincent Zucca11Sorbonne Universités, UPMC, CNRS, LIP6, Paris, France.{jean-claude.bajard, vincent.zucca}@lip6.fr2Department of Electrical and Computer Engineering, University of Waterloo.{jeynard, ahasan}@uwaterloo.caAbstract. Since Gentry’s breakthrough work in 2009, homomorphiccryptography has received a widespread attention. Implementation of afully homomorphic cryptographic scheme is however still highly expensive. Somewhat Homomorphic Encryption (SHE) schemes, on the otherhand, allow only a limited number of arithmetical operations in the encrypted domain, but are more practical. Many SHE schemes have beenproposed, among which the most competitive ones rely on Ring LearningWith Errors (RLWE) and operations occur on high-degree polynomialswith large coefficients. This work focuses in particular on the ChineseRemainder Theorem representation (a.k.a. Residue Number Systems)applied to the large coefficients. In SHE schemes like that of Fan andVercauteren (FV), such a representation remains hardly compatible withprocedures involving coefficient-wise division and rounding required indecryption and homomorphic multiplication. This paper suggests a wayto entirely eliminate the need for multi-precision arithmetic, and presentstechniques to enable a full RNS implementation of FV-like schemes. Fordimensions between 211 and 215 , we report speed-ups from 5 to 20 for decryption, and from 2 to 4 for multiplication.Keywords: Lattice-based Cryptography, Homomorphic Encryption, FV,Residue Number Systems, Software Implementation1IntroductionCryptographers’ deep interests in lattices are for multiple reasons. Besides possessing highly desirable post-quantum security features, lattice-based cryptography relies on simple structures, allowing efficient asymptotic complexities, andis quite flexible in practice. In addition to encryption/signature schemes ([15, 25,18, 7, 21, 22]), identity-based encryption [8], multilinear maps [10, 16], lattices arealso involved in homomorphic encryption (HE). The discovery of this propertyby Gentry in 2009 [12], through the use of ideal rings, is a major breakthroughwhich has opened the door to many opportunities in terms of applications, especially when coupled with cloud computing.HE is generally composed of a basic layer, which is a Somewhat Homomorphic Encryption scheme (SHE). Such a scheme allows us to compute a limited

2Jean-Claude Bajard, Julien Eynard, M. Anwar Hasan, and Vincent Zuccanumber of additions and multiplications on ciphertexts. This can be explainedby the fact that any ciphertext contains an inherent noise which increases after each homomorphic operation. Beyond a certain limit, the noise becomes toolarge to allow a correct decryption. This drawback may be tackled by usingbootstrapping, which however constitutes a bottleneck in terms of efficiency.Further improvements of noise management [6, 5] have been suggested so that,in practice, and given an applicative context, it may be wiser to select an efficientSHE with parameters enabling a sufficient number of operations. For instance,schemes like FV [9] and YASHE [4] have been implemented and tested for evaluating the SIMON Feistel Cipher [17]. Among the today’s more practical SHEschemes, FV is arguably one of the most competitive. This scheme is being currently considered by major stakeholders, such as the European H2020 HEATconsortium [26], as a viable candidate for practical homomorphic encryption.Our Contribution This work focuses on practical improvement of SHE schemes,in particular FV. Despite the fact that the security of YASHE has been called intoquestion recently [1], this scheme can also benefit from the present work. Theseschemes handle elements of a polynomial ring Zq [X]/(X n 1). The modulus qcan be chosen as the product of some small moduli fitting with practical hardware requirements (machine word, etc.). This enables us to avoid the need ofmulti-precision arithmetic in almost the whole scheme. However, this CRT representation (a.k.a. Residue Number Systems, or RNS) is hardly compatible witha couple of core operations: coefficient-wise division and rounding, occurring inmultiplication and decryption, and a noise management technique within homomorphic multiplication, relying on the access to a positional number system.We show how to efficiently avoid any switch between RNS and the positionalsystem for performing these operations. We present a full RNS variant of FV andanalyse the new bounds on noise growth. A software implementation highlightsthe practical benefits of the new RNS variant.It is important to note that this work is related to the arithmetic at the coefficient level. Thus, the security features of the original scheme are not modified.Outline Section 2 provides some preliminaries about FV and RNS. Section 3provides a full RNS variant of decryption. Section 4 gives a full RNS variant ofhomomorphic multiplication. Results of a software implementation are presentedin Section 5. Finally, some conclusions are drawn.2PreliminariesFor the sake of clarity, proofs of lemmas, propositions and theorems within Sect.3 and 4 are provided in an extended version [2].Context High-level operations occur in a polynomial ring R Z[X] /(X n 1)with n a power of 2. R is one-to-one mapped to integer polynomials of degree

A Full RNS Variant of FV like Somewhat Homomorphic Encryption Schemes3strictly smaller than n. Most of the time, elements of R are denoted by lowercase boldface letters and identified by their coefficients. Polynomial arithmeticis done modulo (X n 1). The infinity norm of a (a0 , . . . , an 1 ) R is definedby kak max06i6n 1 ( ai ). Ciphertexts will be managed as polynomials (ofdegree 1) in R[Y ]. For ct R[Y ], we note kctk maxi kct[i]k , ct[i] beingthe coefficient of degree i in Y . The multiplicative law of R[Y ] is denoted by ?.Behind lattice-based cryptosystems in general, and FV in particular, lies theprinciple of noisy encryption. Additionally to the plaintext, a ciphertext containsa noise (revealed by using the secret key) which grows after each homomorphicoperation. Since the homomorphic multiplication involves multiplications in R,it is crucial that the size of a product in R does not increase too much. Thisincrease is related to the ring constant δ sup{kf · gk /kf k · kgk : (f , g) (R \ {0})2 }. It means that kf · gk 6 δkf k · kgk . For the specific ring Rused here, δ is equal to n.For our subsequent discussions on decryption and homomorphic multiplication, we denote the “Division and Rounding” in R[Y ] (depending on parameterst, q which are defined thereafter) by:DRi : ct iXj 0jct[j]Y R[Y ] 7 i Xtj 0q ct[j] Y j R[Y ] .(1)The notation b qt ce, for any c R (e.g. ct[j] in (1)), means a coefficient-wisedivision-and-rounding.Plaintext and Ciphertext Spaces The plaintext space is determined by aninteger parameter t (t 2). A message is an element of Rt R/(tR), i.e. apolynomial of degree at most n 1 with coefficients in Zt . The notation [m]t(resp. m t ) means that coefficients lie in [ t/2, t/2) (resp. [0, t)). Ciphertexts willlie in Rq [Y ] with q a parameter of the scheme. On one side, some considerationsabout security imply a relationship between q and n which, for a given degreen, establish an upper bound for log2 (q) (cf. (6) in [9]). On the other side, theratio b qt c will basically determine the maximal number of homomorphicoperations which can be done in a row to ensure a correct decryption.RNS Representation Beyond the upper bound on log2 (q) due to securityrequirements, the composition of q has no restriction. So, q can be chosen as aproduct of small pairwise coprime moduli q1 . . . qk . The reason for such a choice is the Chinese Remainder Theorem (CRT) which offers a ring isomorphism Zq Qki 1 Zqi . Thus, the CRT implies the existence of a non-positional number system(RNS) in which large integers (mod q) are mapped to sets of small residues.Furthermore, the arithmetic modulo q over large integers can be substitutedby k independent arithmetics in the small rings Zqi . The isomorphism can benaturally extended to polynomials: Rq ' Rq1 . . . Rqk . It means that RNScan be used at the coefficient level to accelerate the arithmetic in Rq .

4Jean-Claude Bajard, Julien Eynard, M. Anwar Hasan, and Vincent ZuccaIn the rest of the paper, the letter q may refer either to the product q1 . . . qkor to the “RNS base” {q1 , . . . , qk }. Symbol ν denotes the “width” of the moduli.In other words, from now on, any modulus m (should it belong to q or to anyother RNS base) is assumed to verify m 2ν .Asymmetric Keys The secret key s is picked up in R according to a discretedistribution χkey on R (in practice, bounded by Bkey 1, i.e. ksk 6 1).For creating the public key, an “error” distribution χerr over R is used. Inpractice, this is a discrete distribution statistically close to a Gaussian (withmean 0 and standard deviation σerr ) truncated at Berr (e.g. Berr 6σerr ). χerris related to the hardness of the underlying (search version of) RLWE problem(for which the purpose is, given samples ([ (ai s ei )]q , ai ) with ei χerr anda U(Rq ), to find s; U(Rq ) is the uniform distribution on Rq ). The publickey pk is created as follows: sample a U(Rq ) and e χkey , then outputpk (p0 , p1 ) ([ (as e)]q , a).Encryption, Addition, Inherent Noise of a Ciphertext Encryption andhomomorphic addition are already fully compliant with RNS arithmetic. Theyare recalled hereafter:– EncFV ([m]t ): from samples e1 , e2 χerr , u χkey , outputct (ct[0], ct[1]) ([ [m]t p0 u e1 ]q , [p1 u e2 ]q ).– AddFV (ct1 , ct2 ): output ([ct1 [0] ct2 [0]]q , [ct1 [1] ct2 [1]]q ).By definition, the inherent noise of ct (encrypting [m]t ) is the polynomial vsuch that [ct(s)]q [ct[0] ct[1]s]q [ [m]t v]q . Thus, it is revealed byevaluating ct Rq [Y ] on the secret key s.Elementary Operations A basic word will fit in ν bits. In RNS, an “innermodular multiplication” (IMM) in a small ring like Zm is a core operation. IfEM stands for an elementary multiplication of two words, in practice an IMMis costlier than an EM. But it can be well controlled. For instance, the moduliprovided in NFLlib library [19] (cf. Sect. 5) enable a modular reduction whichreduces to one EM followed by a multiplication modulo 2ν . Furthermore, the costof an inner reduction can be limited by using lazy reduction, e.g. during RNSbase conversions used throughout this paper. NTT and invNTT denote the NumberTheoretic Transform and its inverse in a ring Rm for a modulus m. They enablean efficient polynomial multiplication (NTT, invNTT O(n log2 (n))).3Towards a Full RNS DecryptionThis section deals with the creation of a variant of the original decryption function DecFV , which will only involve RNS representation. The definition of DecFVis recalled hereafter.

A Full RNS Variant of FV like Somewhat Homomorphic Encryption Schemes– DecFV (ct (c0 , c1 ) Rq [Y ]): compute [DR0 ([ct(s)]q )]t hjtq [c0 c1 s]q5mi.tThe idea is that computing [c0 c1 s]q [ [m]t v]q reveals the noise. If thisnoise is small enough, and given that [m]t has been scaled by , the function DR0allows to cancel the noise while scaling down [m]t to recover [m]t . Concretely,decryption is correct as long as kvk ( q t )/2, i.e. the size of the noiseshould not go further this bound after homomorphic operations.The division-and-rounding operation makes DecFV hardly compatible withRNS at a first sight. Because RNS is of non positional nature, only exact integerdivision can be naturally performed (by multiplying by a modular inverse). Butit is not the case here. And the rounding operation involves comparisons whichrequire to switch from RNS to another positional system anyway, should it be aclassical binary system or a mixed-radix one [11]. To get an efficient RNS variantof DecFV , we use an idea of [3]. To this end, we introduce relevant RNS tools.3.1Fast RNS Base ConversionAt some point, the decryption requires, among others, a polynomial to be converted from Rq to Rt . To achieve such kind of operations as efficiently as possible, we suggest to use a “fast base conversion”. In order to convert residues ofx [0, q) from base q to a base B (e.g. {t}) coprime to q, we compute:!kXqqi mod m.(2)xiFastBconv(x, q, B) q qi qii 1m BThis conversion is relatively fast. This is because the sum should ideally bereduced modulo q in order to provide the exact value x; instead, (2) providesx αx q for some integer αx [0, k 1]. Computing αx requires costly operationsin RNS. So this step is by-passed, at the cost of an approximate result.FastBconv naturally extends to polynomials of R by applying it coefficientwise.3.2Approximate RNS RoundingThe above mentioned fast conversion allows us to efficiently compute an approximation of b qt [c0 c1 s]q e modulo t. The next step consists in correcting thisapproximation.First, we remark that ct(s) q can be used instead of [ct(s)]q . Indeed, thedifference between these two polynomials is a multiple of q. So, the divisionand-rounding turns it into a polynomial multiple of t, which is cancelled by thelast reduction modulo t. Second, a rounding would involve, at some point, acomparison. This is hardly compatible with RNS, so it is avoided. Therefore, wepropose to simplify the computation, albeit at the price of possible errors, byreplacing rounding by flooring. To this end, we use the following formula: t ct(s) q t · ct(s) qt ct(s) q .qq

6Jean-Claude Bajard, Julien Eynard, M. Anwar Hasan, and Vincent ZuccaThe division is now exact, so it can be done in RNS. Since this computationhas to be done modulo t, the term t ct(s) q cancels. Furthermore, the term( t · ct(s) q mod t) is obtained through a fast conversion.Lemma 1 sums up the strategy by replacing ct(s) q by γ ct(s) q , where γis an integer which will help in correcting the approximation error.Lemma 1. Let ct be such that [ct(s)]q [m]t v qr, and let vc : tv [m]t q t . Let γ be an integer coprime to q. Then, for m {t, γ}, the followingequalities hold modulo m: t 1FastBconv( γt · ct(s) q , q, {t, γ}) q m γ [ct(s)]q eq vc γ ([m]t tr) γ eq(3)where each integer coefficient of the error polynomial e R lies in [0, k].The error e is due to the fast conversion and the replacement of rounding byflooring. It is the same error for residues modulo t and γ. The residues moduloγ will enable a fast correction of it and of the term bγ vqc e at a same time. Also,note that r vanishes since it is multiplied by both t and γ.3.3Correcting the Approximate RNS RoundingThe next step is to show how γ in (3) can be used to correct the term (bγ vqc e e).It can be done efficiently when the polynomial vc is such that kvc k 6 q( 21 ε),for some real number ε (0, 1/2].Lemma 2. Let kvc k 6 q( 12 ε), e R with coefficients in [0, k], and γ aninteger. Then, vcvcγε k γ e γ e .(4)qqγLemma 2 enables an efficient and correct RNS rounding as long as k( 21 kvc k 1) γ has the size of a modulus. Concretely, one computes (3) andquses the centered remainder modulo γ to obtain γ ([m]t tr) modulo t, whichreduces to γ[m]t mod t. And it remains to multiply by γ 1 t to recover [m]t .3.4A Full RNS Variant of DecFVThe new variant of the decryption is detailed in Alg. 1. The main modificationfor the proposed RNS decryption is due to Lem. 2. As stated by Thm. 1, givena γ, the correctness of rounding requires a new bound on the noise to make theγ-correction technique successful.

A Full RNS Variant of FV like Somewhat Homomorphic Encryption Schemes7Theorem 1. Let ct(s) [m]t v (mod q). Let γ be a positive integer cotprime to t and q such that γ 2k/(1 t q q ). For Alg. 1 returning [m]t , itsuffices that v satisfies the following bound: q 1 k q tkvk 6 .(5)t 2 γ2There is a trade-off between the size of γ and the bound in (5). Ideally, γ 2kat the price of a (a priori ) quite small bound on the noise. But by choosingγ 2p 1 k for p ν 1 dlog2 (k)e (i.e. γ 2ν is a standard modulus), thebound ( (1 2 p ) q t )/2 for a correct decryption should be close to theoriginal bound ( q t )/2 for practical values of ν. A concrete estimation of γin Sect. 5.1 will show that γ can be chosen very close to 2k in practice, and thusfitting on a basic word by far.Algorithm 1 DecRNS (ct, s, γ)Require: ct an encryption of [m]t , and s the secret key, both in base q; an integer γcoprime to t and qEnsure: [m]t1: for m {t, γ} do2:s(m) FastBconv( γt · ct(s) q , q, {m}) q 1 m mod m3: end for4: s̃(γ) [s(γ) ]γ5: m(t) [(s(t) s̃(γ) ) γ 1 t ]t6: return m(t)3.5Staying in RNS Is Asymptotically BetterIn any decryption technique, (ct(s) mod q) has to be computed first. To optimizethis polynomial product, one basically performs kNTT knIMM kinvNTT.For next steps, a simple strategy is to compute (b qt [ct(s)]q e mod t) by doingan RNS-to-binary conversion in order to Pperform the division and rounding. Bykdenoting xi ct(s) qqi qi , one computes i 1 xi qqi mod q, compares it to q/2 soas to center the result, and performs division and rounding. Hence, the divisionand-rounding requires O(k 2 n)EM. In practice, security analysis (cf. e.g. [9, 4, 17])requires that kν dlog2 (q)e O(n). So, the cost of leaving RNS to access apositional system is dominant in the asymptotic computational complexity.Staying in RNS enables to get a better asymptotic complexity. Indeed, it iseasy to see that Alg. 1 requires O(kn) operations (excluding the polynomial product). Thus, the cost of NTT is dominant in this case. By considering k O(n),we deduce C(DecFV ) O(n3 ), while C(DecRNS ) O(n2 log2 (n)). But the hiddenconstant in “k O(n)” is small, and the NTT, common to both variants, shouldavoid any noticeable divergence (cf. 5.4) for practical ranges for parameters.

8Jean-Claude Bajard, Julien Eynard, M. Anwar Hasan, and Vincent ZuccaIn order to provide optimized RNS variants of decryption, we make tworemarks. First, thePreduction modulo q is unnecessary. Indeed, any extra multiplekqtof q in the sumi 1 xi qi is multiplied by q , making the resulting term amultiple of t, which is not affected by the rounding and is finally cancelledmodulo t. Second, it is possible to precompute qt as a multiprecision floatingpoint number in order to avoid a costly integer division. But given the firstremark, it suffices to precompute the floating point numbers Qi qti with 2ν log2 (k) log2 (t) bits ( 2 words) of precision. In this case, using standard doubleor quadruple (depending on ν) precision is sufficient. Finally, it is sufficient toPkcompute b i 1 xi Qi e mod t. This represents about 2knEM. Reducing modulo tis nearly free of cost when t is a power of 2.A second optimized RNS variant, with only integer arithmetic, is based onAlg. 1, in which γ is assumed to be coprime to t. It is possible to be slightlymore efficient by noticing that the coprimality assumption can be avoided. Thisis because the division by γ is exact. To do it, the for loop can be done moduloγ t. For instance, even if t is a power of 2, one can choose γ as being a powerof 2 too, and use the following lemma to finish the decryption efficiently.Lemma 3. Let γ be a power of 2. Let z : γ[m]t bγ vqc e e γt coming from(3) when computed modulo γt. If γ satisfies (4), then ( denotes the right bitshifting, and v1 is the polynomial with all its coefficients being equal to 1) (z γ2 v1 ) log2 (γ) t [m]t .(6)Lemma 3 can be adapted to other values for γ. The important remark is that[m]t is contained in the blog2 (t)c 1 most significant bits of (z γ2 v1 ) mod γt.So, by choosing γ as a power of 2, a simple bit shifting enables to recover thesebits. Finally, as soon as γt fits in 1 word, the co

Residue Number Systems, Software Implementation 1 Introduction Cryptographers’ deep interests in lattices are for multiple reasons. Besides pos-sessing highly desirable post-quantum security features, lattice-based cryptogra-phy relies on simple structures, allowing e cient asymptotic complexities, and is quite exible in practice.