A Hybrid Deduplication For Secure And Efficient Data .

1y ago

31 Views

1 Downloads

454.03 KB

9 Pages

Report/dmca

Download PDF

Transcription

2016 IEEE 8th International Conference on Cloud Computing Technology and ScienceA Hybrid Deduplication for Secure and EfﬁcientData Outsourcing in Fog ComputingDongyoung KooDepartment of ComputerScience and EngineeringKorea UniversitySeoul, South KoreaEmail: dykoo@korea.ac.krYoungjoo ShinNational SecurityResearch InstituteDaejeon, South KoreaEmail: yjshin@nsr.re.krJoobeom YunDepartment of Computerand Information SecuritySejong UniversitySeoul, South KoreaEmail: jbyun@sejong.ac.krservice requests by outsourcing the centralized workloadsin terms of storage space and network bandwidth to fogdevices, which are distributed over a wide geographicalrange [3], [4], [5]. While the central cloud provides theoverall computing services, it also manages decentralizedheterogeneous fog devices such as set-top box, access point,and home gateway. Individual fog devices located near IoTdevices provide faster services to end users based on theirown computation, storage, and network capabilities. In short,fog computing has the following attractive attributes: (1)low latency, (2) enhanced user experience (i.e., high qualityservice), and (3) context awareness based on locationalproximity to end users [6], [7].The centralized cloud storage is unable to handle enormous volumes of data in a timely manner given a ﬁnitenetwork bandwidth. Distributed storage, especially fog devices, is incapable of providing computing services to usersowing to its limited resources and ﬁeld of vision. Therefore,efﬁcient resource management can be seen as one of themost important goals of commercial cloud storage services.As regards space utilization, deduplication is an attractivedata compression technique which stores only a single copyof duplicate data and provide owners with a link to it.Compared to cloud storage, fog devices located at the userside with temporal storage (owing to limited storage capacity) can perform deduplication and provide data outsourcingservices to data owners faster than the ones in centralcloud architecture. At the same time, the central cloud canefﬁciently utilize storage space from a global perspective byreceiving and maintaining unique data from fog devices.As regards bandwidth utilization, fog computing can beseen as three tier (cloud-fog-end user) layered network. Theservice delays are more likely to happen in communicationsbetween the central cloud and the distributed fogs, which is(possibly, multi-hop) inter-network communications. Contrariwise, intra-network (generally, single hop or a fewones) communications between the fog and end users haverelatively low latency. Thus, client-side deduplication, whichallows end users to upload only a small and unique instanceAbstract—With prevalence of remote storage services, dataprivacy issues become more serious owing to loss of control tooutsourced data. In the meanwhile, the service providers tendto minimize storage utility costs. To minimize the storage costswhile preserving data privacy, secure deduplication techniqueshave been proposed, which are categorized into client-side orserver-side approaches. Client-side approach achieves storageand bandwidth savings at the same time but allows externaladversaries to know existence of duplicates in the remote storage. On the contrary, server-side one prevents the adversariesfrom getting acknowledged but sacriﬁces network bandwidthsavings. In fog computing, however, which is a new computingparadigm extending the cloud computing by outsourcing acentralized workload of the cloud to geographically distributedfog devices located at the edge of the networks, the previousdeduplication schemes cannot guarantee efﬁciency improvement and privacy preservation simultaneously. In this paper, wepresent a simple but nontrivial solution of these contradictoryissues in fog storage. The proposed hybrid secure deduplicationprotocol combines client- and server-side deduplications by taking untrustworthy fog storage environments into account. Theclient-side deduplication is applied in inter-network (i.e., cloudfog network) communications to prevent network congestion atthe network core, while the server-side deduplication is adoptedin intra-network (i.e., user-fog network) communications toprevent information leakage via side channels for maximaldata privacy. Performance and security analyses demonstratethe comparable efﬁciency of the proposed scheme with securityenhancement.Keywords-Data outsourcing, client-side deduplication, serverside deduplication, fog computing, data privacy, efﬁciencyI. I NTRODUCTIONFog computing, as an extension of the cloud computingfrom the core to the edge of the network, is a promisingfuture generation paradigm. Due to the rise of IoT deviceswith limited computing resources, cloud-based solutionshave been extensively researched. However, forecasts basedon the recent growth of the IoT market [1], [2] indicate thatcentralized clouds will be unlikely to be able to providesatisfactory services to end users in the near future. On theother hand, fog computing efﬁciently handles concentratedc 2016 IEEE (CloudCom’16)9781-5090-1445-3/16 31.00 2330-2186/16 31.00 2016 IEEEDOI 10.1109/CloudCom.2016.51Junbeom HurDepartment of ComputerScience and EngineeringKorea UniversitySeoul, South KoreaEmail: jbhur@korea.ac.kr285

II. R ELATEDof data, is adequate on the inter-network due to its bandwidthefﬁciency by preventing repetitive uploads of the wholeduplicate data.WORKExtensive researches on secure deduplication have beenconducted recently under the consideration of cloud storage environments not speciﬁc to fog storage environments.Notice that there have been numerous secure deduplicationprotocols including what allows end users to outsourceplaintext [11], [12], [13], [14] under the assumption that theremote storage service provider is fully trusted. Though, weassume that all of the remote storage cannot be fully trustedfor the rest of this paper [8]. Thus, we focus on previoussecure deduplication schemes which exploit cryptographicprimitives at the end user side as regards data privacy.In this section, we brieﬂy summarize some representativeworks and point out their limitations and difﬁculties in directapplication to the upcoming fog storage architecture.Despite the compelling beneﬁts of deduplication, privacy issues surrounding outsourced data have also receivedclose attention. Data owners cannot guarantee the securemanagement because they lose control on their outsourceddata in remote storage systems after outsourcing [8], [9].Thus, end users attempt to outsource encryptions of theircontents while supporting deduplication such as convergentencryption (CE) [10]. In this context, server-side deduplication, which allows repetitive uploads of duplicate databut eliminates them at the server side, is more appropriateto communications between the fog and end users (onthe intra-network) since it prohibits illegitimate users fromlearning side information such that duplicate content residesin remote storage or not.A. Client-side secure deduplicationConsidering secure deduplication together with storageand bandwidth efﬁciency in the fog storage systems, wepresent a hybrid secure deduplication protocol. By applyingclient-side deduplication at inter-network level and serverside counterpart at intra-network level, the proposed schemeachieves best-effort bandwidth with desirable security guarantees. Speciﬁcally, our protocol satisﬁes the followingproperties:Client-side deduplication occurs from the side of datauploading entity (i.e., end users rendered by IoT devices).Speciﬁcally, a client who attempts to upload data computesand sends a duplicate identiﬁcation term (e.g. hash value) ofthe data to remote storage before actual outsourcing. Whenduplicate copy is discovered in remote storage, the uploadrequestor proves his ownership to the storage with modestamount of communications instead of uploading the entirecontent. By allowing only unique content to be outsourcedto the remote storage, bandwidth consumption might diminish signiﬁcantly and the effects of storage savings appearimmediately.Douceur et al. [10] introduced a cryptographic primitive,namely CE, by attempting to link data conﬁdentiality via encryption with data deduplication. In this study, an encryptionkey is derived from the outsourced data in a deterministicway (i.e., hash value) so that ciphertext generated underthis key becomes the same. As a result, clients having thesame content produce the same ciphertext without prior keyagreement and the remote storage is allowed to identifyduplicates via simple equality test without knowledge ofencryption keys. Following this concept of deterministickey derivation from the data itself, Bellare et al. [15]presented a generalized framework, called message-lockedencryption (MLE). MLE is categorized into four particularconstructions and assessed rigorously according to levels ofintegrity and security guarantees.However, some vulnerabilities arise in client-side deduplication approaches such as conﬁrmation-of-ﬁle (CoF). CoFis indicated in [16], [17] as side information such that theknowledge of duplicates in the remote storage is revealed toadversaries. This might cause serious impact on data privacywhen the message space is restricted to predictable space.Unfortunately, this side channel is inevitable because theidentiﬁcation of deduplicate content in the remote storage isa requisite component in client-side deduplication.1) End users can upload their contents without prior keyagreement while preventing side information leakage.2) Fog storage and cloud storage cannot learn any information about outsourced data except occurrences ofdeduplication.3) Throughput increases by adaptively adopting clientand server-side deduplications according to networkcondition.4) Storage utilization ratio increases in fog storage andcloud storage by exploiting multi-tier hierarchy in fogstorage system.Our solution is nontrivial because previous server-sidededuplications with interactive key agreement require strongassumption that at least one end user, who has previouslyuploaded the same content, is supposed to be always online.Server-side deduplication without key agreement generallycannot allow the semi-honest fog storage to perform deduplication because it is difﬁcult to identify duplication withoutknowledge of the encryption key.The rest of this paper is organized as follows. In Section2, previous secure deduplication studies are brieﬂy reviewed.In Section 3, our goal for fog storage systems is deﬁned. InSection 4, the proposed deduplication protocol is presented.In Sections 5 and 6, our scheme is compared and analyzedwith previous deduplication approaches. In Section 7, thepaper concludes with a summary of the proposed scheme.286

,QWHU QHWZRUN&HQWUDOL]HG VWRUDJH FORXG'LVWULEXWHG VWRUDJH IRJ,QWUD QHWZRUN:RUNORDG RXWVRXUFLQJAs opposed to client-side deduplication, server-side deduplication occurs from the side of data-storing entity (i.e.,cloud and fog storage). The storage server receives all ofthe uploaded ciphertexts from clients, and then performsdeduplication off-line or in the background during serviceprovisioning. In a sense that the uploading client cannot beaware of duplicate copies in remote storage, this approachcan be seen more secure than the client-side one.Following the security vulnerabilities via side channelsidentiﬁed by Harnik et al. [11], Bellare et al. presenteda server-aided deduplication, DupLESS [18]. This protocoladopts oblivious transfer protocol so that data owners of thesame content can agree on pseudorandom encryption key.With a support of additional independent key server, it prevents adversaries from guessing plain content by exploitingthe common hash value as side information. In addition,online brute-force attack can be effectively mitigated byrendering the key server as a rate-limiting authority.Recently, Liu et al. proposed a server-side deduplicationwithout additional independent servers [19] by exploitingpassword authenticated key exchange (PAKE) protocol. Inthis scheme, data-uploading user engages in key agreementprotocol with online users who previously uploaded with thesame short hash permitting collisions intentionally. Duringthe process of PAKE protocol, communications pass throughthe cloud while secret values risking the data privacy are nottransmitted. Once an agreed key is produced through PAKEprotocol, the data-uploading user retrieves the encryptionkey and produces the same ciphertext stored in the cloud.This protocol is desirable in that neither side informationis leaked to illegitimate users nor additional overheads arerequired in order to maintain additional servers. But networktrafﬁc is concentrated on the cloud, which is likely to causea single point of failure. In addition, the assumption, thatthere should be at least one online user who has previouslyoutsourced the same content, is too strong in pragmatic cloudstorage environments.Although the server-side deduplication achieves higherlevel of security than the client-side deduplication in termsof blockading leakage of side information, it requires morecommunications overheads than the one of client-side counterpart. This can be seen as a trade-off between data privacyand bandwidth efﬁciency, but efﬁcient utilization of thenetwork bandwidth becomes a crucial goal as the volumeof outsourced data increases in the era of fog computing.Besides, Stanek et al. [20] proposed an interesting ideaconsidering both of client- and server-side deduplicationsbased on the popularity of outsourced content. When itcomes to popular outsourced data, existence of outsourceddata can be leaked to adversaries, which brings about thesame side channel vulnerabilities because then the encryption of popular content converges to the convergent encryp-:RUNORDG RXWVRXUFLQJB. Server-side secure deduplication,R7 GHYLFHV HQG XVHUFigure 1: Architecture of a fog storage systemtion. The existence of trusted third party incurs additionalmaintenance overhead as well in this scheme.III. F OG STORAGE DESCRIPTION AND OUR GOALSIn this section, we describe the architecture of fog storagesystems and then deﬁne our goal of this paper.A. Fog storage architectureWe introduce the three system entities in a typical fogstorage system: the cloud, fog, and end user (Fig. 1).1) The cloud is a centralized service provider whichprovides long-term data storage and retrieval services.The cloud maintains deduplicated data and metadata inthe form of (data owner’s ID, physical link to the data).In order to handle enormous volume of data and toavoid network congestion caused by its ﬁnite storageand network resources, its data storage workload isoutsourced to widely distributed fog devices.2) The fog is a distributed entity which is located atthe edge of the network and provides data storageservices in place of the cloud with its own limitedresources. Fog devices are connected to the centralcloud and also possibly with each other (via internetwork), while it is connected to end-user devicesin a spatially restricted domain (via intra-network).It is responsible for temporal storage services andrelays services from the cloud when temporal/spatialworkloads exceed its capability.3) The end user is the data outsourcing/retrieving entity(e.g. IoT devices). The principal goal of end users isto receive data storage/retrieval services with elasticityand scalability from the cloud storage through the fogstorage. They have relatively very limited amount ofstorage space compared to the one of cloud and fogstorage. Thus, we assume that they remove the datacontent from their local storage after outsourcing it.B. Efﬁciency requirementsTaking into account the fog storage architecture explainedabove, bandwidth consumption needs to be minimized ininter-network communication. In other words, heavy network trafﬁc is likely to be concentrated to the central cloud287

because it should receive uploaded data from several fogstorage and transmit stored data to them in order to maintainthe overall storage services. Therefore, to control networkbandwidth effectively, it is desirable to perform client-sidededuplication between the cloud storage and the fog storage.In practice, the shift of heavy network load to distributedsmall area domains is natural because network conditionbetween the fog storage and end users is expected to bebetter than the former because of its physical proximity. Aslong as the impact of server-side deduplication is negligible,it is admissible to perform server-side deduplication betweenthe fog storage and end users.A. PreliminariesC. Adversarial modelIn a fog storage system, we assume that the cloud andfog devices are honest-but-curious entities. To clarify anadversary’s location and capabilities, it can be broadlycategorized into one of the two groups: inside and outside. The inside adversary participates in the data outsourcing services and follows the prescribed protocols. Atthe same time, it attempts to learn information aboutthe underlying plain content of the outsourced data byexploiting transcripts allowably given to it. The cloudand fog storage belong to this kind of adversary andend users who have not outsourced data to them canbe considered potential inside adversaries. Therefore, itcan be regarded as passive attacker. The outside adversary attempts to acquire the underlying plain data without obedience of the given protocol.It can exploit side channels such as eavesdropping,access and possibly modify physical storage in aninappropriate manner. Therefore, it can be consideredactive attacker.Without loss of generality, we assume that the cloud andfog do not collude with other adversaries so that they cannotlearn about existence of duplicates in remote storage.2) Bilinear Difﬁe-Hellman assumption: The bilinearabcDifﬁe-Hellman (BDH) problem is to compute e(g, g) a b cGT given g, g , g , g for randomly chosen valuesa, b, c R Zp . Let Adv BDHbe the advantage that anyAprobabilistic polynomial-time (PPT) adversary A solves theBDH problem as followsPrior to a detailed description of the proposed scheme, thecryptographic primitives on which it is based will be brieﬂysummarized.1) Bilinear maps: Let G and GT be multiplicative cyclicgroups with a large prime order p where g is a generator ofG. Then, there is an efﬁciently computable bilinear map esuch that e : G G GT with the following properties: Bilinearity: e(ua , v b ) e(u, v)ab holds for any u, v G and any a, b Zp .Non-degeneracy: e(g, g) 1.Adv BDH P r[A(e, g, g a , g b , g c ) e(g, g)Aabc].If Adv BDHis a negligible function in the security paArameter λ, then we can say that BDH assumption holds anddenote by p, G, GT , g, e the BDH group.3) Identity-based encryption (IBE): Waters introducedan efﬁcient IBE [21] without random oracles. The Waters’ IBE is comprised of the following three probabilistic algorithms and one deterministic algorithm(Setup, KeyGen, Enc, Dec). D. Security requirementsTaking the aforementioned adversaries into account, thefollowing properties should be satisﬁed in the proposedscheme:1) Conﬁdentiality: Unauthorized access to plain datashould be prevented to deter from unintended exposureof outsourced data.2) Integrity: End users should be able to verify the dataintegrity after retrieval of their outsourced content.3) Leakage resilience: Information leakage should beminimized during data outsourcing process. This differs from conﬁdentiality in that no side informationshould be given to even valid end users who haveoutsourced the data content to the remote storage. (μ, mk) Setup(1λ ): Given the security parameter λ,choose a random BDH group p, G, GT , g, e as deﬁnedabove. In addition, choose random values α R Zp ,g2 R G, and u , u1 , . . . , uN R G for some integerN N. Deﬁne a pseudorandom function F (v) u i V ui where v is N -bit identity string and V is aset of indices on which the value is 1 in v. The publicparameter then becomes μ (g, g1 g α , g2 , F (·)) andthe master secret key becomes mk g2α . dv KeyGen(μ, mk, v): Given N -bit identity,choose a random value r R Zp . The decryption keyrfor identity v then becomes dv (mk · F (v) , g r ) r rα(g2 · F (v) , g ). C Enc(μ, v, M ): For a message M and identitybitstring v, choose a random value t R Zp . The cittphertext then becomes C (M · e(g1 , g2 ) , g t , F (v) ).M Dec(μ, dv , C): For decryption key dv (d1 , d2 )and ciphertext C (c1 , c2 , c3 ), the plaintext becomesM c1 · e(d2 , c3 )/e(d1 , c2 ).Its security is based on decisional BDH assumption andthe detailed proof can be found in [21]. In the proposedscheme, slightly modiﬁed version of Waters’ IBE is usedwhere the identity is represented as pseudorandom valueIV. P ROPOSED HYBRID SECURE DEDUPLICATIONIn this section, the proposed hybrid deduplication schemein fog storage architecture is described.288

generated from the message to be outsourced.1decryption key for saving storage space. It is notable that theend user who outsources his encrypted content cannot learnany information including existence of the same content inthe fog storage.3) Retrieval: When an end user wants to access hispreviously outsourced content from the fog storage, he sendsa retrieval request to it. If the fog storage still stores anencryption of the user’s content, it just sends the ciphertextto the end user. Otherwise, the fog storage forwards retrievalrequest to the central cloud storage and conveys the receivedciphertext to the end user. The central cloud storage alwaysstores only single copy (as an encryption) of the data contentoutsourced by end users, so it is able to deliver the requestedencryption to the fog storage. The end user then runs theDecrypt algorithm to obtain the plain content.4) Deduplication: Once a fog storage receives ciphertextC from an end user, it checks duplicates by running Dedupalgorithm with another ciphertext C stored in its localstorage. If duplicate copy is found, the fog storage replacespreviously stored ciphertext with the new ciphertext receivedjust before for the same content and registers the end useras valid data owner (i.e., server-side deduplication).2 Otherwise, the fog storage requests (periodically or on demand)duplicate check to the central cloud storage by sending justlast two parameters in the received ciphertext. The centralcloud runs the Dedup algorithm as the fog did, but it returnsthe result of the algorithm to the fog. If the result is F alse(in case of unique data not stored in the cloud), the fogforwards the ﬁrst component c1 and the cloud stores it inthe form C (c1 , c2 , c3 ) (i.e., client-side deduplication).Regardless of the result, the fog removes the ciphertext fromits local storage at the end of Dedup algorithm.B. OverviewOnce the system public parameters are established, endusers are allowed to transfer their content to a fog storagein charge and retrieve the content from it on demand.Each end user outsources a randomized encryption ofhis data content by choosing random exponent and deletesboth plaintext and ciphertext from his local storage. Thefog storage, located on user side, then stores the receivedciphertext for a certain period of time. During this periodof time, the fog storage performs server-side deduplicationlocally when an encryption of the same content is uploadedby end users. Upon times excess or storage overload, thefog storage and the central cloud storage engage in clientside deduplication. If encryptions of the same content arestored in distributed fog storage, the central cloud receivesjust (random) one of them and all of the fog storage removethe encryption from its temporal storage.To access the outsourced data, the end user receivespossibly deduplicated ciphertext from the fog device incharge. At this time, the fog device relays the ciphertextbelonging to the end user from the central cloud to the enduser if it is not stored in its local storage. Otherwise, thefog directly transfers the corresponding ciphertext to the enduser.C. Main constructionFrom now on, we describe the entire process of ourdeduplication protocol. This procedure is composed of fourphases (System setup, Data outsourcing, Retrieval,Deduplication). It is based on Waters’ IBE scheme, butis modiﬁed version in order to enable secure deduplication.The speciﬁc algorithms are described in Algorithm 1.1) System setup: The trust initializer runs the Setupalgorithm to obtain system public parameters spp. spp isdisclosed to the public so as to allow any entity who engagein the data outsourcing to perform the prescribed protocol.2) Data outsourcing: When an end user tries to uploadcontent to the fog storage, he ﬁrst generates two pseudorandom values derived from the content, which are used inencryption (as an identity of IBE scheme). By using thesevalues, he generates a random encryption and decryption keyin the Encrypt algorithm. He then transmits the ciphertextto the fog storage, which locates at the same or near networkof the end user. The decryption key dk and hash digest α arekept secret by the end user and used to access the outsourcedcontent and integrity veriﬁcation, respectively. Without lossof generality, he removes all of the parameters except theV. P ERFORMANCE ANALYSISIn this section, the proposed scheme is analyzed andcompared with previous secure deduplication approaches interms of efﬁciency.A. ImplementationIn order to evaluate the efﬁciency of the proposed scheme,prototypes of ours, Bellare et al.’s scheme [18], and Liuet al.’s scheme [19] are implemented in Python (version3.4.3) upon Charm framework [23]. Then, we evaluate theperformance with average values over 100 times repetitionson a desktop PC running Ubuntu 12.04 LTS with 3.4GHzCPU with 16GB RAM. In order to follow NIST recommendation with 256-bit security [24], secure symmetric encryption/decryption is rendered by AES256, universal one-wayhash functions are implemented by exploiting SHA512 andSHA256, and supersingular elliptic curve with 512-bit baseﬁeld is used in the prototypes.1 Speciﬁcally, the master secret mk g α and g in public parameter12spp are removed from the proposed protocol. This removes the necessityof trusted key generation center (note that the cloud and fog cannot be fullytrusted). Fortunately, the secret key can be generated by valid data ownersfrom the message, which is similar to the Boneh-Boyen IBE protocol [22].2 This duplicate check can be done by fog storage in the background atany time. The selection criterion which copy should be kept is dependenton a policy of the fog storage, but it has no effect on the security of thescheme.289

Algorithm 1 Algorithms in the proposed schemeprocedure Setup(1λ ) System public parameter selectionSelect a random BDH group p, G, GT , g, e Select uniform random values g2 , u , u1 , u2 , . . . , uN from G for some N NDeﬁne a functiton F (s) u i V ui given N -bit string s with a bitmap V for which i-th bit of s is 1Select semantically secure symmetric encryption/decryption with m-bit key ek for some m N such thatC SE(ek, M ) and M SD(ek, C)Select universal one-way hash functions H1 , H2 , and H3 such that NmH1 : {0, 1} Zp , H2 : {0, 1} {0, 1} , H3 : GT {0, 1}return spp p, G, GT , g, g2 , e, F (·), H1 (·), H2 (·), H3 end procedureprocedure SigGen(spp, M ) Pseudorandom value generationα H1 (M )σM,1 g2ασM,2 F (H2 (M ))return σ α, σM,1 , σM,2 end procedureprocedure Dedup(spp, C̃ (·, c2 , c3 ), C̃ (·, c 2 , c 3 )) (Client-side) Duplicate identiﬁcationif e(c2 , c 3 ) e(c 2 , c3 ) thenreturn T rueend ifreturn F alseend procedureprocedure Encrypt(spp, σ, M )procedure Decrypt(spp, dk, C) Probabilistic encryptionSelect uniform random values r, t R Zprdk (g2α · σM,2, gr )tek H3 (e(σM,1 , g2 ) )ttC (SE(ek, M ), g , σM,2)return dk, C end procedure Deterministic decryption(dk 1 , dk 2 ) dk(c1 , c2 , c3 ) Cek H3 (e(dk 1 , c2 )/e(dk 2 , c3 ))M SD(ek, c1 )return Mend procedureB. Computation overheadsto the number of online checkers who have the same shorthash value. Thus, computation time increases as the numberof checkers increases.The other constant times are summarized in TABLE I.Although the proposed scheme requires almost 3 times morecomputation time for system setup, this can be done onlyonce before service provisioning. Liu et al.’s setup time isminimal because the system does not require initial setup ofparameters for neither elliptic curve nor RSA.Unlike the proposed scheme, Bellare et al.’s scheme andLiu et al.’s scheme require computation on server side duringdata outsourcing. In Bellare et al.’s DupLESS, the key serverneeds to engage in oblivious encryption key generation. InLiu et al.’s scheme, the central server is supposed to senddata outsourcing entity homomorphically encrypted maskedencryption key. When there is no duplicate encryption inthe server, the server randomly chooses the key withouthomomorphic addition, which causes reduction of 167μs inkey generation. Additionally, Liu et al.’s scheme requiresmultiple online checkers to participate in PAKE protocolfor the same short hash value.Fig. 2 demonstrates computation time (ms) on client side.Per algorithm time consumptions are depicted in bar graphand total time for single data outsourcing is drawn with asolid line. In all evaluations, computation time for ﬁle I/O,encryption, and decryption are almost the same with standard deviation less than 0.001 due to the same cryptographicencryption upon the same machine. In addition, all schemeshave similar computation time dominated by the encryption.Oblivious pseudorandom key generation (denoted byOPRF) in Bellare et al.’s scheme is performed on the hashvalue of outsourcing content and requires 1 exponentiationand 2 multiplications, while SigGen algorithm in the proposed scheme uses two universal hash functions and constantnumber of multiplications (i.e., 257 multipl

B. Server-side secure deduplication As opposed to client-side deduplication, server-side dedu-plication occurs from the side of data-storing entity (i.e., cloud and fog storage). The storage server receives all of the uploaded ciphertexts from clients, and then performs deduplication off-line or in the background during service provisioning.