Differentially Private Trajectory Analysis For Points-of .

Transcription

Differentially Private Trajectory Analysis forPoints-of-Interest RecommendationChao LiSchool of Information SciencesUniversity of PittsburghPittsburgh, USAEmail: chl205@pitt.eduBalaji PalanisamySchool of Information SciencesUniversity of PittsburghPittsburgh, USAEmail: bpalan@pitt.eduAbstract—Ubiquitous deployment of low-cost mobile positioning devices and the widespread use of high-speed wirelessnetworks enable massive collection of large-scale trajectorydata of individuals moving on road networks. Trajectory datamining finds numerous applications including understandingusers’ historical travel preferences and recommending places ofinterest to new visitors. Privacy-preserving trajectory mining isan important and challenging problem as exposure of sensitivelocation information in the trajectories can directly invade thelocation privacy of the users associated with the trajectories. Inthis paper, we propose a differentially private trajectory analysisalgorithm for points-of-interest recommendation to users thataims at maximizing the accuracy of the recommendation resultswhile protecting the privacy of the exposed trajectories withdifferential privacy guarantees. Our algorithm first transformsthe raw trajectory dataset into a bipartite graph with nodesrepresenting the users and the points-of-interest and the edgesrepresenting the visits made by the users to the locations, andthen extracts the association matrix representing the bipartitegraph to inject carefully calibrated noise to meet -differentialprivacy guarantees. A post-processing of the perturbed association matrix is performed to suppress noise prior to performinga Hyperlink-Induced Topic Search (HITS) on the transformeddata that generates an ordered list of recommended points-ofinterest. Extensive experiments on a real trajectory dataset showthat our algorithm is efficient, scalable and demonstrates highrecommendation accuracy while meeting the required differentialprivacy guarantees.I. I NTRODUCTIONUbiquitous deployment of mobile positioning devices andthe widespread use of high-speed wireless networks enablemassive collection of large-scale trajectory data of individualsmoving on road networks. The rapid proliferation of lowcost GPS-supported mobile devices enables a wide range oflocation-based service (LBS) applications including locationbased social networks [10], [30], location-based advertising [2], [22], location-based information sharing [9], [24]and navigation applications [28], [32]. A trajectory representsa sequence of location information formed by a series of(latitude, longitude, timestamp) triple that captures a variety of travel information of the users including user’s movement pattern [17], travel paths [36] and travel destination [34],[35]. Each travel destination in a trajectory reveals that the userhas made a visit to the place. In addition to this information, atrajectory may also include temporal information such as theJames JoshiSchool of Information SciencesUniversity of PittsburghPittsburgh, USAEmail: jjoshi@pitt.eduvisiting times of the users. While trajectories of an individualmobile user can be analyzed to understand her personaltravel recommendations comprehensively, aggregate analysisof historical trajectory data belonging to different mobile userscan provide more generalized travel recommendations suchas ‘Where are the top-10 points-of-interest in a given city?’,‘Which shopping mall is the most popular in this area?’ and‘Which users have frequently visited this restaurant?’.Although historical personal trajectory data provide immense information to generate accurate and useful pointsof-interest recommendation, the exposure of the sensitivetrajectory information can pose significant privacy risks thatcan invade the location privacy of the users. In particular, thelocation information of the travel destination, represented as atwo-dimensional geographical region, is often associated witha semantic meaning, such as a university, a shopping mall ora hospital. The disclosure of the association between a mobileuser and such a location may reveal private information aboutthe health conditions, life style and social and political beliefsof the user. For example, if an adversary infers the associationbetween a user and a treatment center, the health state of theuser may be revealed.Privacy-preserving trajectory mining is an important andchallenging problem as exposure of sensitive location information in the trajectories can directly invade the location privacyof the users associated with the trajectories. Differential privacy [12], [13], as a state-of-the-art privacy paradigm, providesa model to quantify the disclosure risks by ensuring that thepublished statistical data does not depend on the presence orabsence of an individual record in the dataset. By carefullyapplying differential privacy mechanisms [12], [13], [25] ontrajectory data, the personal trajectory information, such asthe travel destination of the users, can be protected fromthe malicious or curious inference from the recommendationresults, thus protecting the location privacy of the mobile users.In this paper, we propose a differentially private trajectoryanalysis algorithm for points-of-interest recommendation tousers that aims at maximizing the accuracy of the recommendation results while protecting the privacy of the exposedtrajectories with differential privacy guarantees. Our algorithmfirst transforms the raw trajectory dataset into a bipartite graphwith nodes representing the users and the points-of-interest

Fig. 1: User-location bipartite graph constructionand the edges representing the visits made by the users to thelocations, and then extracts the association matrix representingthe bipartite graph to inject carefully calibrated noise to meet -differential privacy guarantees. A post-processing of theperturbed association matrix is performed to suppress noiseprior to performing a Hyperlink-Induced Topic Search (HITS)on the transformed data that generates an ordered list ofrecommended points-of-interest. We perform extensive experiments on the Geolife GPS trajectory dataset [34], [35], [36]which contains 17621 trajectories collected from 182 usersfor five years. Our results show that the proposed algorithmis efficient, scalable and demonstrates good recommendationaccuracy while guaranteeing differential privacy.The rest of the paper is organized as follows: We firstdiscuss the related work in Section II. Then, in Section III, wepresent the definitions of trajectory processing and differentialprivacy and the model to transfer a raw trajectory dataset to auser-location bipartite graph. In Section IV, we introduce ourprivacy goal and explain the proposed differentially privatetrajectory analysis algorithm for travel recommendation. Weexperimentally evaluate our algorithm in Section V undervarying differential privacy budgets, global sensitivity levelsand database scale. Finally, we conclude in Section VI.II. R ELATED WORKLocation privacy has been an active research area for a longtime. In the past, the location privacy protection mechanismsmainly focused on prevention by protectively processing andperturbing the location information prior to disclosure. Depending on processing location data discretely or continuously,the location privacy protection mechanisms can be roughlyclassified to location data perturbation techniques representedby [16], [23], [27], [31], and trajectory inference preventiontechniques represented by [3], [4], [5], [6]. The latter can befurther broken down into trajectory perturbation [6] and Mixzone [3], [4], [5] techniques. However, all these techniquesassume to restrict the background knowledge of the adversary, which fails to provide strong and quantifiable privacyguarantee.Differential privacy [12], [13], as a state-of-the-art privacyparadigm, provides a model to quantify the disclosure risksby ensuring that the published statistical data does not dependon the presence or absence of an individual record in thedataset. Differential privacy can dispense with the restriction ofthe adversary background knowledge and quantify the privacyin a mathematically provable manner. By carefully applyingdifferential privacy mechanisms [12], [13], [25] to the trajectory data, the personal trajectory information can be protectedfrom malicious inference from the statistical outputs. Usually,the raw trajectory dataset is first transferred to special datastructures, such as Prefix tree [8], [19] or N-gram [7]. Then,the differential privacy protection mechanisms (e.g. LaplaceMechanism [12], Exponential Mechanism [25]) inject noises tothe data structures before releasing them for further processing.To the best of our knowledge, the work presented in this paperis the first differential privacy protection mechanism aimed atprocessing trajectories modeled as bipartite graphs to generateaccurate travel recommendation while protecting the locationprivacy of the users in the trajectories.III. C ONCEPTS AND M ODELIn this section, we first present the background conceptsand the bipartite graph model used to model raw trajectorydatasets and the user-location associations in the dataset. Wethen discuss the differential privacy model and mechanismsfor achieving differential privacy.A. User-location bipartite graph representationThe trajectory dataset analysis typically consists of two major components, namely trajectory preprocessing and trajectoryanalysis [33]. Depending on the objective of the analysis,the output of trajectory preprocessing can be organized asgraphs [34], matrix [35] or tensors [29]. For the purpose ofpoints-of-interest recommendation considered in this work,the raw trajectory dataset is transformed to be processed asa bipartite graph [33]. A bipartite graph can be representedby a graph, G (U, L, E), containing m U nodes onthe left side, n L nodes on the right side and a setof edges E U L between the two sets of nodes. Thisstructure naturally meets the objectives of the points-of-interestrecommendation problem as the left nodes and right nodescan respectively represent users and point-of-interest (POI)locations to be recommended. Here an edge eij (ui , lj ) Eindicates that the user ui visited the POI location lj . Inaddition, we expect to know the frequency of visit betweenuser ui and location lj modeled as the edge weight wij of theedge eij .To construct such a user-location bipartite graph from theraw trajectory dataset, we follow a sequence of three steps asshown in Figure 1. We start from the raw trajectory dataset:

Definition 1 (R AW TRAJECTORY DATASET). The raw trajectory dataset RT D {(ui , T Di ) 1 i m} contains thetrajectory data (T Di ) for m users (ui ). The trajectory dataT Di {(xj , yj , tj ) 1 j ki } of user ui is formed by kitriple, consisting of latitude xj , longitude yj and timestamptj (tj tj 1 ).In the example shown in Figure 1, we find the trajectorydata for users u1 to u6 as six dotted lines as the trajectorydata is always discretely captured by GPS devices. In the firststep, we identify the stop points for all the users. A stop pointis defined as a spatial region that the trajectory data fluctuateswithin a distance threshold Dt for at least a time threshold Tt .Definition 2 (S TOP POINTS SET). The stop points set SP S {(ui , SPi ) 1 i m} contains the stop points information(SPi ) for the m users (ui ). The stop points informationSPi {spj (xj , yj ) 1 j pi } for user ui isformed by pi stop points. A stop point sp is detected whena subset of sequential triplep of T Di , {(xj , yj , tj ) a j b}(xj xa )2 (yj ya )2 Dt ,follows a j b,p2(xb 1 xa ) (yb 1 ya )2 Dt , tb ta Tt .In Figure 1, the stop points are identified as triangles.Subsequently in the second step, these stop points (triangles)are clustered through well-known clustering techniques suchas k-means [15], DBSCAN [14] or OPTICS [1] clusteringalgorithms. These clustered stop points implicitly recommendthose regions covered by the clusters as attractive places asmultiple users in the RTD dataset have historically visited(stopped at) them. As can be seen in Figure 1, the stoppoints of the six users form four clusters, represented by l1to l4 . We denote each cluster as a location as we need toassign a geographically semantic meaning to the cluster fortravel recommendation. In practice, these locations can berepresented by the landmarks (e.g. tourist attractions, shoppingmalls) within the clusters.Finally, in the third step, we need to construct the associations between the users and the locations to build theuser-location bipartite graph. We can connect each stop pointwith its associated location with an arrow line pointing tothe location, which denotes that the user of this stop pointhas visited the location once. Actually, a user may have morethan one stop points within one cluster, which indicates thatthis user visited this location multiple times. This information,called frequency of visit, is important for travel recommendation since a more frequent visit can implicitly represent astronger recommendation. Therefore, if we denote an edge inthe user-location bipartite graph to mean that the user visitedthe location, we can apply the frequency of visit as the weightof the edge to indicate that the user has visited the locationmultiple times. Based on this assumption, we define the userlocation bipartite graph as:Definition 3 (U SER - LOCATION BIPARTITE GRAPH). The userlocation bipartite graph U LBG (U, L, E) consists of theleft set of user nodes U {ui 1 i m}, the right setof location nodes L {lj 1 j n} and the set of visitsrepresented as edges E {eij (ui , lj , wij ) 1 i m, 1 j n} U L, where wij is the frequency of the visit.We next introduce the notion of differential privacy and themechanisms required to achieve differential privacy guaranteesin a dataset.B. Differential privacyDifferential privacy is a classical privacy definition [12]that makes conservative assumptions about the adversary’sbackground knowledge and bounds the allowable error in aquantified manner. In general, differential privacy is designedto protect a single individual’s privacy by considering adjacentdata sets which differ only in one record. Before presentingthe formal definition of -differential privacy, we first definethe notion of adjacent datasets in the context of differentialprivacy. A data set D can be considered as a subset of recordsfrom the universe U , represented by D N U , where N standsfor the non-negative set and Di is the number of elementi in N. For example, if U {a, b, c}, D {a, b, c} canbe represented as {1, 1, 1} as it contains each element of Uonce. Similarly, D0 {a, c} can be represented as {1, 0, 1}as it does not contain b. Based on this representation, it isappropriate to use l1 distance (Manhattan distance) to measurethe distance between data sets.Definition 4 (D IFFERENTIAL PRIVACY [12]). A randomizedalgorithm A guarantees -differential privacy if for all adjacent datasets D1 and D2 differing by at most one record, andfor all possible results S Range(A),P r[A(D1 ) S] e P r[A(D2 ) S]where the probability space is over the randomness of A.In other words, the possible results of the randomizedalgorithm, given a dataset and a query, can form a distributionand differential privacy guarantees that the change of thedistribution for two input databases differing in one recordis bounded by a threshold. Many randomized algorithms havebeen proposed to guarantee differential privacy [12], [25]. Inour work, we use the most commonly used differential privacymechanism namely the Laplace Mechanism [12]. Given a dataset D, a function f and the budget , the Laplace Mechanismfirst calculates the actual f (D) and then perturbs this trueanswer by adding a carefully calibrated noise [12]. The noiseis calculated based on a Laplace random variable, with thevariance λ 4f / , where 4f is the l1 sensitivity, which isdefined as follows.Definition 5 (l1 SENSITIVITY [13]). Given a function f :N U Rd , the l1 sensitivity is measured as:4f maxD1 ,D2 N U D1 D2 1 1 f (D1 ) f (D2 ) 1where f (D1 ) f (D2 ) 1 f (D1 ) f (D2 ) is the Manhattan Distance and U stands for the record universe.In other words, l1 sensitivity measures the maximum impactthat can be caused by changing a single record in a dataset. It

Fig. 2: Differentially private bipartite graph miningis only related to the function f itself, but independent of thedata sets.Definition 6 (L APLACE M ECHANISM [12]). Given a functionf : N U Rd , a budget and a data set D, for each output,ALM (D, f, ) f (D) Lap(4f / )where Lap(4f / ) is a random variable sampled from theLaplace distribution with 0 mean and 4f / variance.We next discuss the proposed differentially private bipartitegraph analysis techniques for trajectory analysis that employs aLaplace Mechanism to achieve differential privacy guaranteesin the exposed points-of-interest recommendation.IV. D IFFERENTIALLY P RIVATE T RAJECTORY A NALYSISThe proposed differentially private trajectory analysis technique analyzes the user-location bipartite graph obtained fromthe raw trajectories to generate a list of ordered recommendedlocations (points-of-interest) and another list of ordered recommended users based on their frequency of visits to alocation. We start from presenting the privacy goal, namelywhat is identified as a ‘record’ in Definition 4 that needsto be protected by the differential privacy mechanism. Wethen present the proposed differentially private algorithm forbipartite graph analysis that generates the two recommendationlists from the user-location bipartite graph.A. Privacy goalBy considering the user-location bipartite graph as a dataset,a ‘record’ in Definition 4 has the form ‘ ui , lj ’, representing user ui visited location lj once, namely a stop pointdefined in Definition 2. Therefore, an edge eij (ui , lj , wij )in the bipartite graph, which can be considered as a group ofwij of (ui , lj , 1), results in wij of same ui , lj recordsin the dataset. Theentire bipartite graph can be transferredPm,nto a dataset with i 1,j 1 wij records. Then, by setting theglobal l1 sensitivity 4f in Definition 5 to different values, wecan protect different levels of differential privacy. We define4f [1, wmax ], where wmax represents the maximum weightamong the edges. The proposed differentially private trajectoryanalysis algorithm with sensitivity 4f [1, wmax ] can thusprotect differential privacy of a group of 4f records. In otherwords, the differential privacy of any edge eij (ui , lj , wij )in the bipartite graph with wij 4f can be protected. When4f 1, the edges with wij 1, namely the visits that onlyhappened once, can be protected. When 4f wmax , all theedges in the bipartite graph corresponding to the associationbetween each pair of user and location, can be protected. Tosum up, a larger 4f results in more noisy edges in the bipartitegraph to be protected but results in lower recommendationresults due to higher injected noise. We will evaluate thevarying 4f later in Section V.B. Differentially private Points-of-Interest RecommendationAmong the recommendation algorithms [11], [20], [26],[35], the one that fits the bipartite graph structure best is theHITS-based algorithm [35]. Hyperlink-Induced Topic Search(HITS) is a link analysis algorithm originally designed forweb pages rating [21]. It defines a hub as a web page withmany links pointing to other web pages and an authority asa web page pointed by many other web pages. It assumesa good hub points to many good authorities and a goodauthority is pointed by many good hubs. In the user-locationbipartite graph, edges point to locations from users. Therefore,by considering users and locations as hubs and authoritiesrespectively, we can apply HITS algorithm to score everyuser and location. A user with higher score represents a moreexperienced user who has more knowledge about the givencity and a location with higher score indicates a more popularplace that is worth being visited. Therefore, with the userlocation bipartite graph as input, we expect the algorithm tooutput the user and location lists ordered by the scores whilepreserving differential privacy. We present a pseudocode ofthe differentially private mining algorithm in Algorithm 1 andillustrate the process in Figure 2.The differentially private mining algorithm consists of foursteps, namely matrix construction (line 1-2), noise addition(line 3-9), noise suppression (line 10) and HITS (line 11-12).In the first step, the bipartite graph structure is transformedinto an association matrix M . The matrix M has m U rows and n L columns. Each entry in M is denoted bythe weight wij of the edge between user ui and location lj .If user ui never visited location lj , wij is set to 0.Then, in the second step, each entry wij is perturbedusing noise calculated by Laplace Mechanism [12] to becomefweij wij Lap( 4f ) so that M becomes M . Each noiseis a random variable sampled from Laplace distribution withvariance sensitivitybudget . The privacy budget is typically considered to be 1 in many differential privacy settings [12], [13]. A

2345678910111280Input : User-location bipartite graph U LBG (U, L, E),expected privacy level lv [1, wmax ], budget .Output: Recommended authority (locations) list A,recommended hub (users) list H.f[m][n], Mc[m][n]Initialize matrix M [m U ][n L ], Mwith 0s;Transfer U LBG to matrix M by filling M with{wij 1 i m, 1 j n};4f lv;δ 4f; for i 1 to m dofor i 1 to n dof[i][j] M [i][j] Lap(δ);Mendendc Sup(Mf);McT · Mc);A powerIte(Mc·McT );H powerIte(M70smaller indicates that the difference between the statisticalquery replies caused by the change of one record in the datasetis smaller, thus providing better privacy protection. We willevaluate the impact of different values of in Section V. Inour algorithm, all the entries share the same privacy budget because of their independence and the composition propertyof differential privacy:Theorem 1 (C OMPOSITION THEOREM [13]). Let Ai be i -differential private algorithms applying to independentdatasets Di for i [1, k]. Then their combination APki 1is max( i )-differential private.The sensitivity 4f can be selected from the range[1, wmax ]. However, we note that the noise can be significantlyhigh without further post-processing. Due to higher noises,the accuracy of recommendation can actually become toolow to be acceptable. Therefore, in the third step, we applyfconsistency constraints to post-process the matrix from Mcto M to suppress noise and improve the recommendationaccuracy.Theorem 2 (P OST- PROCESSING [13]). Let A be a differentially private algorithm and g be an arbitrary function.Then g(A) is also -differentially private.We propose three consistency constraints, named zero consistency constraint(zero-CC), up consistency constraint (upCC) and down consistency constraint(down-CC). In zero-CC,since all the entries wij in M are non-negative, the negativef after perturbation is constrained to be wweij in Mbij 0 inc to ensure consistency while the non-negative wMeij keepsc as wsame in Mbij weij . The up-CC and down-CC followthe isotonic regression [18]. Specifically, in up-CC, the entrieswij in M are sorted from 0 to wmax in non-decreasing order.After adding noises to the non-decreasing list of entries, theperturbed entries weij should also keep the non-decreasingorder to ensure consistency. If one perturbed entry is smaller60number1Algorithm 1: Differentially private bipartite graph mining50403020100010 20 30 40 50 60 70 80 90 100weightFig. 3: Edge weight distributionthan the one before it in the list, this perturbed entry will beadjusted to be equal to the one before it, thus making thelist non-decreasing. After adjusting all weij to wbij , the nonc. Thedecreasing list is transferred back to matrix, namely Mdown-CC is similar to up-CC, but the list is constrained tofollow a non-increasing order from wmax to 0.c, we apply the HITS algorithm toFinally, after generating Mcalculate the recommended authority (locations) list A and therecommended hub (users) list H. Precisely, we can initializeA and H to be vectors of 1s with size n and m respectively.Then, by applying power iteration method [35], we can get thecT · Mc and Mc·McT as the final A and Heigenvectors of Mrespectively. We refer the interested readers to [21] for moredetails on HITS.V. E XPERIMENTAL EVALUATIONIn this section, we experimentally evaluate the performanceoffered by the proposed differentially private trajectory analysis algorithm. Before reporting our results, we first presentour experimental setup.A. Experimental setupOur experiments were programmed in Java language andimplemented on an Intel Core i7 2.70GHz PC with 16GBRAM. In our experiments, we apply the Geolife GPS trajectorydataset [34], [35], [36], which contains 17621 trajectoriescollected from 182 users for five years. We first follow theuser-location bipartite graph construction scheme shown inFigure 1 to process the raw trajectory dataset and generatethe user-location bipartite graph. The sizes of node sets, edgeset and stop point set are shown in Table I (some users areabundant during clustering). Each edge in the user-locationbipartite graph is assigned a weight w representing the numberof visits. The distribution of weights among the 316 edges isshown in Figure 3 (we just show the part for w 100) and thestatistics of weights is shown in Table II. As can be seen, mostedges have small weights, indicating that users have manyrarely visited locations. This suggests the intuition behind ourprivacy goal. That is, we can inject little noise to protect mostof the edges in the bipartite graph. Typically, these protectededges have lower weights but higher sensitivity.

4zero-CCup-CCdown-CC0.204051015top-k(a) 120253035match rate10.8match rate10.8match ratematch b) 0.72025303540top-k(c) 0.4(d) 0120140top-k(a) 10.60.4zero-CCup-CCdown-CC0.2020406080100120match rate10.8match rate10.8match ratematch rateFig. 4: Recommended authority list A with varying privacy budget 1140top-k(b) 0.7(c) op-k(d) 0.1Fig. 5: Recommended hub list H with varying privacy budget UsersLocationsStop pointsEdges143448017316TABLE I: No. ofFirst quartileMedianThird quartileMax2518591TABLE II: w statisticsB. Experimental resultsThe goal of our experiments is to evaluate the performanceof the proposed differentially private user-location bipartitegraph analysis algorithm under various privacy budgets andsensitivity values. That is, we evaluate the accuracy of therecommendation results while simultaneously meeting thedifferential privacy guarantees. We define top-k match rateto measure the recommendation results. If we denote therecommended authority (location) list as A {a1 , a2 , ., an }and the recommended hub (user) list as H {h1 , h2 , ., hn },where the elements with smaller index have higher score, andrepresent the top-k elements in the list as Ak {a1 , a2 , ., ak }and Hk {h1 , h2 , ., hk } respectively,the top-k match rateT) noised(Ak )for A is M Rk (A) org(Akorg(Aand the top-k matchk)T) noised(Hk ), where org()rate for H is M Rk (H) org(Hkorg(Hk)denotes the original lists without injected noise to protectdifferential privacy and noised() stands for the differentiallyprivate lists with noise. In other words, for a query like‘show me top-5 recommended locations in this city’, weexpect the replied 5 locations do not change after addingnoise. Our experiments consist of three components. The noisecalibrated by the Laplace mechanism Lap( 4f ) is affected bythe two parameters, privacy budget and global sensitivity4f . Therefore, in the first part, we adjust the privacy budget to observe the change of M Rk (A) and M Rk (H). Then, inthe second part, we adjust the global sensitivity 4f to evaluatethe algorithm performance. Finally, to evaluate the scalabilityof the algorithm, we reduce the raw dataset scale by onlyusing the first-100-day trajectory data and only using the first90-user trajectory data respectively. In all the experiments, weevaluate the three consistency constraint schemes, denoted by‘zero-CC’, ‘up-CC’ and ‘down-CC’. For each experiment, werepeat 1000 times and show the average.In the first set of experiments, we change the value ofprivacy budget from 1 to 0.7, 0.4 and 0.1 and show theresults with varying k of M Rk (A) and M Rk (H) as Figure4 and Figure 5 respectively. In differential privacy, a smallerprivacy budget means the difference between query repliesfrom two datasets differing in at most one record is smaller,which implies higher privacy requirement and requires morenoise to guarantee better privacy protection. For this part, wefix the global sensitivity to be 1 to protect differential privacyof individual stop point or the 76 weight-1 edges. When 1, Figure 4(a) and Figure 5(a) show that for most valuesof k, M Rk (A) and M Rk (H) for all the three consistencyconstraint schemes are larger than 80%. If we keep choosingthe consistency constraint scheme giving the best results, thematch rates can even be larger than 90%. As 1 is typical inmost differential privacy settings [12], [13], the 90% accuracyshows that our differentially private bipartite graph analysisalgorithm for travel recommendation is practical and effective.In Figure 4(a), we can see that zero-CC provides best matchrate when k 20, which is defeated by dow

the raw trajectory dataset is transformed to be processed as a bipartite graph [33]. A bipartite graph can be represented by a graph, G (U;L;E), containing m jUjnodes on the left side, n jLjnodes on the right side and a set of edges E U Lbetween the two sets of nodes. This structure