DeTor: Provably Avoiding Geographic Regions In Tor


DeTor: Provably Avoiding GeographicRegions in TorZhihao Li, Stephen Herwig, and Dave Levin, University of curity17/technical-sessions/presentation/liThis paper is included in the Proceedings of the26th USENIX Security SymposiumAugust 16–18, 2017 Vancouver, BC, CanadaISBN 978-1-931971-40-9Open access to the Proceedings of the26th USENIX Security Symposiumis sponsored by USENIX

DeTor: Provably Avoiding Geographic Regions in TorZhihao Li, Stephen Herwig, and Dave LevinUniversity of MarylandAbstractLarge, routing-capable adversaries such as nationstates have the ability to censor and launch powerfuldeanonymization attacks against Tor circuits that traversetheir borders. Tor allows users to specify a set of countries to exclude from circuit selection, but this providesmerely the illusion of control, as it does not precludethose countries from being on the path between nodesin a circuit. For instance, we find that circuits excludingUS Tor nodes definitively avoid the US 12% of the time.This paper presents DeTor, a set of techniques forproving when a Tor circuit has avoided user-specified geographic regions. DeTor extends recent work on usingspeed-of-light constraints to prove that a round-trip ofcommunication physically could not have traversed certain geographic regions. As such, DeTor does not requiremodifications to the Tor protocol, nor does it require amap of the Internet’s topology. We show how DeTor canbe used to avoid censors (by never transiting the censor once) and to avoid timing-based deanonymizationattacks (by never transiting a geographic region twice).We analyze DeTor’s success at finding avoidance circuitsthrough simulation using real latencies from Tor.1IntroductionTor [8] has proven to be an effective tool at providinganonymous communication and combating online censorship. Over time, Tor’s threat model has had to adapt toaccount for powerful nation-states who are capable of influencing routes into and out of their borders—so-calledrouting-capable adversaries [34].We consider two key threats that the presence ofrouting-capable adversaries now makes a practical reality. First, routing-capable adversaries can (and regularlydo) censor Tor traffic. While it is well-known that somecountries block Tor traffic beginning or ending withinborders, recent studies have shown that some also blockany Tor traffic that happens to transit through their borders [4]. Second, routing-capable adversaries can launchdeanonymization attacks against Tor. If an adversarialnetwork is on the path of the circuit between source andentry, and between exit and destination, then it can introduce small, detectable jitter between packets to correlatethe two connections and therefore uncover the source andUSENIX Associationdestination [19].In light of increasingly powerful attacks like these, Torhas added the ability for users to specify a set of countries to exclude when selecting circuits. However, aswe will demonstrate, this offers users only the illusionof control over where their traffic does not go. Amongthe circuits that Tor uses to ostensibly ignore the US, wecould identify only 12% of them as definitively avoidingthe US. Alternative schemes have been proposed thatinvolve using traceroute to construct a map of the Internet’s topology. However, routing-capable adversariescan easily (and regularly do [35]) provide incompleteresponses to traceroute, precluding provable securityfrom mapping-based approaches.In this paper, we present a set of techniques that canprove that a circuit has avoided a geographic region. Oneof the most powerful features of these techniques is howlittle they require compared to many prior approaches:they do not require modifying the hardware [3] or routing protocols [30] of the Internet, nor do they requirea map of the Internet’s routing topology [12]. Instead,our work extends recent work on “provable avoidancerouting” [24] that uses geographic distances and speedof-light constraints to prove where packets physicallycould not have traversed. Users can specify arbitrary geographic regions (our techniques do not rely on any notion of network topology or ownership), and we returnper-packet proofs of avoidance, when available.We construct avoidance in Tor in two applications:Never-once proves that packets forwarded along a circuit never traversed a given geographic region, evenonce. With this, users can avoid website fingerprintingattacks [23] and censoring regimes [4].Never-twice proves that packets forwarded along a circuit do not reveal more information to a geographicallyconstrained adversary than is strictly necessary by ensuring that they do not appear on two non-contiguous legsof the Tor circuit. With this, users can prevent certaindeanonymization attacks [2, 17, 29, 10, 15].In sum, this paper makes the following contributions: We introduce the notion of Tor circuits that provablyavoid regions of the world. Unlike prior approaches,our proofs do not depend on any model of network or26th USENIX Security Symposium343

AS-level topologies, and are instead based on roundtrip time measurements. Therefore, they are easy tocollect, do not require modifications to Tor, and donot depend on Internet measurements that are manipulable by a powerful adversary. We present the design, analysis, and evaluation of twonovel forms of avoidance: never-once to avoid censorsand website fingerprinting attacks, and never-twice toavoid various traffic deanonymization attacks. We build these techniques in a system we call DeTor,and evaluate it using real Tor latencies collected by theTing measurement tool [6]. We show that provable,never-once avoidance is possible even when avoiding routing-central countries like the US, and thatprovable never-twice avoidance works for 98.6% ofsource-destination pairs not in the same country.Collectively, our results show that, with client-sidetechniques alone, it is possible to achieve greater control over where Tor data does not go. We believe this tobe a powerful building block in future defenses.the same subnet are chosen to be in the same circuit,and (3) no nodes are chosen from a user-specified listof countries to ignore.Circuit construction is done in such a way that the onlyhost who knows all hops on the circuit is the source: eachother host knows only the hop immediately precedingand succeeding it. By the end of the circuit construction protocol, the source has established a pairwise secret(symmetric) key with each hop on the circuit.The salient feature of Tor is the manner in which itperforms “onion routing.” When sending a packet p tothe destination, the source encrypts p with the symmetric key it shares with the exit node; it then encrypts thisciphertext with the key shared with the middle node; andin turn encrypts this doubly-encrypted ciphertext withthe key shared with the entry node. Each hop on thecircuit “peels off” its layer of encryption, thereby ensuring that anyone overhearing communication betweenany two consecutive Tor routers learns nothing about theother Tor routers on the circuit.2.22Background and Related WorkIn this section, we describe some of the attacks that arepossible against Tor from a powerful routing-capable adversary. We also discuss prior work that has sought tomitigate these attacks. First, we begin by reviewing therelevant details of the Tor protocol.2.1A Brief Review of TorTor [8] is a peer-to-peer overlay routing system thatachieves a particular type of anonymity known as unlinkable communication. A source-destination pair is unlinkable if no one other than the two endpoints can identifyboth the source and destination. That is, an observer mayknow the source (or destination) is communicating withsomeone, but cannot identify with whom.Tor achieves unlinkable communication by routingtraffic through a circuit: a sequence of overlay hosts.There are typically three hosts in a circuit: an entry node1(who communicates with the source), a middle node, andan exit node (who communicates with the destination).The source node is responsible for choosing which Torrouters to include in a circuit, and for constructing thecircuit. Tor’s default circuit selection algorithm choosesnodes almost uniformly at random to be in a circuit, withthree notable exceptions: (1) nodes with greater bandwidth are chosen more frequently, (2) no two nodes from1 Alternatively, clients can make use of so-called bridge nodes,which are in essence non-public entry nodes. Because they serve thesame purpose as traditional entry nodes, they pose no difference inDeTor, and so we refer to them collectively as “entry nodes.”34426th USENIX Security SymposiumThreat ModelWe assume a powerful routing-capable adversary [34],e.g., a nation-state. Such an attacker has the abilityto make (potentially false) routing advertisements andcan therefore attract routes to its administrative domain.Thus, routing-capable adversaries are able to insert themselves onto the path between two communicating endpoints. Once on the path, they can launch various manin-the-middle attacks by injecting, dropping, delaying, orreordering packets.Routing-capable adversaries can also mislead or obfuscate attempts to map their networks. For example,one common approach for mapping a network is touse traceroute, but even benign networks sometimesrefuse to respond to ICMP packets, tunnel their packetsthrough their internal network, or simply do not decrement TTLs. These efforts effectively hide routers froma traceroute measurement, and could allow a nationstate adversary to hide its presence on a path. It is because of these kinds of attacks that we choose not to employ traceroute-based measurements in our system.Because we are mainly focused on nation-state adversaries, we assume that the attacker can be geographicallylocalized. For example, to avoid the United States, weassume that a user can download the geographic information (GPS coordinates) that succinctly describe wherethe US is (including its territories, such as Guam) andthat these constitute all of the locations from which thecountry could launch attacks. This was the same assumption made by Levin et al. [24]. In practice, it may be possible that an adversary could infiltrate other countries’networks, but there are many instances where a nation-USENIX Association

state deploys its censorship mechanisms within its borders [7, 13].This attack model extends naturally to colluding countries, such as the Five Eyes: one can simply considerthem as one large “nation-state” that is constrained toits (potentially noncontiguous) borders. As we willdemonstrate, because our techniques apply to noncontiguous geographic regions, they are not restricted to single nation-states, and can be applied to arbitrary sets ofcountries.The attacker can also run its own Tor routers or collude with some Tor routers, but, as per the previous assumption, only within its own (or its fellow colluders’)borders.Finally, we make several assumptions about what anattacker cannot do. We assume the attacker cannot violate standard cryptographic assumptions, particularlythat it cannot invert cryptographically secure hash functions, infer others’ private keys, or forge digital signatures or MACs. Also, we note that, while an attackercan lie about having larger latencies (by delaying its ownpackets), it is unable to lie about having lower latenciesthan its links actually permit.2.3AttacksThis paper considers three very powerful attacks that areat the disposal of a routing-capable adversary. We reviewthe attacks here, and then describe how prior work hassought to mitigate them.Censorship A routing-capable adversary can censor traffic that enters its borders. Commonly, with Tor, this involves identifying the set of active Tor routers and simplydropping traffic to or from these hosts. The Tor Metricsproject monitors several countries who appear to performthis kind of censorship [37].Traffic deanonymization Consider an attacker who isable to observe the traffic on a circuit between the sourceand the entry node and between the exit node and the destination. The attacker can correlate these two seeminglyindependent flows of traffic in a handful of ways. Forinstance, a routing-capable adversary operating a routeron the path between source and entry could introduce jitter between packets that it could detect in the packetsbetween exit and destination. This works because Torrouters do not delay or shuffle their packets, but rathersend them immediately in order to provide low latencies [6].Website fingerprinting Even an attacker limited to observing only the traffic between the source and entrynode can be capable of deanonymizing traffic. In particular, if the destination’s traffic patterns (e.g., the number of bytes transferred in response to apparent requests)USENIX Associationare well-known and unique, then an attacker may be ableto infer the destination by observing the traffic on anyleg of the circuit [23]. Such attacks run into challengeswhen there is sufficient cover traffic, but unfortunatelyTor users have little control over how much cover trafficthere is.2.4Related WorkSneaking through censors The traditional way ofmitigating censoring nation-states is to sneak throughthem by making would-be-censored traffic look benign.For example, decoy routing [21, 41] uses participatingrouters that are outside the censoring regime but on a benign path to effectively hijack traffic and redirect it toa destination that would be censored. To the censoringregime, the traffic appears to be going to a destination itpermits.Other approaches employ protocol obfuscation techniques to make one protocol look like another. A slewof systems [26, 40, 27, 38, 39, 18] has explored makingTor traffic appear to be other, innocuous traffic, notablySkype (many censors permit video chat applications, soas to allow their citizens to keep in touch with friendsand family abroad). We seek an altogether different approach of avoiding these nefarious regions altogether,rather than trying to sneak through them. However, theseare somewhat orthogonal approaches, and may be complementary in practice.AS-aware Tor variants The work perhaps closest toours in terms of overall goals is a series of systemsthat try to avoid traversing particular networks once (ortwice). To the best of our knowledge, these have focusedalmost exclusively on using autonomous system (AS)level maps of the Internet [2, 19, 10]. Like DeTor, theidea is that if we can reason about and enforce where ourpackets may (or may not) go between hops in the circuit,we can address attacks such as censorship and certainforms of traffic deanonymization.As described in §2.2, however, we assume in this paper an adversary who has the ability to manipulate an observer’s map of the Internet, for instance by withholdingsome routing advertisements, withholding tracerouteresponses, and so on. Instead of relying on these manipulable data sources, we base our proofs on physical, natural limitations: the fact that information cannot travelfaster than the speed of light. As an additional departure from this line of work, we focus predominately onnation-state adversaries, which are easier to locate andgeographically reason about than networks (which mayhave points of presence throughout the world).DeTor’s proofs of avoidance come at a cost that systems that do not offer provable security do not have to26th USENIX Security Symposium345

pay. DeTor discards any circuit for which it cannot obtain proof of avoidance, but it is possible that there arecircuits that achieve avoidance that do not meet the requirements of the proof. As a result, DeTor clients havefewer circuits to choose from than more permissive systems that rely only on AS maps, potentially leading togreater load imbalance in DeTor. We believe this to bea fundamental cost of security with provable properties;however, we also believe that future work can reduceDeTor’s “false negatives.”Avoidance routing Recent work has proposed not tosneak through attackers’ networks, but to avoid them altogether. Edmundson et al. [11] propose to use maps ofthe Internet’s routing topology to infer through whichcountries packets traverse, and to proxy traffic throughthose who appear to avoid potential attackers. However,as with AS-aware Tor variants, this approach relies ondata that can be significantly manipulated by the kind ofpowerful routing-capable adversaries that we consider.In this paper, we seek techniques that yield provable security, even in the presence of such adversaries.Alibi Routing [24] uses round-trip time measurementsand speed of light constraints to provably avoid userspecified, “forbidden” geographic regions. The AlibiRouting protocol only uses a single relay; we generalize this approach to apply to Tor’s three-hop circuits.Moreover, our never-twice application avoids doublytraversing any region of the world, and does not requirean a priori definition of a forbidden region. We reviewAlibi Routing next.3Background: Alibi RoutingWe build upon techniques introduced in Alibi Routing [24] to achieve provable avoidance in Tor. In thissection, we briefly review how Alibi Routing achievesits proofs of avoidance, and we outline the challenges weaddress in translating it to Tor.3.1(1)When this inequality holds, it means that the RTT fors forwarding packets through a to t is significantly lessthan the smallest round-trip time that would also includea host in the forbidden region. In other words, if s canverify that its traffic is going through a and t, then thetraffic could not also go through F without inducing anoticeable increase in end-to-end round-trip time. Withsuch a relay, s can prove that his packets avoided F withtwo pieces of evidence:1. A MAC (or digital signature) from a attesting to thefact that it did indeed forward the packet, and2. A measured end-to-end round-trip time that satisfiesEq. (1).These two pieces of evidence form an “alibi”: the packetswent through a and could not also have gone through F,therefore it avoided F. As a result, Levin et al. [24] referto a relay a who provides such a proof an alibi peer.These proofs of avoidance must be obtained for eachround-trip of communication. The factor of δ acts as anadditional buffer against variable latencies. The larger δis, the fewer potential alibis there will be, but they willbe able to provide proofs of avoidance even when packetssuffer an uncharacteristically high delay, for instance dueto congestion. DeTor makes use of δ in the same manner.One technical detail in Alibi Routing’s proof thatwe will make use of is the process of computingmin f F [R(s, f ) R( f , a)] and min f F [R(a, f ) R( f ,t)].After all, how can one compute the shortest RTT througha host in an untrusted region of the world? The insight isthat, if we know the geographic locations of the hosts inquestion, then we can obtain a lower bound on the roundtrip time between them. In particular, if D(x, y) denotesthe great-circle distance between hosts at locations x andy, then we have the following bound:Proofs of AvoidanceAlibi Routing [24] is a system that provides proof thatpackets have avoided a user-specified geographic region.Specifically, a source node s specifies both a destinationt and a forbidden region F. Node s trusts all nodes thatare provably outside F (we return to this point at the endof this subsection). Alibi Routing then seeks to identifya relay a that is not in F and that satisfies the followingproperty. Let R(x, y) denote the round-trip time (RTT)between hosts x and y, and let Re2e denote the end-to-endRTT that s observes; then for a user-configurable δ 0:346(1 δ ) · Re2e (min f F [R(s, f ) R( f , a)] R(a,t)minR(s, a) min f F [R(a, f ) R( f ,t)]26th USENIX Security Symposiummin [R(s, f ) R( f , a)] R(a,t) f F (2)3· min [D(s, f ) D( f , a)] D(a,t)f Fcwhere c denotes the speed of light. In general, information cann

Regions in Tor Zhihao Li, Stephen Herwig, and Dave Levin, . anonymous communication and combating online cen-sorship. Over time, Tor’s threat model has had to adapt to . that it cannot invert cryptographically secure hash func-tions, infer others’ private keys, or forge digital s