SoK: Secure Messaging - Massachusetts Institute Of Technology

Transcription

SoK: Secure Messaging1Nik Unger , Sergej Dechand†Joseph Bonneau‡§ , Sascha Fahl¶ , Henning Perl¶Ian Goldberg , Matthew Smith† University of Waterloo, † University of Bonn, ‡ Stanford University, § Electronic Frontier Foundation, ¶ Fraunhofer FKIEinsecure ways. However, as will become clear over the courseof this paper, the academic research community is also failingto learn some lessons from tools in the wild.Furthermore, there is a lack of coherent vision for the futureof secure messaging. Most solutions focus on specific issuesand have different goals and threat models. This is compounded by differing security vocabularies and the absence ofa unified evaluation of prior work. Outside of academia, manyproducts mislead users by advertising with grandiose claimsof “military grade encryption” or by promising impossiblefeatures such as self-destructing messages [7]–[10]. The recentEFF Secure Messaging Scorecard evaluated tools for basicindicators of security and project health [11] and found manypurportedly “secure” tools do not even attempt end-to-endencryption.We are motivated to systematize knowledge on securemessaging due to the lack of a clear winner in the race forwidespread deployment and the persistence of many lingeringunsolved research problems. Our primary goal is to identify where problems lie and create a guide for the researchcommunity to help move forward on this important topic. Afurther goal in this work is to establish evaluation criteria formeasuring security features of messaging systems, as well astheir usability and adoption implications. We aim to providea broad perspective on secure messaging and its challenges,as well as a comparative evaluation of existing approaches,in order to provide context that informs future efforts. Ourprimary contributions are: (1) establishing a set of commonsecurity and privacy feature definitions for secure messaging;(2) systematization of secure messaging approaches based bothon academic work and “in-the-wild” projects; (3) comparativeevaluation of these approaches; and (4) identification anddiscussion of current research challenges, indicating futureresearch directions.After defining terminology in Section II, we present oursystematization methodology in Section III. In subsequentsections (Sections IV–VI), we evaluate each of the proposedproblem areas (namely trust establishment, conversation security and transport privacy) in secure messaging. Our findingsare discussed and concluded in Section VII.Abstract—Motivated by recent revelations of widespread statesurveillance of personal communication, many products nowclaim to offer secure and private messaging. This includes both alarge number of new projects and many widely adopted tools thathave added security features. The intense pressure in the past twoyears to deliver solutions quickly has resulted in varying threatmodels, incomplete objectives, dubious security claims, and a lackof broad perspective on the existing cryptographic literature onsecure communication.In this paper, we evaluate and systematize current securemessaging solutions and propose an evaluation framework fortheir security, usability, and ease-of-adoption properties. We consider solutions from academia, but also identify innovative andpromising approaches used “in the wild” that are not consideredby the academic literature. We identify three key challengesand map the design landscape for each: trust establishment,conversation security, and transport privacy. Trust establishmentapproaches offering strong security and privacy features performpoorly from a usability and adoption perspective, whereas somehybrid approaches that have not been well studied in theacademic literature might provide better trade-offs in practice.In contrast, once trust is established, conversation security canbe achieved without any user involvement in most two-partyconversations, though conversations between larger groups stilllack a good solution. Finally, transport privacy appears to bethe most difficult problem to solve without paying significantperformance penalties.I. I NTRODUCTIONMost popular messaging tools used on the Internet donot offer end-to-end security. Even though protocols suchas OpenPGP and S/MIME have been available for decades,they have failed to achieve widespread adoption and havebeen plagued by usability issues [2]–[5]. However, recentrevelations about mass surveillance by intelligence serviceshave highlighted the lack of security and privacy in messaging tools and spurred demand for better solutions. A recentPew Research poll found that 80% of Americans are nowconcerned about government monitoring of their electroniccommunications. A combined 68% of respondents reportedfeeling “not very secure” or “not at all secure” when usingonline chat and 57% felt similarly insecure using email [6].Consequently, many new applications claiming to offer securecommunication are being developed and adopted by end users.Despite the publication of a large number of secure messaging protocols in the academic literature, tools are beingreleased with new designs that fail to draw upon this knowledge, repeat known design mistakes, or use cryptography inII. BACKGROUND AND D EFINITIONSSecure messaging systems vary widely in their goals andcorresponding design decisions. Additionally, their target audiences often influence how they are defined. In this section, we1 Thisis an extended version of our paper in the 2015 IEEE Symposiumon Security and Privacy [1]. This document was last updated on 2015-04-14.1

define terminology to differentiate these designs and providea foundation for our discussion of secure messaging.vulnerable windowsecureEtcompromiseA. Types of specification(a) Forward SecrecySecure messaging systems can be specified at three differentbroad levels of abstraction:Chat protocols: At the most abstract level, chat protocolscan be defined as sequences of values exchanged betweenparticipants. This mode of specification deals with highlevel data flows and often omits details as significant as thechoice of cryptographic protocols (e.g., key exchanges) to use.Academic publications typically specify protocols this way.Wire protocols: Complete wire protocols aim to specifya binary-level representation of message formats. A wireprotocol should be complete enough that multiple parties canimplement it separately and interoperate successfully. Oftenthese are specific enough that they have versions to ensurecompatibility as changes are made. Implicitly, a wire protocolimplements some higher-level chat protocol, though extractingit may be non-trivial.Tools: Tools are concrete software implementations thatcan be used for secure messaging. Implicitly, a tool containsa wire protocol, though it may be difficult and error-prone toderive it, even from an open-source tool.vulnerable windowsecureEtcompromise(b) Backward SecrecyFig. 1. Session keys are protected from long-term key compromise.We can distinguish between message repudiation, in whichAlice denies sending a specific message, and participationrepudiation in which Alice denies communicating with Bobat all. The high-level goal of repudiable messaging systems isto achieve deniability similar to real-world conversations.A fundamental problem of deniability is that Justin maysimply trust Bob even with no technical evidence due to Bob’sreputation or perceived indifference. In a group chat, thisproblem may be even worse as Alice may need to convinceJustin that a number of accusers are all colluding to frameher. It is not possible to construct a messaging system thatovercomes this fundamental social problem; the best that canbe done is to provide no stronger evidence than the word of theaccusers. Some technical systems clearly offer more evidence;for example, signed PGP emails offer strong evidence thatAlice really was the sender.The cryptographic literature has produced many definitionsof “deniability” since deniable encryption was first formallyproposed [12]. For example, we can draw a distinction betweenan offline and online judge: in the offline case, the accuserattempts to convince the judge of an event after the conversation has already concluded; in the online case, the judgeexchanges private communications with the accuser while theconversation is still taking place. Existing work defines onlinerepudiation in incompatible ways, and very few protocolsattempt to achieve meaningful online repudiation [13], [14].Thus, in this work we only consider the offline setting.B. SynchronicityA chat protocol can be synchronous or asynchronous. Synchronous protocols require all participants to be online andconnected at the same time in order for messages to betransmitted. Systems with a peer-to-peer architecture, wherethe sender directly connects to the recipient for messagetransmission, are examples of synchronous protocols. Asynchronous protocols, such as SMS (text messaging) or email, donot require participants to be online when messages are sent,utilizing a third party to cache messages for later delivery.Due to social and technical constraints, such as switchedoff devices, limited reception, and limited battery life, synchronous protocols are not feasible for many users. Mobileenvironments are also particularly prone to various transmission errors and network interruptions that preclude the use ofsynchronous protocols. Most popular instant messaging (IM)solutions today provide asynchronicity in these environmentsby using a store-and-forward model: a central server is usedto buffer messages when the recipient is offline. Securemessaging protocols designed for these environments need toconsider, and possibly extend, this store-and-forward model.D. Forward/Backward SecrecyIn systems that use the same static keys for all messages,a key compromise allows an attacker to decrypt the entiremessage exchange. A protocol provides forward secrecy if thecompromise of a long-term key does not allow ciphertextsencrypted with previous session keys to be decrypted (Figure 1a). If the compromise of a long-term key does not allowsubsequent ciphertexts to be decrypted by passive attackers,then the protocol is said to have backward secrecy (Figure 1b).However, tools with backward secrecy are still vulnerable toactive attackers that have compromised long-term keys. In thiscontext, the “self-healing” aspect of backward secrecy has alsobeen called future secrecy. The terms are controversial andvague in the literature [15]–[17].C. DeniabilityDeniability, also called repudiability, is a common goal forsecure messaging systems. Consider a scenario where Bobaccuses Alice of sending a specific message. Justin, a judge,must decide whether or not he believes that Alice actually didso. If Bob can provide evidence that Alice sent that message,such as a valid cryptographic signature of the message underAlice’s long-term key, then we say that the action is nonrepudiable. Otherwise, the action is repudiable or deniable.2

III. S YSTEMATIZATION M ETHODOLOGYexisting secure messaging systems. Each section then definesand evaluates these approaches, as well as several possiblevariations, in terms of the already-defined properties. Concreteexamples of protocols or tools making use of each approachare given whenever possible. The sections then conclude bydiscussing the implications of these evaluations.In each section, we include a table (Tables I, II, and III)visualizing our evaluation of approaches within that problemarea. Columns in the tables represent the identified properties, while rows represent the approaches. Groups of rowsbegin with a generic concept, specified as a combinationof cryptographic protocols, followed by extension rows thatadd or modify components of the base concept. Wheneverpossible, rows include the name of a representative protocolor tool that uses the combination of concepts. Representativesmay not achieve all of the features that are possible usingthe approach; they are merely included to indicate whereapproaches are used in practice. Each row is rated as providingor not providing the desired properties. In some cases, a rowmight only partially provide a property, which is explained inthe associated description.For each problem area, we identify desirable properties inthree main categories:1) Security and Privacy Properties: Most secure messagingsystems are designed using standard cryptographic primitivessuch as hash functions, symmetric encryption ciphers, anddigital signature schemes. When evaluating the security andprivacy features of a scheme, we assume cryptographic primitives are securely chosen and correctly implemented. Wedo not attempt to audit for software exploits which maycompromise users’ security. However, if systems allow endusers to misuse these cryptographic primitives, the scheme ispenalized.2) Usability Properties: Usability is crucial for the use andadoption of secure messaging services. Human end users needto understand how to use the system securely and the effortrequired to do so must be acceptable for the perceived benefits.In previous research, various secure messaging tools havebeen evaluated and weaknesses in the HCI portion of theirdesign have been revealed. The seminal paper “Why JohnnyCan’t Encrypt” [2] along with follow-up studies evaluatingPGP tools [3], [4] and other messaging protocols [18]–[22]have also showed users encountering severe problems usingencryption securely. However, these studies focused on UIissues unique to specific implementations. This approachresults in few generic insights regarding secure messengerprotocol and application design. Given the huge number ofsecure messaging implementations and academic approachesconsidered in our systematization, we opted to extract genericconcepts. Because we focus on usability consequences imposed by generic concepts, our results hold for any tool thatimplements these concepts.To evaluate the usability of secure messaging approaches,we examine the additional user effort (and decisions), securityrelated errors, and reduction in reliability and flexibility thatthey introduce. Our usability metrics compare this extra effortOver the years, hundreds of secure messaging systems havebeen proposed and developed in both academia and industry.An exhaustive analysis of all solutions is both infeasible andundesirable. Instead, we extract recurring secure messagingtechniques from the literature and publicly available messagingtools, focusing on systematization and evaluation of the underlying concepts and the desirable secure messaging properties.In this section, we explain our precise methodology.A. Problem AreasWhile most secure messaging solutions try to deal with allpossible security aspects, in our systematization, we dividesecure messaging into three nearly orthogonal problem areasaddressed in dedicated sections: the trust establishment problem (Section IV), ensuring the distribution of cryptographiclong-term keys and proof of association with the owningentity; the conversation security problem (Section V), ensuringthe protection of exchanged messages during conversations;and the transport privacy problem (Section VI), hiding thecommunication metadata.While any concrete tool must decide on an approach foreach problem area, abstractly defined protocols may onlyaddress some of them. Additionally, the distinction betweenthese three problem areas is sometimes blurred since techniques used by secure messaging systems may be part of theirapproach for multiple problem areas.B. Threat ModelWhen evaluating the security and privacy properties insecure messaging, we must consider a variety of adversaries.Our threat model includes the following attackers:Local Adversary (active/passive): An attacker controllinglocal networks (e.g., owners of open wireless access points).Global Adversary (active/passive): An attacker controllinglarge segments of the Internet, such as powerful nation statesor large Internet service providers.Service providers: For messaging systems that require centralized infrastructure (e.g., public-key directories), the serviceoperators should be considered as potential adversaries.Note that our adversary classes are not necessarily exclusive.In some cases, adversaries of different types might collude.We also assume that all adversaries are participants in themessaging system, allowing them to start conversations, sendmessages, or perform other normal participant actions. Weassume that the endpoints in a secure messaging system aresecure (i.e., malware and hardware attacks are out of scope).C. Systematization StructureSections IV–VI evaluate trust establishment, conversationsecurity, and transport privacy approaches, respectively. Foreach problem area, we identify desirable properties dividedinto three main groups: security and privacy features, usabilityfeatures, and adoption considerations. Each section startsby defining these properties, followed by the extraction ofgeneric approaches used to address the problem area from3

to a baseline approach with minimal security or privacyfeatures. This is a challenging task and conventional userstudies are not well suited to extract such high-level usabilitycomparisons between disparate tools. We opted to employexpert reviews to measure these usability properties, which isconsistent with previous systematization efforts for securityschemes in other areas [23], [24]. To consider usability andadoption hurdles in practice, we combined these expert reviewswith cognitive walkthroughs of actual implementations basedon Nielsen’s usability principles [25]–[27] and already knownend-user issues discovered in previous work [2]–[5], [19]–[21],[28]. These usability results supplement our technical systematization and highlight potential trade-offs between securityand usability.3) Ease of Adoption: Adoption of secure messagingschemes is not only affected by their usability and securityclaims, but also by requirements imposed by the underlyingtechnology. Protocols might introduce adoption issues byrequiring additional resources or infrastructure from end usersor service operators. When evaluating the adoption propertiesof an approach, we award a good score if the system doesnot exceed the resources or infrastructure requirements of abaseline approach that lacks any security or privacy features.B. Usability PropertiesMost trust establishment schemes require key management:user agents must generate, exchange, and verify other participants’ keys. For some approaches, users may be confrontedwith additional tasks, as well as possible warnings and errors,compared to classic tools without end-to-end security. If aconcept requires little user effort and introduces no new errortypes, we award a mark for the property to denote good usability. We only consider the minimum user interaction requiredby the protocol instead of rating specific implementations.Automatic Key Initialization: No additional user effort isrequired to create a long-term key pair.Low Key Maintenance: Key maintenance encompasses recurring effort users have to invest into maintaining keys. Somesystems require that users sign other keys or renew expiredkeys. Usable systems require no key maintenance tasks.Easy Key Discovery: When new contacts are added, noadditional effort is needed to retrieve key material.Easy Key Recovery: When users lose long-term key material, it is easy to revoke old keys and initialize new keys (e.g.,simply reinstalling the app or regenerating keys is sufficient).In-band: No out-of-band channels are needed that requireusers to invest additional effort to establish.No Shared Secrets: Shared secrets require existing socialrelationships. This limits the usability of a system, as not allcommunication partners are able to devise shared secrets.Alert-less Key Renewal: If other participants renew theirlong-term keys, a user can proceed without errors or warnings.Immediate Enrollment: When keys are (re-)initialized, otherparticipants are able to verify and use them immediately.Inattentive User Resistant: Users do not need to carefullyinspect information (e.g., key fingerprints) to achieve security.IV. T RUST E STABLISHMENTOne of the most challenging aspects of messaging security is trust establishment, the process of users verifyingthat they are actually communicating with the parties theyintend. Long-term key exchange refers to the process whereusers send cryptographic key material to each other. Longterm key authentication (also called key validation and keyverification) is the mechanism allowing users to ensure thatcryptographic long-term keys are associated with the correctreal-world entities. We use trust establishment to refer to thecombination of long-term key exchange and long-term keyauthentication in the remainder of this paper. After contactdiscovery (the process of locating contact details for friendsusing the messaging service), end users first have to performtrust establishment in order to enable secure communication.C. Adoption PropertiesMultiple Key Support: Users should not have to investadditional effort if they or their conversation partners usemultiple public keys, making the use of multiple devices withseparate keys transparent. While it is always possible to shareone key on all devices and synchronize the key between them,this can lead to usability problems.No Service Provider Required: Trust establishment does notrequire additional infrastructure (e.g., key servers).No Auditing Required: The approach does not requireauditors to verify correct behavior of infrastructure operators.No Name Squatting: Users can choose their names and canbe prevented from reserving a large number of popular names.Asynchronous: Trust establishment can occur asynchronously without all conversation participants online.Scalable: Trust establishment is efficient, with resourcerequirements growing logarithmically (or smaller) with the thetotal number of participants in the system.A. Security and Privacy FeaturesA trust establishment protocol can provide the followingsecurity and privacy features:Network MitM Prevention: Prevents Man-in-the-Middle(MitM) attacks by local and global network adversaries.Operator MitM Prevention: Prevents MitM attacks executedby infrastructure operators.Operator MitM Detection: Allows the detection of MitMattacks performed by operators after they have occurred.Operator Accountability: It is possible to verify that operators behaved correctly during trust establishment.Key Revocation Possible: Users can revoke and renew keys(e.g., to recover from key loss or compromise).Privacy Preserving: The approach leaks no conversationmetadata to other participants or even service operators.D. Evaluation1) Opportunistic Encryption (Baseline): We consider opportunistic encryption, in which an encrypted session is established without any key verification, as a baseline. For4

TABLE IT RADE - OFFS FOR COMBINATIONS OF TRUST ESTABLISHMENT APPROACHES . S ECURE APPROACHES OFTEN SACRIFICE USABILITY AND ADOPTION .ExampleSecurity FeaturesUsabilityAdoptionNeO twop rO era k Mpe to ir r tO ato Mi M Ppe r tM revKe ra Mitt P enPr y R or A M rev tediva ev cc De ency oca ou tec tedPr tion nta tedes P biAuer o litvi ss ytong ibLo mleatwEa K icsy ey KeEa Ke Ma y InyIn sy K Di inte itias n lNo Ban ey R cov anc izatS d eco ery e ionAl h arveeeryIm rt-l d SeeIn med ss K creat ia e tsten te ytiv En Ree ro newUs ll aMer me lulNo tipRe ntlsisNo Ser e KtantAu vic eyNo di e P SuptripAs Na ng ovi ormd tSc ynch e S Req eral ro q uiab n ua rele ou tti ds ngSchemeOpportunistic Encryption†*TCPCrypt- - TOFU (Strict)† TOFU†*TextSecureKey Fingerprint Verification†*Threema Short Auth Strings (Out-of-Band)†*SilentText Short Auth Strings (In-Band/Voice/Video)†*ZRTP Socialist Millionaire (SMP)†*OTR Mandatory Verification†*SafeSlingerKey Directory†*iMessage- Certificate Authority†*S/MIME- Transparency Log Extended Transparency Log† Self-Auditable Log†CONIKSWeb-of-Trust†*PGP Trust Delegation†*GnuNS Tracking*KeybasePure IBC†SIM-IBC-KMS- Revocable IBC†- Blockchains*NamecoinKey Directory TOFU Optional Verification†* TextSecureOpportunistic Encryption SMP†*OTR provides property;- ----- - ----- - -- --- - -- - - -- -- - - - ----- - ------ partially provides property; - does not provide property; † has academic publication; * end-user tool availableinstance, this could be an OTR encryption session without anyauthentication. The main goal of opportunistic encryption is tocounter passive adversaries; active attackers can easily executeMitM attacks. From a usability perspective, this approach isthe baseline since it neither places any burden on the user norgenerates any new error or warning messages.For instance, TextSecure shows a warning that a user’s key haschanged and the user must either confirm the new key or applymanual verification to proceed (shown in Figure 2). If theuser chooses to accept the new key immediately, it is possibleto perform the verification later. The motivation behind thisapproach is to provide more transparency for more experiencedor high-risk users, while still offering an “acceptable” solutionfor novice end users. Critically, previous work in the relateddomain of TLS warnings has shown that frequent warningmessages leads to higher click-through rates in dangeroussituations, even with experienced users [30].From an adoption perspective, TOFU performs similarly tothe baseline, except for key recovery in the strict version andmultiple key support in both versions. The multiple key supportproblem arises from the fact that if multiple keys are used, theprotocol cannot distinguish between devices. An attacker canclaim that a new device, with the attacker’s key, is being used.2) TOFU: Trust-On-First-Use (TOFU) extends opportunistic encryption by remembering previously seen key material [29]. The network MitM prevented and infrastructureMitM prevented properties are only partially provided due tothe requirement that no attacker is present during the initialconnection. TOFU requires no service provider since keys canbe exchanged by the conversation participants directly. TOFUdoes not define a mechanism for key revocation. TOFU can beimplemented in strict and non-strict forms. The strict form failswhen the key changes, providing inattentive user resiliencebut preventing easy key recovery. The non-strict form promptsusers to accept key changes, providing easy key recovery atthe expense of inattentive user resilience.TOFU-based approaches, like the baseline, do not requireany user interaction during the initial contact discovery. Thisyields good scores for all user-effort properties except for thekey revocation property, which is not defined, and alert-lesskey renewal, since users cannot distinguish benign key changesfrom MitM attacks without additional verification methods.3) Key Fingerprint Verification: Manual verification requires users to compare some representation of a cryptographichash of their partners’ public keys out of band (e.g., in personor via a separate secure channel).Assuming the fingerprint check is performed correctly byend users, manual verification provides all desirable securityproperties with the exception of only partial key revocationsupport, as this requires contacting each communication part-5

Fig. 2. TextSecure warning for key changes: the user must either accept thenew key by selecting “complete”, or perform manual verification [31].ner out-of-band. The approaches differ only in their usabilityand adoption features.Fingerprint verification approaches introduce severe usability and adoption limitations: users have to perform manualverification before communicating with a new partner (and getthem to do the same) to ensure strong authentication. Thus,manual verification does not offer automatic key initialization,easy key discovery, or immediate enrollment. In addition,new keys introduce an alert on key renewal, resulting in akey maintenance effort. Fingerprints complicate multiple keysupport since each device might use a different key.While it is possible to improve the usability of key fingerprint verification by making it optional and combining it withother approaches, we postpone discussion of this strategy untilthe discussion.4) Short Authentication Strings: To ease fingerprint verification, shorter strings can be provided to the users forcomparison. A short authentication string (SAS) is a truncatedcryptographic hash (e.g., 20–30 bits long) of all public partsof the key exchange. It is often represented in a format aimedto be human friendly, such as a short sequence of words.All participants compute the SAS based on the key exchangethey observed, and then compare the resulting value witheach other. The method used for comparison of the SASmust authenticate the entities using some underlying trustestablishment mechanism.Several encrypted voice channels, including the ZRTP protocol and applications like RedPhone, Signal, and SilentPhone,use the SAS method by requiring participants to read stringsaloud [32], [33]. Figure 3 shows an example of SAS verification during establishment of a voice channel in RedPhone.For usability reasons, RedPhone and SilentPhone use randomdictionary words to represent the hash. Because these toolsrequire the user to end the established call manually if theverification fails, they are not inattentive user resistant.SAS systems based on voice channels anchor trust in theability of participants to recognize each other’s voices. Userswho have never heard each other’s voices cannot authenticateFig. 3. Users read random words during SAS verification in RedPhone [31].using this method. Even for users that are familiar with eachother, the security provided by voice identification has been thesubject of controversy [34], [35]. Recent work [36] suggeststhat, with even a small number of samples of a target user’sspeaking voice, audio samples can be synthesized that areindistinguishable from the genuine

a foundation for our discussion of secure messaging. A. Types of specification Secure messaging systems can be specified at three different broad levels of abstraction: Chat protocols: At the most abstract level, chat protocols can be defined as sequences of values exchanged between participants. This mode of specification deals with high-