Privacy-Preserving Machine Learning In TensorFlow With TF . - O'Reilly

Transcription

Privacy-Preserving Machine Learning inTensorFlow with TF EncryptedMorten DahlO’Reilly AI Conference, New York, April 2019

Why?Privacy in machine learning

Machine Learning Processdata settrainingprediction service

clinical photostransfer learningprediction servicemachine learning positioned to have huge impact on health care

Potential Bottlenecksdata access(liability and controlled use)data setincentive(accuracy and exposure)trainingprediction serviceleakage(model and training data)risk management(store and process)

Sanitizationdata access(liability and controlled use)sanitised data set(Differential Privacy)incentive(accuracy and exposure)trainingsanitised predictionleakage(model and training data)risk management(store and process)

Encryptiondata access(liability and controlled use)encrypted data set(Secure Computation)incentive(accuracy and exposure)encrypted trainingencrypted predictionleakage(model and training data)risk management(store and process)

Hybriddata access(liability and controlled use)privacy mitigates bottlenecksencrypted data setencrypted trainingincentive(accuracy and exposure)encrypted sanitisedpredictionleakage(model and training data)risk management(store and process)

How?Computing on encrypted data

Prediction with Linear Modelwxxdot(x, w)w1dot(x1x2x3,w2 )w3 x1*w1 x2*w2 x3*w3

using Homomorphic EncryptionwxEnc(x)Enc(dot(x, w))homomorphicencryption schemepublic multiplicationprivate additionw1dot(Enc(x1)Enc(x2)Enc(x3),w2 ) Enc(x1)*w1 Enc(x2)*w2 Enc(x3)*w3w3 Enc(x1*w1) Enc(x2*w2) Enc(x3*w3) Enc(x1*w1 x2*w2 x3*w3)

Paillier Homomorphic Encryptionpublic encryption keytypically 4000 bits:computation is significantlymore expensivec Enc(x, r) g x * r n mod n 2g 36n 35n 2 1225Enc(5, 2) 36 5 * 2 35 mod 1225 718Enc(5, 4) 36 5 * 4 35 mod 1225 674

Private Addition in PaillierEnc(x, r) * Enc(y, s) (g x * r n mod n 2) * (g y * s n mod n 2) g (x y) * (r * s) n mod n 2 Enc(x y, r*s)Enc(5, 2) * Enc(5, 4) 718 * 674 57 36 10 * 8 35 Enc(10, 8)

Public Multiplication in PaillierEnc(x, r) w (g x * r n mod n 2) w g (x*w) * (r w) n mod n 2 Enc(x*w, r w)Enc(5, 2) 2 718 * 718 1024 36 10 * 4 35 Enc(10, 4)

using Secret SharingwShare1(x)Share1(dot(x, w))xwShare2(x)Share2(dot(x, w))

Secret SharingShare1(x, r) r mod mShare2(x, r) x - r mod mx Share1(x, r) Share2(x, r) mod mm 10Share1(5, 7) 7 mod 10 7Share2(5, 7) 5 - 7 mod 10 87 8 15 5 mod 10

Private Addition with Secret Sharingx1y1z1 x1 y1x2y2z2 x2 y2x x1 x2y y1 y2x y (x1 x2) (y1 y2) (x1 y1) (x2 y2) z1 z2

Public Multiplication with Secret Sharingx1wz1 x1 * wx2wz2 x2 * wx x1 x2x*w (x1 x2) * w (x1 * w) (x2 * w) z1 z2

using Secret Sharing, with Private ModelShare1(w)Share1(x)Share1(dot(x, w))xShare2(w)Share2(x)Share2(dot(x, w))

using Secret Sharing, with Private ModelShare1(w)Share1(x)Share1(dot(x, w))xprivate multiplicationShare(w0)Share2(w)Share2(x)dot( Share1(x0) Share1(x1) Share1(x2) , Share(w1) ) Share1(x0)*Share(w0) Share(w2)Share2(dot(x, w)) Share1(x0*w0) Share1(x0*w0 x1*w1 x2*w2)

Private Multiplication with Secret Sharing((a1a2,,b1b2,,c1c2a a1 a2b b1 b2c a*b c1 c2))x1x2x x1 x2y1y2y y1 y2alphaalphabetaz1 alpha*beta alpha*b1 beta*a1 c1betaz2 alpha*b2 beta*a2 c2alpha x - abeta y - bx*y z1 z2

Multidisciplinary ChallengeData science(use-cases, workflow, monitoring)Cryptography(techniques, protocols, trust)Machine learning(models, approx, precision)Engineering(distributed, multi-core, readability)need common language

TF EncryptedMaking it accessible

TensorFlowplatform for research and production-level training and deploymentpopular and backed by Google

TF Encrypted Architecturestandard operations(matmul, relu, sigmoid, tanh, etc)easily mix ordinary andencrypted computationsAppMPCsecure computationdirectly using TensorFlowMLTF Encryptedordinary TensorFlowDistHETensorMPCTensorFlowthird party libraries forsecure computationML

PredictionEncouraging use

ParticipantsShare1(x)Share1(w0, b0, )Share1(logits)w0, b0, xShare2(w0, b0, )Share2(x)Share2(logits)

Private Prediction with TF Encrypted

Overall Computationcompute serversprediction clientmodel owner

Local ProcessingTF Data pipeline

Joint PredictionCombining knowledge for nuance

ParticipantsShare1(x age)Share1(w0, b0, )x ageShare1(x gender)Share1(x income)Share1(res)w0, b0, x genderShare2(x age)Share2(w0, b0, )Share2(x gender)Share2(x income)Share2(res)x income

Private Joint Prediction with TF Encrypted

TrainingLearning without seeing

ParticipantsShare1(x, y)Share1(w)Share2(x, y)Share2(w)x, y

ParticipantsShare1(x, y)Share1(w)Share2(x, y)Share2(w)x, y

ParticipantsShare1(x, y)Share1(w)Share2(x, y)Share2(w)x, y

Private Training with TF Encrypted

Overall Computationdata ownermodel owner

Joint TrainingCombining insights for better models

Participantsx 0, y 0x 1, y 1Share1(x 0, y 0)Share1(x 1, y 1)Share1(w)Share2(x 0, y 0)Share2(x 1, y 1)Share2(w)

Private Joint Training with TF Encrypted

Federated LearningKeeping data decentralized

Participantsweightsx 0, y 0Share1(update 0)Share1(update 1)Share1(update 2)Share1(aggregated-update)x 1, y 1weightsShare2(update 0)x 2, y 2Share2(update 1)Share2(update 2)Share2(aggregated-update)

Secure Federated Learning in TF Encrypted

Overall Computationcompute serversdata ownersmodel owner

Local OptimizationTF optimization

RoadmapHigh-level API (Private Keras, Pre-trained Models, Owned Data)Tighter integration (TF Data, TF 2.0, TF Privacy, TF Federated)Third-party cryptographic libraries (HE, MPC)Improved performance

Wrap-UpYou can compute on encrypted data,without the ability to decryptPrivacy-preserving ML mitigate bottlenecks andenable access to sensitive informationThank you!Secure computation distributes trust and control,and is complementary to e.g. differential privacygithub.com/tf-encrypted/Privacy-preserving ML is a multidisciplinary fieldbenefitting from adaptations on both sides@mortendahlcs@dropoutlabsaiTF Encrypted focuses onusability and integration

TensorFlow Dist Tensor ML TF Encrypted MPC ML App ordinary TensorFlow third party libraries for secure computation easily mix ordinary and encrypted computations secure computation directly using TensorFlow standard operations (matmul, relu, sigmoid, tanh, etc)