Scalable Machine Learning TensorFlow: A Framework For

Transcription

TensorFlow: a Framework forScalable Machine LearningACM Learning Center, 2016

You probably want to know. What is TensorFlow?Why did we create TensorFlow?How does TensorFlow work?Code: Linear RegressionCode: Convolution Deep Neural NetworkAdvanced Topics: Queues and Devices

Fast, flexible, and scalableopen-source machine learninglibrary One system for research andproduction Runs on CPU, GPU, TPU, andMobile Apache 2.0 license

Machine learning gets complex quicklyModeling complexity

Machine learning gets complex quicklyDistributedSystemHeterogenousSystem

TensorFlow Handles ComplexityModeling complexityDistributedSystemHeterogenousSystem

What’s in a Graph?Edges are Tensors.Nodes are Ops. aConstantsVariablesComputationDebug code (Print, Assert)Control FlowbUnder the Hoodaddc

A multidimensional array.A graph of operations.

The TensorFlow GraphComputation is defined as a graph Graph is defined in high-level language (Python)Graph is compiled and optimizedGraph is executed (in parts or fully) on available lowlevel devices (CPU, GPU, TPU)Nodes represent computations and stateData (tensors) flow along edges

Build a graph; then run it.a.c tf.add(a, b)baddc.session tf.Session()value of c session.run(c, {a 1, b 2})

Any Computation is a TensorFlow GraphbiasesAddweightsMatMulexampleslabelsReluXent

Any Computation is a TensorFlow Graphetatith uXent

Automatic DifferentiationAutomatically add ops whichcompute gradients for variablesbiases.Xentgrad

Any Computation is a TensorFlow GraphSimple gradient descent:etatith swbiases.learning rateXentgradMul

Any Computation is a TensorFlow GraphdetutribdisDevice AbiasesAdd.learning rateDevices: Processes, Machines, CPUs, GPUs, TPUs, etcMulDevice B

Send and Receive NodesdetutribdisDevice AbiasesAdd.learning rateDevices: Processes, Machines, CPUs, GPUs, TPUs, etcMulDevice B

Send and Receive NodesdetutribdisDevice AbiasesSendRecvAdd.Send.RecvRecvlearning rateDevice BSendDevices: Processes, Machines, CPUs, GPUs, TPUs, etcMulSendRecv

Linear Regression

Linear Regressionresultinputy Wx bparameters

What are we trying to do?Mystery equation: y 0.1 * x 0.3 noiseModel: y W * x bObjective: Given enough (x, y) value samples, figure outthe value of W and b.

y Wx b in TensorFlowimport tensorflow as tf

y Wx b in TensorFlowimport tensorflow as tfx tf.placeholder(shape [None],dtype tf.float32, name ”x”)

y Wx b in TensorFlowimport tensorflow as tfx tf.placeholder(shape [None],dtype tf.float32, name ”x”)W tf.get variable(shape [], name ”W”)

y Wx b in TensorFlowimport tensorflow as tfx tf.placeholder(shape [None],dtype tf.float32, name ”x”)W tf.get variable(shape [], name ”W”)b tf.get variable(shape [], name ”b”)

y Wx b in TensorFlowyimport tensorflow as tf x tf.placeholder(shape [None],dtype tf.float32, name ”x”)W tf.get variable(shape [], name ”W”)b tf.get variable(shape [], name ”b”)y W * x bbmatmulWx

Variables Must be InitializedyCollects all variable initializersinit op tf.initialize all variables() init opMakes an execution mulsess tf.Session()sess.run(init op)xActually initialize the variables

Running the Computationx in 3sess.run(y, feed dict {x: x in}) Only what’s used to compute a fetch willbe evaluatedAll Tensors can be fed, but allplaceholders must be fedyfetch bmatmulWfeedx

Putting it all togetherimport tensorflow as tfx tf.placeholder(shape [None],dtype tf.float32,name 'x')W tf.get variable(shape [], name 'W')b tf.get variable(shape [], name 'b')y W * x bwith tf.Session() as sess:Build the graphPrepare execution environmentsess.run(tf.initialize all variables())Initialize variablesprint(sess.run(y, feed dict {x: x in}))Run the computation (usually often)

Define a LossGiven x, y compute a loss, for instance:# create an operation that calculates loss.loss tf.reduce mean(tf.square(y - y data))

Minimize loss: n.AdamOptimizer function minimumparameters (weights, biases)

TrainFeed (x, ylabel) pairs and adjust W and b to decrease the loss.W Wb b-( dL/dW )( dL/db )TensorFlow computesgradients automatically# Create an optimizeroptimizer tf.train.GradientDescentOptimizer(0.5)# Create an operation that minimizes loss.train optimizer.minimize(loss)Learning rate

Putting it all togetherloss tf.reduce mean(tf.square(y - y label))Define a lossoptimizer tf.train.GradientDescentOptimizer(0.5)Create an optimizertrain optimizer.minimize(loss)Op to minimize thelosswith tf.Session() as sess:sess.run(tf.initialize all variables())Initialize variablesfor i in range(1000):sess.run(train, feed dict {x: x in[i],y label: y in[i]})Iteratively run thetraining op

TensorBoard

Deep Neural Network

Remember linear regression?import tensorflow as tfx tf.placeholder(shape [None],dtype tf.float32,name 'x')W tf.get variable(shape [], name 'W')b tf.get variable(shape [], name 'b')y W * x bBuild the graphloss tf.reduce mean(tf.square(y - y label))optimizer tf.train.GradientDescentOptimizer(0.5)train optimizer.minimize(loss).

Convolutional DNNx tf.contrib.layers.conv2d(x, kernel size [5,5], .)x tf.contrib.layers.max pool2d(x, kernel size [2,2], .)x tf.contrib.layers.conv2d(x, kernel size [5,5], .)x tf.contrib.layers.max pool2d(x, kernel size [2,2], .)x tf.contrib.layers.fully connected(x, activation fn tf.nn.relu)x tf.contrib.layers.dropout(x, 0.5)logits tf.config.layers.linear(x)xconv 5x5 (relu)maxpool 2x2conv 5x5 (relu)maxpool 2x2fully connected (relu)dropout 0.5fully cke/tensorflow-tutorial/blob/master/2 mnist.ipynb

Defining Complex ing rate

Distributed TensorFlow

Data ParallelismParameter ServersΔp’p’ModelReplicas.Data.

Describe a cluster: ClusterSpectf.train.ClusterSpec({"worker": 222","worker2.example.com:2222"],"ps": ["ps0.example.com:2222","ps1.example.com:2222"]})

Share the graph across deviceswith tf.device("/job:ps/task:0"):weights 1 tf.Variable(.)biases 1 tf.Variable(.)with tf.device("/job:ps/task:1"):weights 2 tf.Variable(.)biases 2 tf.Variable(.)with tf.device("/job:worker/task:7"):input, labels .layer 1 tf.nn.relu(tf.matmul(input, weights 1) biases 1)logits tf.nn.relu(tf.matmul(layer 1, weights 2) biases 2)train op .with tf.Session("grpc://worker7.example.com:2222") as sess:for in range(10000):sess.run(train op)

Input Pipelines with eprocessWorkerPreprocess.FilenamesRaw ExamplesExamples

Tutorials & CoursesTutorials on tensorflow.org:Image recognition: https://www.tensorflow.org/tutorials/image recognitionWord embeddings: ge Modeling: slation: https://www.tensorflow.org/versions/seq2seqDeep les/tutorials/deepdream/deepdream.ipynb

Thank you and have fun!Martin Wicke@martin wickeRajat Monga@rajatmonga

Extras

InceptionAn Alaskan Malamute (left) and a Siberian Husky (right). Images from improving-inception-and-image.html

Show and nd-tell-image-captioning-open.html

Parsey /announcing-syntaxnet-worlds-most.html

Text SummarizationOriginal text Alice and Bob took the train to visit the zoo. They saw a baby giraffe, alion, and a flock of colorful tropical birds.Abstractive summary Alice and Bob visited the zoo and saw animals and -summarization-with-tensorflow.html

Claude Monet - Bouquet of SunflowersImages from the Metropolitan Museum of Art (with permission)Image by @random forests

ArchitecturePython front endC front end.Bindings Compound OpsTensorFlow Distributed Execution diOS.

TensorFlow: a Framework for Scalable Machine Learning ACM Learning Center, 2016. You probably want to know. What is TensorFlow? . Fast, flexible, and scalable open-source machine learning library One system for research and production Runs on CPU, GPU, TPU, and Mobile Apache 2.0 license. Machine learning gets complex quickly Modeling .