Modelling Data Networks Stochastic . - Richard Clegg

Transcription

Modelling data networks –stochastic processes andMarkov chainsa1, 30, 31, 22, 2α1, 6ub1, 32, 3c0, 3v2, 2β1, 1Richard G. Clegg (richard@richardclegg.org)— November 2016Available online at http://www.richardclegg.org/lectures accompanying printednotes provide full bibliography.(Prepared using LATEX and beamer.)

Introduction to stochastic processes and Markov chainsStochastic processesA stochastic process describes how a system behaves over time –an arrival process describes how things arrive to a system.Markov chainsMarkov chains describe the evolution of a system in time – inparticular they are useful for queuing theory. (A Markov chain is astochastic process).

Stochastic processesStochastic processLet X (t) be some value (or vector of values) which varies in time t.Think of the stochastic process as the rules for how X (t) changeswith t. Note: t may be discrete (t 0, 1, 2, . . .) or continuous.Poisson processA process where the change in X (t) from time t1 to t2 is a Poissondistribution, that is X (t2 ) X (t1 ) follows a Poisson distribution.

A simple stochastic process – the drunkard’s walkRandom walk – or drunkard’s walkA student walks home from a party. They start at a distance X (0)from some point. At every step they (randomly) gets either oneunit closer (probability p) or one unit further away.(X (t) 1 probability pX (t 1) X (t) 1 probability 1 p.Can answer questions like “where will he be at time t”?

Drunkard’s walk – p 0.5, X (0) 0

Drunkard’s walk – p 0.5, X (0) 0What is the average (or expected value) of X (t), that is,E [X (t)]?Note – if you’ve never come across “expected value” or“expectation” before you can think of it as average (seenotes).E [X (t)] 0.5(X (t 1) 1) 0.5(X (t 1) 1) 0.5(X (t 1) X (t 1)) 0.5(1 1) X (t 1).

Drunkard’s walk – p 0.5, X (0) 0What is the average (or expected value) of X (t), that is,E [X (t)]?Note – if you’ve never come across “expected value” or“expectation” before you can think of it as average (seenotes).E [X (t)] 0.5(X (t 1) 1) 0.5(X (t 1) 1) 0.5(X (t 1) X (t 1)) 0.5(1 1) X (t 1).Therefore E [X (t)] X (0) 0 – the poor drunk makes noprogress towards his house (on average).

Drunkard’s walk – p 0.5, X (0) 0What is the average (or expected value) of X (t), that is,E [X (t)]?Note – if you’ve never come across “expected value” or“expectation” before you can think of it as average (seenotes).E [X (t)] 0.5(X (t 1) 1) 0.5(X (t 1) 1) 0.5(X (t 1) X (t 1)) 0.5(1 1) X (t 1).Therefore E [X (t)] X (0) 0 – the poor drunk makes noprogress towards his house (on average). E X (t)2 0.5(X (t 1) 1)2 0.5(X (t 1) 1)2 X (t 1)2 1. Therefore E X (t)2 t – on average the drunk does getfurther from the starting pub.

Drunkard’s walk – p 0.5, X (0) 0What is the average (or expected value) of X (t), that is,E [X (t)]?Note – if you’ve never come across “expected value” or“expectation” before you can think of it as average (seenotes).E [X (t)] 0.5(X (t 1) 1) 0.5(X (t 1) 1) 0.5(X (t 1) X (t 1)) 0.5(1 1) X (t 1).Therefore E [X (t)] X (0) 0 – the poor drunk makes noprogress towards his house (on average). E X (t)2 0.5(X (t 1) 1)2 0.5(X (t 1) 1)2 X (t 1)2 1. Therefore E X (t)2 t – on average the drunk does getfurther from the starting pub.This silly example has many uses in physics and chemistry(Brownian motion) – not to mention gambling (coin tosses).

The Poisson processThe Poisson processLet X (t) with (t 0) and X (0) 0 be a Poisson process withrate λ. Let t2 , t1 be two times such that t2 t1 . Let τ t2 t1 . (λτ )nP [X (t2 ) X (t1 ) n] exp[ (λτ )],n!for n 0, 1, 2, . . .In other words, the number of arrivals in some time period τfollows a Poisson distribution with rate λτ .

The special nature of the Poisson processThe Poisson process is in many ways the simplest stochasticprocess of all.This is why the Poisson process is so commonly used.Imagine your system has the following properties:The number of arrivals does not depend on the number ofarrivals so far.No two arrivals occur at exactly the same instant in time.The number of arrivals in time period τ depends only on thelength of τ .The Poisson process is the only process satisfying theseconditions (see notes for proof).

Some remarkable things about Poisson processesThe mean number of arrivals in a period τ is λτ (see notes).If two Poisson processes arrive together with rates λ1 and λ2the arrival process is a Poisson process with rate λ1 λ2 .In fact this is a general result for n Poisson processes.If you randomly “sample” a Poisson process – e.g. pickarrivals with probability p, the sampled process is Poisson,rate pλ.This makes Poisson processes easy to deal with.Many things in computer networks really are Poissonprocesses (e.g. people logging onto a computer or requestingweb pages).The Poisson process is also “memoryless” as the next sectionexplains.

Siméon Denis Poisson(1781-1840)

The interarrival time – the exponential distributionThe exponential distributionAn exponential distribution for a variable T takes this form:(1 exp[ (λt)], t 0,P [T t] 0t 0.The time between packets is called the interarrival time – thetime between arrivals.For a Poisson process this follows the exponential distribution(above).This is easily shown – the probability of an arrival occurringbefore time t is one minus the probability of no arrivalsoccurring up until time t.The probability of no arrivals occurring during a time period tis (λt)0 exp[ (λt)]/0! exp[ (λt)].The mean interarrival time is 1/λ.

The memoryless nature of the Poisson processThere is something strange to be noticed here – thedistribution of our interarrival time T was given byP [T t] 1 exp[ (λt) for t 0.However, if looked at the Poisson process at any instant andasked “how long must we wait for the next arrival?” theanswer is just the same 1/λ.Exactly the same argument can be made for any arrival time.The probability of no arrivals in the next t seconds does notchange because an arrival has just happened.The expected waiting time for the next arrival does notchange if you have been waiting for just one second, or for anhour or for many years – the average time to the next arrivalis still the same 1/λ.

The Poisson bus dilemma· · · X (t) · · ·Consider you arrive at the bus stop at a random time.Buses arrive as a Poisson process with a given rate λ.Buses are (on average) 30 minutes apart 1/λ 30 minutes.How long do you wait for the bus on average?

The Poisson bus dilemma· · · X (t) · · ·Consider you arrive at the bus stop at a random time.Buses arrive as a Poisson process with a given rate λ.Buses are (on average) 30 minutes apart 1/λ 30 minutes.How long do you wait for the bus on average?Bus passenger 1: Obviously 15 minutes – the buses are 30minutes apart, on average I arrive half way through thatperiod.

The Poisson bus dilemma· · · X (t) · · ·Consider you arrive at the bus stop at a random time.Buses arrive as a Poisson process with a given rate λ.Buses are (on average) 30 minutes apart 1/λ 30 minutes.How long do you wait for the bus on average?Bus passenger 1: Obviously 15 minutes – the buses are 30minutes apart, on average I arrive half way through thatperiod.Bus passenger 2: Obviously 30 minutes – the buses are aPoisson process and memoryless. The average waiting time is30 minutes no matter when the last bus was or when I arrive.

The Poisson bus dilemma· · · X (t) · · ·Consider you arrive at the bus stop at a random time.Buses arrive as a Poisson process with a given rate λ.Buses are (on average) 30 minutes apart 1/λ 30 minutes.How long do you wait for the bus on average?Bus passenger 1: Obviously 15 minutes – the buses are 30minutes apart, on average I arrive half way through thatperiod.Bus passenger 2: Obviously 30 minutes – the buses are aPoisson process and memoryless. The average waiting time is30 minutes no matter when the last bus was or when I arrive.So, who is correct?

The Poisson bus dilemma – solutionSo, is the answer 15 minutes, 30 minutes or something else.

The Poisson bus dilemma – solutionSo, is the answer 15 minutes, 30 minutes or something else.30 minutes is the correct answer (as the Poisson process resultshow us).To see why the 15 minutes answer is wrong consider thediagram.The average gap between buses is 30 minutes.The average passenger does wait for half of the interarrivalgap he or she arrives during.However, the average passenger is likely to arrive in a largerthan average gap (see diagram).We do not need to prove that the answer is 30 minutes – theproof is already there for the Poisson process.long gapsshort gapstime

Introducing Markov chainsMarkov ChainsMarkov chains are an elegant and useful mathematical tool used inmany applied areas of mathematics and engineer but particularly inqueuing theory.Useful when a system can be in a countable number of“states” (e.g. number of people in a queue, number ofpackets in a buffer and so on).Useful when transitions between “states” can be considered asa probabilistic process.Helps us analyse queues.

Andrey Andreyevich Markov (1856 -1922)

Introducing Markov chains – the puzzled professorA1/23/41/41/31/2BC2/3Professor has lost his class. He tries lecture theatres A, B andC.Every day he moves to a different lecture theatre to look.He moves with probabilities as shown on the diagram.

The puzzled professor (2)Want to answer questionssuch as:What is probability he is inroom A on day n?Where is he most likely to“end up”?A1/23/41/41/31/2BC2/3First step – make system formal. Numbered states for rooms0, 1 2 for A, B, C.Let pij be the probability of moving from room i to j on a day(pii 0).Let λi,j be the probability he is in room j on day i.Let λi (λi,0 , λi,1 , λi,2 ) be the vector of probabilities for dayi.For example λ0 (1, 0, 0) means definitely in room A ( room0) on day 0.

The puzzled prof (3)Define the probabilitytransition matrix P.AWrite down the equation forday n in terms of day n 1.1/23/41/41/31/2We have:Pλn,j i λn 1,i pij .B2/3Transition matrix p00 p01 p02P p10 p11 p12 .p20 p21 p22Matrix equation is λi λi 1 P.C

Equilibrium probabilitiesThe matrix equation lets us calculate probabilities on a givenday but where does prof “end up”.Define “equilibrium probabilities” for states πi limn λn,i .Think of this as probability prof is in room i as time goes on.Define equilibrium vector π (π0 , π1 , π2 ).Can be shown that for a finite connected aperiodic chain thisvector exists is unique and does not depend on start.From λi λi 1 P then π πP.This vector and the requirement that probabilities sum to oneuniquely defines πi for all i.

Equilibrium probabilities – balance equationsThe matrix equation for π can also be thought of as “balanceequations”.That is in equilibrium, at every state the flow in a state is thesum of the flow going into it.Pπj i pij πi for all j (in matrix terms π πP).PThis and i πi 1 are enough to solve the equations for πi .

Equilibrium probabilities – balance equationsThe matrix equation for π can also be thought of as “balanceequations”.That is in equilibrium, at every state the flow in a state is thesum of the flow going into it.Pπj i pij πi for all j (in matrix terms π πP).PThis and i πi 1 are enough to solve the equations for πi .π0 π1 π2 1probabilities sum to oneπ1 p10 π2 p20 π0balance for room 0π0 p01 π2 p21 π1balance for room 1π0 p02 π1 p12 π2balance for room 2Solves as π0 16/55, π1 21/55 and π2 18/55 for prof.

Markov chain summaryA Markov chain is defined by a set of states and theprobability of moving between them.This type of Markov chain is a discrete time homogeneousMarkov chain.Continuous time Markov chains allow transitions at any timenot just once per “day”.Heterogenous Markov chains allow the transition probabilitiesto vary as time changes.Like the Poisson process, the Markov chain is “memoryless”.Markov chains can be used in many types of problem solving,particularly queues.

Markov recapBefore going on to do some examples, a recap.pij is the transition probability – the probability of movingfrom state i to state j the next iteration of the chain.The transition matrix P is the matrix of the pij .πi is the equilibrium probability – the probability that after a“long time” the chain will be in state i.The sum of πi must be one (the chain must be in some state).PEach state has a balance equation πi j πj pji .The balance equations together with the sum of πi will solvethe chain (one redundant equation – why?).

The google page rank exampleDid you know google owes part of its success to Markovchains?“Pagerank” (named after Larry Page) was how googleoriginally ranked search queries.Pagerank tries to work out which web page matching a searchterm is the most important.Pages with many links to them are very “important” but it isalso important that the “importance” of the linking pagecounts.Here we consider a very simplified version.(Note that Larry Page is now a multi-billionaire thanks toMarkov chains).

kittenweb – pagerank exampleImagine these four web pages are every web page aboutkittens and cats on the web.An arrow indicates a link from one page to another – e.g.”Lol cats” and ”Cat web” link to each other.

Kittenweb – pagerank exampleNow think of a user randomly clicking on “cats/kittens” links.What page will the user visit most often – this is a Markovchain.“Lolcats” links to two other pages so 1/2 probability ofvisiting “Cat web” next.“Cat web” only links to “Lol cats” so probability 1 of visitingthat next.

Kittenweb – pagerank example 0001 1/2 00 1/2 .P 0001 0 1/2 1/2 0

Kittenweb – pagerank example 0001 1/2 00 1/2 .P 0001 0 1/2 1/2 0π0 π1 /2π1 π3 /2π2 π3 /2π0 π1 π2 π3 1miss equation for π3

Kittenweb – pagerank exampleπ0 π1 /2π1 π3 /2π2 π3 /2π0 π1 π2 π3 1miss equation for π3

Kittenweb – pagerank exampleπ0 π1 /2π1 π3 /2π2 π3 /2π0 π1 π2 π3 1π1 π2 from lines 2 and 3.miss equation for π3

Kittenweb – pagerank exampleπ0 π1 /2π1 π3 /2π2 π3 /2π0 π1 π2 π3 1π1 π2 from lines 2 and 3.π1 2π0 π3 /2 from line 1 and 3.miss equation for π3

Kittenweb – pagerank exampleπ0 π1 /2π1 π3 /2π2 π3 /2miss equation for π3π0 π1 π2 π3 1π1 π2 from lines 2 and 3.π1 2π0 π3 /2 from line 1 and 3.π1 /2 π1 π1 2π1 1 from line 4 and above lines.

Kittenweb – pagerank exampleπ0 π1 /2π1 π3 /2π2 π3 /2miss equation for π3π0 π1 π2 π3 1π1 π2 from lines 2 and 3.π1 2π0 π3 /2 from line 1 and 3.π1 /2 π1 π1 2π1 1 from line 4 and above lines.π1 2/9 π0 1/9 π2 2/9 π3 4/9

Kittenweb – pagerank exampleπ1 2/9 π0 1/9 π2 2/9 π3 4/9So this page shows “Lol Cats” is the most important page,followed by “Cat web” and “Kitten pics” equally important.Note that pages 0,1 and 2 all have only one incoming link butare not equally important.Nowadays google has made many optimisations to theiralgorithm (and this is a simplified version anyway).Nonetheless this “random walk on a graph” principle remainsimportant in many network models.

Queuing analysis of the leaky bucket modelA “leaky bucket” is a mechanism for managing buffers and tosmooth downstream flow.What is described here is sometimes known as a “tokenbucket”.A queue holds a stock of “permit” generated at a rate r (onepermit every 1/r seconds) up to a maximum of W .A packet cannot leave the queue without a permit – eachpacket takes one permit.The idea is that a short burst of traffic can be accommodatedbut a longer burst is smoothed to ensure that downstream cancope.Assume that packets arrive as a Poisson process at rate λ.A Markov model will be used [Bertsekas and Gallager page515].

Modelling the leaky bucketUse a discrete time Markov chain where we stay in each state fortime 1/r seconds (the time taken to generate one permit). Let akbe the probability that k packets arrive in one time period. Sincearrivals are Poisson,ak e λ/r (λ/r )k.k!Queue of permits(arrive every 1/r seconds)Rest of internetPoisson input to queueExit queue for packets with permits

A Markov chain model of the situationIn one time period (length 1/r secs) one token is generated (unlessW exist) and some may be used sending packets.States i {0, 1, . . . , W } represent no packets waiting and W ipermits available. States i {W 1, W 2, . . .} represent 0tokens and i W packets waiting.If k packets arrive we move from state i to state i k 1 (exceptfrom state 0).Transition probabilities from i to j, pi,j given by a0 a1 i j 0pi,j aj i 1 j i 1 0otherwise

Continuous time Markov chainsThe Markov chains we looked at are “discrete” time – assumethat one transition occurs every time unit.What if we want to drop this?We need to study “continuous time” Markov chains.As it turns out this is quite easy if the maths is treatedcarefully answers are nearly the same.

Continuous time Markov chainsConsider a chain with states numbered from 0.Time step is some small δt andTransition probabilities from i to j given by pij δt. P(δt) 1 p00 δtp01 δtp02 δt.p10 δt1 p11 δtp12 δt. .p20 δtp21 δt1 p22 δt . . . .Note slightly “strange” definition of p0 0 (why?)

Continuous time Markov chainsDefine the following (assuming that the states of the chain arenumbered (0, 1, 2, . . . ).X (t) is the state of the chain at some time t 0.f(t) (f0 (t), f1 (t), . . . ) is the vector of probabilities at time t,formally fi (t) P [X (t) i].qij (t1 , t2 ) where t1 t2 is P [X (t2 ) j X (t1 ) i].Since the context is still homogeneous chains then theseprobabilities are just a function of τ t2 t1 . Hence, define fori 6 jqij (τ ) qij (t2 t1 ) qij (t1 , t2 ) P [X (τ ) j X (0) i] .

Continuous time Markov chainsDefine the following (assuming that the states of the chain arenumbered (0, 1, 2, . . . ).X (t) is the state of the chain at some time t 0.f(t) (f0 (t), f1 (t), . . . ) is the vector of probabilities at time t,formally fi (t) P [X (t) i].qij (t1 , t2 ) where t1 t2 is P [X (t2 ) j X (t1 ) i].Since the context is still homogeneous chains then theseprobabilities are just a function of τ t2 t1 . Hence, define fori 6 jqij (τ ) qij (t2 t1 ) qij (t1 , t2 ) P [X (τ ) j X (0) i] .Define the limitqij (τ ).τ 0τqij lim

Continuous time Markov chainsWe can show in the limit if transitions are Poisson (see notes)XXdfi (t) fi (t)qij fj (t)qji .dtj6 ij6 i

Continuous time Markov chainsWe can show in the limit if transitions are Poisson (see notes)XXdfi (t) fi (t)qij fj (t)qji .dtj6 ij6 iIt is handy to defineqii Xi6 jqij .

Continuous time Markov chainsWe can show in the limit if transitions are Poisson (see notes)XXdfi (t) fi (t)qij fj (t)qji .dtj6 ij6 iIt is handy to defineqii Xqij .i6 jNow, define the matrix Q q00 q01 q02 . . .q10 q11 q12 . . . .q20 q21 q22 . . . . . .

Continuous time Markov chainsWe can show in the limit if transitions are Poisson (see notes)XXdfi (t) fi (t)qij fj (t)qji .dtj6 ij6 iIt is handy to defineqii Xqij .i6 jNow, define the matrix Q q00 q01 q02 . . .q10 q11 q12 . . . .q20 q21 q22 . . . . . .It can now be seen thatdf(t) f(t)Q.dt

Completing the continuous Markov Chain (1)Notice thatQ P(1) I,where I is the identity matrix.Assume the chain is finite and there are no disconnected states.Now the equilibrium probabilities can be calculated. In this caseπ lim f(t).t

Completing the continuous Markov Chain (1)Notice thatQ P(1) I,where I is the identity matrix.Assume the chain is finite and there are no disconnected states.Now the equilibrium probabilities can be calculated. In this caseπ lim f(t).t ThereforeπQ d limt f(t) dt

Completing the continuous Markov Chain (1)Notice thatQ P(1) I,where I is the identity matrix.Assume the chain is finite and there are no disconnected states.Now the equilibrium probabilities can be calculated. In this caseπ lim f(t).t ThereforeπQ d limt f(t)d lim f(t) dtdt t

Completing the continuous Markov Chain (1)Notice thatQ P(1) I,where I is the identity matrix.Assume the chain is finite and there are no disconnected states.Now the equilibrium probabilities can be calculated. In this caseπ lim f(t).t ThereforeπQ d limt f(t)ddπ lim f(t) 0.dtdt t dt

Completing the continuous Markov Chain (2)This gives a new version of our balance equations. For all i thenXπj qji 0jExpanding qjj from its definition and multiplying by 1 givesXXπi qij πj qji 0.j6 ij6 iThis can also be seen as a balance of flows into and out of thestate. For all i:XXπi qij πj qji(output) (input)j6 iAlso as usualj6 iPiπi 1.

The “talking on the phone” exampleIf I am talking on the phone, I will hang up as a Poissonprocess with a rate h (for hang up).If I am not talking on the phone, I will decide to start a newcall as a Poisson process with rate t (for talk).At a given time what is the probability I am talking on thephone.Unsurprisingly this can be modelled as a Markov chain.This example may seem “trivial” but several such chains couldbe use to model how occupied the phone network is.

The “talking on the phone” exampleh01tOur chain has two states 0 (talking) and 1 (not talking) and thetransition matrix: h hQ .t tWe need our new balance equations:

The “talking on the phone” exampleh01tOur chain has two states 0 (talking) and 1 (not talking) and thetransition matrix: h hQ .t tWe need our new balance equations:State 0 – (output) hπ0 tπ1 (input)

The “talking on the phone” exampleh01tOur chain has two states 0 (talking) and 1 (not talking) and thetransition matrix: h hQ .t tWe need our new balance equations:State 0 – (output) hπ0 tπ1 (input)State 1 – (output) tπ1 hπ0 (input)

The “talking on the phone” exampleh01tState 0 – (output) hπ0 tπ1 (input)State 1 – (output) tπ1 hπ0 (input)

The “talking on the phone” exampleh01tState 0 – (output) hπ0 tπ1 (input)State 1 – (output) tπ1 hπ0 (input)We also need π0 π1 1 which gives from state 0hπ0 t(1 π0 ).

The “talking on the phone” exampleh01tState 0 – (output) hπ0 tπ1 (input)State 1 – (output) tπ1 hπ0 (input)We also need π0 π1 1 which gives from state 0hπ0 t(1 π0 ).Rearrange to π0 t/(h t) and π1 h/(h t).Interpretation – the proportion of time talking (state 0) isproportional to the talking rate t and the proportion hung upis proportional to the hangup rate.

Markov Chain reminderBalance equationsN equations for N states of chain – but only N 1 independent.Discrete – For all i thenXπj pji πi(input) (original probability)jContinuous – For all i thenXXπj qji πi qijj6 ij6 iProbabilities sum to onePFor both types i πi 1.(input) (output)

Lecture summaryStochastic processesStochastic processes are processes describing how a system evolveswhich are in some way “random” but can lead to interestingbehaviour.Poisson processesPoisson processes are stochastic processes which can be thought ofas describing arrivals. They are memoryless and have many usefulmathematical properties.Markov chainsMarkov chains are stochastic processes which describe theevolution of a system between states. They remember only theircurrent state and can be used to model a wide variety of situations.

A simple stochastic process { the drunkard’s walk Random walk { or drunkard’s walk A student walks home from a party. They start at a distance X(0) from some point. At every step they (randomly) gets either one unit closer (probability p) or one unit further away. X