Fault Tolerant Service Function Chaining

Transcription

Fault Tolerant ServiceFunction ChainingM. GHA ZNAV I, E. JALALPO U R, B . WO NG, R. B O U TABA , A . MASHTIZADEHUNIVERSITY OF WATERLOO

Middleboxes and Service Function ChainsFirewallIDSNATInternet1

formr twods ofmidy im-100Percent Contributionntrubalandhorte imppli, andn usebility,ortedMiddlebox Failures81High Severity IncidentsPopulation7550433625201120L2 SwitchesPercent ContributionuNavendu JainMicrosoft Researchnavendu@microsoft.com31L3 RoutersMiddleboxesOthers100Demystifying the dark side of the middle:75A field study of middlebox failures in datacenters5035IMC201342251190vityyr32

Middlebox Fault ToleranceAlice NATBobInternetNAT3

Consistent State ReplicationAliceBob BingAlice AppleNATBobInternetNATBob BingAlice Apple4

Consistent State ReplicationAliceBob BingAlice AppleNATBobInternetNATBob Bing5

Previous ApproachesEXTERNALIZED STATESNAPSHOT BASEDStatelessNF, NSDI 2017Pico Replication, SoCC 2013CHC, NSDI 2019FTMB, SIGCOMM 2015REINFORCE, CoNEXT 20186

Externalized State ApproachNATInternetRead/WriteFault Tolerant Data StoreState7

Snapshot Based ApproachesAliceBob BingAlice AppleNATBobInternetNATBob BingAlice ApplePrimary stateReplicated state8

Snapshot Based Approaches for a ChainFWFWHighIDSLatencyLow ThroughputIDPSNATInternetNATPrimary stateReplicated state9

Our ApproachFault TolerantFFirewallIF’IDSI’NNAT Primary stateReplicated state10

GoalsConsistent state replication to tolerate ! middlebox failuresMinimizing performance overhead during normal operationMinimizing disruption during middlebox failures11

Fault Tolerant Chaining (FTC)In-chain replication Replicates a chain’s state instead of the state of individual middleboxes Each middlebox’s state replicated to subsequent ! middlebox serversTransactional packet processing Simplifies the development of multi-threaded middleboxes Improves scalability and performanceData dependency vectors Enables concurrent state replication12

Normal Operationm1Forw.r1m21r22m32r33Buffer13

Normal Operationm1Forw.r1m211r22m32r33Buffer14

Normal Operationm1Forw.3r1m211r22m32r33Buffer15

Failure Recoverym1Forw.3r1 m211r22m32r33Bufferm22’r2Primary stateReplicated state16

Transactional Packet ProcessingExisting approaches Single thread or batched packet processing FTMB: multi threaded packet processing Tracking state changes in granularity of each state variable read/write Frequent periodic state snapshotsOur approach Packet transaction model for concurrent packet processing Using isolation property to tracking state changes in granularity of packet transactions17

Data Dependency VectorsTracking data changes instead of thread operationsEnabling different number of threads at the middlebox and replicas Fail over to smaller machine Scale up to a larger machineMiddlebox ProductIPSecWANOptimizerWAFThroughput CPU CoreHP VSR1001268 Mbps1HP VSR1008926 Mbps8STEELHEAD CCX770M 10 Mbps2STEELHEAD CCX1555M 50 Mbps4Barracuda Level 1100 Mbps1Barracuda Level 5200 Mbps218

Data Dependency Vectors ExampleMiddleboxW(1)1⟨0,x,x⟩2R(1), W(3)⟨1,x,4⟩Middlebox’s dependency vector:⟨0,3,4⟩Replica ⟨0,3,4⟩ ⟨0,x,x⟩4hold3⟨0,3,4⟩ ⟨1,x,4⟩?5⟨1,3,4⟩ ⟨1,x,4⟩ 1⟨0,x,x⟩⟨1,3,4⟩ 2 ⟨2,3,5 ⟩⟨1,x,4⟩Replica’s dependency vector:⟨0,3,4⟩4⟨1,3,4⟩ 5 ⟨2,3,5 ⟩19

EvaluationMETHODENVIRONMENTSComparing FTC with:A cluster of 12 serversNF, Non-Fault tolerant system Ideal performanceFTMB (SIGCOMM 2015) State logging SnapshotsFTMB Snapshot (SIGCOMM 2015) State logging Snapshots 40 Gbps networkSAVI Cloud environment A virtual network of OVS switchesMoonGen and pktGen traffic generators UDP traffic Packet size: 256 B20

Fault Tolerant NATsNFThroughput (Mpps)10FTC2x higherthroughputFTMB864201248Threads21

Fault Tolerant Chains – ThroughputThroughput (Mpps)10NFFTMB SnapshotFTMBFTC1.8x higher3.5xthroughput8639% drop dueto snapshots4202345Chain Length22

Fault Tolerant Chains – LatencyNFPackets 0Latency (µs)23

ConclusionKeep operation of a chain of middleboxes online after ! middleboxes fail In-chain replication Transactional packet processing Data dependency vectors24

Fault Tolerant Service Function Chaining M. GHAZNAVI, E. JALALPOUR, B