Icne Baidu's Best Practice With Low Latency Networks

Transcription

Baidu’s Best Practice with Low Latency NetworksFeng GaoIEEE 802 IC NENDOrlando, FLNovember 2017Presented by Huawei

Low Latency Network Solutions011. Background Introduction2. Network Latency Analysis3. Low Latency Network Solutions4. Best Practice

Background IntroductionArtificial IntelligenceHigh PerformanceComputing CloudReal Time Big DataAnalysis Latency-sensitive applications are deployed and developed in Data Centers,from the simple pursuit of high bandwidth, non-blocking to the pursuit of lowlatency, no packet loss Bandwidth-centric network design is switched to latency-centric design.Reduce the jitter of latency.

Network Latency AnalysisL P PhotoelectricpropagationdelayFixedS SerializationdelayNNodeforwardingdelay D Retransmissionand np- pipelineLow latency ChipReduce thedeviceforwarding delay10G- 40G- 100GLarge Capacity ChipReduceSerializationdelayNo packet lossReduce packetre-transmissiondelayRoCEiWARPHost AccelerationReduce hostprocessingdelayInfinibandPCIEDCBIBECN

Low Latency Solution : Host AccelerationCredit CNPPFC QCNPFC ECNRDMA vs TCP/IPRoCEv2 Kernel Bypass brought by RDMA reduces Compatible with current Ethernet-based DCNthe latency on the Host Low CAPEX/OPEX Easy to deploy, easy to reuse the operationcapability.

Low Latency Solution : PFC ECN1、PFC(Priority Flow Control) is a kind of back-pressure protocol based on priority queues. Congestion nodesends Pause frame to notify upstream node to stop sending to prevent buffer overflow and packet loss.PFC problemsHOL BlockingPFC deadlockPFC unfairnessPause Storm2、ECN(Explicit Congestion Notification) is a kind of end to end congestion control mechanism based on theflow.CNPNICSWNICCENICLong control loopSWSWSWComplexity ofsetting thresholdCNPNICNICSWRandomnessDifferentcongestion controlmechanisms onNICs

Low Latency Solution : Network Architecture 2*100GE1,Non-blocking when scale-out;Speedup 1:12, Speedup 1 when Fan-out;Speedup 4:3

Best Practice -1Evaluation objects:1. PFC only and ECN PFC2. Under different network utilization and speedup ratioSpeedup 1:1Speedup 4:3Conclusions:1、ECN PFC outperforms PFC under different kinds of network utilization.2、Speedup ratio profits the efficiency of the network: the higher, the better.3、Threshold should be configured properly:provided the headroom, PFC threshold should be setas high as possible. ECN threshold should be set based on traffic pattern.

Best Practice - 2Evaluation objects:1. DCQCN:PFC ECN2. New Solution:TOR downlinks enable ECN, Per-Packet Load BalancingConclusions:Introduce incast flows(75%)1、Need to involve an ideal load balancing algorithm: increase the speedup ratio could mitigate thecongestion of Fabric's internal ports, but the packet loss caused by uneven distribution of traffic stillexists.

Innovations on Low Latency Network Technology021.Control Plane – Feedback Mechanism2.Data Plane – Multipath Load Balancing3.Management Plane – Self Adaptive Network4.Function enhancement : Queuing Optimization

Control Plane – Feedback Mechanism OptimizationTraditional Congestion NotificationCongestion Notification / Packet Loss NotificationFeedback info is simple Only mark congested/uncongested, no quantizedcongestion information.Notification Message improvement Involve congestion notification mechanismwith more quantized levels, not two status.Notification loop is long NIC generates the congestion notification, thecontrol loop is long. Congestion notification packet is mixed withnormal traffic, without prioritization design.Multiple ways to accelerate Switch feedback the congestion/packet lossdirectly, shorten the control loop Set a higher priority to notification message TCP fast retransmission

Data Plane – Multipath Load BalancingDynamic load balancingThe traditional hash algorithmdistributes traffic unevenly In multi-path scenario, as using flow 5tuple based hash algorithm, elephantflows may map to the same link,introducing persistent congestion on thelink.New multi-path load balancing Select a idle path based on measuredload of multi paths Use the length of the egress queue as ahash key of load balancing algorithm Cut elephant flows into flowlets,schedule to different paths and make sureno out-of-order.

Management Plane – Adaptive NetworkAnalyzerLow latency network puts forward higherrequirements for operation and maintenancemanagement automation According to the severe requirements of packet loss andlatency, the network configuration needs to bedynamically adapted to ensure the online configuration isalways best.Effect of the Adaptive Network1. Detection and discovery Traffic measurement, mark the information along thenetwork nodes (timestamp, ingress port, egress port,queue)2. Computing and characteristic analysis Analyze real-time service characteristics, calculate theoptimal scheduling strategy3. Instruction distribution and continuous optimization According to the traffic pattern, self configure anddynamically tune the parameters.

Function enhancement : Queuing OptimizationBack-pressureBack-pressureTraffic characteristicsElephant flows: contribute 80 percent of total traffic. Packet losshas little influence to the whole performance. Latency nonsensitive.Mice flows: contribute 20 percent of the data traffic load. Packetloss has serious influence to the whole performance. Latencysensitive.Advantages of technical solutions low latency: isolate the congested flows,make non-congested flows low latency. High throughput: buffer congested flows along the path, fully utilize the link capacity, notslow down the mice flow Quick response: zero hop response

Summary03

SummaryBusinessOrientationInternal requirementsfrom Baidu Cloud &Artificial intelligenceapplicationsNetworkOrientationUnder the overall layoutof the network, achievenetwork accelerationwithin the partial datacenter networkProductOrientationPromote industrialdevelopment,products need to beoptimizedArchitectureEvolutionInvest in small-scale .Optimize and iterategradually

THANKS

Management Plane - Adaptive Network Low latency network puts forward higher requirements for operation and maintenance management automation According to the severe requirements of packet loss and latency, the network configuration needs to be dynamically adapted to ensure the online configuration is always best.