Mellanox Seamlessly Integrates With OpenStack, Increasing . - Nvidia

Transcription

WHITE PAPERMellanox Seamlessly Integrates With OpenStack,Increasing Efficiency and Reducing Operational CostsHigh-speed Ethernet with hardware offloads deliver total infrastructureefficiency for NFV and Cloud Data CentersExecutive SummaryOpenStack, commonly referred to as "the Linux of the Cloud", allowscompanies to utilize open-source initiatives and a transparent andcollaborative approach to implement public and private cloud solutions toachieve business agility, infrastructure elasticity and operational simplicity.While OpenStack provides a flexible framework, it is particularly importantthat the cloud infrastructure, composed of compute, network and storageresources, runs at maximum performance and efficiency to guaranteeoverall application performance. This paper discusses the requirements forcloud network infrastructure to properly support web-scale IT with Red HatOpenStack as the cloud management platform, and tightly integrated withMellanox end to end networking solutions.Challenges to Efficient Cloud DeploymentIn order to achieve multi-tenancy and automation goals, cloud deploymentsoften leverage disaggregation and virtualization technologies. However, thiscomes at a cost of significant performance penalties which manifestthemselves as low data communication and storage access performance,and heightened CPU utilization. As a result, organizations compensate byover-provisioning CPU cores to improve the performance resulting in a largerhardware footprint and capital expenditures, thus reducing totalinfrastructure efficiency.Understanding Compute Virtualization PenaltiesIn virtualized environments, multiple virtual machine (VM) instances runsimultaneously over physical server hardware. This has necessitated virtualswitch software that often reside alongside the hypervisor in the OS kernelto handle network I/O traffic to and from VMs. While virtualization doesKey Benefits Achieve 10X Better Performance thanvanilla OVS with ASAP2 OVS Offloads Free up 100% CPU Cores with ASAP2 The Best Industry Throughput and 50%CapEx Savings for DPDK Improve Latency by 20X with SR-IOV Improve CPU Utilization by 80% withOverlay Networks Offloads Deliver 6X Throughput with RDMA Improve TCO by Marrying Bare-Metaland Virtualized OpenStack Clouds withMellanox Spectrum Switches andConnectX Intelligent Network Adapters

bring flexibility, it also results in significantly degraded I/Operformance degradation due to increased layers ofprocessing in software and the CPU is burdened with amajority of the virtual network I/O processing.(OVS), OVS over DPDK, Network Overlay Virtualization, SRIOV, and RDMA.Avoid Network Virtualization PenaltyOpen vSwitch (OVS), vRouter, VPP and Linux Bridge aresome of the most popular virtual switch platforms used inOpenStack cloud deployments today. OVS hardwareoffloads accelerate the traditional slow virtual switch packetperformance by an order of magnitude. Essentially, offeringthe best of both worlds: Hardware acceleration of the datapath (fast-path) for high-throughput flows along withunmodified standard OVS control path for flexibility andprogramming of match-action rules.Overlay tunneling protocols such as VXLAN, NVGRE orGENEV are not all recognized by every server NIC andprocessing these new packet formats without NIC hardwareoffload needs to be done by the OS kernel in CPU, resultingin lower and nondeterministic I/O performance andincreased CPU load.Solving Storage Virtualization PenaltiesThe TCP/IP protocol stack is not the most efficient to powerstorage networks. The protocols were designed toincorporate a lot of handshakes between endpoints, and it isalmost impossible to offload all protocol handling operationsinto the NIC hardware. As a result, complex protocolsoftware needs to run in the CPU. These can result in lowstorage access bandwidth, low IOPS, and high CPUoverhead.To overcome these penalties and achieve ultimateinfrastructure efficiency and application performance, cloudoperators are looking to implement efficient virtual networksolutions that provide excellent virtualization and bypass theTCP/IP stack in processing storage IO to achieve accelerationand efficiency in cloud networks.Mellanox OpenStack Cloud NetworkSolutionThrough an end-to-end suite of interconnect products ofadapters, switches, cable/optics, and associated networkdriver and management software, Mellanox enables clouddata centers to achieve the highest efficiency through ahigh-performance, low-latency cloud network with richnetwork offload with acceleration and automation features.Mellanox can mitigate the above-mentioned penalties,delivering cloud networks that can handle line-rateprocessing at 10, 25, 40, 50, and 100Gb/s speeds,supporting high-throughput, high-IOPS storage operations,with minimal CPU overhead so that infrastructure resourcescan be dedicated to actual application workload.Increase OVS Efficiency with ASAP2Mellanox offers an open source and high performance OVSoffload solution called Accelerated Switching and PacketProcessing (ASAP2). ASAP2 fully and transparently offloadsnetworking functions such as overlays, routing, security andload balancing to the adapter’s embedded switch (e-switch).ASAP2 provides throughput of 66Mpps for small packets and line rate performance for large packets while completelyfreeing up the CPU cores.ASAP2 is a significant improvement over traditionalapproaches because it ensures that SDN and networkprogrammability capabilities are maintained, and at thesame time, network I/O achieves highest performance oncompute nodes. With virtual switch/router being utilized inalmost all production OpenStack deployments, it makessense to use ASAP2 with any virtual switch/routerimplementation to give a tremendous performance boost interms of higher packet throughput and lower latencies, andimprove cloud efficiency. Mellanox ASAP2 is fully integratedwith RHEL 7.5 and Red Hat OSP 13.The Best Performance for DPDKOvercoming PenaltiesData Plane Development Kit (DPDK) reduces overheadcaused by interrupts that are sent each time a new packetarrives for processing. DPDK implements a polling processfor new packets to achieve the key benefits of significantlyimproving processing performance while eliminating PCIoverhead and maintaining hardware independence.Although DPDK technology consumes CPU cycles, MellanoxConnectX-5 adapters offer the industry’s highest bare-metalpacket rate of 139 million packet per second for runningOVS or VNF cloud applications over DPDK. Mellanox DPDK isfully Red Hat supported for RHEL 7.5 and OSP 13.Mellanox provides the foundation for efficient cloudinfrastructure through disaggregation and virtualizationsolutions that mitigate performance penalties associatedwith compute, network, virtualization and storage. Thisenables cloud applications to run at the highest performanceand efficiency. Mellanox achieves higher cloud efficiencythrough the following solutions; Open vSwitch OffloadsMellanox started offloading VXLAN protocol processing tothe NIC since the ConnectX-3 generation of NICs. The VXLANOffload feature enables the NIC to handle statelessprocessing of VXLAN packets such as checksum calculation,Network Virtualization with VXLANOffload, and VTEP Gateway

(100GbE/IB)Figure 1. Comprehensive OS Integration for Mellanox Switch and AdapterReceive Side Scaling (RSS), Large Segmentation Offload(LSO), etc, significantly improving throughput and latencyperformance, and reducing CPU overhead associated withoverlay packet processing. In addition to VXLAN, MellanoxNICs also support offload of other overlay encapsulationprotocols such as NVGRE and GENEVE.Oftentimes VXLAN networks needs to communicate withother networks such as VLAN networks that supports baremetal servers, or wide area networks for data centerinterconnect, and North-South user traffic. This necessitateVXLAN Tunnel Endpoint (VTEP) gateway which allow toconnect VXLAN networks to other type (i.e., VLAN) ofnetworks. Mellanox Spectrum switches support VTEPgateway functionality in hardware, ensuring highestperformance when heterogeneous networks in the cloudcommunicate with each other.Overcome Compute VirtualizationPenalty with SR-IOVSingle Root I/O Virtualization (SR-IOV) allows a device, suchas a network adapter, to separate access to its resourcesamong various PCIe hardware functions. This allows trafficstreams to be delivered directly between the virtualmachines and their associated PCIe partitions, givingapplications direct access to the I/O hardware. As a result,the I/O overhead in the software emulation layer iseliminated. SR-IOV enables VMs to achieve networkperformance that is nearly the same as in non-virtualizedenvironments.Mellanox NICs support basic SR-IOV as well as advancedfeatures such as SR-IOV High Availability (HA) and Quality ofService (QoS). SR-IOV HA provides a redundancy mechanismfor VFs by using Link Aggregation Group (LAG) to bind twoVFs from two different port on the same NIC together, andexposes the bundle as one VF to the VM. When one VF inthe bundle fails, the other VF continues forwarding trafficwithout affecting VM I/O operations.Overcome Storage VirtualizationPenalty with RDMA/RoCEThe large overhead associated with stateful protocols suchas TCP dictates that it is not an ideal transport protocol forsoftware defined scale-out storage applications, especiallywhen storage media gets faster as it transitions from harddisks to solid-state drives (SSD) to Non-Volatile Memory(NVM). Remote Direct Memory Access (RDMA), on the otherhand, is a protocol designed for high-speed links within datacenter environment that can overcome the inefficiencies ofTCP. RDMA can run over InfiniBand (IB) or over ConvergedEthernet (RoCE). RDMA kernel bypass, read and writenetwork semantics and full transaction offload to RDMAcapable NIC devices guarantee the highest possiblethroughput, lowest latency, and minimal CPU overhead,making it ideal for storage access. Typically RDMA overConverged Ethernet (RoCE) requires the network to beconfigured for lossless operation, however, Mellanox hasrecently enhanced RoCE with built-in error recoverymechanisms. While a lossless network has never been astrict requirement, customers typically configure

Mellanox provides Remote DirectMemory Access (RDMA) capabilities toimprove storage performance by up to6X as compared to conventionaladapters without RDMA supporttheir networks to prevent packet loss and ensure the bestperformance. With this new version, RoCE can be deployedon ordinary, Ethernet networks. By utilizing RDMA or RoCE,virtual servers can achieve much higher I/O performancebecause the majority of packet processing is offloaded tothe NIC. This further enables increased performance,improved latencies and significantly reduced CPU overhead.The net effect is an improvement of overall server andapplication efficiencies.Open Composable NetworksThe goal of Mellanox is to enable OpenStack deploymentswith a fully integrated suite of high-performance, highlyprogrammable networking components including switches,network adapters, optical modules and cables. The key tobeing open and composable is to support open APIs andstandard interfaces, as well as disaggregating hardware fromsoftware and allowing choice of network operating systems.Mellanox embraces this open philosophy, which completelyand truly frees organization from vendor lock-in, all the waydown to the switch silicon level. Mellanox Spectrumswitching silicon offers a choice of popular open networkoperating systems like Cumulus, Sonic, and Mellanox Onyxwhile providing best in class hardware that delivers zeropacket loss, fair traffic distribution and more predictableapplication performance compared to merchant silicon inother switch offerings. Mellanox NEO , networkingorchestration and management software, is a powerfulplatform for management, monitoring, and visualization ofscale-out networks.OpenFlow On Spectrum SwitchesMellanox Spectrum switches are fully open, withoutlocking of cables or features. All features are availableincluding IP unnumbered BGP for the underlay and VTEP forthe overlay, with controller integrations or EVPN. Spectrumis Mellanox’s 10/25/40/50 and 100Gb/s Ethernet switch thatis optimized for SDN to enable flexible and efficient datacenter fabrics with leading port density, low latency, zeropacket loss, and non-blocking traffic flow.From the ground up, starting at the switch silicon level,Spectrum is designed with a flexible processing capacity sothat it can accommodate a programmable OpenFlowpipeline that enables packets to be sent to subsequenttables for further processing, and allows metadata to becommunicated between OpenFlow tables. In addition,Spectrum is an OpenFlow-hybrid switch that supports bothOpenFlow operation and normal Ethernet switchingoperation simultaneously. Users can configure OpenFlow atport level, assigning some Spectrum ports to performOpenFlow based packet processing operations and others toperform normal Ethernet switching operations. Spectrumcan even mix and match algorithms on the same switch portby using a classification mechanism to direct traffic to eitherthe OpenFlow pipeline or traditional Ethernet processing.Figure 2. Mellanox’s Comprehensive Cloud Partner Ecosystem

Controller:OrLayer 3NetworkController-Less (EVPN)VTEPL3 VTEPLayer 2NetworkNICVTEPASAP2VMVMHypervisor/OVSVMVMLegacy NICVTEPVMVirtualized oadSR-IOVHypervisor/OVSNICL2 VTEPVTEPOverlaySDNControllerAppBare Metal ServersFigure 3. Mellanox’s OpenStack VTEP Integration for Virtualized and Bare-Metal CloudsVTEP Support In Spectrum SwitchesThere is a need for VXLAN Termination End Point (VTEP) toconnect bare-metal servers to the virtual networks inOpenStack. This is best handled by the switch hardware asopposed to on the Hypervisor which can create a bottleneckin the OVS. VTEP can be implemented on Spectrum tooffload overlay network technologies such as VXLAN, NVGREor Geneve. Hardware VTEPs, like the ones deployed bySpectrum can achieve higher performance but pose addedcomplexity on a ToR switch since there is a need for theswitch to be VM-aware, which requires a large forwardingtable be maintained for VM Mac address or VLAN to VXLANtranslations. For this reason a high performance switch likeSpectrum is required. Mellanox Spectrum supports VTEPgateway functionalities which make it ideal to be deployedas: Layer 2 VTEP gateway between virtualized networksusing VXLAN and non-virtualized networks using VLAN inthe same data center or between data centers. Layer 2VTEP gateway to provide high-performance connectionto virtualized servers across Layer 3 networks and enableLayer 2 features such as VM live migration (VMotion). Onvirtualized server hosts, where the NIC does not haveVTEP capability or software VTEP can’t meet the networkI/O performance requirement, the VTEP can beimplemented on Mellanox Spectrum ToR. In some cases, the application running in the VM maydesire to use advanced networking features such asRemote Direct Memory Access (RDMA) for inter-VMcommunication or access to storage. RDMA needs to runin SR-IOV mode on virtualized servers and in cases whenMellanox NIC is not present, the VTEP is bestimplemented in the ToR. Mellanox Spectrum is the idealswitch to build an ethernet storage fabric. It leveragesthe speed, flexibility, and cost efficiencies of Ethernetwith the best switching hardware and software packagedin ideal form factors to provide performance, scalability,intelligence, high availability, and simplified managementfor storage. Layer 3 VTEP gateway that provides VXLAN routingcapability for traffic between different VXLAN virtualnetworks, or for north-south traffic between an VXLANnetwork and a VPN network or the Internet. This featureis supported in Spectrum hardware with cumulusnetwork operating system.Overlay SDN can be deployed to achieve networkvirtualization and automation without requiring upgrades ofphysical networking equipment, more specifically, networkdevices that are NOT the VTEPs. Beyond the virtualizedenvironment with VXLAN/NVGRE/GENEVE, there are oftenBare Metal Servers (BMS) or legacy networks that can onlyuse VLAN, or North-South traffic that goes out to a VPN

network or the Internet. In those cases, using a softwareVTEP gateway adds an extra hop or and performancebottleneck can occur. Best practice is to use the ToR that theBMS is connected to as hardware VTEP. This achieves linerate performance while saving costs. Mellanox NEO is aethernet fabric monitoring and provisioning tool forMellanox NICs and switches. NEO is pre-integrated withOpenStack Horizon and supports VTEP provisioning ofConnectX NICs using Neutron ML2 mechanism drivers andSpectrum switch using Ironic conductors.Marrying Bare-Metal and VirtualizedOpenStack CloudsServers equipped with Mellanox ConnectX-4/ConnectX-5provides stateless VXLAN offloads with a simple driverconfiguration. Further, ConnectX-5 is the best networkadapter to provision a high performance software VTEP perserver for inter-VM or VM to public networkcommunication. Mellanox NICs are integrated with popularcommercial SDN controllers such as Nuage VirtualizedServices Platform(VSP), or with open source controllers suchas OpenDaylight. Thus, with ConnectX adapters customerscan easily build an efficient and non-blocking virtualizedOpenStack cloud with ASAP2 OVS offload technology.Mellanox Spectrum is an ideal switch to terminate VXLANsfor bare-metal servers in a multitenant cloud. With SDNoverlay controller solutions such as Nuage VSP, OpenContrail and vmware NSX, VXLANs can be terminated on theToR to bare metal servers and traditional VLAN segments. Inan environment where a SDN controller isn’t needed,Spectrum switches can support a controller-less VXLANoverlay network using standard BGP EVPN protocol. Such asolution eliminates cost of controller licenses and isinteroperable with other switches that support standardBGP EVPN based VXLANs.ConclusionAs an industry leader in high-performance networkingtechnologies, Mellanox understands the risks and rewards oftransforming a data center. As IT organizations transition tocloud-based and service-centric infrastructures, the need forgaining network and server efficiencies is paramount totransition beyond 10Gb server I/O. By combining keytechnologies from the adapter and switch, Mellanox is ableto accelerate virtual and bare-metal networks and reduceCPU utilization through hardware-based offloads forincreased scalability, greater flexibility and highest efficiencyin the modern software-defined data centers to bringhighest Return on Investment to our customers.Learn more about Mellanox and OpenStackMellanox OpenStack Reference /mellanox-openstacksolution.pdfMellanox Red Hat OpenStack Reference .pdCeph White ers/WP Deploying Ceph over High Performance Networks.pdfMellanox Scale-Out Open Ethernet Products:http://www.mellanox.com/page/ethernet switch overviewWith a common SDN control layer across ConnectX softwareVTEPs and Spectrum hardware VTEPs, customers can easilyunify deployment and operations of bare-metal andvirtualized OpenStack clouds. Further, Mellanox’sintegration with SDN ecosystem partners enables a turnkeyOpenStack cloud solution that achieves lower total cost ofownership.350 Oakmead Parkway, Suite 100, Sunnyvale, CA 94085Tel: 408-970-3400 Fax: 408-970-3403www.mellanox.com Copyright 2018. Mellanox Technologies. All rights reserved.Mellanox, Mellanox logo, and ConnectX are registered trademarks of Mellanox Technologies, Ltd. Mellanox NEO is a trademark of Mellanox Technologies, Ltd.All other trademarks are property of their respective owners.v8.04.17

almost all production OpenStack deployments, it makes sense to use ASAP. 2. with any virtual switch/router implementation to give a tremendous performance boost in terms of higher packet throughput and lower latencies, and improve cloud efficiency. Mellanox ASAP. 2. is fully integrated with RHEL 7.5 and Red Hat OSP 13. The Best Performance for DPDK