White Paper - Intel Builders

Transcription

White PaperUnlocking Edge Performance with the Capgemini EngineeringENSCONCE/Intel Smart Edge Open PlatformIn this white paper we document the increased performance that results when the Intel Smart Edge Open toolkitis integrated into the Capgemini ENSCONCE multi-edge compute platform. Benchmarking data shows increasedthroughput and reduced latency and jitter in typical edge computing use-cases when utilizing Intel Smart Edge Openoptimizations that leverage CPU pinning and I/O acceleration features found within Intel Xeon Scalable processors.AuthorsNilanjan SamajdarCapgemini EngineeringAnurag RanjanIntelTable of ContentsIntroduction . . . . . . . . . . . . . . . . . . . . . 1Capgemini ENSCONCE Platform inEmerging Use Cases . . . . . . . . . . . . . . 1ENSCONCE Platform Overview. . . . 2Intel Smart Edge Open. . . . . . . . . . . . 2Performance Optimization CaseStudy. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Key Metrics for Observation. . . . . 3System under Test. . . . . . . . . . . . . . 3Network Configurations . . . . . . . . 4Test Applications. . . . . . . . . . . . . . . 4Test Scenario. . . . . . . . . . . . . . . . . . . . . 4Base Configurations . . . . . . . . . . . . 4Optimizations on BaseConfiguration . . . . . . . . . . . . . . . . . . 4Benchmarking Observations andAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . 5Throughput. . . . . . . . . . . . . . . . . . . . 5Round Trip Time. . . . . . . . . . . . . . . . 5Conclusions. . . . . . . . . . . . . . . . . . . . . . 6IntroductionThe adoption of new technologies such as machine learning and artificialintelligence is driving advances across industries and transforming businessmodels. These technologies are also driving changes in network architecture.Visual computing and machine learning require large data sets to be processedquickly, which has typically been addressed by cloud computing. Robotics andautomation, however, require low-latency communications and processing toenable real-time decision making. This limits the effectiveness of cloud computingas a solution. The emergence of Multi-access Edge Computing (MEC), which bringsthe benefits of cloud computing to the edge of the network, results in lower latencyand bandwidth while retaining the flexibility and scalability of a cloud architecture.The Capgemini ENSCONCE was developed to deliver a cloud-native MECplatform that interfaces with multiple access networks. Capgemini and Intel havecollaborated on a new version of the ENSCONCE platform that integrates the Intel Smart Edge Open toolkit. The combination allows service providers to rapidlydevelop and deploy edge services that increase computing and I/O performancewhile also reducing latency to support emerging solutions like automation andmachine learning.This whitepaper reviews the architecture of the new Capgemini ENSCONCEplatform with the Intel Smart Edge Open toolkit. It presents a sample workloadrequiring high throughput, low latency, and low jitter, while running othercomputing workloads. The study uses Intel Smart Edge Open building blocks toapply workload pinning to specific cores and to apply SR-IOV to the networkingstack. The results show the Capgemini ENSCONCE platform with Intel Smart EdgeOpen delivers significant improvements in throughput and latency even underincreased CPU compute loads.Capgemini ENSCONCE Platform in Emerging Use CasesThe ENSCONCE platform addresses a variety of use cases that require reliable,low-latency communication between an end device and an edge application, andhigh compute density at the edge to reduce communication to a central cloud.These use cases are found in a variety of domains, including manufacturing,government, and transportation, among others. Robotics, automation, machinelearning, and visual computing require large amounts of data to be processedquickly to allow real-time solutions. The rollout of 5G, with its higher bandwidthand lower latency than previous wireless standards, is increasing the need for edgecomputing solutions that can take advantage of speed and latency benefits of 5G.The ENSCONCE platform can be located either at the network edge, such asat a wireless base station, or on-premises at a customer location. By locatingcomputing capabilities at the edge of the network, the ENSCONCE platform avoidsthe long latencies and high bandwidth costs when sending data to a farawaydatacenter for processing. By leveraging Kubernetes containerization, applicationscan be easily deployed and quickly scaled to meet demand.1

White Paper Unlocking Edge Performance with the Capgemini ENSCONCE/Intel Smart Edge Open PlatformThe ENSCONCE platform is also well-suited to serve Vehicleto Everything (V2X) and other use cases where deviceschange location, often at the high speeds associated withtransportation. To provide consistent quality of service, theENSCONCE platform monitors the end user location andthe network load and migrates the edge application to edgeplatforms at the closest location as necessary.ENSCONCE Platform OverviewThe ENSCONCE Platform consists of two key functions: ENSCONCE-Central provides orchestration functionsboth for applications running locally, as well as on remoteedge platforms. ENSCONCE-Central utilizes controlplane components from Intel Smart Edge Open to enableresource discovery and application scheduling. This allowsENSCONCE to run applications where it is most efficient.ENSCONCE-Edge provides local platform and applicationmanagement, including application scheduling andhardware accelerator management.The ENSCONCE platform can run on bare-metal servers aswell as virtual machines, depending on customer preferenceand design decisions. The ENSCONCE platform supportsstandard services, such as: Firewall, NAT, DNS Traffic management and load balancing Kubernetes and Docker for standardized applicationcontainerization and orchestration Host management services including logging, applicationtracing, and inventory Identity and Access Management Prometheus monitoring Logging frameworkA high-level visualization of the ENSCONCE platformarchitecture is shown in Figure 1.Intel Smart Edge OpenIntel Smart Edge Open is a royalty-free edge computingsoftware toolkit that enables highly optimized andperformant edge platforms to on-board and manageapplications and network functions with cloud-like agilityacross any type of network. It is built on standardized APIsand open source software tools. Intel Smart Edge Openenables developers using the ENSCONCE platform to: easily migrate applications from the cloud to the edge byabstracting network complexity. enable secure on-boarding and management ofapplications with an intuitive web-based GUI. leverage standards-based building blocks for functionssuch as access termination, traffic steering, multi-tenancy,authentication, telemetry, and appliance discovery andcontrol.The ENSCONCE Platform leverages performance-enhancingcapabilities available in Intel Smart Edge Open, including:Single Root I/O Virtualization (SR-IOV) for AcceleratedNetworkingSR-IOV allows a single physical PCI resource, such as anEthernet controller, to be virtualized and allocated to specificapplication and network functions. This allows traffic tobypass the software switch layer of the hypervisor and to bedelivered directly to the appropriate virtual environment,reducing software overhead and delivering networkperformance comparable to nonvirtualized environments.CPU Manager for Kubernetes (CMK) and Huge PagesCMK allows Kubernetes containers to be pinned to specificprocessor cores on Intel Xeon Scalable processors, whichimproves performance by keeping recently used data inthe on-chip cache and reducing cache misses that degradeperformance. Huge Pages improves memory allocation andmanagement in virtualized applications, resulting in betterperformance for memory intensive edge applications.Figure 1. ENSCONCE platform architecture2

White Paper Unlocking Edge Performance with the Capgemini ENSCONCE/Intel Smart Edge Open PlatformOpen Virtual Switch (OVS) for Accelerated NetworkingIntel Smart Edge Open uses Data Plane Development Kit(DPDK), which provides a set of libraries and drivers foroffloading packet processing from the kernel. By leveraginghardware features in Intel Xeon Scalable processors, DPDKresults in higher computing efficiency and higher packetthroughput when running OVS.1.The delay of the network path between the device andthe edge platform.2.The delay caused by traffic routing within the edgeplatform.3.The delay caused by processing the data in the edgeapplication.Hardware AcceleratorsIntel Smart Edge Open includes plug-ins to support Intel FPGA Programmable Acceleration Card N3000 (Intel FPGAPAC), High Density Deep Learning (HDDL) accelerator,and Visual Cloud Accelerator Card for Analytics (VCAC-A).These plugins allow a Kube-native environment in whichKubernetes manages the hardware accelerators along withother hardware resources.4.The delay caused by other system loads in the edgeplatform caused by multiple applications competing forhardware resources. Because an edge platform mustprovide high compute density to be economical, a heavysystem load is to be expected during normal operation.Applications that are sensitive to latency and jitter willbe negatively impacted during heavy system loadingwithout appropriate solutions.OpenVINO ToolkitIntel Smart Edge Open enables tight integration betweenhardware accelerators and the OpenVINO toolkit, allowinghigh-performance visual analytics and natural languageprocessing on the ENSCONCE platform.Mobile Network IntegrationIntel Smart Edge Open provides reference implementationsfor certain 5G core network functions. Network ExposureFunction (NEF) and Application Function (AF) help steertraffic from an end user to an edge application. Intel SmartEdge Open provides a Core Network Configuration Agent(CNCA) function to interact with the NEF and AF. CapgeminiEngineering works with mobile network operators to integratethese reference implementations into their core network.Performance Optimization Case StudyEdge-edge applications can vary from use-case to use-case;however, they can be represented by a simplified model:transfer data, process data, and return a response or forwardthe data to a different service. The crucial factors of this usecase are:Factor 1 is determined by the access network architectureand will not be covered in this analysis. Factors 2–4 areinfluenced by Capgemini ENSCONCE platform architecture,and the performance impact will be analyzed.Key Metrics for ObservationRound Trip Time (RTT)RTT (in milliseconds) refers to the average time for a messagefrom the client device to travel to the edge platform andback. The test uses a simple echo application so that dataprocessing time is negligible.ThroughputThroughput (in Mbps) refers to the rate at which datapasses through the system from client to server. Becausedata processing time is negligible in this test, throughputmeasures the end-to-end performance of the system.System under TestWe evaluate the performance of the ENSCONCE platformusing an edge cluster deployment as described below anddepicted in Figure 2 and Table 1.Figure 2. Reference deployment of ENSCONCE Platform as the System Under Test3

White Paper Unlocking Edge Performance with the Capgemini ENSCONCE/Intel Smart Edge Open PlatformHardwarePlatform:2nd Gen Intel Xeon Scalable Processor FamilyProcessors2 x Intel Xeon Gold 6230Processor(24 cores, 48 threads, 2.1GHz per CPU, microcode:0x5003003)Memory192GBSoftwareHost OSUbuntu 18.04 (kernel 4.15.0)OpenStack (Rocky)Docker 18.06.3-ceKubernetes v1.18ENSCONCE Platform 5.6.2.0OpenNESS 20.06Intel CPU Manager for Kubernetesv1.4.1OVS-DPDK v2.13.0 1024 MB RAMSR-IOV CNI v2.0.0SR-IOV plugin v3.1Userspace CNI v1.2HostDevice CNI 0.8.6KubeOVN CNI 1.2.1Local Storage1 x 1TB SSD SATA orEquivalent Boot Drive.Table 1. Reference Hardware and Software Configuration forENSCONCE Platform System Under TestThe ENSCONCE Platform is deployed on an OpenStack VMrunning Ubuntu, configured with: 8 virtual cores, with 0-4 cores reserved for CPU pinning 16 GB DRAM ENSCONCE Platform 5.6, incorporating OpenNESS 20.06 Ubuntu VMs on servers running Intel Xeon Goldprocessors.The edge client connects to the edge application running onthe ENSCONCE Platform through a connected edge router.Network ConfigurationsDelays caused by packet routing within the edge platform(Factor 2) can be improved via multiple approaches. Thefollowing network variations were tested in this analysis:OVS with DPDK mode SR-IOVThese test applications were provided the followingresources:1 vCPU core2 x Intel Ethernet NetworkAdapter X710 iPerf server and client applications, L3-Forwarder, andPacket Generator were used to measure performance. Testswere conducted over a span of five minutes, with multipleiterations of iPerf performing TCP upload and downloadtests with 128 kB and larger packet sizes. Multus CNI v3.3OVS with kernel modeTest ApplicationsDPDK 19.11Networking: Delays caused by other applications on the platformcompeting for resources (Factor 4) can be tested by pinningthe test application to sequestered cores, and by varyingthe number and compute intensity of “noisy neighbor”applications. Noisy neighbor workloads are generated via theLinux stress tool, which loads all available cores by runningmultiple threads doing sqrt() and malloc() operations.Delays caused by application processing (Factor 3) varywidely depending on the application. In this analysis, we useda simple application that echoes the input to the output.Test ScenarioWe evaluated platform performance of traffic representinga client connecting to an edge application. This scenariorepresents common dataflows occurring on an edge platform.The following configurations were tested to measure impacton delays caused by packet processing within the platformand delays caused by noisy neighbor applications:1.Kernel mode OVS Layer-2 Switch as the edgenetworking layer2.Core pinning via Intel CPU Manager for Kubernetes3.Kernel mode OVS with DPDK/OVS and using Kube-OVNplugin.4.Kernel mode OVS with SR-IOVBase Configurations“Quiet” consists of an ENSCONCE Platform, configured witha vanilla OpenStack VM based edge deployment, with theVMs connected through kernel mode OVS bridging. iPerfclient traffic is sent from an external-client to an iPerf-serveredge application.“Noisy” consists of the “Quiet” configuration with additionalworkloads competing for CPU, memory, and networkresources in a noisy neighbor scenario.Optimizations on Base Configuration“Noisy” Core Pinning modifies “Noisy” by adding exclusivecores to the edge application using the CMK Scheduler builtinto Intel Smart Edge Open. This scenario is implementedwith extra CPU workloads generating a noisy neighborscenario.“Noisy” SR-IOV & Core Pinning augments Core Pinning byadding SR-IOV Virtual Functions (VFs) to the edge applicationusing the Intel Smart Edge Open SR-IOV Plugin and CNI.This scenario is also implemented with extra CPU workloadsgenerating a ‘noisy neighbor’ scenario.These different configurations were benchmarked with theiPerf traffic test and relative performance was measured.4

White Paper Unlocking Edge Performance with the Capgemini ENSCONCE/Intel Smart Edge Open PlatformTest ScenarioTest Configuration“Quiet” KubeOVN with kernel mode OVS Without noisy neighbour“Noisy” KubeOVN with kernel mode OVS With noisy neighbour“Noisy” Core Pinning KubeOVN with kernel mode OVS With CPU pinning With noisy neighbour“Noisy” SR-IOV &Core Pinning SR-IOV based interface With CPU pinning With noisy neighbourTable 2. Test ScenariosBenchmarking Observations and AnalysisThroughputFigure 4. Throughput improvement as a percentage value ofbaselineFurther testing revealed that this test’s throughput waslimited by the 10 Gb network interface. When running theSR-IOV and core pinning scenario both with and withoutnoisy neighbors, throughput was the same, indicating thatwith the Intel optimizations packet processing throughput isnot impacted by other applications on the platform.Round Trip TimeIn Figure 5, Round Trip Time (RTT) for “Quiet” is normalizedto 100%.When competing applications are added, “Noisy” RTTincreases to 160% of baseline as packet processing isnegatively impacted by competition for resources.Core pinning reduces the impact on RTT somewhat, reducingRTT to 120% of baseline.Adding SR-IOV combined with core pinning reduces RTTto 98-99% of “Quiet” despite having other applicationscompeting for platform resources.Figure 3. Impact of Intel optimizations on ThroughputNot only does the combination of core pinning and SRIOV reduce average RTT, it also reduces RTT variability asshown in Figure 6.The “Quiet” configuration with no competing applicationrunning is normalized to 100%.When competing applications are introduced, “Noisy”throughput drops to about 60% of baseline as competitionfor resources limits packet processing performance.When core pinning is introduced via Intel CMK, throughputincreases to 68% of the baseline. This is due to efficienciesgained by allowing packet processing data to remain in thecache of the pinned core, rather than being flushed whenother applications try to use the same core.When SR-IOV and core pinning are used together, throughputincreases to 118% of the baseline. By combining corepinning with the I/O virtualization options, packet processingefficiency in the presence of noisy neighbors is even betterthan an unoptimized system with any other applicationscompeting for resources.Figure 5. Packet RTT (Round Trip Time) as a percentage valueof baseline5

White Paper Unlocking Edge Performance with the Capgemini ENSCONCE/Intel Smart Edge Open PlatformFigure 6. Minimum, mean, and maximum RTT across configurationsConclusionsIntel Smart Edge Open performance optimizations, including core pinning and SR-IOV, significantly boost performancefor edge applications, especially when the edge platform is heavily loaded. By integrating these optimizations, CapgeminiENSCONCE platform delivers more deterministic behavior, with better throughput and lower RTT, to help edge applicationsmeet expected QoS levels even when heavily loaded with other applications. This combination enhances the ability todeploy and manage applications in an edge network, and to optimize the performance of those applications for emergingtechnology solutions.Notices & DisclaimersPerformance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex .Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available u pdates. See backup for configuration details. No product or component can be absolutely secure.Your costs and results may vary.Intel technologies may require enabled hardware, software or service activation.Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy. Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.Printed in USA347107-001 ENPlease Recycle0621/FZ/ICKK/PDF6

System under Test We evaluate the performance of the ENSCONCE platform using an edge cluster deployment as described below and depicted in Figure 2 and Table 1. White Paper Unlocking Edge Performance with the Capgemini ENSCONCE/Intel Smart Edge Open Platform Figure 2 .Reference deployment of ENSCONCE Platform as the System Under Test 3