Migrating To A Cloud-Native Architecture - Accenture

Transcription

Shifting your applianceto a cloud native architectureBy Aneel Kumar, Badri Narayan RDand Ram Ramalingam

ContentsIntroduction 3What does it mean to be cloud native?4The cloud native landscape5Why use a Kubernetes platform architecture?6-7What is the new way of service developmentand deployment in a cloud native world?8What are the distinguished complexitiesthat need to be considered when designinga microservices or platform architecture?9Why is there a need for a service mesh?10-12What is the recommended gateway13for microservices?What are recommendations for CI/CD,DevSecOps and NFR in pipelines?14-17What are other best practices to follow?18Conclusions 19GREAT MIGRATION2

IntroductionMonolith legacy applications can be cumbersome and face issuesthat more nimble modern cloud native platforms do not. Cost isa challenge, as hardware and resources must be pre-provisioned.However, a bigger issue is that the individual components andservices of monolith applications cannot be scaled and deployedseparately. Many legacy applications were designed usingtechnologies such as integrated java or .Net stack. These stacksinclude a user interface (UI), backend, middleware and database,all integrated as one deployable component within the applicationserver. Many were designed for traditional hardware-baseddevices or appliances. Hardware appliances, (e.g., those providingnetworking features such as network traffic management, securityand load balancing) and application stacks delivered as appliancesdo not allow application components to be independently scaled.Therefore, they cannot take advantage of a huge benefit of thecloud: scaling based on the need.With enterprise boundaries blurring every day, the need for aplatform ecosystem is becoming imperative, which is driving agrowing need to overcome some of the above hurdles and migratelegacy apps and appliances to a cloud native architecture. Byleveraging modern cloud native platforms enterprises can indeedmake elephants (aka legacy apps) fly. Here we discuss multipleimportant aspects of the platform and migration.GREAT MIGRATION3

What does it meanto be Cloud native?To be cloud native, applications need to have the following architecture elements:MicroservicesContainer packagedDynamically managedCloud agnosticService Oriented Architecturehas evolved into a moreloosely coupled microservicesarchitecture. Modern architectureis microservices-oriented andbased on the 12 factor appprinciples. Microservices enablegreater agility and speed,experimentation, innovation andthe ability to pick the right toolfor a service. Support for polyglotservice is another new agerequirement.Containers are required forvelocity, portability, reliability,efficiency, self-service andisolation. Containers, which arestandalone, enable the abilityto isolate each microservicedeployment independently.This leads to higher levels ofcompute resource isolation,portability and scalability.Orchestration platforms suchas Kubernetes support dynamicscheduling and managing all thecontainers and correspondingresources deployed to theunderlying infrastructureplatform.Typically, product companiesdeliver their product in allforms of cloud infrastructureincluding on-premise(vSphere). Being cloud nativemakes the product available indifferent marketplaces (AWS,Azure, GCP etc.) and ensurescustomers are not locked intoany particular vendor.GREAT MIGRATION4

IaaSPaaSEnterprise ApplicationsIoT & VR/AR ServicesArtificial Intelligence ServicesAnalytics and Big DataMobile ServicesApp ServicesDevelopment ServicesManagement Automation ServicesSecurity Identity and Access ServicesEnterprise IntegrationDatabase ServicesCompute ServicesPersistency & StorageNetworkingThe cloud native landscapeTo support new age apps and services, cloud is a natural choice, whetherit is on private or public clouds. With regards to selecting platforms andtools (building blocks), the cloud native landscape is very big.Various Cloud Building BlocksSaaSGREAT MIGRATION5

Why use a Kubernetesplatform architecture?In general, a platform should seamlesslywork in a legacy datacenter, privateor public cloud but in a native waythat takes advantage of all cloudfeatures such as autoscaling, resourceoptimization, managed services, andserverless functions.A platform should use all of the same code base andmicroservices. Although several options can be picked fromthe above landscape to design a cloud native architecture,Kubernetes stands out as a de-facto platform for multi-cloud /cloud neutral architecture.A declarative platform – Legacy platforms with web serversare imperative platforms as actions need to be definedand adjusted or reprogrammed when something changes.Declarative platforms are state driven. Based on the desiredstate, the platform will try to adjust and achieve that state at alltimes. It is similar to how, when a thermostat is set to a certaintemperature (the desired state), it tries to maintain the room atthat temperature all the time. Kubernetes is a giant declarative,or state-driven, machine.“Kubernetes is the Linux of the cloud” – This statement, madeby Kelsey Hightower at Kubecon 2017, describes Kuberneteswell. Kubernetes was first released in mid-2015 and was thefirst project to graduate from the Cloud Native ComputingFoundation (CNCF). It is an open source cluster managementtool to automate, deploy, manage, and scale applications. Itcan run on bare metal offered by various cloud providers. Itis a natural choice for legacy apps that need to be supportedon multiple cloud providers and datacenters. Although it isprimarily labelled as a container orchestration platform, it is, ingeneral, a microservices platform that is declarative.GREAT MIGRATION6

Some of the main features of the Kubernetesplatform include the following:Ability to build twelve-factor appsRolling update or rollbackContainer grouping using podsResource monitoring and loggingSecurityAlpha/beta feature– Authentication and authorization– Auth tokens support (static token file, serviceaccount tokens, bootstrap tokens, open idconnect tokens, webhook token authentication)– Authorization modes (ABAC, RBAC, webhook,custom modules)– Role binding and cluster role binding– Auditing and audit logs– Secrets management– Security context (at pod / container level)– Network policy for pod communication– Encryption at RESTReplication controllerStorage managementResource monitoringHealth checkingService discoveryConfigMap and secretNetworking– With pod, intra pod, pod to service, external to serviceSelf-healingRolling deployment and rollbackAuto-scalability mainly horizontal autoscalingCI/CD integration, canary and blue/green deploymentsHigh availability and multiple zonesLogging and distributed tracingDNS managementMonitoringLoad balancingWHY USE A KUBERNETES PLATFORM ARCHITECTURE?GREAT MIGRATION7

What is the new way of servicedevelopment and deploymentin a cloud native world?In a cloud native world, declarative APIs offer many advantagesas a primary means of service development and deployment.Although both declarative and standalone APIs can use theKubernetes platform, declarative APIs are especially advantageousin green field environments, especially when migrating legacyapps from the ground up. All types are readable by kubectl andcan be viewed in the Kubernetes UI & dashboard. Also, resourcesare naturally scoped to cluster and all Kubernetes API supportfeatures are available. There is an operator framework availablefrom the CoreOS for this API development. In fact, the new way ofdelivering software is not a zipfile, tar ball or install shield but asoperators. For example, Kafka is available as an operator that canrun as a service in Kubernetes.GREAT MIGRATION8

What are the distinguishedcomplexities that need to beconsidered when designingmicroservices or platformarchitecture?Out of the standard complexities, the highlighted ones should be given additionalattention based on Accenture’s experience.API ManagementMonitoringAuto ScalingNetworkAutomated DeploymentObservabilityCircuit actsSecurityDistributed LoggingService DiscoveryDistributed TracingServicetoService CommunicationDistributed TransactionsTestabilityHealth CheckVersioningLoad BalancingMetrics Collection*Traditionally a circuit breaker is provided by application libraries and APIs,which are coded by the developers.GREAT MIGRATION9

Why is there a needfor a service mesh?Next gen microservices platforms need a service mesharchitecture to manage the many complexities previouslyidentified.A service mesh is a dedicated infrastructure layer for handlingservice-to-service communication which, in the legacy world, isachieved using appliances. Mesh is a layer of services across allenvironments that containerized applications and microservicescan be connected to as needed. The service mesh is responsiblefor the reliable delivery of requests through the complextopology of services that comprise a modern, cloud nativeapplication. It allows the decoupling of the application network,reliability, observability and security from service code. It doesthis in a programming language agnostic way.In practice, the service mesh is typically implemented as anarray of lightweight network proxies that are deployed alongsideapplication code, without the application needing to be aware.There is a central controller, which orchestrates the connections.Service traffic flows directly between proxies and the controlplane is aware of the interactions. The controller delivers accesscontrol policies and collects performance metrics. The controllereasily integrates with platforms like Kubernetes.Below are some of the advantagesof using a service mesh:A network forservices, not bytesResiliency &efficiencyTimeoutRetriesCircuit breakersTraffic controlHealth checksVisibilitySecurityPolicy enforcementLoad balancingwith automaticfailoverSystematic faultinjectionAdds fault toleranceto the application(no code changesrequired)GREAT MIGRATION10

Istio Service MeshIstio is an open-source service mesh project that was introducedin May 2017. It was formed through a partnership between Google,IBM, and Lyft. Istio is one of the key building blocks to the newKnative serverless platform being built by Google, Pivotal, IBM, RedHat and SAP. It can be deployed to any Kubernetes-based platform(on-prem and public cloud). Aspen mesh is the F5 network’senterprise ready version of Istio on multicluster and mulitcloud likeAKS (Azure Kubernetes Service), EKS (Elastic Kubernetes Service),GKS (Google Kubernetes Service) and PKS (Pivotal KubernetesService). Istio service mesh architecture primarily consists of a dataplane and a control plane.Data Plane: This consists of a fleet of intelligent envoy proxies,which are deployed as sidecars alongside each microservice.These sidecars intercept and control all the network traffic betweenservices using iptables. The circuit breaker in Istio operates more ina blackbox way (unlike Hystrix white box way in legacy apps) usingenvoy proxy and it is native to Kubernetes ecosystem running insidea Kubernetes cluster. The envoy proxies are where the followingcapabilities are implemented in a service mesh: Service discovery Health checks Load balancing Fault injection Circuit breaker Rich metrics and tracing Traffic routing TLS termination HTTP/2 and gRPC L7 filters ResiliencyWHY IS THERE A NEED FOR A SERVICE MESH?Control Plane: This manages and configures all the runtimecomponents across Istio with their corresponding rules andpolicies. It consists of three primary components: Pilot – is the configuration source for all the envoy sidecars.It provides service discovery details, routing rules, resiliencyconfigurations and authorization policies to all the envoy proxies. Mixer – enforces various policies across the service meshand collects telemetry data from the envoy proxy and otherservices. Mixer includes a flexible plugin model, which allowsIstio to plug in to different infrastructure backends if desired(e.g. metrics aggregation and visualization). Citadel – provides security features such as strong serviceto-service and end-user authentication with built-in identityand credential management. It facilitates mutual TransportLayer Security (mTLS) across the entire service mesh withouttouching the services. Service level authorization supportJSON Web Token (JWT) and Role Based Access Control (RBAC)is also available.Istio out-of-the-box metrics anddistributed tracing solution:Istio comes packaged with a Prometheus backend for metricsaggregation. For metrics visualization, Istio provides Grafana witha pre-built dashboard and Servicegraph for visualizing mesh callgraphs. It supports both Yaeger and Zipkin for distributed tracing,collecting and visualization. All of these are optional, and an inhouse solution can be plugged in, if desired.GREAT MIGRATION11

A typical “Kube Istio” implementation architecture depicting“North-South” and “East-West” Traffic ManagementUsersAPI Server(North)APIGatewayETCDMaster“Call me” MicroServices-1 (West)CONTROLPLANEEastWestIstioControlPlane andIstio RESTAPIIstio erIstioIngresCaller MicroService (South)TrRu affile csControlManagerEastWest“Call me” MicroServices-2 (East)Kubernetes ClustersWHY IS THERE A NEED FOR A SERVICE MESH?GREAT MIGRATION12

What is the recommendedgateway for microservices?There are many API gateways available in the marketsuch as APIGEE, Wso2, mulesoft, and axway. However, amicroservices gateway such as “Ambassador” that is native toKubernetes has its own advantages. The service developmentteam registers the API declaratively as part of the deploymentprocess. It is developer-centric with automated integrationtesting and gated API deployment. It allows client-driven APIversioning for compatibility, stability and facilitates canaryrouting for dynamic testing.A common question is whether Istio can be used as an APIgateway, since both Istio and Ambassador use envoy proxies?The main difference is that Istio handles “east-west” traffic,where as ambassador handles “north-south” traffic ie: trafficinto a customer’s data center or cloud deployment. Both aremanaged with different control planes.GREAT MIGRATION13

What are recommendations for CI/CD,DevSecOps and NFR in pipelines?Security and other non-functional requirements (NFR) tests canbe integrated in the DevOps pipeline itself. For CI/CD, a gitlab orJenkins pipeline can be used. Gitlab has direct integration withKubernetes clusters and its runners can directly execute tests onthe Kubernetes cluster. Various functional tests, such as unit tests,integration tests, E2E and NFR tests, are recommended. Mostfunctional tests can be executed in the CI part of the pipeline andremaining e2e scenarios and NFR can be executed in the CD partof the pipeline on a warm cluster using AKS, EKS, GKS and PKS.For appliance models, some providers may like to sell theirproducts as a virtual appliance, even though it is based onmicroservices architecture and is available in AWS or Azuremarketplaces. In this case the pipeline would consist of thefollowing elements using a warm Kubernetes cluster andKubernetes namespaces:The CI phase of the pipeline would consist of B Cqc Tu Tc Ti; where B stands for build, Cqc for Code quality check, Tufor unit testing, Tc for contract testing and Ti for integrationtesting. The CD phase of the pipeline would consist of Pami Tprov Tpack Te2e; where Pami stands for package AMI orAzure VM, Tprov for provisioning testing, Tpack for packagetesting and Te2e for end-to-end scenario testing.If it is a “true” microservices deployment it would have thefollowing pipeline elements: CI Phase would remain B Cqc Tu Tc Ti CD phase would have Blue/green deployment ofMicroservices (or similar deployment strategy) Te2e Tcanary testing on live clusterComponentRepo MasterFeature gGREAT MIGRATION14

Non-Functional Requirements (NFR):Deployment Strategy:When a product is delivered using this framework, several NFRdimensions needs to be addressed, such as:Typically organizations have custom deployment strategy withrespect to individual product requirements, however beloware some of the commonly adopted Kubernetes deploymentstrategies:1) Control plane performance testing of the APIs and scalability2) Data plan performance testing and scalability3) Resilience and high availability (Istio can be used forresiliency testing as well)4) Security. There are several security products and toolsthat need to be used with the Kube ecosystem5) Sizing and costing and total cost of ownership6) LongevityBranching Strategy:There are two main strategies GitFlow and GitHubFlow that areadopted today. GitHubFlow is a light weight process, whereasGitFlow is complex to handle scenarios such as: Discrete named or numbered releases Multiple versions of the software need to be supported andmaintained independently When there is a need to freeze development on a releasecandidate while still working on features for a subsequentreleaseRECREATE: Version A is terminated then version B is rolledout. Use this strategy if downtime is not a problem and noextra step is need in KubernetesRAMPED (also known as rolling-update or incremental):Version B is slowly rolled out and replacing version A. No extrastep is needed in Kubernetes. Use this strategy for statefulapplications with lesser no. of releases across instancesBLUE/GREEN: Version B is released alongside version A,then the traffic is switched to version B. Use this strategy forinstant release and rollback but at a costCANARY: Version B is released to a subset of users, thenproceeded to a full rollout using service meshA/B testing: Similar to Blue/Green but this strategy can beused when Version B needs to be released to a subset of usersunder a specific condition using service meshSHADOW: Version B receives real-world traffic alongsideversion A and doesn’t impact the response. Use this strategyto test performance at a costWe can use any of the above deployment strategies forKubernetes and Istio based deployments. In general, forautomatic rollbacks, Helm and Helm monitor plugin help inachieving this, it can recognize common failure pattern androllback to previous release.WHAT ARE RECOMMENDATIONS FOR CI/CD, DEVSECOPSAND NFR IN PIPELINES?GREAT MIGRATION15

Resiliency testing could be achievedusing the following: Node restart recovery time Node death recovery time API stress recovery time Data plane stress recovery time Container death recovery timeTools that can be used are ChaosKube, Powerfulseal, Pumba andKube-monkey. Chaoskube has more wide usage trends withactive contributors.Chaos testing and service resiliency testing can also be doneusing Istio (service mesh) fault injection.How to handle security?Many security tools can be employed, includingany of these: Injection Cross Site Request Forgery (CSRF) Cross Site Scripting (XSS) Cross Site Tracing (XST)Similarly, performance testing automation of the control plane APIand data plane can be done using (Jmeter or similar) tests. Broken Authentication and Session ManagementOperations and SRE: Security MisconfigurationService mesh (Istio) from SRE perspective provides three mainthings: abstraction, intelligence and extendibility. Operationalintelligence helps in the automation capabilities for SRE team.It provides golden metrics such as traffic, latency, errors andsaturation for monitoring distributed systems. Istio abstractsaway different policy and telemetry backend systems that makesit to be agnostic of those backends. It also provides telemetryfrom service mesh and reports it to Mixer, which can be routed tovarious other monitoring backend systems using adaptors. It haspolicy enforcement with respect to assigning quotas, securityaccess to end points etc., which is useful for SRE and operations.WHAT ARE RECOMMENDATIONS FOR CI/CD, DEVSECOPSAND NFR IN PIPELINES? Insecure Direct Object References Sensitive Data Exposure Using Components with Known Vulnerabilities Insecure HTTP Redirect HTTP Parameter pollution Denial of Service (DOS) Penetration testing (kube-hunter, nmap, Zap proxy,Wireshark, Kali Linux, IBM App scan, Nessus) Container Security (Clair, GoSec, Kube Benchand AquaSec) API Security (Burpsuite Pro, Peach API security)GREAT MIGRATION16

Here is an example of how security can be integrated into thepipeline itself (an example of “DevSecOps”):GovernanceOngoing OperationsStakeholdersScopeRequirement Security runway Standard UserStories Risk Rating Checklist, jobaidesDesignSD ElementsSelf service capabilities: Training availability Security team engagement Templates/job aides Threat modeling Security reviewMalwareFeature & HIPSFW Mgmt Pentest & Vulnerability Logging/& Runtime Simulation Scanning gifMgmtUser stories/requirements/Test cases/priorityBuildDeployTestBacklogCommit ID: 113Committer Conpile &PackageCodeAnalysisRun unittestsSTATICCreate STenvDYNAMICDeploycodeLoad testdataLifecycleRun testharnessCreatecluster envTear downST envTool monitoring, etcWHAT ARE RECOMMENDATIONS FOR CI/CD, DEVSECOPSAND NFR IN PIPELINES?Run PerftestRun SecuritytestTear downST envRun OpstestSecurity team (onshore or offshore) interprets report results Assists with remediation direction Provides in on demand capabilitiesCI/CD TeamsDeploycodeVulnerability managementBDD SecurityNote: Tools reflectedare examples only“PlatformTeams”GREAT MIGRATION17

What are other best practicesto follow? When architects are engineering platforms, it is possible for themto end up in “resume driven development,” which tends to makethem custom engineer many of the components. For example,engineers tend to write their own scaling mechanisms, when it’salready provided by the Kubernetes platform itself. Supportabilityof an in-house developed platform is a challenge. It is notpossible to fix or refactor everything. Be cautious of this trap andleverage, as much as possible, the platform features provided bythe Kubernetes and Istio ecosystem. A risk-based approach to testing and to end-to-end testingis important. Proper health checks, liveness and readinessprobes need to be configured on good-to-go services.It is very common for developers to focus too much onmaking things better as they learn more during the projectprogress. It’s good to get every engineer and developerinvolved and everyone should be aware of the full stackincluding deployment, otherwise it’s not possible to scale. It is better to have a live stack and an integrated master insteadof keeping different components in separate branches. A singlepipeline is risky but separate deployment pipelines with differentstacks also have disadvantages. It is important to determine thetradeoffs early on and ensure that a live cluster is running. In many cases, a complete “rip and replace” may not be requiredwhen creating microservices; it is all about ensuring correctdecoupling happens between components that can individuallyscale and be deployed. It is important to define high level scopeand accountabilities that are functional and business independentfor each micro service. Functional and technical decoupling ofmicroservices is a key maturity measure. Avoid too much of aprescriptive model so as not to prioritize “design elegance” overefficiency and effectiveness. Ensure reuse of microservices asmuch as possible with proper governance.GREAT MIGRATION18

ConclusionsBy leveraging modern Cloud Native Computing Foundationplatforms like Kubernetes and the Istio ecosystem, enterprisescan move away from monolithic legacy application environments.This new architecture will not only be cloud agnostic, but alsowill be robust and can flex to meet customer needs. Using newtechnologies requires a change in mindset, and organizations’cultures should facilitate this.Those willing to embark on thegreat migration can successfullymove from legacy environmentsto new cloud native architecturesand capture the advantages sucharchitectures offer.GREAT MIGRATION19

AuthorsAneel Kumaraneel.a.kumar@accenture.comBadri Narayan RDbadrinarayan.rd@accenture.comRam Ramalingamramadurai.ramalingam@accenture.comThis document makes descriptive reference to trademarksthat may be owned by others.The use of such trademarks herein is not an assertion ofownership of such trademarks by Accenture and is notintended to represent or imply the existence of an associationbetween Accenture and the lawful owners of such trademarks.ABOUT ACCENTUREAccenture is a leading global professional services company,providing a broad range of services and solutions in strategy,consulting, digital, technology and operations. Combiningunmatched experience and specialized skills across morethan 40 industries and all business functions— underpinnedby the world’s largest delivery network—Accenture works atthe intersection of business and technology to help clientsimprove their performance and create sustainable value fortheir stakeholders. With approximately 425,000 people servingclients in more than 120 countries, Accenture drives innovationto improve the way the world works and lives.Visit us at www.accenture.com

Microservices Service Oriented Architecture has evolved into a more loosely coupled microservices architecture. Modern architecture is microservices-oriented and based on the 12 factor app principles. Microservices enable greater agility and speed, experimentation, innovation and the ability to pick the right tool for a service. Support for .