Cloud Resource Orchestration Programming - WordPress

Transcription

Cloud ComputingCloud Resource OrchestrationProgrammingOverview, Issues, and DirectionsThe pervasiveness and power of cloud computing alleviates some of theproblems application administrators face in their existing hardware andlocally managed software environments. However, the rapid increase in scale,dynamicity, heterogeneity, and diversity of cloud resources necessitates havingexpert knowledge about programming complex orchestration operations (forexample, selection, deployment, monitoring, and runtime control) on thoseresources to achieve the desired quality of service. This article providesan overview of the key cloud resource types and resource orchestrationoperations, with special focus on research issues involved in programmingthose operations.Rajiv RanjanCSIRO and University of NewSouth WalesBoualem BenatallahUniversity of New South WalesSchahram DustdarVienna University of Technology,AustriaMichael P. PapazoglouTilburg University, the Netherlands2Over the last few years, cloud computing has emerged as the newmodel of distributed computing by offering hardware and softwareresources as virtualization-enabled services. Cloud computing1 providers suchas Amazon Web Services (AWS) andMicrosoft Azure give application owners the option to deploy their applicationover a network with a virtually infiniteresource pool with practically no upfront capital investment and with modestoperating costs. Today, cloud computingsystems (see /SP800-145.pdf)follow a service-driven, layered softwarearchitecture model (see Figure 1), withSoftware as a Service (SaaS), Platform asPublished by the IEEE Computer Societya Service (PaaS), and Infrastructure as aService (IaaS).Key to exploiting the potential ofcloud computing is the issue of resourceorchestration (RO).2,3 Based on the analysis of several research papers, commercialproducts, and analysts reports, we defineRO as the set of operations that cloudproviders (such as AWS) and applicationowners (such as Netflix) undertake (eithermanually or automatically via computer programs) for selecting, deploying,monitoring, and dynamically controlling the configuration of hardware andsoftware resources as a system of quality of service (QoS)-assured componentsthat can be seamlessly delivered to endusers. As Figure 1a shows, RO operations1089-7801/15/ 31.00 2015 IEEE IEEE INTERNET COMPUTING

SAASCloud Resource Orchestration ProgrammingPAASCross-cutting orchestrationoperationsEmailWebAppserver Database AutoscalingLoad-balancerContent oopMonitoring stormIAASVirtualization technologies (for example, Xen,VMWare, and KVM)CPUElasticIPBLOB rce owResource controlResourcemonitoring(c)Figure 1. Overview of a cloud computing system. (a) Reference cloud resource stack.The architecture provides a layeredapproach to characterizing resources based on their attributes and granularity. (b) High-level architecture of a multilayeredenterprise application consisting of clients, a load balancer,Web servers, application servers, and database management servers.The flow of requests between these layers is often complex. Each layer might instantiate multiple software resources, andeach software resource might need to be replicated on multiple hardware resources (for example, CPUs), while load balancersdistribute requests across instances of software resources. (c) Abstract resource orchestration (RO) operations in the lifecycle ofan enterprise application. (BLOB Binary Large Object; EBS Elastic Block Store; ERP enterprise resource planning; IaaS Infrastructure as a Service; PaaS Platform as a Service; SaaS Software as a Service; and VPN virtual private network.)span across all the layers of a cloud computingstack. The overall goal of RO is to ensure successful hosting and delivery of applications (SaaS)by meeting QoS objectives of cloud applicationowners (for example, maximizing availability and throughput, while minimizing latencyand avoiding an overload) and resource providersmay/june 2015 (maximizing usage, energy efficiency, profit, andso on), respectively. A recent report from the OpenData Center Alliance defines 19 usage scenarios ofRO, spanning across all three layers of the cloudstack (see www.opendatacenteralliance.org/docs/ODCA Service Orch MasterUM v1.0 Nov2012.pdf).3

Cloud ComputingProgramming RO is challenging, because cloudapplications are composed of heterogeneous software and hardware resources that are deployedacross the cloud stack and might have complexintegration and interoperation dependencies.Currently, orchestrating cloud resources requireshuman familiarity with the various providers andextensive manual programming. This is inadequate, given the dynamic variation of applicationresource requirements, and the proliferation ofautonomous and heterogeneous cloud service providers offering resources at different layers (IaaS,PaaS, and SaaS). Dynamic variation of applicationresource requirements4,5 arise from a number offactors, including resource capacity demand (suchas bandwidth, memory, and processing power),failures (of a network link or resource), end-useraccess patterns (number of users, request arrivalpattern burstiness, request service time distribution, and user location), and variations in resourceprices. Modern configuration management solutions such as Amazon OpsWorks and Puppet provide support for describing resource configurationover cloud services. However, even sophisticatedprofessional programmers and system administrators regularly resort to understanding different low-level cloud service APIs, command-linelanguages, Web interfaces, and procedural programming, to create and maintain complex cloudresource configurations. Given the importanceof resources orchestration to cloud service consumers, major cloud service providers are rapidly improving their cloud resource-managementcapabilities. Recent offerings such as CloudSwitch(see https://home.cloudswitch.com), Azure Fabric Controller (see http://fabriccontroller.net), andAWS CloudFormation (see http://aws.amazon.com/cloudformation) exemplify such trends.To help navigate this terrain, here we characterize cloud resources orchestration in a multilayered stack and highlight the main researchchallenges involved with programming orchestration operations for different cloud resource types.RO Operations for Hosting EnterpriseApplications on the CloudThe application architecture (such as contentdelivery networks, streaming Big Data analytics applications, and high-performance computingapplications) determines how, when, and whichorchestration operations should be affected on cloudresources. Though lack of space doesn’t permit discussion about all application architectures, here we4www.computer.org/internet/ discuss some orchestration operations for managing typical enterprise applications (see https://media.amazonwebservices.com/AWS Web Hosting Best Practices.pdf). Figure 1b depicts the highlevel architecture of an enterprise application, whichconsists of multiple software resource layers, including the presentation, business logic, and data layers.Across each layer, we must program a number oforchestration operations to control the resources atdesign time, as well as at runtime, to fulfill the QoSobjectives. We detail the operations in the followingparagraphs (see also Figure 1c).Selecting resources (at design and runtime). Anapplication owner analyzes candidate softwareresources to determine whether we can selectthem for realizing the required functionality satisfying certain resource requirements and constraints (for example, interoperability with othersoftware resources, compatibility with target hardware resources, cost, availability, and so on). Next,we select the compatible hardware resources thatwe can allocate to software resources.Deploying resources (both design time and runtime). This operation involves instantiating software resources on cloud services and configuringthem for communication and interoperation withother software resources. Integrating an applicationserver with the database server (see Figure 1b) is asalient example of this orchestration operation.Monitoring resources (runtime). Monitoring QoSattributes of cloud applications involves detecting event patterns (such as a load spike) frominformation produced by deployed resources (forexample, application usage statistics).Controlling resources (runtime). Based on eventpatterns detection, a resource orchestrator canreact to deviations in application behaviors andinitiate (policy-based) corrective actions, ideally without disrupting the runtime system. Anexample resource control operation could be tohorizontally scale a database server by migrating it from a small CPU resource configurationto an extra-large CPU resource in AWS ElasticCompute Cloud (EC2) for improving throughput.Cloud Resource Types andOrchestration ChallengesNow, let’s look at each resource type throughexamples (see Figure 1a) and analyze the coreIEEE INTERNET COMPUTING

Cloud Resource Orchestration Programmingresearch challenges involved with programming orchestration operations.IaaSThe CPU, storage, and network resources in cloudenvironments are supplied by a collection of datacenters installed with hundreds to thousands ofphysical resources such as cloud servers, storage repositories, and network backbone. Theseresources expose configuration attributes (seeTable 1) that define consumable features and functions that are available from hardware resources.Providers manage these physical resources throughhardware virtualization technologies, such asXen,6 Citrix, and VMware (see www.vmware.com/au/virtualization).A CPU resource is essentially a piece of virtualization software running on the physical cloudserver. It’s the most common method of exposingthe computational power to software resources;where we get finer-granularity accessibility andflexibility at the super-user level that can helpcustomize the placement of software resourcesfor QoS. The CPU resource emulates the propertiesof a physical CPU resource by providing a virtualCPU: a network card, physical memory, and harddisk. Table 1 shows orchestration operations relevant to IaaS resources.The second IaaS-level hardware resourcesare the Binary Large Object (BLOB) data storageresources, which let users store raw applicationdata on virtualized disks and access them anytime from any point on the Internet. BLOB storage(such as AWS Simple Storage Service) can holdvideo, audio, photos, and archived email messages, and let applications store and access datafrom any point on the Internet. This storage typeaims to enforce fault-tolerant behavior throughredundancy. For example, Azure provides different levels of redundancy7,8 options for its BLOBand other types of storage resources (queues andtables), including local redundant storage, georedundant storage, and read access–geo redundant storage.A CPU resource has access to its local harddisk. However, by default, the local disk is nonpersistent; once the instance of a CPU resource isterminated, its local storage contents are purged.To overcome this issue, cloud providers offeroff-instance storage resources that persist independently from the life of a CPU resource. Theseoff-instance storage resources are referred to asthe Elastic Block Store (EBS) and XDrive in AWSmay/june 2015 EC2 and Microsoft Azure, respectively. Principaladvantages of designing applications using offinstance storage include the following: automaticdata replication — this prevents data loss due toa single point of failure; and point-in-time datasnapshot creation and backup to cloud-specificBLOB storage resources.As the need for high-volume data transfer andcommunication across network boundaries growsfor applications, networking resources (for example, routers, switches and communication bandwidth, AWS elastic IP, OpenFlow, and the AWSsecurity group) become a vital component at theIaaS level. Network resources provide a variety offunctionality, including bandwidth, virtual overlays for isolating traffic, guaranteeing messagedelivery delay, encrypting communication channels, and network monitoring.Programming IaaS-Layer OrchestrationOperationsHere we discuss research issues in programming orchestration operations at the IaaS layer.Selecting optimal IaaS resources. The diversity ofofferings at this layer leads to complex decisionmaking problems of optimal comparison andselection of IaaS resources from multiple cloudproviders. For example, how does an applicationengineer compare the cost and performance features of hardware resources offered by differentproviders such as AWS and Azure? Similarly, anengineer can choose one provider for storageintensive applications and another for computation-intensive applications. During the selectionprocess, an engineer must consider many attributes (see Table 1), including goals, comparisonbenchmarks, and resource type alternatives. Themain research challenges include how to identify and formulate selection criteria and solvequalitative (that is, the virtualization format andcloud location) and quantitative (for example,minimizing response time and cost) QoS constraints while considering a large number of IaaSresource alternatives and application use cases.Existing approaches have focused on applyingcombinatorial optimization,9 evolutionary optimization,10 and multicriteria11 decision-makingtechniques for solving the selection problem.Controlling concurrency. Orchestration operationon a particular class of hardware resources (suchas a CPU resource) is enforced by invoking their5

Cloud ComputingTable 1. IaaS, PaaS, and SaaS resource types, their attributes, and list of supported orchestration rted orchestration operationsCPUCores, speed, family, physical memory capacity, storagecapacity, addressing bits, I/O performance, renting cost,type (single or cluster of templates), resource sharing(multitenant or dedicated), physical location of cloud,availability zone, availability, performance statistics, servicelevel agreement (SLA), security, privacy, and integrityStart, stop, restart, select, mount offinstance storage, monitor, reconfigure,assign IP, select cloud location, selectavailability zone, scale-in, scale-out,authorize, and authenticateBLOB storageCreate new buckets, upload file, downloadType (persistent or nonpersistent), storage size, storageformat, renting cost, location of host cloud, availability zone, file, scale-in, scale-out, monitor, encrypt,availability, performance statistics, SLA, security, privacy, and decrypt, authorize, and authenticateintegrityNetworkIP Type (static or dynamic), version (IPV4 or IPV6), rentingcost, message encryption cost, URL, data transfer-incost, data transfer-out cost, connection hour, availability,performance statistics, SLA, security, privacy, and integrityAllocation of IP addresses, URL, ports,availability zone,VPN to CPU resources,and monitorPaaSFeature (Web server, database server, load balancer,authorization server, and so on), virtualization format (suchas Xen and VMware), environment (host operating system,implementation language such as Java, .Net, PHP, or Rubyon Rails), legal and regulatory issues, security, reliability,integrity, licensing terms and costs, initialization scripts,availability, performance statistics, and SLAStart, stop, restart, select, allocate hardwareresources, integrate with other appliances,install script, monitor, create, migrate, scalein, scale-out, login, log-out, install software,replicate, synchronize, backup, delete,encrypt data, decrypt data, authorize, andauthenticateSaaSCustomize, accounting, billing, select, dataFeature (email, customer relationship management,porting, authentication, and authorizationERP, social networking, document management, andcrowdsourcing), legal and regulatory issues, security, privacy,integrity, reliability, licensing terms and costs, availability,performance statistics, SLA, and data portabilityIaaSrespective (provider-specific) Web service API. Programming applications that can be hosted acrossdistributed IaaS resources require a developer toorchestrate concurrent computation and communication across heterogeneous cloud services, ina manner that’s robust to delays and failures. Forexample, in a multistep orchestration operation ofallocating a CPU resource to a software resource,followed by assigning an elastic network IP andmounting an EBS resource — if one of the immediate operations fails or throws an unexpected error,a trivial implementation would fail stop, leaving thesystem in inconsistent state. Ensuring deadlock-freeorchestration to deal with a high level of concurrencyand network traffic arising from potentially largenumbers of overlapping requests, recent efforts2,12have advocated programming resource orchestrationbased on declarative programming languages.Configuring dynamic resources. The impetus behindcloud computing is the ever-increasing demand tomanage growth and increase computing flexibility6www.computer.org/internet/ by dynamically scaling up or down resources basedon demand.4,5 However, existing cloud resourceprovisioning techniques don’t effectively supportdynamic resource configuration. For instance,applications or workloads can’t be dynamicallyand automatically partitioned or migrated arbitrarily from one cloud service to another if demandcycles increase. Moreover, dynamic configurationof resources is a complex issue because of lack ofvisibility and control across heterogeneous services at different layers. Advanced cloud resourceorchestration techniques13 have focused on developing an analytical application workload-prediction model for forecasting application resourcerequirements, and developing adaptive resourcemanagement techniques that can dynamicallyconfigure resources to meet requirements and constraints. While initial research results are promising, more than that, in many cases there’s researchfrom the fields of autonomic computing that wecan leverage to a certain extent — however, designing effective dynamic cloud resource orchestrationIEEE INTERNET COMPUTING

Cloud Resource Orchestration Programmingtechniques that cope with large-scale heterogeneous cloud environments remains a deeply challenging problem.Allocating cloud resources energy-efficiently.In recent years, energy-efficient allocation12 ofhardware resources to applications has emergedas a critical requirement, due to the worldwidefocus on minimizing the carbon footprint. Effortshave focused on fabricating energy-efficienthardware, such as low-power, energy-efficientCPUs and solid-state drives to minimize energyconsumption. The research community has alsofocused on software-based approaches to minimize energy consumption, such as resource allocation and task consolidation. That said, whatremains a difficult and open research problemis the development of energy-efficient IaaSresource orchestration techniques that take intoaccount application-specific service-level agreements (SLAs) while making resource allocationdecisions for software resources.Data security and privacy. The most significantdifference between cloud security and traditionalsecurity controls stems from the fact that usersspanning different corporations and trust levels often interact with the same set of computingresources. The security and availability of generalcloud resources is dependent upon the security ofbasic APIs. From authentication and access control to encryption and activity monitoring, wemust design these interfaces to protect againstboth accidental and malicious attempts to circumvent policy. For example, consider BLOB storageresources that have limited data security and privacy features, such as simple access control basedon trusted credentials. BLOBs only support finegrained security and privacy features to protectits end users from the following risks: data exposure (confidentiality), data tampering (integrity),and denial of access to data (availability). Recentresearch efforts have focused on developing additional third-party security infrastructures14 toensure the security, privacy, and integrity of data— not only while being transmitted over networklinks but also while at rest on BLOB resources.Interoperability. To improve resilience, an intuitivesolution is to deploy applications across multipleIaaS providers. Unfortunately, most of the existing providers aren’t compatible with each other.They tend to have proprietary APIs, which aren’tmay/june 2015 explicitly designed for cross-cloud interoperability.To tackle such heterogeneities, there’s a requirement to enforce standardization across layers ofthe cloud resource stack. Recent developments —including Delta Cloud, jclouds, and Dasein Cloud(see http://dasein-cloud.sourceforge.net) — simplify this task by implementing a single API thatabstracts APIs related to multiple clouds such asAWS EC2 and GoGrid. We can orchestrate fundamental cloud resources such as CPU, appliances,and storage via SOAP/RESTful APIs. However,orchestrating monitoring, load balancing, andauto-scaling RO operations to handle uncertainties in application and resource behaviors acrossclouds via a unified API still isn’t viable, and henceremains an open research problem. The Topologyand Orchestration Specification for Cloud Applications (TOSCA; see www.oasis-open.org/committees/tc home.php?wg abbrev tosca#overview)is an interoperability specification that providesbuilding blocks to support cross-stack orchestration of cloud resources.PaaSThe PaaS layer features a rich pool of softwareappliances that facilitate the end-to-end lifecycle of developing, testing, deploying, and hosting applications. The following software resourcecategories are relevant at this layer.Appliances. Appliances15 are pre-configured, selfcontained, virtualization-enabled, and pre-builtsoftware resource units (database, Web server, application server, Apache Hadoop, Apache Storm, loadbalancers, and so on) that we can integrate withother compatible appliances for designing complexapplications. Primarily, it’s the goal of the resourceorchestrator to select, assemble, deploy, and manage a set of appliances (refer to https://solutionexchange.vmware.com/store/category groups/virtual-appliances) delivering a particular application functionality.For instance, several reusable appliances which) emerged in the area of Big Data processing (refer to n-big-data), including SQL andNoSQL appliances.16 SQL appliances (see http://aws.amazon.com/rds) provide traditional relational database systems (such as MySQL, SQLServer, PostGres, and Oracle). NoSQL appliances(for example, Neo4j, CouchDB, MongoDB, Cassandra, and Amazon Dynamo) offer efficient7

Cloud Computingsupport for unstructured data management andlimited-to-no support for atomicity, consistency,isolation, and durability (ACID) transaction principles of SQL-like database systems.In addition, to process Big Data producedby social media, mobile devices, the Internet ofThings, business transactions, and content distribution, there has been a paradigm change from thetraditional “one shot” machine-learning approachto elastic and virtualized cloud-based machinelearning (ML) and data-processing appliances thatare able to mine continuous, high-volume, openended data streams.Distributed ML appliances17 (such as ApacheMahout, MLBase, GraphLab, R, FlexGP, VowpalWabbit, MOA, and Pegasus) implement a widerange of ML algorithms (for example, clustering,decision trees, latent Dirichlet allocation, regression, and Bayesian) that are capable of miningdatasets in parallel by leveraging a distributedset of machines.Special data processing appliances — such asApache S4 (see http://incubator.apache.org/s4),Twitter Storm, Amazon Kinesis, StreamBase, andApache Hadoop — enable programming of applications that rapidly process massive amountsof data in parallel on large sets of machines. Tospeed up the ML algorithms, these data processing appliances simplify the process of distributingthe training and learning tasks across a parallelset of resources.We distinguish between basic and compositeappliances. A basic appliance (such as AWS Relational Database Service [RDS], Apache Mahout,Apache Storm, and SQL Azure) delivers singleabstract functionality that might not be sufficientto design a fully functional application stack.Examples include Web server, database, andmonitoring appliances. However, multiple basicappliances might need to be integrated to create afunctional application. On the other hand, a composite appliance encapsulates a number of software resource units to support a standalone, fullyfunctional application. For instance, Bitnami’sRedmine composite appliance encapsulates multiple software resource units, including MySQLand Ruby on Rails.We can classify appliances as customizableand non-customizable. Bitnami and rPath offerappliances that we can customize in terms of theirmapping to hardware resources. For instance, wecan map a Bitnami appliance (see http://wiki.bitnami.com/Applications) to one of the AWS CPU8www.computer.org/internet/ resource types, depending on anticipated QoStargets. Similarly, with customizable appliances,users have the flexibility to mount EBS volumes,if persistence of application data is a requirement.On the other hand, providers such as AWS EC2offer noncustomizable appliances that we canintegrate directly (without any further modification to its hardware resource configuration) intoan application. For instance, AWS offers a loadbalancer and a monitoring appliance such asAWS CloudWatch (see http://aws.amazon.com/cloudwatch) for integration with other appliances(such as an app server appliance) to be hosted onAWS EC2.Programming PaaS-Layer OrchestrationOperationsNow, let’s discuss research issues in programming orchestration operations at the PaaS layer.Selecting optimal appliances. This operationrequires an understanding of the technicaldetails, features, and interoperation ability ofcompeting appliances. In particular, the orchestrator needs to evaluate whether an appliancecan deliver the requested functionality (suchas stream data processing, database server, andsource code management server). If a group ofappliances is going to be selected, then theymust meet integration constraints. Finally, anappliance’s compatibility with the virtualizationtechnology of the target cloud must be considered during the selection process. Other important selection criteria include the appliance’s hostoperating system and its programming environment. To solve the appliance selection problem,research efforts have focused on applying multicriteria decision making11 and semantic-basedservices’ discovery techniques.18Integrating appliances. An application can be composed of several appliances. The deployment procedures and order of their executions are unique toeach application and cloud environment. Dependencies between various appliances in an application must be taken into account to ensure correctdeployments. In today’s interoperability solutions,templates are used as an interoperation mechanismto combine appliances. These templates captureappliances’ unstructured information that’s notoriously difficult to use as a means to support information with other compatible templates to rendercomposite offerings at the PaaS level. Today, noIEEE INTERNET COMPUTING

Cloud Resource Orchestration Programmingdescription language exists to describe and combine metadata descriptions of PaaS appliances ina uniform manner. Instead, a plethora of such formalisms exists, with varying types of concerns anddifferent capabilities. Nor is there a consistent wayto model the dependencies between operationaland deployment dimensions to create end-to-endcombinations of modular cloud stack offerings tomeet application or consumer demands.Comprehensive monitoring.19 Although the cloudprovider offers proprietary monitoring appliances, such as CloudWatch by AWS and FabricController by Azure, they have the followinglimitations: an inability to monitor applicationcomponents deployed across multiple cloud providers; and an inability to support QoS (such aslatency and availability) monitoring for individual software resources (for a Web, application, ordatabase server, for example). To improve thissituation, recent research efforts have focused ondeveloping monitoring techniques that can monitor both hardware and software resources andsupport a comprehensive list of QoS parametersfor each resource type.Managing Big Data. While research groups havefocused on large-scale data management in traditional enterprise settings, cloud computing and itsavailable NoSQL and SQL appliances have theirown research challenges with regards to programming orchestration operations (such as selection,scale-in, scale-out, synchronize, replicate, andbackup). Supporting ad hoc querying on top ofNoSQL appliances and providing hard data-consistency guarantees remains an open researchproblem. Further, it’s not clear how NoSQL appliances will perform for different classes of applications (for enterprises or streaming Big Data,for example) and workload (decision support, I/Ointensive, and so on). Developing techniques thatcan impart the intelligence16 of characterizing thedata density (density and distribution of data; orcomposition of queries) to a cloud-based loadbalancing appliance (such as AWS Elastic LoadBalancer) for improving the QoS (including querylatency and database ser

App layer server Web server Web server 1 n App server Presentation layer Load-balancer Database appliance App se er Load balancer Virtualization technologies (for example, Xen, VMWare, and KVM) Monitoring appliance Orhcestration frameworks Crowdsourc