Introduction To Cloud Computing

Transcription

Introduction toCloud Computing

Grid ComputingDef combination of computer resources from multipleadministrative domains applied to a common task*Core idea distributed parallelcomputation–super virtual computer* http://en.wikipedia.org/wiki/Grid computing2

Utility ComputingDef “The packaging of computing resources (computation,storage etc.) as a metered service similar to a traditionalpublic utility”*Observation not a new concept–"If computers of the kind I have advocated become thecomputers of the future, then computing may someday beorganized as a public utility just as the telephone system is apublic utility. The computer utility could become the basis ofa new and important industry." - John McCarthy, MIT Centennial in 1961* http://en.wikipedia.org/wiki/Utility computing3

Cloud ComputingIs cloud computing? grid computing utility computing ? difficult to define–means different things to different partiesVarious definitions NIST – National Institute of Standards and Technology–“universally” accepted definition4

Cloud Computing – NISTDefinition “Cloud computing is a model for enabling convenient,on-demand network access to a shared pool ofconfigurable computing resources (e.g., networks,servers, storage, applications, and services) that can berapidly provisioned and released with minimalmanagement effort or service provider interaction. Thiscloud model promotes availability and is composed offive essential characteristics, three service models, andfour deployment models.”** oud-def-v15.doc5

Cloud Computing – NISTDefinition “Cloud computing is a model for enabling convenient,on-demand network access to a shared pool ofconfigurable computing resources (e.g., networks,servers, storage, applications, and services) that can berapidly provisioned and released with minimalmanagement effort or service provider interaction. Thiscloud model promotes availability and is composed offive essential characteristics, three service models, andfour deployment models.”** oud-def-v15.doc6

Cloud Computing – NISTDefinition “Cloud computing is a model for enabling convenient,on-demand network access to a shared pool ofconfigurable computing resources (e.g., networks,servers, storage, applications, and services) that can berapidly provisioned and released with minimalmanagement effort or service provider interaction. Thiscloud model promotes availability and is composed offive essential characteristics, three service models, andfour deployment models.”** oud-def-v15.doc7

Cloud Computing – NISTDefinition “Cloud computing is a model for enabling convenient,on-demand network access to a shared pool ofconfigurable computing resources (e.g., networks,servers, storage, applications, and services) that can berapidly provisioned and released with minimalmanagement effort or service provider interaction. Thiscloud model promotes availability and is composed offive essential characteristics, three service models, andfour deployment models.”** oud-def-v15.doc8

Cloud Computing – NISTDefinition “Cloud computing is a model for enabling convenient,on-demand network access to a shared pool ofconfigurable computing resources (e.g., networks,servers, storage, applications, and services) that can berapidly provisioned and released with minimalmanagement effort or service provider interaction. Thiscloud model promotes availability and is composed offive essential characteristics, three service models, andfour deployment models.”** oud-def-v15.doc9

Cloud Computing – NISTDefinition “Cloud computing is a model for enabling convenient,on-demand network access to a shared pool ofconfigurable computing resources (e.g., networks,servers, storage, applications, and services) that can berapidly provisioned and released with minimalmanagement effort or service provider interaction. Thiscloud model promotes availability and is composed offive essential characteristics, three service models, andfour deployment models.”** oud-def-v15.doc10

NIST Essential CharacteristicsOn-demand self-service a consumer can unilaterally provision computingcapabilities without human interaction with the serviceprovidercomputing capabilities–server time, network storage, number of servers etc.11

NIST Essential CharacteristicsBroad network access capabilities are–– available over the networkaccessed through standard mechanismspromote use by–heterogeneous thin or thick client platforms12

NIST Essential CharacteristicsMulti-tenancy / Resource pooling provider’s computing resources are pooled to servemultiple consumerscomputing resources– location independence– storage, processing, memory, network bandwidth and virtualmachinesno control over the exact location of the resourceshas major implications–performance, scalability, security13

NIST Essential CharacteristicsRapid elasticity capabilities can be rapidly and elastically provisioned unlimited virtual resources predicting a ceiling is difficult14

NIST Essential CharacteristicsMeasured service metering capability of service/resource abstractions–––– storageprocessingbandwidthactive user accountsOK so what happened to utility computing – pay asyou go model?–more on this later when we discuss deployment models15

Relevant TechnologiesAccess heterogeneous set of thick & thin clients– high speed broadband access– PCs (enterprise, home), mobile devices, hand-held deviceswired & wirelessdata centres–––large computing capacitydistributeddirect access storage devices Vs. storage area networks16

Relevant TechnologiesVirtualization decoupling from the physical computing resourcesVirtualization types hardware–emulation – VM emulates/simulates complete hardware –paravirtualization - software interface to virtual machines –QEMUXenfull virtualization - complete simulation of the underlyinghardware VMWare, Parallels17

Relevant TechnologiesVirtualization types memory virtualization–– storage virtualization–– decouples volatile random access memory (RAM) resourcesfrom individual systemsaggregates these resources into a virtualized memory poolavailable to any computer in the clusterabstracting logical storage from physical storageNAS - network attached storagedata virtualization–data as an abstract layer, independent of underlying databasesystems, structures and storage18

Relevant TechnologiesVirtualization types network virtualization––virtualized network addressing space within or across networksubnetsVPNsQuestion? how do we measure virtual resources–Amazon ECU (elastic compute unit) EC2 Compute Unit equals– 1.0-1.2 GHz 2007 Opteron or– 2007 Xeon processor19

Relevant TechnologiesAPIs required for various operations and applications––– administrationapplication developmentresource migrationno standards20

SPI ServicesSaaS (Software-as-a-Service) vendor/provider controlled applications accessed over thenetworkcharacteristics–––network based accessmulti-tenancysingle software release for allSaaS Examples–Salesforce.com, Google Docs21

SPI ServicesSaaS & Multi-tenancy SaaS applications are multi-tenant applications application data–Google docsSaaS Application Design SaaS applications are 'net native' configurability, efficiency, and scalability SOA & SaaS22

Net Native ApplicationCharacteristics cloud specific design, development & deployment multi-tenant data builtin metering & management browser based client & client tools customization via configuration23

SPI ServicesSaaS Disadvantages dependency on– performance– network, cloud service providerlimited client bandwidthsecurity–––good: better security than personal computersbad: CSP is in charge of the dataugly: user privacy24

SPI ServicesPaaS (Platform-as-a-Service) vendor provided development environment––tools & technology selected by vendorcontrol over data life-cycleAdvantages rapid development & deployment small startup cost––required skills setmoney25

SPI ServicesPaaS – Architectural Characteristics multi-tenancy– native scalability– dataload balancing & fail-overnative integrated management–––performanceresource consumption/utilizationload26

SPI ServicesPaaS Disadvantages inherits all from SaaSchoice of development technology is limited to vendorprovided/supported tools and servicesPaaS Examples Google app engine–Google Site Google Docs27

SPI ServicesIaaS (Infrastructure-as-a-Service) vendor provided and consumer provisioned computingresources–––processing, storage, network, etc.consumer is provided customized virtual machinesconsumer has control over OS, memorystorageservers & deployment configurationslimited control over network resources28

SPI ServicesIaaS utility computing? maybe – NIST does not talk about Advantages infrastructure scalability native integrated management– performance, resource consumption/utilization, loadeconomical cost–hardware, IT support29

SPI ServicesIaaS Examples Amazon Elastic Compute Cloud – EC230

SPI Services31

SPI Services & anization & service Service Providerprovider share controlcontrolled[1] Visualizing the Boundaries of Control in the Cloud. Dec ing-the-boundaries-of-control-in-the-cloud/32

XaaSXaaS (Everything-as-a-Service) composite second level services–Security-as-a-Service –McAfee*– McAfee SaaS Email Archiving– McAfee SaaS Email Inbound Filtering– McAfee Vulnerability Assessment SaaS (PEN Tests)CaaS – Communication-as-a-Service VoIP, private osted security/33

A Simple Reference Modelservicecloud entapplicationsPaaSIaaS34

Amazon Web Serviceshttp://aws.amazon.com/35

NIST Cloud Deployment Models4 Deployment Models private cloud–– infrastructure is operated solely for an organizationmanaged by the organization or by a third partycommunity cloud––supports a specific communityinfrastructure is shared by several organizations36

NIST Cloud Deployment Models4 Deployment Models public cloud–– infrastructure is made available to the general publicowned by an organization selling cloud serviceshybrid cloud––infrastructure is a composition of two or more cloudsdeployment modelsenables data and application portability37

NIST Cloud ube.png38

Cloud Distributed StorageDistributed Storage Two approaches to scaling––vertical – bigger hardwarehorizontal – more hardware functional partitioninghorizontal partitioning– sharding*http://queue.acm.org/detail.cfm?id 1394128*http://en.wikipedia.org/wiki/Shard %28database architecture%2939

Cloud Distributed StorageCAP Theorem* web services cannot ensure all three of the followingproperties at once–consistency –availability –set of operations has occurred all at oncean operation must terminate in an intended responsepartition tolerance operations will complete, even if individual components areunavailable* Eric Brewer, University of California, Berkeley40

Cloud Distributed StorageHorizontal Storage Scaling “any horizontal scaling strategy is based on datapartitioning”*–forced to decide between consistency & availabilityACID provides strong data consistency guarantees––at the cost of availability2PC availability product of availability of each* http://queue.acm.org/detail.cfm?id 139412841

Cloud Distributed StorageBASE – an ACID alternative basically available, soft state, eventual consistency characteristic–– “optimistic and accepts that the database consistency will bein a state of flux”*supports partial failuresscalability promise–“leads to levels of scalability that cannot be obtained withACID”** http://queue.acm.org/detail.cfm?id 139412842

Cloud Distributed StorageEventual Consistency consistency across functional groups is easy to relax we encounter this on daily basis some scenarios––– update of online user profileonline master card paymentATM cheque depositidempotent operations–permit partial failures43

Cloud Distributed StorageGeneral Characteristics simplified data model built on distributed file systems–– highly available– GFS - Google File SystemHDFS – Hadoop Distributed File Systemrelaxed consistencyfault-tolerant–replication44

Cloud Distributed StorageGeneral Characteristics eventual consistency– all replicas will be updated at different times and in differentorderexamples–––Google BigTableYahoo PNUTSAmazon S345

Cloud Distributed ComputationMotivation distributed computing–many thousands of computers large datasets fault-tolerant easy to configure & manage46

Cloud Distributed ComputationBasic Idea functional programming functional decomposition––large problem broken into a set of small problemseach small problem can be solved by a functional transformation of input data– remember pipes & filters?can be executed in complete isolation– parallel computingserver (task) farm–to solve the big problem47

Cloud Distributed ComputationDistributed grepgrepmatchessolutionconcat48

Cloud Distributed ComputationDistributed wccountcountssolutionmerge49

ionmergeMAPDATAREDUCEPARTITIONING50

MapReduceMap input: key/value pair output: intermediate key/value pairReduce input: intermediate key/value pair output: final key/value pair51

MapReduceExamples distributed grep–map –reduce if match(value,pattern) emit(value,1)emit(key,sum(value*))distributed wc–map –for all w in value do emit(w,1)reduce emit(key,sum(value*))52

Security in CloudSecurity Technology, provides assurance––confidentialityintegrity, authenticityPrivacy Right, provides control––anonymityprimary & secondary use53

Information Security ConcernsConfidentiality safe from prying eyes–communication, persistenceAuthenticity data is from a known sourceIntegrity data has not been tampered with––provenance (computation)persistence54

Information Security ConcernsNon-repudiation assurance against deniabilityAccess control access & modification by privileged users––individual vs. group accessmulti-tenancy (PaaS, SaaS)55

Information Security ConcernsLong term security change in authentication/authorization proof of possession confidentiality– crypto systems do not provide long term guaranteesintersection attacks56

Security Enhancing TechniquesEncryption symmetric encryption (data) public key cryptography (identity, authentication)–secret private key, published public key hash / Message Authentication Code (integrity) digital signatures (authentication, non-repudiation) TLS/SSL (communication)57

Security Enhancing TechniquesEncryption homomorphic encryption*–allow for arbitrary computing over encrypted data – if E(p) c then D(2c) 2p (multiplication operation)allows for data processing without decryptionpromising but not practical so far**key management challenges–increase as the access control granularity increases* Gentry, C. 2009. Fully homomorphic encryption using ideal lattices. In Proceedings of the 41st Annual ACMSymposium on theory of Computing (Bethesda, MD, USA, May 31 - June 02, 2009). STOC '09. ACM, New York,NY, 169-178.** Bruce Schneier. Schneier on Security. morphic enc.html58

Security Enhancing TechniquesSecure query & search PIR/SPIR (Private Information Retrieval)––“allows a user to retrieve an item from the server withoutrevealing the item to the database”*under research more effort required to be adopted by mainstream* Chor, B., Kushilevitz, E., Goldreich, O., and Sudan, M. 1998. Private information retrieval. J. ACM 45, 6 (Nov.1998), 965-981.59

Security Enhancing TechniquesSecure query & search encrypted data search–matching with encrypted keywords –secure anonymous database search (SADS)* –meta-data drivensingle party querymulti party queriesnot easy, may require trusted third parties* Raykova, M., Vo, B., Bellovin, S. M., and Malkin, T. 2009. Secure anonymous database search. In Proceedings of the 2009 ACMWorkshop on Cloud Computing Security (Chicago, Illinois, USA, November 13 - 13, 2009). CCSW '09. ACM, New York, NY, 115126.60

Security Enhancing TechniquesRemote data checking client side preprocessing–––data in chunks along with MAC for each chunkserver stores data chunk MAC combinationsforward error correction long term recoverability61

Security Enhancing TechniquesData Remanence “Residual representation of data after purge” How to purge data in cloud?– risk at all levels (SaaS, PaaS, and IaaS)Secure deletion––encrypt the data in the clouddata deletion key destruction62

Security in CloudCSA (Cloud Service Alliance) http://www.cloudsecurityalliance.org/ various introductory publications––CSA Guide ver 2.0inline with NIST63

Cloud Computing – NIST Definition “Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minima