CS349D Cloud Computing - Stanford University

Transcription

CS349D Cloud ComputingChristos Kozyrakis & Matei ZahariaFall 2017, 10:30–12:00, 380-380Whttp://cs349d.stanford.edu

Class StaffChristos Kozyrakishttp://www.stanford.edu/ kozyrakiMatei Zahariahttps://cs.stanford.edu/ mateiJames Thomas (TA)http://cs.stanford.edu/ jjthomas

TopicsCloud computing overviewDebugging & monitoringCloud economics (2)Resource allocationStorageOperationsDatabasesServing systemsServerless computingProgramming modelsAnalytics & streamingsystemsML as a serviceSecurity & privacyHardware accelerationCAP theorem

Class FormatOne topic per class meetingWe all read the paper ahead of timeSubmit answer to 1-2 questions before meeting1-2 students summarize paper & lead discussionWe all participate actively in the discussion1 student keeps notesA few guest lecturesSee schedule online

What to Look for in a PaperThe challenge addressed by the paperThe key insights & original contributionsReal or claimed, you have to checkCritique: the major strengths & weaknessesLook at the claims and assumptions, the methodology,the analysis of data, and the presentation styleFuture work: extensions or improvementsCan we use a similar methodology to other problems?What are the broader implications?

Tips for Reading PapersRead the abstract, intro, & conclusions sections firstRead the rest of the paper twiceFirst a quick pass to get rough idea then a detailed readingUnderline/highlight the important parts of the paperKeep notes on the margins about issues/questionsImportant insights, questionable claims, relevance to other topics, ways toimprove some technique etc.Look up references that seem to important or missingYou may also want to check who and how references this paper

Research ProjectGroups of 2-3 studentsTopicAddress an open question in cloud computingSuggested by staff or suggest your ownTimelineProject proposal – October 9thMid-quarter checkpoint – November 6thPresentation/paper – week of December 3rd

RemindersMake sure you are registered on AxessContact instructors for access codeSign up to lead a discussion topicWe will assign topics for note takingStart talking about projectsForm a team

Cloud Computing OverviewChristos Kozyrakis & Matei Zahariahttp://cs349d.stanford.edu

What is Cloud Computing?Informal: computing with large datacenters

What is Cloud Computing?Informal: computing with large datacentersOur focus: computing as a utility» Outsourced to a third party or internal org

Types of Cloud ServicesInfrastructure as a Service (IaaS): VMs, disksPlatform as a Service (PaaS): Web, MapReduceSoftware as a Service (SaaS):Email, GitHubPublic vs private clouds:Shared across arbitrary orgs/customers"vs internal to one organization

ExampleAWS Lambda functions-as-a-service» Runs functions in a Linux container on events» Used for web apps, stream processing, highlyparallel MapReduce and video encoding

Cloud Economics: For UsersPay-as-you-go (usage-based) pricing:» Most services charge per minute, per byte, etc» No minimum or up-front fee» Helpful when apps have variable utilization

Cloud Economics: For UsersElasticity:» Using 1000 servers for 1 hour costs the same as1 server for 1000 hours» Same price to get a result faster!ResourcesResourcesTimeTime

Cloud Economics: For ProvidersEconomies of scale:» Purchasing, powering, managing machines atscale gives lower per-unit costs than customers’

Other Interesting FeaturesSpot market for preemptible machinesReserved instances and RI marketAbility to quickly try exotic hardware

Common Cloud Applications1. Web/mobile applications2. Data analytics (MapReduce, SQL, ML, etc)3. Stream processing4. Batch computation (HPC, video, etc)

Cloud Software StackAnalytics UIsWeb Servermemcached, TAO, Operational StoresChubby, ZK, CoordinationSQL, Spanner, Dynamo,Cassandra, BigTable, Other ServicesAnalytics EnginesMessage BusMetadatamodel serving, search,Unicorn, Druid, Kafka, Kinesis, MapReduce, Dryad,Pregel, Spark, Hive, AWS Catalog, Distributed StorageAmazon S3, GFS, Hadoop FS, Resource ManagerEC2, Borg, Mesos, Kubernetes, Security (e.g. IAM)CacheMetering BillingHive, Pig, HiPal, Java, PHP, JS,

Example: Web ApplicationAnalytics UIsWeb Servermemcached, TAO, Operational StoresChubby, ZK, CoordinationSQL, Spanner, Dynamo,Cassandra, BigTable, Other ServicesAnalytics EnginesMessage BusMetadatamodel serving, search,Unicorn, Druid, Kafka, Kinesis, MapReduce, Dryad,Pregel, Spark, Hive, AWS Catalog, Distributed StorageAmazon S3, GFS, Hadoop FS, Resource ManagerEC2, Borg, Mesos, Kubernetes, Security (e.g. IAM)CacheMetering BillingHive, Pig, HiPal, Java, PHP, JS,

Example: Analytics WarehouseAnalytics UIsWeb Servermemcached, TAO, Operational StoresChubby, ZK, CoordinationSQL, Spanner, Dynamo,Cassandra, BigTable, Other ServicesAnalytics EnginesMessage BusMetadatamodel serving, search,Unicorn, Druid, Kafka, Kinesis, MapReduce, Dryad,Pregel, Spark, Hive, AWS Catalog, Distributed StorageAmazon S3, GFS, Hadoop FS, Resource ManagerEC2, Borg, Mesos, Kubernetes, Security (e.g. IAM)CacheMetering BillingHive, Pig, HiPal, Java, PHP, JS,

Components Offered as PaaSAnalytics UIsWeb Servermemcached, TAO, Operational StoresChubby, ZK, CoordinationSQL, Spanner, Dynamo,Cassandra, BigTable, Other ServicesAnalytics EnginesMessage BusMetadatamodel serving, search,Unicorn, Druid, Kafka, Kinesis, MapReduce, Dryad,Pregel, Spark, Hive, AWS Catalog, Distributed StorageAmazon S3, GFS, Hadoop FS, Resource ManagerEC2, Borg, Mesos, Kubernetes, Security (e.g. IAM)CacheMetering BillingHive, Pig, HiPal, Java, PHP, JS,

Datacenter Hardware2-socket server 10GbE NICFlash StorageJBOD disk arrayGPU/accelerators 10GbE Switch

Datacenter HardwareRows of rack-mounted serversDatacenters with 50 – 200K of servers and burn 10 – 100MWStorage: distributed with compute or NAS systemsRemote storage access for many use cases (why?)

Hardware Heterogeneity[Facebook server configurations]Custom-design serversConfigurations optimized for major app classesFew configurations to allow reuse across many appsRoughly constant power budget per volume

Useful Latency Numbers "Initial list from Jeff Dean, GoogleL1 cache referenceBranch mispredictL3 cache referenceMutex lock/unlockMain memory referenceCompress 1K bytes with SnappySend 2K bytes over 10GeRead 1 MB sequentially from memoryRead 4KB from NVMe FlashRound trip within same datacenterDisk seekRead 1 MB sequentially from diskSend packet CA à Europe à CA0.5 ns5 ns20 ns25 ns100 ns3,000 ns2,000 ns100,000 ns50,000 ns500,000 ns10,000,000 ns20,000,000 ns150,000,000 ns

Useful Throughput NumbersDDR4 channel bandwidth20 GB/secPCIe gen3 x16 channel15 GB/secNVMe Flash bandwidth2GB/secGbE link bandwidth10 – 100 GbpsDisk bandwidth6 GbpsNVMe Flash 4KB IOPS500K – 1MDisk 4K IOPS100 – 200

Performance MetricsThroughputRequests per secondConcurrent usersGbytes/sec processed.LatencyExecution timePer request latency28

Tail Latency[Dean & Barroso,’13]The 95th or 99th percentile request latencyEnd-to-end with all tiers includedLarger scale à more prone to high tail latency29

Total Cost of Ownership (TCO)TCO capital (CapEx) operational (OpEx) expensesOperators perspectiveCapEx: building, generators, A/C, compute/storage/net HWIncluding spares, amortized over 3 – 15 yearsOpEx: electricity (5-7c/KWh), repairs, people, WAN, insurance, Users perspectiveCapEx: cost of long term leases on HW and servicesOpeEx: pay per use cost on HW and services, people30

Operator’s TCO Other[Source: James Hamilton]Hardware dominates TCO, make it cheapMust utilize it as well as possible31

ReliabilityFailure in time (FIT)Failures per billion hours of operation 109/MTTFMean time to failure (MTTF)Time to produce first incorrect outputMean time to repair (MTTR)Time to detect and repair a failure

ailureCorrectSteady state availability MTTF / (MTTF MTTR)

Yearly Datacenter Flakiness 0.5 overheating (power down most machines in 5 mins, 1-2 days to recover) 1 PDU failure ( 500-1000 machines suddenly disappear, 6 hrs to come back) 1 rack-move (plenty of warning, 500-1000 machines powered down, 6 hrs) 1 network rewiring (rolling 5% of machines down over 2-day span) 20 rack failures (40-80 machines instantly disappear, 1-6 hours to get back) 5 racks go wonky (40-80 machines see 50% packet loss) 8 network maintenances (4 might cause 30-minute random connectivity losses) 12 router reloads (takes out DNS and external vIPs for a couple minutes) 3 router failures (have to immediately pull traffic for an hour) dozens of minor 30-second blips for dns 1000 individual machine failures (2-4% failure rate, machines crash at least twice) thousands of hard drive failures (1-5% of all disks will die)Add to these SW bugs, config errors, human errors,

Key Availability ion Partitioning (sharding) Load-balancing Watchdog timers Integrity checks Canaries Eventual consistency Make apps do something reasonable when not all is rightBetter to give users limited functionality than an error pageAggressive load balancing or request droppingBetter to satisfy 80% of the users rather than none

The CAP TheoremIn distributed systems, choose 2 out of 3ConsistencyEvery read returns data from most recent writeAvailabilityEvery request executes & receives a (non-error) responsePartition-toleranceThe system continues to function when networkpartitions occur (messages dropped or delayed)

Useful TipsCheck for single points of failureKeep it simple stupid (KISS)The reason many systems use centralized controlIf it’s not tested, do no rely on itQuestion: how do you test availability techniqueswith hundreds of loosely coupled servicesrunning on thousands of machines?37

1 rack-move (plenty of warning, 500-1000 machines powered down, 6 hrs) 1 network rewiring (rolling 5% of machines down over 2-day span) 20 rack failures (40-80 machines instantly disappear, 1-6 hours to get back) 5 racks go wonky (40-80 machines see 50% packet loss) 8 network maintenances (4 might cause 30-minute random connectivity .