SQL Server In AWS - Brent Ozar

Transcription

SQL ServerIn AWSA practical guide to deploying SQL Serverin the Amazon cloudIt isn’t always blue skies in the cloud; it takes work to get things rightWho Is Deploying SQLServer In EC2?A question I hear a lot is, “Who isputting SQL Server into EC2?” Sometimesthere’s a hint of incredulity in that question,hinting that people can’t seriously bedeploying SQL Server into EC2. That’s farfrom the truth: many companies, large andsmall, are deploying SQL Server intoAmazon’s cloud, with varying degrees ofsuccess.Back to the question of who’s deployingSQL Server in EC2 I work with a start up building a platformon a Microsoft stack. The front end is hostedon mobile devices, the middle tier is writtenin .NET, and the database is SQL Server.Everything in here iscorrect(ish) as of September2011They’re just starting out and they don’t havethe time to spend speccing out servers,ordering them, and waiting for delivery. It’seasy to spin up servers to test new featuresand functionality and if a feature doesn’tcatch on the servers are turned back off.Traditional hosting facilities didn’t offer thethe flexibility or response times that they werelooking for, so they chose Amazon EC2.Another company I work with is a ISVtransitioning to a Software as a Service(SaaS) model. The ISV works in an industrywhere there is a lot of specialized knowledgeand technical ability, but that knowledge andability doesn’t have anything to do withkeeping a server up and running. In the pastthey’ve added additional maintenanceroutines to the software that they sell tocustomers, but sometimes customers don’thave anywhere to host this mission criticalsoftware. The ISV has hosted a few customerson their own servers, but they don’t want tokeep buying new hardware in response tocustomer demands – what happens when aA Quick Overview:Downsides to the CloudStorage Test ResultsHigh Availability OptionsBackups25910Things change fast in the AWS ecosystem. If you want to keep up to date on new features,products, functionality, and pricing I suggest that you follow the Amazon Web Services blog:http://aws.typepad.com/. This is one of the best places to get up to the second informationabout the AWS universe.

FACT: You can save money inthe cloud.FACT: You can also wastemoney in the cloud.Look out!There are many thingsto worry about, hereare a few:1. CostsWith AWS, you get billed for theresources you use. When the VM ison, you’re getting charged. Whenyour storage is provisioned, you’regetting charged.2. Noisy NeighborsAWS is built around virtualization. Becareful, other VMs could be usingmore than their share of resources.3. FailuresEquipment failures are inevitable. Youmust plan for them.customer leaves? The ISV hasn’t made themove to cloud computing yet, they’re feelingout the different options available, but EC2provides a level of flexibility that they won’tget through traditional IT solutions.What Are Some OfThe Problems?There are, of course, problems withevery way that you can possibly deploy SQLServer. When you deploy on premises, youhave to purchase hardware, wait for it toarrive, set it up, and then you have tomaintain it. When you host your databasewith another hosting provider, you’re limitedby the amount of server space they haveavailable. When you host in EC2 there are adifferent set of problems.CostOne of the perceived problems withdeploying into the cloud is the cost. You’rerenting something by the hour (like that“friend” your uncle brought to Thanksgivinglast year); and renting something by the hourcan get really expensive when you want thatsomething to be up and running all the time.How much would it cost to keep areasonable SQL Server running 365 days ayear? 10,862.40 assuming that you’re usinga Double Extra Large Hi-Memory instance(that’s 4x 2.61GHz Xeons with 34.2 GB ofmemory). You can really kick out the jamsand get yourself a Quadruple Extra Large Hi-Memory instance (8x 2.66 GHz Xeons and68.4GB of memory) for twice the price. Yeah,that’s expensive. Of course, you can alsoreserve the instance and just buy it outrightfor a fraction of that cost, but who wants todo that?What would a similar server cost youfrom HP? You can get an HP ProLiant DL380G7 with a pair of fancy pants Intel E5640CPUs (that’s right around the same speed asthe EC2 instance), 32GB of RAM, and 1.2TBof in chassis storage for about 8,600. Thatprice doesn’t include an OS or any otherlicensing costs. It also doesn’t include power,cooling, or even a spare server sitting aroundready to pick up the load if your primaryserver fails.Storage isn’t tremendously expensive inAmazon’s cloud – 10 cents per GB per monthof provisioned storage. Over the course of ayear, 1 terabyte of storage is only 102.40 amonth, and that storage is fairly redundantwithin a single data center.Despite the cost, Amazon’s cloud is stillincredibly popular with many businesses.Why? Simple: it’s easy to start with a smallamount of resources and rapidly scale inresponse to customer demands. If somethingdoesn’t pan out, you just stop the servers anddelete the storage and you don’t have toworry about it anymore. The costs go awayas soon as you stop using the compute cycles.Noisy NeighborsI used to live in an apartment. It wasn’t agreat apartment. In fact, it wasn’t even a niceapartment. I could hear everything that myneighbors did. After a while, I knew about4. Drive limitsThere is a finite limit on the amount ofstorage a single server can have. Planaccordingly and split the load.5. Bandwidth limitsYou can use ten gigabit ethernetlocally, but Windows instances in AWSare limited to one gigabit connectivity.6. Instance SizesThere’s a hard limit on the biggest sizeof an EC2 instance. You can’t get a 64core monster server in AWS.How much?You can use the AWS calculator tofigure out just how much your existinginfrastructure will cost to move to thecloud.

You may not be alone.their girlfriends, their love of Super TecmoBowl, and I learned that they liked to listen toloud music on week nights when they didn’thave to work the next day.My noisy neighbors made it difficult forme to get things done. When you’re sharing ahost, noisy neighbors can make it difficult toget things done, too. This sort of thing canhappen on a server right now – when oneprogram is aggressive writing to disk, readperformance will suffer. If you’re sharing aphysical server with another guest operatingsystem, you have no idea what that other OSis doing. It could be sitting there doingnothing, it could be chugging along doing 3drendering, or it could be a BitTorrent server.You have no idea what’s going on in the nextroom and how it could be affecting yourperformance.Unfortunately, there’s not a lot that youcan do about noisy neighbors apart frommoving. In the real world, you move to a newapartment or buy a house. It takes time andmoney to get a new place, but it’s feasible. InEC2 it’s a lot easier to get a new place: youjust upgrade your instance. You pay more bythe hour, but it’s easy to get a bigger place.Crashes In GeneralTo be fair, you can’t do much aboutcrashes if you’re using someone else’s hostingservice; when the data center loses poweryour server is going to go down. So, how doyou protect yourself from crashes?Redundancy, of course.Unfortunately, redundancy getsexpensive fast. If you want a redundant mirrorfor that 10k a year server, you need a second10k a year server. If you want multiple serversspread across the country, you’re paying foreach one. Thankfully the cost of the serversincludes the Windows license, but it’s still alarge cost for many people to stomach.SQL Server has a number of options tohelp you keep your servers up and running incase something horrible happens. Traditionaldatabase mirroring will work in EC2. You arefree to implement whatever Rube Goldbergmachine you want. It’s important to keep inmind that you have little control over yourstorage (there are no SAN snapshots or SANmirroring that you can control) and there isno dedicatednetwork that youcan rely on. Allof yourconnectivity goesover the openinfrastructurewithin Amazon’sdata centers.You’re fightingfor bandwidthwith Amazon,Netflix, Heroku, and everyone else in thecloud.that server with a lot of other servers. If thereare 4 other SQL Servers sharing the samegigabit connection, you could end up withsome very poorperformance.Amazon recentlyadded instances with10 gigabit networkcards, but they’relimited to Linuxmachines in the USEast region right now.In the future theremight be instanceswith 10 gigconnections that run Windows, but there’snothing in EC2 right now to make the pathbetween a Windows server and the storageany faster. To keep up to date on whichinstances have 10 gigabit ethernet, visit thelist of EC2 Instance Types and search for “I/OPerformance: Very High”.In short, if your applications are chatty,you’re going to need to learn to break themapart into many smaller pieces to keepthroughput high.You pay moreby the hour, butit’s easy to get abigger placeLimited Number Of DrivesHow many drives can you connect to aninstallation of Windows locally? 26 or so,right? That’s how many letters there are in thealphabet. You can go beyond that usingmount points to boost the amount of storageyou have. For most purposes, though, youeffectively can attach an unlimited number ofdrives to a Windows machine. Certainly morethan you’ll need. SANs make it easy to createhuge volumes, after all.In EC2, that’s not quite the case. You canonly attach 16 drives (Amazon call themvolumes) to a server. It’s important to keepthis limitation in mind when designing yourstrategy to move into Amazon’s cloud. If youneed to store more than 16 terabytes of data,you’ll need to find a way to spread that dataout across multiple EC2 instances.Considering some of what we’ll cover lateron, this is going to be a really good idea.Limited Instance SizesWe just briefly touched on this in the lastsection – not every size of instance isavailable in every region. Typically, the USEast region gets the newest fanciest servers,at least as far as I can tell. Other regionsslowly get new hardware over time. The nameof the instance type (m1.small, m2.4xlarge)will stay the same over time, but it may not beLimited Network ThroughputWhat Now?Did you know that there’s a pretty solidlimit on network throughput in EC2? I didn’teither until I start performing some storagebenchmarks (more on this soon). It turns outthat there’s a single active gigabit connectioninto each host. That’s right: a gigabitconnection per host. You might be sharingPlan carefully. All of these pitfalls canbe easily overcome by examining yourworkload and finding ways to workwithin the limitations of AWS.

Terminology:One of the biggest bottlenecksyou’ll find in AWS is going tobe the network.Volupat est ipsum quisest eu pede laoreetelementum lectusAWSAmazon Web Services. This is theoverarching banner of Amazon’scloud offering.EC2Elastic Compute Cloud. This is whereyou create and administer virtualmachines.EBSElastic Block Storage. Your SAN in thesky.S3Simple Storage Service. Cheap,durable, and slow. It’s a great place tokeep data around for a long time.RegionAWS is split into geographic regions(US East, US West, EU, AsiaSingapore, Asia Tokyo).Availability ZoneAvailability Zones are data centersinside of regions. Availability Zoneshave independent power andconnectivity and should be used forfailover.possible to bring up instances of the samesize in every region.Storage PerformanceIf you want to keep data around for anylength of time, it’s important to persist itsomewhere. Persisting your data means thatyou have to write it and, if you aren’t writingit to your users’ computers you have to write itto the database. Hiding beneath all of thatfancy database software is something assimple as a pile of disks for storing data inrows and tables. Making sure the disks areworking as fast as they can is critical tokeeping a database running at peakperformance.Server PerformanceStorage PerformanceGigabit ethernet sucks. If disk I/O isyour absolute bottleneck, you onlyhave two options – you need to writeto as many drives as possible. We’llfind out why that isn’t entirely possiblein EC2. The other option is to make theconnection between your databaseserver and the storage as fast aspossible. Since you can’t get super faststorage in EC2, you’ll have to findanother way to make your databasefast.Just like any other computer, there are afinite number of resources available on anyhost. Likewise, the VMs that you spin up havea finite amount of resources available tothem. Unlike your local data center, you haveno control over which VMs share the samehost.Normally you could put differentmachines together and stagger theirfunctionality around the clock so that gueststhat did bulk processing at night would be onthe same host as machines that record ordersduring business hours. When everything isunder your control, it’s very easy to makesure that any host isn’t overcommitted duringa certain time window.The only way that you can guaranteeperformance is to allocate larger instancesthat take up all of the physical resources ofthe host but cost more. By doing this you’reable to eliminate the effect of noisyneighbors. It’s important to understand howto get the best performance out of yourresources.Just How Fast is GigabitEthernet?The question isn’t rhetorical. It’simportant to understand how fast gigabitethernet is. Gigabit ethernet can really onlytransfer, at most, 120 megabytes per second.That’s the theoretical maximum assuming thateverything along the line is operatingperfectly and there is no latency or additionalhops to make.We’re not likely to see this in the realworld. You’re not likely to see this in yourown server room and you’re sure as hell notgoing to see it in EC2.Why is your storage going to be slow?You have keep in mind that not everythinginvolved may be able to push data at 120MB/s; the drives could be slow, fragment, oryou could be sharing space with someoneelse who is doing a lot of

every way that you can possibly deploy SQL Server. When you deploy on premises, you have to purchase hardware, wait for it to arrive, set it up, and then you have to maintain it. When you host your database with another hosting provider, you’re limited by the amount of server space they have available. When you host in EC2 there are a