Guide To Scaling Global Game Servers On AWS - Amazon Web Services, Inc.

Transcription

GUIDE TO SCALING GLOBAL GAME SERVERS ON AWSGuide to Scaling GlobalGame Servers on AWS

2To give your players the bestexperience possible—evenduring peak hours—you needcompute resources that canramp up and down quicklyto accommodate fluctuatingplayer usage.Quick jump1.0 Reference architecture2.0 Games as REST APIsHTTP load balancing.5Application Load Balancer.6Custom load balancer.7HTTP Auto Scaling.8Flexibility is key, and AWS gives you game server compute options to build it yourself,integrate existing tools, or even move to a fully managed service.Installing application code.9You can run your own orchestration for game servers that use Amazon Elastic ComputeCloud (EC2) or Amazon EC2 Spot Instances. You can deploy multiplayer game servicesusing container-based orchestration with managed services, such as Amazon ElasticContainer Service (ECS) or Amazon Elastic Kubernetes Service (EKS). Or, you can usea managed service with Amazon GameLift, which helps you deploy, operate, and scalededicated game servers for session-based multiplayer games. GameLift includes AmazonGameLift FleetIQ, which provides an extra logic layer for using low-cost Spot Instances forgame hosting while managing hosting tasks and can be used independently of GameLift.Give players a great experience and save costs no matter which compute option you choose.3.0 Game serversMatchmaking.12Routing messages.13Mobile push notifications.13Last thoughts on game servers.14This guide focuses on RESTful server and stateful game server use cases and best practicesfor load balancing RESTful EC2 servers. It begins with reference architecture for ahorizontally scalable game backend.

1.03A fully production-readygame backend on AWSElastic LoadBalancingStateful TCP socketHTTP/SHTTP/S4Reference architecture fora scalable game backendOur reference architecture depicts a game backend that supports awide set of game features, including login, leaderboards, challenges,chat, binary game data, user-generated content, analytics, and onlinemultiplayer capabilities. Keep in mind that not all games have all ofthese components, but the following diagram provides a visualizationof how they would fit together. In this guide, we’ll cover HTTP/JSONservers and stateful game servers (labeled 2 and 4 in the diagram).You can find more information about the other pieces of this referencearchitecture in the Introduction to Scalable Game DevelopmentPatterns on AWS.TCPCloudFront CDNClientGET2Stateful Game ServersSecurity GroupHTTP/JSON Servers AutoScaling GroupHTTP/JSON Servers AutoScaling GroupQueueasyncjobGame dataReadsReads6CACHECACHEWritesStateful Game ServersSecurity GroupBroadcastmessage forgameMobile pushnotifications8SQS for jobqueuesWritesSNS for pushmessagesElastiCache for RedisSecurity GroupJob results57MRRun jobSRDS MySQLRJob Workers AutoScaling GroupLearn how Behaviour Interactive uses our managedservice GameLift to scale its game servers.Availability Zone A1Availability Zone BSingle Region (Oregon, Singapore, etc.)ServerPUTS3 for binarygame assets3

4To make use of horizontal scalability,implement most of your game’sfeatures using an HTTP/JSON API,which typically follows the RESTarchitectural pattern.2.0Games asREST APIsGame clients—whether on mobile devices, tablets,PCs, or consoles—send HTTP requests to yourservers for data, such as logins, sessions, friends,leaderboards, and trophies. Clients don’t maintainlong-lived server connections, which makes it easy toscale horizontally by adding HTTP server instances.Clients can recover from network issues by simplyretrying the HTTP request.When properly designed, a REST API can scale tohundreds of thousands of concurrent players. RESTfulservers are simple to deploy on AWS. And theybenefit from the wide variety of HTTP development,debugging, and analysis tools available on AWS.Nevertheless, some modes of gameplay—like realtime online multiplayer games, chat, and gameinvites—benefit from a stateful two-way socket thatcan receive server-initiated messages. If your gamedoesn’t have these features, you can implement allyour functionality using a REST API. We’ll discussstateful servers later in this guide. First, let’s focus onour REST layer.Deploying a REST layer to Amazon EC2 typicallyconsists of an HTTP server, such as Nginx or Apache,plus a language-specific application server. Thefollowing table lists some of the popular packagesgame developers use to build REST APIs:LanguagePackageNode.jsExpress, Restify, SailsPythonEve, Flask, BottleJavaSpring, JerseyGoGorilla Mux, GinPHPSlim, SilexRubyRails, Sinatra, GrapeThis is just a sampling—you can build a REST API inany web-friendly programming language. AmazonEC2 gives you complete root access to the instance,so you can deploy any of these packages. There aresome restrictions on supported packages for AWSElastic Beanstalk. For details, see the AWS ElasticBeanstalk FAQs.RESTful servers benefit from medium-sized instancesbecause more can be deployed horizontally at thesame price point. General purpose medium-sizedinstances (for example, M5) or compute-optimizedinstances (for example, C5) are a good match forRESTful servers.

HT TP LOAD BAL ANCING5Follow these guidelines to get the most out of Elastic Load Balancing:HTTP loadbalancingBecause HTTP connections are stateless, loadbalancing RESTful servers is straightforward.AWS offers Elastic Load Balancing, which is theeasiest approach to HTTP load balancing forgames on Amazon EC2. AWS Elastic Beanstalkautomatically deploys Elastic Load Balancingto load balance your EC2 instances. If you useElastic Beanstalk to get started, you’ll alreadyhave Elastic Load Balancing running. Always configure Elastic Load Balancing tobalance between at least two Availability Zonesfor redundancy and fault tolerance. Elastic LoadBalancing balances traffic between the EC2instances in the Availability Zones you specify.If you want an equal distribution of traffic onservers, enable cross-zone load balancing—evenif there are an unequal number of servers perAvailability Zone. This ensures optimal usage ofservers in your fleet. Configure Elastic Load Balancing to handle SSLencryption and decryption. This offloads SSL fromyour HTTP servers, meaning there’s more CPU foryour application code. For more information, seeCreate a Classic Load Balancer with an HTTPSListener in the Classic Load Balancers Guide. Elastic Load Balancing automatically removes anyfailed EC2 instances from the load balancing pool.To ensure the health of your HTTP EC2 instancesis closely monitored, configure your load balancerwith a custom health check URL. Then, write servercode that responds to that URL and performs acheck on your application’s health. For example,you can set up a simple health check that verifiesyou have database connectivity. The health checkreturns the message “200 Ok” if your instancepasses the check or “500 Server Error” if yourinstance is unhealthy. Each load balancer you deploy must have aunique Domain Name System (DNS) name. To setup a custom DNS name for your game, you canuse a DNS alias (CNAME) to point your game’sdomain name to the load balancer. For detailedinstructions, see Configure a custom domainname for your Classic Load Balancer in the ClassicLoad Balancers Guide. Note that when your loadbalancer scales up or down, the IP addresses thatthe load balancer uses change. So, it’s importantto use a DNS CNAME alias and to avoid referencingthe load balancer’s current IP addresses in yourDNS domain.For more information, seeWhat is Elastic Load Balancing?

APPLICATION LOAD BAL ANCER6The following features that come with the Application LoadBalancer can be highly beneficial for a gaming workload:ApplicationLoad BalancerOur Application Load Balancer is asecond-generation load balancerthat provides more granular controlover traffic routing based at theHTTP/HTTPS layer.Learn how Gearbox Software responds toplayer traffic within minutes using AWS.EXPLICIT SUPPORT FOR AMAZON ECSThe Application Load Balancer can be configuredto load balance containers across multiple portson a single EC2 instance. Dynamic ports can bespecified in an ECS task definition, giving thecontainer an unused port when scheduled onEC2 instances.HTTP/2 SUPPORTHTTP/2 (a revised edition of the older HTTP/1.1protocol) together with the Application LoadBalancer delivers additional network performanceas a binary protocol instead of a textualone. Binary protocols can improve stability,as they’re inherently more efficient to processand are much less prone to errors than textualprotocols. And HTTP/2 supports multiplexing,which enables the reuse of TCP connectionsfor downloading content from multiple origins,cutting down on network overhead.NATIVE IPV6 SUPPORTWith the near exhaustion of IPv4 addresses, manyapplication providers are changing to a modelthat rejects applications without IPv6 support. TheApplication Load Balancer natively supports IPv6endpoints and routing to virtual private cloud(VPC) IPv6 addresses.WEBSOCKETS SUPPORTLike HTTP/2, the Application Load Balancersupports WebSocket protocol, enabling youto set up a longstanding TCP connectionbetween a client and server. This is a muchmore efficient method than standard HTTPconnections, which are usually held openwith a sort of heartbeat that contributes tonetwork traffic. WebSocket is a great use casefor delivering dynamic data, like updatedleaderboards, while minimizing traffic andpower usage on mobile devices. Elastic LoadBalancing enables the support of WebSocketsby changing the listener from HTTP to TCP.However, in TCP Mode, Elastic Load Balancingenables the Upgrade header when aconnection is established. Then, the loadbalancer terminates any connection that’sidle for more than 60 seconds (for example,a packet isn’t sent within that timeframe).This means the client has to reestablish theconnection. WebSocket negotiations fail if theload balancer sends an upgrade request andestablishes a WebSocket connection to otherbackend instances.

CUSTOM LOAD BAL ANCER7Popular choices for games include HAProxy and F5’s BIG-IP Virtual Edition, both of which can run on EC2. If youdecide to use a custom load balancer, follow these recommendations: Deploy the load balancer software (such asHAProxy) to a pair of EC2 instances, each in adifferent Availability Zone for redundancy.Custom loadbalancerIf you need specific features or metrics thatElastic Load Balancing doesn’t provide,you can deploy your own load balancer toAmazon EC2. Assign an Elastic IP address to each instance.Then, create a DNS record containing both of thoseElastic IP addresses as your entry point—allowingDNS to round robin between your loadbalancer instances. If you’re using Amazon Route 53, our highlyavailable and scalable cloud DNS web service,use Route 53 health checks to monitor yourload balancer EC2 instances to detect failure.This ensures traffic doesn’t get routed to a loadbalancer that’s down. In order for HAProxy to handle SSL traffic, getthe latest development version of 1.5 or later.For more information, see Simple SPDY and NPNNegotiation with HAProxy at Ilya Grigorik’s blog.If you decide to deploy your own load balancer, keepin mind that there are several aspects you need tohandle on your own. First, if your load surpasses whatyour load balancer instances can handle, you needto launch additional EC2 instances. In addition, newauto-scaled application instances aren’t automaticallyregistered with your load balancer instances. So, youneed to write a script that updates the load balancerconfiguration files and restarts the load balancers.If you’re interested in HAProxy as a managedservice, consider AWS OpsWorks, which usesChef Automate to manage EC2 instances andcan deploy HAProxy as an alternative to ElasticLoad Balancing.

HT TP AUTO SCALING8HTTP AutoScalingThe ability to dynamically grow and shrinkserver resources in response to user patternsis a primary benefit of running on AWS.Auto Scaling enables you to scale thenumber of EC2 instances in one or moreAvailability Zones based on system metricslike CPU utilization or network throughput.For an overview of Auto Scaling functionality, seeWhat Is Amazon EC2 Auto Scaling? Then, walkthrough Getting Started with Amazon EC2 AutoScaling. You can use Auto Scaling with any type ofEC2 instance, including HTTP, a game server, or abackground worker. HTTP servers are the easiest toscale because they sit behind a load balancer thatdistributes requests across server instances. AutoScaling dynamically handles the registration orderegistration of HTTP-based instances from ElasticLoad Balancing. This means traffic will be routed to anew instance as soon as it’s available. When configuring your Auto Scaling group, choosetwo Availability Zones and a minimum of twoservers. This will ensure your game server instancesare properly distributed across multiple AvailabilityZones for high availability. Elastic Load Balancingtakes care of balancing the load between multipleAvailability Zones.For details on configuring scaling policies, seeDynamic Scaling for Amazon EC2 Auto Scaling inthe Amazon EC2 Auto Scaling User Guide.To use Auto Scaling effectively, choose good metricsto trigger scale-up and scale-down activities. Use thefollowing guidelines to determine your metrics: Monitor CPUUtilization, which is often a goodCloudWatch metric. Web servers tend to be CPUlimited, whereas memory remains fairly constantwhen the server processes are running. A higherpercentage of CPUUtilization tends to indicatethe server is becoming overloaded with requests.For finer granularity, pair CPUUtilization withNetworkIn or NetworkOut. Benchmark your servers to determine good valuesto scale on. For HTTP servers, you can use a toollike Apache Bench or HTTPerf to measure serverresponse times. Increase the load on your serverswhile monitoring CPU or other metrics. Then, makenote of the point at which your server responsetimes degrade, and see how it correlates to yoursystem metrics.If you’re interested in a managed service, checkout how to use auto-scaling with GameLift.

INSTALLING APPLICATION CODE9Installing application codeWhen you use Auto Scaling with AWS Elastic Beanstalk, Elastic Beanstalk takes careof installing your application code on new EC2 instances as they scale up. This is oneof the advantages of the managed container that Elastic Beanstalk provides.However, if you’re using Auto Scaling without Elastic Beanstalk, you need to getyour application code onto your EC2 instances to implement automatic scaling. Ifyou’re already using Chef or Puppet, consider using them to deploy application codeon your instances. AWS OpsWorks Auto Scaling uses Chef to configure instancesand offers a variant of Auto Scaling that provides both time-based and load-basedautomatic scaling. With OpsWorks, you can set up custom start-up and shut-downsteps for your instances as they scale. OpsWorks is a great alternative to managingautomatic scaling when you’re already using Chef or if you’re interested in usingChef to manage your AWS resources. For more information, see Managing Loadwith Time-based and Load-based Instances in the OpsWorks User Guide.If you’re not using any of these packages, you can use the Ubuntu cloud-initpackage as a simple way to pass shell commands directly to EC2 instances. Withcloud-init, you can run a simple shell script that fetches the latest application codeand starts up the appropriate services. This solution is supported by the officialAmazon Linux AMI and the Canonical Ubuntu AMIs.For more details on these approaches, see the AWS Architecture Center.

GUIDE TO SCALING GLOBAL GAME SERVERS ON AWS10“AWS has really enabled us to grow. Fortnite has grownmore than 100 times in the last 12 months alone, soscalability has been really key for us here. We run our gameservers in 26 Availability Zones around the world rightnow, and we have an almost 10x difference in workloadsbetween peak and low peak in any particular region.”Chris Dyl, Director of Platform, Epic GamesLearn how Epic Games uses EC2 and otherAWS services to scale up and down tomeet demand.

11There are some gameplay scenariosthat work well with an event-drivenRESTful model.For example, turn-based play and appointmentgames that don’t require constant real-time updatescan be built as stateless game servers using thetechniques mentioned in the previous section.3.0GameserversHowever, sometimes a game server’s approach needsto be the opposite of a RESTful approach. Clientsestablish a stateful two-way connection to the gameserver via UDP, TCP, or WebSockets—enabling boththe client and server to initiate messages. If thenetwork connection is interrupted, the client mustperform reconnect logic and possibly logic to resetits state. Because clients can’t simply be round-robinload balanced across a pool of servers, stateful gameservers introduce challenges for automatic scaling.Historically, many games have used statefulconnections and long-running server processes forgame functionality, especially in the case of largerAAA and massively multiplayer online (MMO) games.If you have a game that’s architected in this manner,you can run it on AWS. For new games, however, weencourage you to use HTTP as much as possible andonly use stateful sockets for aspects of your gamethat really need it, such as online multiplayer.The following table lists several packages that allowyou to build event-driven servers:LanguagePackagePythonGivent, TwistedNode.jsCore, Socket.io, AsyncErlangCoreJavaJBoss, NettyRubyEvent MachineGoSocket.ioC isn’t listed in the above table because it tendsto be the language of choice for multiplayer gameservers. Many commercial game engines, such asAmazon Lumberyard and Unreal Engine, are writtenin C . This enables you to take existing gamecode from the client and reuse it on the server. It’sparticularly valuable when running physics or otherframeworks on the server, such as Havok, whichfrequently only supports C .Regardless of programming language, stateful socketservers generally benefit from as large an instanceas possible because they’re more sensitive to issueslike network latency. The largest instances in the C2compute-optimized instance family (for example, C5)are often the best options. This new generation ofinstances uses enhanced networking via single rootI/O virtualization (SR-IOV). SR-IOV provides highpackets per second, low latency, and low jitter—making this an ideal solution for game servers.

MATCHMAKING12The following steps outline the typical matchmaking process:1. Ask the user about the type of game they wouldlike to join (one-on-one or teams, for example).2. Look at what game modes are currently beingplayed online.3. Factor in variables like the user’s geolocation(for latency) or ping time, language, andoverall ranking.4. Place the user on a game server that containsa matching game.MatchmakingGame servers require long-lived processes, and theycan’t be round-robin load balanced like with an HTTPrequest. After a player is on a given server, theyremain on that server until the game is over, whichcould be minutes or hours.Matchmaking is the feature that getsplayers into games.In a modern cloud architecture, you should minimizeyour usage of long-running game server processes tothe gameplay elements that require it. For example,imagine an open-world or MMO game. Some of thefunctionality, such as running around the world andinteracting with other players, requires long-runninggame server processes. However, the rest of the APIoperations, like listing friends, altering inventory,updating stats, and finding games to play, can beeasily mapped to a REST web API.In this approach, game clients would first connect toyour REST API and request a stateful game server.Next, REST API performs matchmaking logic andgives clients an IP address and server port to connectto. The game client then connects directly to thatgame server’s IP address.This hybrid approach gives you the best performancefor your socket servers because clients can directlyconnect to the EC2 instances. And you still get thebenefits of using HTTP-based calls for your mainentry point.For most matchmaking needs, Amazon GameLiftprovides a matchmaking system called FlexMatch.You can control FlexMatch via your REST API andmake calls to the GameLift API to initiate matchingand return results. You can learn more aboutFlexMatch in the GameLift Developer Guide.If FlexMatch doesn’t suit your matchmaking needs,you can find more information about implementingmatchmaking in a serverless custom environmentin Fitting the Pattern: Serverless CustomMatchmaking with Amazon GameLift onthe Game Tech Blog.

ROUTING MESSAGES13A common strategy for sending and receivingmessages is to use a socket server with a statefulconnection. If your player base is small enoughthat everyone can connect to a single server, youcan route messages between players by selectingdifferent sockets. However, in most cases, you needmultiple servers—which means those servers alsoneed a way to route messages.RoutingmessagesAmazon Simple Notification Service (SNS) can helproute messages between EC2 server instances. Forexample, let’s assume player 1 on server A wants tosend a message to player 2 on server C (as shown inthe following figure). In this scenario, server A canlook at locally connected players. When server A can’tfind player 2, it can forward the message to an SNStopic to propagate the message to other servers.There are two main categories ofmessages in gaming: messages targetedat a specific user (like private chat ortrade requests) and group messages(such as chat or gameplay packets).BCSocket serverinstancesPlayer 1Mobile push notificationsUnlike the previous use case, which is designed to handlenear-real-time in-game messaging, mobile push is thebest choice for sending a message to draw a user backin when they’re out of a game. An example might be auser-specific event (such as a friend beating your highscore) or a broader game event (like a DoubleXP Weekend).Although Amazon SNS supports the ability to send pushnotifications directly to mobile clients, Amazon Pinpointis a better option. Pinpoint provides more than justmobile push notifications. It’s a player-pleasing, multichannel notification solution that includes email, voicemessages, and SNS messaging.SNS topicbetween serversAIn this scenario, Amazon SNS fills a role similar to amessage queue like RabbitMQ or Apache ActiveMQ.Instead of using Amazon SNS, you can run RabbitMQ,Apache ActiveMQ, or a similar package on AmazonEC2. The advantage of Amazon SNS is that you don’thave to spend time administering and maintainingqueue servers and software on your own. For moreinformation about Amazon SNS, see What is AmazonSNS? and Creating an Amazon SNS topic in theAmazon Simple Notification Service Developer Guide.Player 2SNS-backed player-toplayer communicationbetween two servers

L AST THOUGHTS ON GAME SERVERS14Last thoughts on game serversIt’s easy to become obsessed with finding the perfect programming framework orpattern. Both RESTful and stateful game servers have their place. And any of thelanguages discussed in this guide work well when programmed thoughtfully. Whenmaking your choice, consider your overall game data architecture—where data lives,how to query it, and how to efficiently update it.For more information about scalable data patterns, read our Guide to ScalableData on Games for AWS.

GUIDE TO SCALING GLOBAL GAME SERVERS ON AWS

Ruby Rails, Sinatra, Grape Games as REST APIs 2.0. HTTP load balancing 5 Because HTTP connections are stateless, load balancing RESTful servers is straightforward. AWS offers Elastic Load Balancing, which is the easiest approach to HTTP load balancing for games on Amazon EC2. AWS Elastic Beanstalk