5 Signs You Should Consider Aerospike Over Amazon DynamoDB

Transcription

5 Signs You Should ConsiderAerospike over Amazon DynamoDB

Table of ContentsIntroduction . 3Consideration 1: Will on-demand pricing accelerate TCO as you hit scale? . 4Cost of Strategic Provisioning: Migrating data from another data source into DynamoDB . 4Cost of Capacity: Higher read performance costs more. A lot more. . 5Cost of Consistency: Avoiding stale reads will incur additional cost in DynamoDB . 6Consideration 2: Will variable performance and fluctuating latency be an issue? . 7Noisy Neighbors will drive inconsistent performance . 7Benchmark results for DynamoDB and DynamoDB with DAX expose variability in performance whencompared to Aerospike . 8Summary findings: . 8Consideration 3: Will regulations influence cloud or non-cloud platform requirements? . 9Consideration 4: Will rapidly rising DevOps and testing costs be a concern? . 10Consideration 5: Will vendor lock-in be an issue?. 11Call to Action. 13Appendix A: Benchmark results for DynamoDB and DynamoDB with DAX vs. Aerospike . 14Appendix B: Costing Detail . 15DynamoDB Cost Detail . 16DynamoDB Accelerator (DAX) Cost Detail . 18Aerospike Cross AZ Data Cost Detail . 18Aerospike Instance Cost Detail . 19About Aerospike . 202

IntroductionThe right database is an essential element for applications that are mission critical and core to a business. Adatabase may not be sexy, but the right database can improve end user experience, ensure 24/7 uptime,reduce TCO and allow the IT-Ops teams to sleep soundly at night. Yet, often, initial database selection isbased on in-house familiarity, the path of least resistance, IT politics, or the “approved vendor list”. Ultimately,the most important consideration is a successful business outcome, but this is often overlooked upfront. Thenext most important consideration is total cost - all things being equal. This too, can be over looked at theoutset of a project. With those objectives in mind, finally, a sound technology choice can be made.At Aerospike, we are often asked how Aerospike compares to other databases and where and whenAerospike is the better choice. This is a complex topic because there are many good databases. However,they are often very different. Evaluating their respective characteristics and suitability for the project at-handcan be challenging. This paper offers considerations that we believe are relevant when deciding betweenAerospike and Amazon DynamoDB.We often encounter this question from enterprises that have chosen Amazon Web Services (AWS) for baremetal hosting. DynamoDB, one of the many AWS database offerings, is a solid product that appears to be asound choice for companies that already leverage the AWS platform. However, it’s not a one-size-fits-all-forall-uses product. There are areas of considerable differentiation that would indicate that Aerospike might be apreferable choice. Unfortunately, the dilemma is that a company cannot change its business requirements tofit the limitations of the technology and conversely it is unlikely that the technology can change either.What signs should you consider when deciding that Aerospike might be the better choice. Here are five thatwe cover in this paper:Sign 1:On-demand pricing accelerates TCO as you hit scaleSign 2:Variable performance and fluctuating latency can be an issueSign 3:Regulatory influences on cloud or non-cloud platform requirementsSign 4:Rising DevOps and testing costs now a concernSign 5:Vendor lock-in can be an issueIf you find these considerations important to you as you consider a database technology for either a newproject or a new phase of a project, we hope to address them in this paper. We will look specifically atDynamoDB and Aerospike to compare relative strengths and weaknesses as they relate to the aboveconsiderations.Throughout this paper, we will provide an objective view of the information that is currently available.Hopefully this will aid in decision making and help any project team make appropriate choices, both for thebusiness and the respective technology.3

Sign 1: On-demand pricing accelerates TCO as you hit scaleWhile this was a question of CapEx versus OpEx at the beginning of the cloud computing discussion, manyenterprises find that on-demand pricing is not as advantageous as they once hoped. Research is showingthat enterprises are getting cloud bills that exhibit 40-50% CapEx and OpEx waste given there are oftenhidden costs within cloud environments.1,2There are a couple of key attributes that drive higher pricing when it comes to Amazon DynamoDB. One isthe migration of data when onboarding and the other is the pricing basis that tends to rely heavily upon usagecapacity. Many of these factors are not considered when initially comparing one database to another. ForDynamoDB, true costs are uncovered a few months after deployment when the usage and cost patterns arebetter understood.Cost of Strategic Provisioning: Migrating data from another data source into DynamoDBWhen a few existing database customers moved from Aerospike to DynamoDB (and then later returned –ergo our reporting this issue), these customers found that they needed an unusually high number of capacityunits3 to migrate from Aerospike to DynamoDB in a timely manner. This ended up costing them 10 times theamount they expected for migration.Since DynamoDB pricing is based upon Read Capacity Units (RCUs) and Write Capacity Units (WCUs) customers pay a premium to migrate data onto the DynamoDB platform at a faster rate. If you increase thecapacity requirement for the sake of faster data load, AWS creates many internal partitions. However, afterthe migration, the platform does not allow you to reduce the number of partitions for steady-state usage.4After their initial migration from Aerospike to DynamoDB, these users were able to decrease the number ofcapacity units to a steady state value to reduce costs, but they could not decrease the number of partitionsneeded for DynamoDB. In other words, if one increases the capacity requirement for a faster data load for amigration, the DynamoDB system, as a result, will create a proportional number of internal partitions. (Thereis a cap of 1000 write units per partition.5)So, in a way, customers pay for excess capacity they never really use. Further, if you have a read/writepattern which has affinity to a partition, you will be limited by the per-partition limit, i.e. 1000, even though youhave more overall capacity. It can be difficult to anticipate the additional costs associated withoverprovisioning for migration.What options remain? Either customers continue to pay tens of thousands of dollars a month, or they reducethe capacity and thus reduce throughput. The result is that customers pay for more capacity than what theyactually use when leveraging DynamoDB. This is one of the primary considerations when cost-optimizingDynamoDB usage, with a focus on lean, strategic provisioning and m/dynamodb/faqs/#What is a readwrite capacity o-dynamodbpartitions24

Cost of Capacity: Higher read performance costs more. A lot more.Looking at Figure 1, let’s examine the test case of one billion records with a record size of 4 KB and a150,000 reads-per-second with 150,000 writes-per-second and a replication factor of three. It is easy tocompare the yearly costs of capacity between hourly rate for a year, hourly rate for a 1-year upfrontpurchase, and hourly rate for a 3-year upfront purchase. All things are never equal, but it is helpful tounderstand the basic metrics for cost evaluation for each database. This data set is representative of whatyou would typically experience, in terms of costs differences on AWS between DynamoDB and Aerospike.1 Year Operational Charges on AWS: DynamoDB vs.Aerospike1B records, 4k record size, 75k TPS Reads, 75k TPS Writes, RF3 4,500,000 4,000,000 3,500,000 3,000,000 2,500,000 2,000,000 1,500,000 1,000,000 500,000 DynamoDBHourly rateDynamoDB w/DAX1-year upfront rateAerospike3-year upfront rateFigure 1: DynamoDB with DynamoDB Accelerator (DAX) is included as Benchmark results (see Appendix) indicate DAX is required toapproximate Aerospike read performance. (For cost detail, see Table 1.)There is a 3-year upfront pricing option for DynamoDB with significant discounting (on the order of 70-80%)that reverts to 1-year upfront pricing rate thereafter. Plus, the option is available for customers to leverage themore flexible hourly/annual rate without paying upfront (albeit at a considerably higher rate). Paying upfront toleverage pricing discounts is one of the factors Enterprises are prone to neglect.When using DynamoDB, you pay a flat, hourly rate based upon how much capacity you have provisioned inRead Capacity Units (RCU) and Write Capacity Units (WCU). Pricing also varies by the region (though isindicated clearly). Capacity planning is obviously a significant aspect of ensuring cost efficiency forDynamoDB.Recently Amazon introduced DynamoDB Accelerator (DAX) to be used on top of DynamoDB to reduce querylatency. DAX is a caching mechanism and is priced independently as Memory/Hour6. For example, if youwant half your records on DAX for faster queries, costs will increase icing.pdf5

Table 1 (below) exhibits an application with a mix of reads/writes at very high transactions per second (TPS)throughput (it represents some additional detail breakdown of Figure 1, FYI). The operational charges fromAmazon for DynamoDB and DynamoDB with DAX cost approximately 85% more than a comparableAerospike database. (See Appendix B for further pricing detail).Table 1: When looking at usage patterns, cost over time, as well as growth is important to understand. While the on-demand model isoften attractive, it could be costing several times that of other database options.These operational costs do not include costs for any scaling or any growth that may be encountered insubsequent years. Moreover, to achieve similar latency requirements as those from Aerospike (e.g., 1ms),DAX will be required (See Appendix A for benchmark comparisons) – which drives a significant andunexpected cost penalty (as indicated) – even with 1- and 3-year upfront discounting (modeled; not typicallyavailable, actually). If latency performance requirements are less critical (per Appendix B benchmark, thelatency for DynamoDB without DAX can be 10x that of Aerospike), then using DynamoDB without DAX willsave money. However, as the business requirements change and grow, project teams will be forced to revisitthe economics of this decision.(For DynamoDB and DAX performance dynamics see Consideration 2.)Cost of Consistency: Avoiding stale reads will incur additional cost in DynamoDBDynamoDB has two modes for reads – Eventually Consistent and Strongly Consistent. It allows twice more“Eventually Consistent Reads” per hour than “Strongly Consistent Reads”7. For an Eventually ConsistentRead, the response might not reflect the results of a recently completed write operation. Therefore, theresponse might include stale data. On the other hand, Strongly Consistent Reads will return the most up-todate copy of the data no matter what. Determining whether the needs of the application will be sufficientlyserved by one or the other depends upon the business.It is important to note that strong consistency has a price. Strongly Consistent Reads cost twice as much aseventual consistency reads, hence development teams should choose wisely.Conclusion: The published costs do not make DynamoDB more compelling when looking at growth overtime. Upfront costs for migration, costs for high performance and costs beyond three years will result inmuch greater costs. Buying specific reserve instances while reducing total cost, can certainly burden cash7https://aws.amazon.com/dynamodb/faqs/#What is a readwrite capacity unit6

flow and budget constraints upfront, increase lock-in effect (see Consideration 5), and impact DevOpsflexibility.Sign 2: Variable performance and fluctuating latency can be an issueRunning workloads on a public cloud has advantages, but predictability and low latency are unfortunately notsome of them. The impact when deploying an on-demand platform with unpredictable network latency andthe noisy neighbor8 effect could be systemic and degrade performance.Noisy Neighbors will drive inconsistent performanceNoisy neighbors, also known as stolen CPUs9, implies negative intent from your virtual neighbors. However,intent has nothing to do with how performance can be adversely affected. In reality, it is a relative measure ofthe cycles a CPU should deliver to your database, but could not, due to other tenants on that sameinfrastructure that randomly divert resources. Databases are especially sensitive to noisy neighbors becauseof their CPU-intensive tasks. This results in burst-y behavior when there are drops and rises in performancebased upon how the public clouds allocate CPU time to your specific process - in this case, a database.Some of these “stolen” cycles are from the hypervisor enforcing a quota on the type of subscription you havein place which represent an upper limit that would be hit by the spikes in CPU utilization. In other cases, suchas the one shown below (Figure 2), the amount of diverted CPU cycles varies over time, presumably due tothe same physical hardware that also requests CPU cycles from the underlying hardware. In short, thedatabase is competing for the same hardware resources, and will not “win” all of the time.Figure 2 shows a graph of CPU usage on a host, with ‘stolen’ CPUs in yellow. Light blue denotes ‘idle’ oravailable cycles, purple denotes ‘user’ or cycles spent executing application code, and dark blue denotes‘system’ or cycles spent executing kernel code. In this case, we can see the amount of ‘stolen’ CPUs (yellow)is significant and in a couple of points in time pushes CPU utilization to 100%, threatening processes (purple)and headroom (light blue).Figure 2: This graphic depicts stolen CPUs (yellow). This is from a typical AWS performance monitoring graphic though many databases run on the AWS public cloud, including Aerospike, inconsistent performanceshould be viewed as a strong consideration when deciding to deploy on a public cloud. Further to thisconsideration, it should be noted that Aerospike can run either on-premise or on AWS as well as on othercloud ined-and-how-to-troubleshoot-stolen-cpu-issues/7

Since public clouds, such as AWS, are multi-tenant systems, they allocate resources for the various tenantworkloads as needed. Dealing with this can be a challenge. However, there are a few ways to mitigate theissue:1. Consider a database technology that can utilize the cloud-native platform features in ways thatremove multi-tenant induced latency due to either CPU or storage I/O cycles that are taken by othertenants. Since Aerospike is not an AWS cloud service, but rather an AWS tenant, latencies will bereduced by the higher priority AWS allocates resources for customer processes above those for theirown native platform services.2. Consider database technology that runs both on a public cloud provider as well as on-premise. (SeeConsideration 3.) This choice provides optionality to work around latency and cost issues that cannotbe easily and inexpensively resolved if locked into a public cloud-only database.3. Consider using a database that is highly efficient and utilizes the CPU more efficiently which canfurther mitigate the noisy neighbor problem.Benchmark results for DynamoDB and DynamoDB with DAX expose variability in performance when comparedto AerospikeIn benchmark analysis (see Figure 3 below and Appendix A), Aerospike delivers the lowest latency andhighest query throughput compared to DynamoDB under scenarios with heavy Read workloads and a 50-50Read/Write workload. Performance is more than just completing database tasks the fastest. It is also aboutproductivity and TCO, which often cost much more than the cost of the database and depends often on otherfactors such as the cost for Dev and IT Ops.Internal Aerospike benchmark tests used the following scenario:- 1 billion records @4KB data size- Heavy Read workloads- Heavy Write workloads- 3 DAX instances per cluster (DAX is a separate DynamoDB clustering layer to improve performance)- a cache hit ratio of 52% ( 50-50 for DynamoDB usage and DynamoDB Accelerator usage)Summary findings:Latency Benchmark (ms)113240206099151054.542.980Aerospike100% ReadsDynamoDB100% ReadsAvg LatencyDynamo/DAX100% ReadsAerospike50/50 R/W95% LatencyDynamoDB50/50 R/WDynamoDB/DAX50/50 R/W99.9% LatencyFigure 3 Aerospike vs. DynamoDB benchmark data using different configurations and usage patterns. Note that both DynamoDB andDynamoDB with DAX have latency numbers that are much higher than Aerospike for all cases.8

Aerospike delivers 1ms 99% of time under at 200K-500K TPS (see Appendix A). Similar workloads arecommon for transactional analysis applications. Though DynamoDB with DAX (accelerator) does improvelatency, the costs for DAX can be prohibitively high.For DynamoDB with DAX, improving latency requires instance upgrades, increased number of clusters withsharding and using hot keys to allow data subsets to fit onto a DAX cluster to improve cache hit ratios. Forexample, with a cache hit ratio of 52%, we observed average latencies of 2.5ms to 3.5ms (see Appendix A).For sub-ms ( 1ms) performance, it was observed that a cache hit ratio of 90% would be required – whichwould incur considerably higher costs.Further, there is throttling between the DAX servers and the DynamoDB servers. Trying to warm the cachecan be a long (and tedious) process. Warming a DAX cache may take as long as 5 or 6 hours, dependingupon the target cache hit ratio. A 50% cache hit ratio will most likely take 2 to 3 hours. With Aerospike, whichdoes not require a cache, this never becomes an issue nor a cost.Trying to improve DynamoDB latency with DAX injects other costs: DynamoDB applications need additional code to use DAX: the applications need to know which keysare in which DAX cluster. When using DAX, write latency actually goes up at the cost of having good reads; you essentiallyhave to “write-through” the Cache to the DynamoDB memory. With DAX deployments, performance is hampered by having hot/warm/cold data – some data is inthe DAX cache and some in DynamoDB. With data in the DAX cache, latency improves. If the dataneeded is not in DAX/cache and it is retrieved from DynamoDB, the result will be much higherlatencies and thus sizable performance difference/variation. Aerospike’s high-performance range, by contrast, is very consistent. For instance, Aerospike candeliver 700K TPS without any incremental cost, yet at the same low latency levels (see Appendix A).By contrast, since DynamoDB is priced by TPS, costs would rise quickly as a result.Conclusion: While latency issues within multi-tenant public clouds are well-documented, it is worthconsidering the impact this might have on application performance at scale. Trends indicate most choicesare to either run within a public cloud, such as AWS, or run on-premise. It is helpful to consider that mostdatabase requirements, including those that deal with performance and cost, will change over time. Themore affordable and performant the database, including support for both a public cloud and on-premisedeployment, the more likely the long- term success of the project.Sign 3: Regulatory influences on cloud or non-cloud platform requirementsMany enterprises deploying new applications need the operational flexibility to use both cloud and non-cloudplatforms to deal with real world business issues. Examples include laws that require data to remain onpremise, or latency issues that can only be addressed with data co-located with the applications, either cloudor non-cloud. With DynamoDB, the only option is to run within Amazon Web Services, whereas Aerospike canrun in both cloud and non-cloud environments. This adds optionality and prevents lock-in.Business continuity is also worth considering. Many enterprises adopt hybrid deployment models just for thatpurpose. Some data is deemed ok in the public cloud, but some is not and kept in separate data centers. Ifan enterprise needs high-performance database technology in its public cloud as well as in its own privatecloud, then a DynamoDB choice severely limits the options. With Aerospike, the enterprise can run onpremise or in a public cloud -- AWS, Azure, GCP.The General Data Protection Regulation (GDPR) (Regulation (EU) 2016/679) is a regulation by which theEuropean Parliament, the Council of the European Union and the European Commission intend to unify data9

protection for all in the European Union (EU). Moreover, this regulation will also address the export ofpersonal data outside the EU. GDPR mandates strict rules regarding what data is being housed and where,how it is secured, and the flow of how users access and use it. As enterprises embrace public, private, andhybrid cloud strategies, and thus store data in diverse environments, it is appropriate to give seriousconsideration to the regulatory and legal requirements of each location – including cross-border and securitysensitivities.In addition, the Payment Services Directive 2 (PSD2) is an EU Directive, administered by the EuropeanCommission to regulate payment services and payment service providers throughout the European Union(EU) and European Economic Area (EEA). PSD2 sets the stage for open banking by providing standardizedaccess to customer data, enhancing payment security and lowering the barriers to entry for Trusted ThirdParty10 (TTP) services and FinTech. The value nexus between financial institutions and payment providers liesin the specific ability to transact using customer data. Simply put, customer data needs to be exposed tomany players across many services at a much faster rate, and all of it done securely.Emerging regulations warrant close consideration when making deployment decisions:First, PII (personally identifiable information) may only be stored within the country where the EU citizenresides.Second, the ability to transport data back and forth between their own data center and the public cloudbecomes a matter of compliance. Indeed, many public cloud providers do not guarantee that the datawill remain in-country, since they leverage backup services that may replicate the data to storagesystems out of the country11.So, there is much to consider when selecting database technology. The public cloud offloads opportunity costand mitigates early stage risk. But in the long run, flexibility and deployment flexibility might be of paramountimportance. Without the choice of both cloud and non-cloud, enterprises could find themselves migrating offof a “closed” database, such as DynamoDB, at great risk and expense.Conclusion: This section is about leveraging both cloud and non-cloud technology to deal with changingoperational needs. As privacy, protection and compliance issues continue to arise, the ability to leverageboth cloud and non-cloud, at the same time, is becoming increasingly of interest.Sign 4: Rising DevOps and testing costs now a concernConsidering Dev and Test in the cloud? If so, that will require a dev and test budget for AWS instances aspart of any project plan. Paying on-demand pricing for those instances may very well prove to be costprohibitive. “Sticker shock” for dev and test with required database instances drives many enterprises to useDevOps resources on-premise.12 This is worth taking into account if considering DynamoDB:1. While on-demand costs appear to provide better cost metrics for dev and test use of databases, theresources needed when considering dev, test, staging, and deployment are often far greater thanthose expected.2. Latency issues encountered with remote public cloud hosting are not typically considered. Thismeans that just minor performance testing can cost thousands of dollars a day.3. Security testing becomes more complex considering that more security approaches and enablingtechnology need to come into play to handle native security on a public cloud, as well as nativesecurity within the on-premise systems.10https://en.wikipedia.org/wiki/Trusted third -devops-in-the-cloud.html1110

If the DevOps team does a lot of validation with dynamic data sets, there is a cost premium every time it musttest its product/solution. Customers often try to mitigate cost issues with reserve instances, paying upfront forcloud service, or buying spot instances (be there at the right time), but this quickly becomes a manageabilityissue.Dedicated resources in the DevOps team that hunt for the cheapest way to get it done must constantlymonitor, track and analyze service usage charges and “look for a deal” to reduce costs. This can become adistraction and reduce the effectiveness of DevOps processes and teams. For a typical DevOps organizationwith 100 employees doing Dev and Test, only a 5 percent loss in productivity can cost the business over 5,000 per day, or 1.3 million dollars per year.It’s helpful to do a comparison for Dev and Test costs for a typical DevOps process or chain. In looking atFigure 4 below, it is helpful to note how Dev and Test compare when using Aerospike or DynamoDB. Whilethe on-demand aspect of AWS’s DynamoDB does reduce some costs, the use within a DevOps process notonly creates a cost disadvantage but can be a disruption that carries hidden costs as well.Implementing DevOps: DynamoDB vs. Aerospike 300,000 250,000 200,000 150,000 100,000 50,000 Cost of DevCost of TestDynamoDBCost of StagingAerospikeFigure 4 When running a DevOps organization, it is worth considering the additional cost of the database for Dev, Test, and Staging(Source: Cloud Technology Partners)It also makes sense to consider the cost for staging in the public cloud. This is often different than test,considering that a database is being deployed, and once again the organization is paying on demand pricesfor DynamoDB instances. These invariably cost a great deal more than typically anticipated. In short, ondemand databases are expensive when it comes to DevOps. DynamoDB shares that limitationConclusion: On-demand database pricing may be fine for production. However, it makes good sense toconsider the cost of Dev and Test, and the fact that any project will need a database instance for each, aswell as staging. Those costs are often surprisingly prohibitive when compared to on-premise databasessuch as Aerospike or Aerospike in the cloud.Sign 5: Vendor lock-in can be an issueWe all fear lock-in. Why? While there is rarely a technology that cannot be abandoned, the core issue is thecost of changing technologies, as well as the risk and disruption to the business. Years ago, hardwarevendors also offered databases. These databases were propriety and only ran on the vendor’s proprietaryhardware. Until the advent of “open” relational databases, customers were locked-in and could not movefrom one hardware vendor to another. Proprietary databases in the cloud that preclude a move to analternate deployment presents the same problem once again. If a business wants to move from one cloudprovider to another or deploy a hybrid cloud or even deploy on-premise or at a CoLo, that will not be possible.11

In the immortal words of Yogi Berra, the famous American baseball player, this vendor lock in is “Déjà Vu AllOver Again”.DynamoDB is an AWS IaaS cloud-only

At Aerospike, we are often asked how Aerospike compares to other databases and where and when Aerospike is the better choice. This is a complex topic because there are many good databases. However, they are often very different. Evaluating their respective characteristics and suitability for the project at-hand can be challenging.