COMPARISON GUIDE Top Cloud Data Warehouses For The

Transcription

COMPARISON GUIDETop CloudData Warehousesfor the EnterpriseAmazon vs. Azure vs. Google vs. Snowflake

The Rise of CloudData WarehousingData warehouses have been staples of enterprise analytics andreporting for decades. But they weren’t designed to handle today’sexplosive data growth or keep pace with end users’ ever-changingMASSIVELY PARALLEL PROCESSING (MPP)needs. All that changed when cloud data warehousing emerged.Data warehouses that support big data projects useCloud data warehousing provides businesses of all sizes withbenefits and flexibility they couldn’t enjoy before. No longerconstrained by physical data centers, companies can nowdynamically grow or shrink their data warehouses to rapidlymeet changing business budgets and requirements.Modern cloud architectures combine three essentials: the powerof data warehousing, flexibility of Big Data platforms, and elasticityof cloud at a fraction of the cost to traditional solution users.This eBook describes leading cloud data warehouses withnoteworthy differences and a proven approach to make themmassively parallel processing (MPP) architectures toprovide high-performance queries on large data volumes.MPP architectures consist of many servers running inparallel to distribute processing and input/output (I/O) loads.COLUMNAR DATA STORESMPP data warehouses are typically columnar stores —the most flexible and economical for analytics. Columnardatabases store and process data by columns instead ofrows and make aggregate queries, the type often used forreporting, run dramatically faster.accessible, effective, and efficient for all your data users.Comparison Guide: Top Cloud Data Warehouses for the Enterprise2

AMA ZON A ZURE GOOGLE SNOWFL AKEAmazon RedshiftRedshift is a fully managed, petabyte-scale data warehouseservice in the cloud. You can start with as little as a fewgigabytes of data and scale to petabytes. This empowers you toacquire new insights from your business and customer data.The first step to creating a Redshift data warehouse is tolaunch a set of nodes, called an Amazon Redshift cluster. Afteryou provision your cluster, you upload your data set and thenperform data analysis queries. Regardless of the size of yourdata set, Amazon Redshift delivers fast query performance usingfamiliar SQL-based tools and business intelligence applications.THE FIRST WIDELY ADOPTED CLOUD DATA WAREHOUSEFor many years, data warehousing was only available as an on-premisesolution. Then in November 2012 Amazon Web Services (AWS) launchedRedshift. Although not the first cloud data warehouse, it was the first togain market share through adoption. Redshift’s SQL dialect is based onPostgreSQL, which is well understood by analysts worldwide, and usesan architecture familiar to many on-premises data warehouses users.Comparison Guide: Top Cloud Data Warehouses for the Enterprise3

AMA ZON A ZURE GOOGLE SNOWFL AKEMicrosoft AzureSynapse AnalyticsAzure Synapse Analytics is a newer analytics service that bringstogether enterprise data warehousing and Big Data analytics.It gives you the freedom to query data using either serverlesson-demand or provisioned resources. Azure Synapse offers aunified experience to ingest, prepare, manage, and serve data foryour business intelligence (BI) and machine learning (ML) needs.At the heart of Azure Synapse is a cloud-native, distributed SQLprocessing engine. It’s built on the foundation of SQL Server toTAKING SQL BEYOND DATA WAREHOUSINGdrive your most demanding enterprise data warehousingSynapse Analytics aims to unify a range of analytics workloads, suchworkloads. Similar to other cloud MPP solutions, Azure SQLas data warehouses, data lakes, and ML, in a singular user interface (UI).Data Warehouse (SQL DW) separates storage and compute,billing for each separately. Azure Synapse saves relational tablesdata with columnar storage and abstracts physical machinesby representing compute power in the form of data warehouseunits (DWUs). This allows your users to easily and seamlesslyscale compute resources at will.Comparison Guide: Top Cloud Data Warehouses for the EnterpriseThe combination of an SQL Engine, Apache Spark with Azure Data LakeStorage (ADLS), and Azure Data Factory gives users the option to controlboth data warehouse/data lakes and data preparation for ML tasks.Azure Synapse allows for both vertical and horizontal scaling of the datawarehouse. Vertically by changing the service tier or placing the databasein an elastic pool. Horizontally by adding more data warehouse units.4

AMA ZON A ZURE GOOGLE SNOWFL AKEGoogle BigQueryBigQuery is a fully managed, serverless data warehouse thatautomatically scales to match storage and computing powerneeds. With BigQuery, you get a columnar and ANSI SQLdatabase that can analyze terabytes to petabytes of data atincredible speeds. BigQuery also lets you do geospatial dataanalysis using familiar SQL with BigQuery GIS. In addition,you can quickly build and operationalize ML models onlarge-scale structured or semi-structured data using simpleSQL with BigQuery ML. And you can support real-timeinteractive dashboarding with BigQuery BI Engine.The BigQuery architecture is composed of several components.Borg is the compute. Colossus is the distributed storage.Jupiter is the network. And Dremel is the execution engine.A SERVERLESS SOLUTIONGoogle doesn’t expect you to manage your data warehouse infrastructurewhich is why BigQuery hides many of the underlying hardware, database,nodes, and configuration details. Its elasticity automatically works outof the box. And getting started is simply a matter of creating an accountwith Google Cloud Platform (GCP), loading a table, and running a query.Google takes care of the rest.Comparison Guide: Top Cloud Data Warehouses for the Enterprise5

AMA ZON A ZURE GOOGLE SNOWFL AKESnowflake’s hybrid architecture is separated into three distinct layers:Snowflake CloudData PlatformSnowflake is a fully managed MPP cloud data warehousethat runs on AWS, GCP, and Azure. When you’re a Snowflakeuser, you can spin up as many virtual warehouses as you needto parallelize and isolate the performance of individual queries.Snowflake enables very high concurrency by separatingstorage and compute to ensure that many warehouses cansimultaneously access the same data source.You interact with Snowflake’s data warehouse through aweb browser, the command line, an analytics platform,THE FIRST MULTI-CLOUD DATA WAREHOUSEor via Snowflake’s ODBC, JDBC, or other supported drivers.Snowflake, unlike the other data warehouses we’ve profiled, is the onlyThe platform supports ACID-compliant relational processingsolution that doesn’t run on its own cloud. It’s the first multi-cloudand has native support for document store formats such asdata warehouse available globally on AWS, GCP, and Azure. With aJSON, Avro, ORC (Optimized Row Columnar), Parquet, and XML.common and interchangeable code base, Snowflake features global datareplication, which means you can move your data to any cloud, in anyregion — without having to re-code your applications or learn new skills.Comparison Guide: Top Cloud Data Warehouses for the Enterprise6

Top Cloud Data Warehouses at a GlanceAmazon RedshiftMicrosoft Azure SynapseGoogle BigQuerySnowflake Cloud Data Platform2012201620102014Separates Storage and ComputeNoYesYesYesMulti-CloudNoNoNoYesAmazon Redshift SQLTSQLStandard SQL 2011 & BigQuery SQLSnowflake SQLYes - ManualYes – Manual and AutomaticYes – AutomaticYes – n nkLinkLinkYesYesYesYesInitial ReleaseQuery LanguageElasticityTransactionWebsiteFree TrialComparison Guide: Top Cloud Data Warehouses for the Enterprise7

Achieve an AgileData WarehouseOur Qlik Data Integration Platform (formerly Attunity) automatesthe entire data warehouse lifecycle to accelerate the availability ofyour analytics-ready data. Our model-driven approach helps yourdata engineers to design, deploy, manage, and catalog purposebuilt, cloud data warehouses faster than traditional solutions.Add Qlik to any cloud data warehouse you choose and achievethe cost and efficiency promises of agile data warehousing.OUR QLIK PRODUCTIVITY DRIVERS Real-time data ingestion and updates – A simple and universalsolution for continually ingesting your enterprise data into popular clouddata warehouses in real time. Automated workflow – A model-driven approach for continuallyrefining your data warehouse operations. Trusted, enterprise-ready data – A smart, enterprise-scale data catalogto securely share your data marts.Comparison Guide: Top Cloud Data Warehouses for the Enterprise8

020ServicesDIY /Hardware 9.6Drug Store/TruckingPharmacy 8.3 8.460ElectricalGoods 13.340ChurnRetail &Consulting, Audit & TaServicexs 22.180100Choose Cloud Data Warehousingand Innovate with QlikAir Couriers%7.6Professional &Commerc& Supp ial Equipmentlies Who 17.7Engineering 11.0PersonalServices 6.1ChickenConstruction 7.6WineamSodaCerealIce 1k4MChurn6MChurn8MThe cloud is now the go-to platform for modern analytics.That’s why your enterprise needs approaches and technologiesthat enable analytics in the cloud, in a solution that deliversmore value with faster iterations and fewer resources. WithQlik, you automate your data warehouse, optimize your datapipeline, deliver a secure data catalog, and cap it all off withindustry-leading analytics.Qlik is a robust, comprehensive, and innovative solution for12.26enabling world-class data architecture, integration, delivery,and analytics in your business.Ready to move to agile data warehousing? We’re ready to help.16.31%22-31days20.41%15-21days 18.31%8-14days7.19%25.66%Over rnFor more information, eWest4692%64%34%dGeiaHear W rtaAle ichEvrt100%LinM85Comparison Guide: Top Cloud Data Warehouses for the 6.86%Centra32%78%50%11%45%

ABOUT QLIKQlik’s vision is a data-literate world, one where everyone can use data to improvedecision-making and solve their most challenging problems. Only Qlik offers endto-end, real-time data integration and analytics solutions that help organizationsaccess and transform all their data into value. Qlik helps companies lead withdata to see more deeply into customer behavior, reinvent business processes,discover new revenue streams, and balance risk and reward. Qlik does businessin more than 100 countries and serves over 50,000 customers around the world. 2020 QlikTech International AB. All rights reserved. All company and/or product names may be trade names, trademarks and/or registered trademarks of the respective owners with which they are associated.

THE FIRST WIDELY ADOPTED CLOUD DATA WAREHOUSE For many years, data warehousing was only available as an on-premise solution. Then in November 2012 Amazon Web Services (AWS) launched Redshift. Although not the first cloud data warehouse, it was the first to gain market share through adoption. Redshift’