Data Warehouse Automation Helps Close The Data-To

Transcription

FOR APPlICATION DEVElOPMENT & DElIVERy PROFESSIONAlSData Warehouse Automation Helps Close TheData-To-Insight Gapby Boris Evelson and Nasry AngelAugust 16, 2016Why Read This ReportKey TakeawaysAgile business intelligence (BI) platforms onlypartially support the iterative development ofBI applications. Parts of the process, like dataintegration and modeling, still follow a lessflexible waterfall development life cycle. In thisreport, AD&D pros can learn about a landscapeof vendors that bring Agile options to all phasesof BI application development. These datawarehouse automation (DWA) platforms facilitateshorter, more iterative development cycles,foster collaboration between the business andtechnology management, and require fewerexpensive human resources with specialized datamodeling skills.Agile BI Requires More Than Just AgileDashboardsTrue agility means prototyping data models (notjust dashboards and reports) quickly so businessusers can continuously iterate on it. AD&D prosworking on BI initiatives should consider addingDWA platforms to their BI toolbox.FORRESTER.COMDWA Enables BI Self-Service For Use CasesRequiring A Data Warehouse (DW)DWA is not a data warehouse appliance, nordata-warehouse-as-a-service (DWaaS) — it’ssoftware that automatically generates a datawarehouse by analyzing the data itself andapplying best practices for DW design embeddedin the technology. Another name for this type oftechnology is “metadata-generated analytics.”

For Application Development & Delivery ProfessionalsData Warehouse Automation Helps Close The Data-To-Insight Gapby Boris Evelson and Nasry Angelwith Gene Leganza, Brian Hopkins, Noel Yuhanna, and Shreyas WarrierAugust 16, 2016Table Of Contents2 Traditional BI Development Delivers InsightsThrough A DropperAgile BI Tools Address Development AgilityFor Only Parts Of The Full BI Stack4 DWA Tools Crack Open The Floodgates10 Deploy DWA Only When Benefits OutweighConcernsNotes & ResourcesForrester interviewed five vendor and usercompanies: Attunity, Birst, Magnitude Software,TimeXtender, and WhereScape.Related Research DocumentsThe Forrester Wave : Agile Business IntelligencePlatforms, Q3 2015Insight Platforms Accelerate DigitalTransformationWhat It Means12 Consider A DWA Platform When All TheStars Align Just Right13 Supplemental MaterialIt’s Time For A User-Driven Enterprise BI StrategyTechRadar : Business Intelligence, Q1 2015Forrester Research, Inc., 60 Acorn Park Drive, Cambridge, MA 02140 USA 1 617-613-6000 Fax: 1 617-613-5000 forrester.com 2016 Forrester Research, Inc. Opinions reflect judgment at the time and are subject to change. Forrester ,Technographics , Forrester Wave, RoleView, TechRadar, and Total Economic Impact are trademarks of ForresterResearch, Inc. All other trademarks are the property of their respective companies. Unauthorized copying ordistributing is a violation of copyright law. Citations@forrester.com or 1 866-367-7378

For Application Development & Delivery ProfessionalsAugust 16, 2016Data Warehouse Automation Helps Close The Data-To-Insight GapTraditional BI Development Delivers Insights Through A DropperThe famous designer and teacher Inge Druckrey said, “You can’t come up with ideas if you don’t seefirst.” This principle applies perfectly to business users’ relationship with data. You don’t know whatyou don’t know, and until business users can “see” or “play” with the data, they may not be able tofully articulate their business requirements. But most BI shops follow a waterfall system developmentlife cycle (SDLC) that takes too long and is too inflexible to keep up with digitally empoweredcustomers — and limits business involvement. There is no room in waterfall for the trial and error,exploration, or discovery essential to creating valuable business insights. One-third of enterpriseusers report that fast-changing analytic and reporting requirements is their firm’s top challenge whenorchestrating their BI strategy.1 In response, Forrester sees application development and delivery(AD&D) pros working on BI initiatives evolving their approaches to be more agile by:›› Empowering business users to self-author the majority of BI content. AD&D pros canhelp foster BI agility by deploying highly visual and intuitive Agile BI tools. Business users —executives, managers, and individual contributors — who are already proficient in Excel canlearn and start authoring content using BI tools in a matter of minutes or hours. But Forrester’sdata shows there’s room for improvement: Only 53% of individual contributors are able toapply insights to operational processes and actions in a timely manner, compared with 61% ofmanagers and 76% of C-suite executives.2›› Embracing rapid and iterative prototyping to replace slow waterfall techniques. No onegets BI requirements right on the first attempt. Strong anecdotal evidence shows that, at best,business users can guess no more than a quarter to a third of the data sources, metrics, and waysthey are going to utilize the information before they actually see a prototype. Luckily, modern BItools can prototype a report or a dashboard in hours or even minutes — much faster than puttingrequirements on paper or using outdated whiteboarding or other manual prototyping techniques.Agile BI Tools Address Development Agility For Only Parts Of The Full BI StackThe days of technology-management-centric BI application development are numbered. ModernBI technologies cater directly to business users. However, these tools only address developing BIprototypes and applications for (see Figure 1):›› The top layer of the BI stack — reporting, analytics, and dashboards. Leading Agile BI vendorslike IBM, Information Builders, Microsoft, MicroStrategy, Oracle, Panorama, Qlik, SAP, SAS, TableauSoftware, and TIBCO Software provide platforms where business users can self-author the majorityof BI content.3 These tools provide intuitive point-and-click and/or NLP GUIs that let users connectto a variety of data sources, automatically model the data, perform a few simple data integrationfunctions, create metrics and KPIs, and visualize data in dashboards.4 However, some of thesetools offer all of the above functionality only for data sets that are small enough to be loaded into 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.Citations@forrester.com or 1 866-367-73782

For Application Development & Delivery ProfessionalsAugust 16, 2016Data Warehouse Automation Helps Close The Data-To-Insight Gapmemory — usually under 100 GBs. Larger data sets require a disk-based data model and storage —technology and processes that are still the realm of technology management professionals like datamodelers and database administrators.›› The data preparation stage of the BI process. The same business focus revolution that alreadyhappened in BI is also happening in data management via technology referred to as “data prep.”AD&D pros can still use powerful, large, enterprise-grade ETL tools, like Informatica or IBMDataStage.5 Additionally, most BI tools provide a few basic data prep features that business userscan leverage. Now there’s also a third option: Vendors like Alteryx, Paxata, and Trifacta provideETL-like products that focus on business users, who can develop data prep processes feedingapplications built in multiple BI tools.6›› The source data discovery phase. ETL and data curation processes require source-to-targetmapping. But given the explosion of data sources, manual mapping is too slow — this is the realmof data profiling tools. These tools programmatically scan data sources and display informationabout them like most common values, data sparsity, data ranges, outliers, and so on. Most ETLtools — such as those from Ab Initio Software, IBM (DataStage), and Informatica — come withrobust data profiling capabilities, as do the data prep tools. Newly emerging data catalog productslike Alation or Waterline Data also specialize in data profiling, especially on big data sources.FIGURE 1 Agile BI, Data Preparation, And Profiling Capabilities Of Selected VendorsPlatformsAgile BIUser-focuseddata preparation Data profilingRepresentativevendorsAgile BIYesLightNoYesUser-focusedYesdata preparationComprehensiveYesAlteryx, Datawatch, IBM (DataWorks),Informatica (Rev), Lavastorm Analytics,Paxata, Tamr, TrifactaAgile BI withbuilt-in dataprofilingLightYesAttivio, Oracle (Big Data Discovery)Yes 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.Citations@forrester.com or 1 866-367-73783

For Application Development & Delivery ProfessionalsAugust 16, 2016Data Warehouse Automation Helps Close The Data-To-Insight GapDWA Tools Crack Open The FloodgatesFor larger data sets and where complex transformations are required, you need a more comprehensiveend-to-end BI architectural stack (see Figure 2). This typically requires integration of at least three typesof development tools — ETL, data modeling, and BI — and includes nine different steps (see Figure 3).Even with Agile BI and Agile data prep tools you address only a portion of the end-to-end cycle — therest still require the slow waterfall approach. What if AD&D pros and their BI business colleagues couldrun the entire cycle from a single platform and apply Agile principles? Forrester tracks five vendors ina BI category we call DWA that can automate most, not just one or two, steps of the BI developmentcycle (see Figure 4).7 They are:›› Attunity (Compose). Attunity’s Compose platform provides a model-driven approach for end-toend DWA, where the data warehouse design and ETL are automatically generated from a logicaldata model. The architect connects the model to data sources, and Compose automaticallygenerates ETL and all the physical data models from landing areas to the data warehouse to datamarts, supporting a variety of DW design patterns. Once Attunity Compose is architected anddeployed, BI center of excellence/competency center staff can use Attunity Visibility, a data usageanalytics platform, to monitor and analyze performance of the DW and BI platforms for varioususage patterns. As a last step in the DWA process, Attunity Compose can programmaticallygenerate semantic layers or input files for Microsoft PowerPivot, Qlik, and Tableau Software.›› Birst. First and foremost, Birst is a complete insights platform that is based on a DWA architecture.Birst’s DWA takes a data-driven approach — where data source definition programmaticallygenerates the target schema. In addition to covering most requirements for self-service Agile BI anddata visualization, Birst offers a few truly unique features. Its underlying ROLAP capabilities — notunique in the enterprise BI landscape, but seldom seen in Agile BI platforms — provide declarativecapabilities that support metadata and data reuse versus building siloed applications. The platformalso includes large enterprise performance-enhancing capabilities like multitiered caching andaggregate awareness. Additionally, unlike other cloud-based solutions, Birst does not require all datato be moved to the cloud. Instead, the platform can directly query data on-premises and combine itwith analytic-ready data in the Birst cloud via query federation that supports a combination of hybridcloud plus on-premises deployment scenarios.›› Magnitude Software (Kalido Information Engine). The platform offers a top-down, requirementsdriven business information modeling solution, coupled with automation in all phases of DWdesign, build, and ongoing operation. This approach helps bridge the communications gapbetween technology management and business pros to ensure your DW meets the analytical needsof the business and remains current. In addition to rapid design, build, and deployment, KalidoInformation Engine includes a master data management capability to enable managing hierarchiesand reference data. This capability ensures that only curated data is loaded into the warehouse.Kalido extends its automation out to the BI layer and automatically generates BI semantic layers for 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.Citations@forrester.com or 1 866-367-73784

For Application Development & Delivery ProfessionalsAugust 16, 2016Data Warehouse Automation Helps Close The Data-To-Insight Gappopular BI tools, including SAP BusinessObjects, IBM Cognos, Microsoft Analysis Services, andQlik. Magnitude also offers expertise and prebuilt capabilities to extract data from Oracle eBusinessSuite, PeopleSoft, and JD Edwards’ operational applications.›› TimeXtender (Data Discovery Hub). The product is composed of an ODS, a DW, and a seriesof templatized models for several industry verticals and popular ERP data sources including Infor,Microsoft, and SAP.8 TimeXtender specializes in providing DWA for the Microsoft SQL server. Itautomates the tasks required to model and govern data. In addition to building and maintaining awarehouse, you can use it to create a data discovery hub that enables business users to accessdata on their own time; maintain governance, security, and control; and reduce the backlog oftechnology requests. TimeXtender also extends its automation out to the BI layer and automaticallygenerates BI semantic layers for popular BI tools, including Qlik, Microsoft PowerBI and SSAS, andTableau Software.›› WhereScape (3D and RED). WhereScape RED DWA is based on both top-down and data-drivenapproaches — the latter, where data source definitions programmatically generate the targetschema, is more popular with WhereScape clients. It also offers a unique capability — its 3Dproduct can profile and analyze data sources, a “pre-ETL” step. The vendor boasts the broadestsupport for database engines, including Azure SQL Data Warehouse, EMC Greenplum, IBM DB2and Netezza, Microsoft Analytics Platform, Microsoft SQL Server, Oracle Exadata, and Teradata.Additionally, WhereScape is spearheading the big data trends and is the only DWA vendor that willgenerate the schema on Apache Hive or Pivotal commercial distribution. 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.Citations@forrester.com or 1 866-367-73785

For Application Development & Delivery ProfessionalsAugust 16, 2016Data Warehouse Automation Helps Close The Data-To-Insight GapFIGURE 2 Big Data And Business Intelligence Hub-And-Spoke ArchitectureSample hub-and-spoke BI architectureIngestMoveConsumeData management and integrationKnowledge management (search portal)SpokeHubData explorationData discovery acceleratorsBIDMBIEDWVirtual data warehouse datasourcesAd hocinterfacesInternaldatasourcesLocallakesData lake/datahub (distributed)Metadata for date lineage/impact analysisData warehouse automationData governance, BI on BI“Cold” area: slow, lessexpensive Hadoop orHadoop-like platform“Warm” area: faster,more expensive DBMS“Hot” area: fastest,most expensivein-memory 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.Citations@forrester.com or 1 866-367-73786

For Application Development & Delivery ProfessionalsAugust 16, 2016Data Warehouse Automation Helps Close The Data-To-Insight GapFIGURE 3 Multiple Platforms Are Required To Address Developing All Components Of A Full BI StackBI componentSource data profile and discovery1. Data sourcingSource data profile and discovery2. Data curationData transformation, integration, cleansing, reconciliation, and aggregation3. Staging area/ODSA single place to stage transactional and reference data from multiple sources.Requires building and maintaining a logical and a physical data model.4. Enterprise datawarehouse (EDW)Optimizes cross-enterprise transactional data for reporting and analysis. Keepshistory, by adding a time dimension. Needs functionality to handle slowly changingdimensions (customer/product name changes). Requires building and maintaining alogical and a physical data model.5. Data martsSpecific subject matter extension of a data warehouse. Requires building andmaintaining a logical and a physical data model.6. OLAP cubesModels and optimizes data for instantaneous slicing and dicing — analyzing data byvarious attributes. Requires building and maintaining a logical and a physicaldata model.7. Semantic layerCreates business-friendly definitions (versus cryptic database objects names) fortables, columns, metrics, and key performance indicators (KPIs)8. Metrics and KPIsPrecalculated, pre-aggregated values9. Reports anddashboardsSummary or detailed level data organized in tabular reports, banded reports, ordata visualizationsIT-pros-focusedplatformsBI componentBusiness user selfservice and agilityaddressed by1. Data sourcingData profiling or extract,transform, load (ETL) withbuilt-in data profiling2. Data curationETL, master data management Data preparation platforms(MDM), data qualityand BI platforms’ datapreparation features3. Staging area/ODSData modelingNot addressed unless aDWA platform is used4. Enterprise datawarehouse (EDW)Data modelingNot addressed unless aDWA platform is used5. Data martsData modelingNot addressed unless aDWA platform is used6. OLAP cubesData modeling and BI withOLAP enginesPartially addressed unlessa DWA platform is used7. Semantic layerBIBI platforms or a DWAplatform8. Metrics and KPIsBIBI platforms or a DWAplatform9. Reports and dashboards BIData profiling platformsPercentage oftotal effort80%20%BI platforms 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.Citations@forrester.com or 1 866-367-73787

For Application Development & Delivery ProfessionalsAugust 16, 2016Data Warehouse Automation Helps Close The Data-To-Insight GapFIGURE 4 Data Warehouse Automation Capabilities Of Five Key -down,model-driven*Bottom-up, data-driven*Top-down, model-drivenDataprofilingYesNoNoExtract, transform,load (ETL)Native graphicaluser interface(GUI)-based ETLNative GUI-based ETL toolNative GUI-based ELT toolAPIs to integrate with thirdparty ETL platformsThird-party ETL into a staging areaLogical data modelNative GUI-basedmodeling toolThird-party ETL into a stagingareaProgrammatically createdNative GUI-based modeling toolto created conceptual modelIntegration withErWinLogical model is programmaticallycreated.Physical data model Any. Optimized for Programmatically createdSQL server, Oracle ROLAP modelDatabase, OracleExadata, Teradata.Any. Optimized for Microsoft SQLServer, Oracle Exadata, Teradata.Slowly changingdimensionsType 1, 2, 3, and hybrid/Type 6(1 2 3). Types can be changedanytime with no loss of history.Type 1, 2, and 3Type 1 and 2Types can be changed anytimewith no loss of history.Build ODS/EDW/DM ODS, EDW, DMStar and snowflake schemas‡ Star and snowflake schemas‡EDW or DMBuild aggregates/cubesAggregates andcubesAggregates and cubes. Birst Aggregates, cubesnative BI is aggregate-aware.Aggregate awareness depends onthe BI platform.†Integration with BIplatforms (generateBI semantic layer)Microsoft PowerPivot, Qlik,Tableau SoftwareODS, 3NF, EDW, or DMAggregate awareness dependson the BI platform.Tableau SoftwareIBM Cognos, Microsoft SSAS,Qlik, SAP BusinessObjects*Model-driven approach: Create a conceptual or a logical model first, then connect it to data sources.Data-driven approach: First identify data sources, then create a logical model that best fits the source datarequirements.†An aggregate-aware BI platform automatically optimizes SQL by redirecting query with a “GROUP BY”statement to a table with precalculated aggregates.‡Star and snowflake schemas are data models optimized for analysis; 3NF (third normal form) is a datamodel optimized for transaction processing. 2016 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law.Cit

can leverage. Now there’s also a third option: Vendors like Alteryx, Paxata, and Trifacta provide ETl-like products that focus on business users, who can develop data prep processes feeding applications built in mul