Intelligent Data Hub - Semarchy

Transcription

Intelligent Data HubThe Next Generation of Smart Data ManagementDave WellsOctober 2019Research Sponsored byThis publication may not be reproduced or distributedwithout prior permission from Eckerson Group.

Intelligent Data HubAbout the AuthorDave Wells is the Data Management Practice Director at EckersonGroup, a business intelligence and analytics research and consultingorganization. He brings a unique perspective to data managementbased on five decades of working with data in both technical andbusiness roles. Dave works at the intersection of informationmanagement and business management, where real value isderived from data assets. He is an industry analyst, consultant, andeducator dedicated to building meaningful and enduringconnections throughout the path from data to business value. Knowledge sharing and skillsdevelopment are Dave’s passions, carried out through consulting, speaking, teaching, andwriting. He is a continuous learner—fascinated with understanding how we think—and astudent and practitioner of systems thinking, critical thinking, design thinking, divergentthinking, and innovation.About Eckerson GroupEckerson Group helps organizations get more value fromdata and analytics. Our experts each have more than 25years of experience in the field. Data and analytics is all wedo, and we’re good at it! Our goal is to provide organizationswith a cocoon of support on their data journeys. We do thisthrough online content (thought leadership), expert onsiteassistance (full-service consulting), and 30 courses on dataand analytics topics (educational workshops).Get more value from your data. Put an expert on your side.Learn what Eckerson Group can do for you! Eckerson Group 2019www.eckerson.com2

Intelligent Data HubTable of ContentsExecutive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4The Trouble with Data Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5The Fragmentation of Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5The Fragmentation of Data Management Tools . . . . . . . . . . . . . . . . . . . . 7The Urgency—Why You Should Care Now . . . . . . . . . . . . . . . . . . . . . . . . . 8What Is the Intelligent Data Hub? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Intelligent Data Hub Defined. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Advantages of the Intelligent Data Hub . . . . . . . . . . . . . . . . . . . . . . . . . . 10Data Management in the Hands of the Business. . . . . . . . . . . . . . . . . . . . 11Business Taking Ownership of Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Technology to Support Business Ownership of Data . . . . . . . . . . . . . . 11Intelligent Data Hub Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Customer B2B, Legal Entities, Business Partners, and Suppliers . . . .12Customer B2C, Citizens, and Households . . . . . . . . . . . . . . . . . . . . . . . . 13Employees, HR, Business Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13Financial Instruments, Services, Hierarchies, and Securities . . . . . . . 14Assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Places. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Intelligent Data Hub as a Single Platform. . . . . . . . . . . . . . . . . . . . . . . . . . 15Intelligent Data Hub in the Ecosystem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Migrating to the Intelligent Data Hub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18About Eckerson Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19About the Sponsor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Eckerson Group 2019www.eckerson.com3

Intelligent Data HubExecutive SummarySmart data management must become a core competency for smart business. Every businesstoday depends on data. High-value opportunities are abundant when data is well managed,but the risks that accompany poor data management are equally high. To maximize valueand minimize risk, we need to take a new approach to data management—an approach thatovercomes the difficulties created by fragmented data and technology, and that puts datamanagement in the hands of the business.The business needs to manage its data assets with the same discipline and rigor as its financialassets, and it needs the tools to manage data without deep technical knowledge. Intelligentdata management is a key component when it offers a single platform for governance,master data, reference data, data quality, data enrichment, and workflows. Intelligent datamanagement uses artificial intelligence and machine learning algorithms to discover andunderstand data and to graph relationships among data sources—an important capability inshifting from data silos to a data hub. The Intelligent Data Hub is a software platform with arobust set of data management capabilities to discover, integrate, manage, and govern data. Itspans multiple applications to measure and monitor data quality, process efficiency, and otherdata management outcomes.This report describes the challenges of modern data management, the urgency of findinga new approach, and the capabilities of an Intelligent Data Hub to meet the challenges andbecome a catalyst for a new approach to data management. Eckerson Group 2019www.eckerson.com4

Intelligent Data HubThe Trouble with Data ManagementData has become a critical asset in the age of data-driven business. Data management isa modern strategic imperative that is at least as critical as financial management. Yet datamanagement has become complex and challenging. New sources and ever-expandingvolumes lead to increasingly scattered data. At the same time, continuous technologyevolution has created a world of fragmented data management tools. Connecting the dots,breaking down barriers, and reducing fragmentation—these are today’s pressing datamanagement challenges. And they are certainly not abstract. They are real and very tangible:Incomplete, incorrect, and poorly integrated data has a high cost. For example, IBM estimatedthe cost of poor data at 3.1 trillion in 2016 (three years ago!) in the United States alone.The Fragmentation of DataData management complexity increases as the variety of data types, databases, deploymentplatforms, and data use cases grows. Enterprise data that once resided entirely in onpremises relational databases is now widely distributed across networks and stored atmultiple locations and in many kinds of databases. The data warehousing and master datamanagement (MDM) objectives of the past—breaking down silos—have been displacedas new kinds of data and systems have emerged. When business engages in self-servicebusiness intelligence (BI) and takes ownership of applications and databases, the work ofdata management becomes more complex. Business units may acquire and implementapplications without IT participation (as when a marketing department implements Marketo,for example). As software-as-a-service (SaaS) applications are implemented and data movesto the cloud, data silos proliferate. Each SaaS application manages its own data. ERP systems,CRM systems, and specialty applications for marketing, sales, and other business functionseach maintain their own databases. Eckerson Group 2019www.eckerson.com5

Intelligent Data HubFigure 1. A Complex Modern Data LandscapeA typical data landscape today includes on-premises data as well as data in multiple cloudplatforms. (See figure 1.) Integrating data and applications across today’s multi-cloud and onpremises/hybrid data landscape brings new challenges and requires a new approach to datamanagement.Data fragmentation is compounded by redundant, inconsistent, and sometimes conflictingdata stored in data warehouses, data lakes, and MDM repositories. (See figure 2.) Each of theseindependent efforts to integrate data leads to new data silos and new kinds of disintegration. Eckerson Group 2019www.eckerson.com6

Intelligent Data HubFigure 2. Data Fragmentation and RedundancyThe problem of data fragmentation and redundancy is frequently severe. A subtle butimportant point in figure 2 is the label “Data Warehouses”—note that this is plural, notsingular. Our recent surveys found that nearly 60% of organizations operate between two andfive data warehouses. Fewer than 10% of respondents had zero or just one data warehouse.This means that more than 30% have six or more data warehouses.The Fragmentation of Data Management ToolsFragmentation in data management extends beyond data fragmentation to include the toolsand technologies of data management. Data management processes are inefficient and errorprone when supported by a collection of disconnected tools that lack interoperability. Thelifecycle of data from acquisition to consumption includes many processes and activities. (Seefigure 3.) Data discovery, data refinement, data quality management, data integration, datagovernance, data access, data analysis, metadata management, master data management,and many other functions depend on technology. Many of these functions, such as governanceand metadata management, are threaded through multiple proprietary tools. Toolfragmentation leads directly to management fragmentation—disconnected processes that aredirect causes of oversights, errors, and inefficiencies. Eckerson Group 2019www.eckerson.com7

Intelligent Data HubFigure 3. The Many Processes of Data ManagementIn this complex data management world, old-style MDM is clearly inadequate. Data-dependentorganizations need a single, integrated tool to provide comprehensive standardization andgovernance of all core data assets, including master data, reference data, and widely sharedtransactional data.The Urgency—Why You Should Care NowData management has become a critical responsibility, with the stakes raising rapidly asorganizations become more data dependent. Modernizing data management is strategic andit is urgent, with impacts on three fronts: risk, revenue, and costs. (See figure 4.) Eckerson Group 2019www.eckerson.com8

Intelligent Data HubFigure 4. Business Benefits of Intelligent Data ManagementRegulatory compliance requirements increase the risk profile of data management as moreregulations, changing regulations, and greater complexity of regulations add challengesto compliance and risk management. New revenue opportunities emerge when wellmanaged data offers new insights, drives innovation, and helps the business to understandconsumer expectations and meet new market demands. Operational costs are reducedwhen modernization decreases demand for and lowers the cost of middleware and complexintegration systems and processes.What Is the Intelligent Data Hub?Intelligent Data Hub DefinedThe Intelligent Data Hub is a software platform with a robust set of data managementcapabilities to discover, integrate, manage, and govern data. It spans multiple applications tomeasure a nd monitor data quality, process efficiency, and other data management outcomes.Semarchy describes the Intelligent Data Hub as follows: “The Intelligent Data Hub bringstogether the critical information that lives across applications such that it can be governed,mastered, and managed in a centrally understood, non-disruptive way.”Bringing together all of the critical information is a significant shift from the single-domainconcept of legacy MDM. Typical old-style MDM solutions manage a master hub for customerdata, another hub for product data, yet another for account data, and so on. This approachcreates new kinds of data silos—domain silos instead of application silos. The IntelligentData Hub manages a single master hub for all core data assets (see figure 5) encompassing Eckerson Group 2019www.eckerson.com9

Intelligent Data Hubcustomers, organizations, products, accounts, locations, assets, addresses, employees,facilities, machines and devices, suppliers, business partners, and much more—all of theparties, places, and things for which data is shared across applications, databases, andbusiness processes. Consistent design, central management, and recognition of relationshipsamong master data domains brings substantial advantages for data discovery and datagovernance.Figure 5. A Single Data Hub for All Core Data AssetsAdvantages of the Intelligent Data HubThe Intelligent Data Hub is clearly a more robust solution than traditional MDM, as it managesmultiple domains of master and reference data in a single hub. Beyond centralization andconsistency, the hub offers several distinct advantages: Eckerson Group 2019www.eckerson.com10

Intelligent Data Hub Core data assets are managed and governed without disrupting businessworkflows, processes, and applications. With hub intelligence, the applicationof rules and policies is woven into the activities of working with data instead ofbeing executed as separate parallel processes. The hub acts as a mediator between applications, helping to resolve conflictingviews of data assets and the things that they represent. In addition to dataintegration, it connects applications through data with a modern approach usingreal-time REST APIs to achieve integration speed and reliability that is impossiblewith traditional data and application integration techniques. The traditionalhard line between analytical and operational data is removed, driving greaterconsistency among analytics and operations information resources. Central implementation and enforcement of data quality rules improves dataquality and increases the efficiency of data collection and update processes. Central implementation and enforcement of data governance policies reducesrisk, protects data from corruption and loss, helps to ensure regulatorycompliance, and reduces the cost and complexity of governance. Standardization of core data improves consistency and traceability, simplifiesdata integration, and reduces the amount of data preparation that is needed foranalytics. Enrichment of core data creates more robust master data, enables true360-degree views, increases the analytic value of data, and better informsbusiness people and business processes. Providing a single location for business glossaries and data definitions makesthe definitions more accessible, increases their use, and improves dataunderstanding throughout the organization.Data Management in the Hands of the BusinessBusiness Taking Ownership of DataOne of the positive trends in data management today is the shift from IT-centric to businesscentric data ownership and management. Business organizations are recognizing data as acritical business asset and are taking on responsibility to govern and actively manage dataquality. This trend naturally aligns with the self-service revolution in BI and analytics, withnon-technical business people directly engaging with data. Eckerson Group 2019www.eckerson.com11

Intelligent Data HubTechnology to Support Business Ownership of DataThe Intelligent Data Hub supports business ownership and management of data, embracingthe concept that business people are the experts and the policy makers for data usage, dataquality, and data governance. The hub is designed and built for non-technical business people,providing: A business-friendly and intuitive user experience. Ability to understand and act upon the data. Collaborative data definition, policy making, and data governance.Intelligent Data Hub Use CasesSeven data hub use cases (see figure 6) cover a wide range of master and reference datacommon to virtually every enterprise. Analytics applications share the same master data,reference data, business glossary, and data definitions. Operational applications use the hubto ensure semantic consistency across the enterprise.Figure 6. Intelligent Data Hub Use Cases Eckerson Group 2019www.eckerson.com12

Intelligent Data HubCustomer B2B, Legal Entities, Business Partners, and SuppliersThis data model is centered on l egal entity (company) information augmented withinformation elements such as addresses, sites, hierarchies, market segments, contacts,financials, history, channels, risk metrics, markets, preferences, contracts, sales, serviceslevels, ratings, loyalty, legal structure, compliance, and other interactions and observations.Business processes that can be managed in the hub include new company creation, matchand merge, deduplication, basic data quality, data survivorship and consolidation, externaldata enrichment, company data lifecycle, legal hierarchy management, data quality/standardization, counter-party risk assessment (credit risk, legal/IP risk, financial risk), GDPR,CCPA, HIPAA, marketing segmentation, surveys, retention, loyalty, and predictive analyticsprocesses.Customer B2C, Citizens, and HouseholdsThis data model is typically centered on the physical person (consumer) informationaugmented with information elements such as addresses, channels, contact points,households, financials, credit risk, segmentation, credit history, legal data, preferences, socialbehavior, sales, contracts, and compliance.Business processes that can be managed in the hub include new person creation, match andmerge, deduplication, basic data quality, data survivorship and consolidation, self-servicepreferences, personal data lifecycle, external data enrichment, household management,advanced data quality/standardization, GDPR, CCPA, privacy, opt-in/out processes, creditrisk, legal risk, financial assessments, marketing segmentation, surveys, retention, loyalty,predictive analytics, recommendations, and multi-channel strategy.Employees, HR, Business UnitsThis data model centers on employee data augmented with information elements such asaddresses, reporting hierarchies, cost centers, business units, legal IDs, employment history,insurance, benefits, authentication, contracts, talents and skills, performance, salaries andbonuses, taxes, travel and expenses, recommendations, procurement, and social activity.Business processes that can be managed in the hub include new employee creation, matchand merge, deduplication, basic data quality, data survivorship and consolidation, self-serviceenrichment, hiring approvals, multi-application provisioning, off-boarding, job mapping,reorg, reporting hierarchies, costs optimization, authentication, business continuity, taxes,environment, health, legal, privacy, screening assessments, GDPR, cost centers assignment,performance assessments, talent management, HR costs optimization, policies enforcement,and referral programs. Eckerson Group 2019www.eckerson.com13

Intelligent Data HubProductsThis data model centers on product (or part/SKU) data augmented with information elementssuch as extensible attributes, families and taxonomies, bills of materials, rich content (media,images, videos, PDFs, etc.), variants (sizes, colors, etc.), markets, geographies, pricing,lifecycle, branding and IP, legal data, compliance data, marketing campaigns, costs of sales,social metrics, inventory, supply chain information, sales, recommendations, and call centerstatistics.Business processes that can be managed in the hub include product creation, deduplication,consolidation, partner enrichment, classification, media enrichments, configuration, bills ofmaterials, lifecycle management, hierarchies, data quality, pricing, transfer prices, regulatorycompliance, licensing, IP, environment, health, formulation, performance monitoring,profitability, quality assurance, satisfaction and loyalty surveys, recommendations, and socialcampaigns.Financial Instruments, Services, Hierarchies, and SecuritiesThis data model is centered on the financial instrument and is augmented with informationelements such as securities, tickers, policies, contracts, coverages, risks, costs, rating factors,hierarchies, and pricing.Business processes that can be managed in the hub include instrument creation,deduplication, consolidation, collaborative information completeness, pricing modelenrichment, classification, configuration, lifecycle management, hierarchies, advanced dataquality, legal compliance, performance monitoring, profitability, trading transparency, andplanning.AssetsThis data model is centered on a ssets and is augmented with information elements such aslocation, cost center, amortization, tagging, insurance, debt, credits, and risks.Business processes that can be managed in the hub include asset creation, deduplication,consolidation, classification, configuration, lifecycle and timeline management, hierarchies,advanced data quality, financial compliance, legal compliance, amortizations, maintenance,and planning.PlacesThis data model is centered on l ocation and is augmented with information elements suchas geography, point of interest, administrative areas, public facilities, terrain, elevation,schedules, maintenance, rich content (pictures, videos, etc.), contacts, facilities costs,business continuity, insurance, sales performance, taxes, legal data, compliance, qualityassurance, supply chain, web traffic, planning, and social activity. Eckerson Group 2019www.eckerson.com14

Intelligent Data HubBusiness processes that can be managed in the hub include location creation, deduplication,consolidation, information completeness, location public directory, marketing and mediaenrichments, internal technical support, store/plant/warehouse readiness, businesscontinuity, tax assessment, legal risks, leases, financial assessment, privacy, GDPR or CCPAcompliance, environmental compliance, health compliance, location advertising, externalexposure, business performance optimization, sales impact analysis, and brand awareness.Intelligent Data Hub as a Single PlatformThe Intelligent Data Hub is a platform for discovery, governance, management, integration,and measurement of data shared across many applications and business processes. (Seefigure 7.)Figure 7. Primary Functions of the Intelligent Data Hub Eckerson Group 2019www.eckerson.com15

Intelligent Data HubDiscover. Data discovery has become an essential first step in the path from data to businessvalue. With the explosion of data sources, volumes, and types, most organizations have dataabout which they have only limited knowledge. In support of data discovery, the IntelligentData Hub connects to data sources, profiles the data, and makes the profiles available to thepeople and processes that work with data. Profiles include statistical metadata and describecolumn contents including highest value, lowest value, most frequent value, average value,percent unique, percent null, and number of data patterns. Data stewards and data qualitymanagers use the profiles to identify data quality issues, perform root cause analysis, andidentify data constraints and associated rules. Business users and data analysts use theprofiles to help them understand the data and to derive appropriate governance rules.Govern. Data governance addresses both data protection and data usefulness. Protectionincludes both security and regulatory compliance. Usefulness includes quality of data andquality of definitions. The Intelligent Data Hub includes a data governance application withfeatures to define and enforce security policies, data validation rules, data standardizationrules, and matching rules for deduplication and golden record creation. The governanceapplication also supports a collaborative environment with an enterprise glossary to define,describe, and manage business terms, policies, processes, and metrics.Manage. Data management includes all of the administrative processes for data acquisition,validation, processing, storage, and protection to ensure that data is accessible, reliable, andtimely. Resolving identities, preventing duplicate records, managing hierarchies, and ensuringdata quality are among the most important data management activities. Intelligent Data Hubfunctions include those for master data management (MDM), reference data management(RDM), and application data management (ADM). Legacy technologies that treat MDM,ADM, and RDM each in isolation from the others add complexities and inefficiencies to datamanagement processes. It simply makes sense to manage them collectively. The hub alsosupports data management applications and dashboards to inform and assist people withdata management responsibilities.Integrate. Integration capabilities in the hub include the generation and management ofAPIs—both REST and SQL—to enable data and application integration. Integration patternsare built into the hub to support batch and real-time integration, ETL, and data virtualization.Data and application integration can be performed across complex ecosystems that includeon-premises, cloud, multi-cloud, and hybrid deployments.Measure. Peter Drucker once said “you can’t manage what you can’t measure,” and thisprinciple applies to data as strongly as to any other business assets and processes. IntelligentData Hub measures and metrics include the data profile metrics previously described, dataquality metrics, and process performance metrics. Metrics stored in the data hub are exposedthrough APIs and may be accessed, processed, and reported by external applications,dashboards, and data visualization tools. Eckerson Group 2019www.eckerson.com16

Intelligent Data HubIntelligent Data Hub in the EcosystemThe Intelligent Data Hub fits neatly into broader data ecosystems in a way that bringsstructure and consistency to the management of master and reference data. (See figure 8.)It connects to operational systems and databases to acquire data and populate the hub. Itmanages master and reference data in the hub, providing all of the capabilities for discovery,integration, enrichment, and quality management. It supports collaborative data stewardshipthat allows business users, data owners, and data stewards to create high-quality goldenrecords. It governs master and reference data to protect data from corruption and loss and toenhance data usefulness and value. Value is realized when data is used and frequently reused.The hub facilitates reuse by allowing operational systems, analytical applications, and shareddata resources such as data warehouses and data lakes to connect to trustworthy and highquality master data and reference data. It is easy to see how the Intelligent Data Hub becomesa centerpiece of the enterprise data ecosystem.Figure 8. Intelligent Data Hub in the Ecosystem Eckerson Group 2019www.eckerson.com17

Intelligent Data HubMigrating to the Intelligent Data HubMoving to an Intelligent Data Hub may seem like a significant endeavor, but it can be gracefuland non-disruptive if undertaken as a planned step-by-step migration. (See figure 9.)Figure 9. The Migration ProcessBusiness case. Begin by describing the business case for moving to an Intelligent Data Hub.Consider benefits such as business ownership of master and reference data, business valueof consistent and trustworthy data, ease of data discovery, and confidence in security andcompliance.Technical case. Support the business case by describing the technical benefits of moving toan Intelligent Data Hub. Consider the efficiency gains of managing data on a single platform,integrating data across multi-cloud and hybrid environments, and simplifying both data andapplication integration.Technology selection. Perform the due diligence to select the right technology for anIntelligent Data Hub. Develop a short list (tip: it is likely to be a very short list) of technologiesthat provide all of the features and functions needed, and that maximize the use of smartalgorithms for data discovery, data enrichment, and data governance.Proof of value. Undertake a proof-of-value project with goals to quickly learn the technologyand to confidently estimate time, cost, and ROI.Incremental migration. Avoid the pain and risk of “big bang” implementation by migratingone data domain at a time. Plan and sequence the increments based on business impact andthe ability to fully engage business stakeholders and subject matter experts.Organizational and cultural shift. Throughout the migration and afterward, pay attentionto the human elements of data management. Business ownership of data is sure to changeroles, responsibilities, and relationships. Awareness and active management of the humandimension is key to success. Eckerson Group 2019www.eckerson.com18

Intelligent Data HubConclusionMore data Faster data Better data quality Stronger integration Semantic integrity Trustworthy data These are the expectations for the very near future of data management. Consider startingyour Intelligent Data Hub journey today. A new approach to data management is bothimperative and urgent. Eckerson Group 2019www.eckerson.com19

Intelligent Data HubAbout Eckerson GroupWayne Eckerson, a globally known author, speaker, and advisor, formedEckerson Group to help organizations get more value from data and analytics.His goal is to provide organizations with a cocoon of support during every stepof their data journeys.Today, Eckerson Group helps organizations in three ways: Our thought leaders publish practical, compellingcontent that keeps you abreast of the latest trends,techniques, and tools in the data analytics field. Our consultants listen carefully, think deeply, andcraft tailored solutions that translate your businessrequirements into compelling st

This ublication may not be reproduced or distributed without rior ermission from ckerson Group. Intelligent Data Hub The Next Generation of Smart Data Management