Common Data Validation Use Cases - Informatica

Transcription

Common Data Validation Use Cases Source to Target Testing: Data is Transformed ETL Validation Data Masking Production to Development Testing: Data is the same ETL version upgrade ETL Migration Application retirement Application upgrades Production Reconciliation Data Warehousing MDM, Operational integration Audit/Balance/Control solutionsPowerCenter Data Validation Option Overview 2011 Informatica. Company Confidential.1121

Current Data Validation ApproachProblems with Manual Validation Takes a long time and is expensive Currently Data Validation is done manually bywriting SQL scripts Time is spent writing queries and waiting for them to run Error-prone manual process Customers estimate Data Validation should take 30% of all hours spent on Data Integration Cannot perform thorough testing Time/Cost pressure leads to “try it here and there” approach Most customers admit they do not do enough data validation,resulting in poorer data quality and higher project risk The tester runs out of time/money before testing is done Usual problems associated with writing custom code PowerCenter upgrades take up to 6 months No audit trail No reuse No methodology It takes one day to upgrade the ETL software342

Data Validation Option OverviewData Validation Option Architecture1 Tool built on top of Informatica PowerCenterDVO is used todefine Test Rules3DVO is usedto display results5ComprehensiveReporting on allTests and Results Users define data rules using easy-to-understand GUIPC APIApplications Data is processed and evaluated using PowerCenterResultsDBDatabases Results are displayed in the GUI and stored for laterretrieval and reportingData Formats4All results arestored in theResults DBReal timeLegacy52PowerCenter mappings are generatedSession is executed63

Customer Example #1Customer Example #2Large Financial InstitutionLarge Health Care OrganizationSituation Need to “Balance” 350 tables being loaded into Data Warehouse onnightly basis to ensure what is loaded is accurateSituation Upgrade from PowerCenter 8.1 to 8.5. Need to test 1 full day of data(30 million rows) across old and new systemsPrior to DVO Manual testing via developer-written mappings & PL/SQL to test 350DW tables. 20 hours to write/test each mapping. Not scalablePrior to DVO Projected 6-8 weeks of manual testing to ensure upgrade wassuccessful with no regressionsResults with DVO Development effort is 3 hours per table. 1 hour for many tables. Can test all data, vs. just a subset previously Can prove to auditors that validation is complete Increased confidence from business that financial data is correctResults with DVO Setup of DVO took 1 day Running full day of data took 1 day Upgrade testing complete in 2 days with full coverage of data784

Data Validation Option Benefits Increased likelihood of project success, lower project risk Significant cost savings, faster time to market– 50% source-to-target testing– 80% regression testing– 90% upgrade testing Ability to test all data, not just a small sample Ability to test in heterogeneous environments No need to know SQL Complete Audit Trail and comprehensive reporting of alltesting activities No need to acquire additional server technology: leveragePowerCenter’s scalability, platform support and data access9105

3 5 Data Validation Option Overview Tool built on top of Informatica PowerCenter Users define data rules using easy-to-understand GUI PC API Data is processed and evaluated using PowerCenter Results are displayed in the GUI and stored for later retrieval and reporting 6 Data Valida