Using Conceptual Data Modeling To Ensure High Information And Data Quality

Transcription

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumUsing Conceptual Data Modelingto ensure highInformation and Data QualityPete StiglichSenior .comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 1Intelligent Business IntelligencesmEWSolutions’ BackgroundEWSolutions is a Chicago-headquartered strategic partner and full life-cyclesystems integrator providing both award winning strategic consulting andfull-service implementation services. This combination affords our clients afull range of services for any size enterprise information management,managed meta data environment, and/or data warehouse/businessintelligence initiative. Our notable client projects have been featured in theChicago Tribune, Federal Computer Weekly, Crain’s Chicago Business, andwon the 2004 Intelligent Enterprise’s RealWare award, 2007 Excellence inInformation Integrity Award nomination and DM Review’s 2005 World ClassSolutions award.2007 Excellence inInformation Integrity AwardNominationBest Business IntelligenceApplicationInformation IntegrationClient: Department of DefenseWorld ClassSolutions AwardData ManagementFor more information on our Strategic Consulting Services, ImplementationServices, or World-Class Training, call toll free at 866.EWS.1100, 866.397.1100, mainnumber 630.920.0005 or email us at Info@EWSolutions.comwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 2Intelligent Business Intelligencesm236

MIT Information Quality Industry Symposium, July 16-17, 2008Professional Profile / Contact InformationProfessional Profile / Contact InformationPete Stiglich is a Senior Consultant with EWSolutions with nearly 25 years of IT experience in thefields of Data Modeling, Data Warehousing, Business Intelligence, meta data Management, DataIntegration, Customer Relationship Management (CRM), Customer Data Integration (CDI),Database Design and Administration, Data Quality, and Transaction Processing. Pete hasarchitected Enterprise Information Management solutions for diverse industries such as Insurance,Credit Card, Medical, Retail, Banking, Manufacturing, Telecom, and Government.Pete has developed and taught courses on Dimensional Data Modeling, Conceptual DataModeling, ER/Studio, and SQL. Pete has presented for DAMA at the international and local level,as well as at the 2007 IADQ Conference. Pete’s articles on Data Architecture have beenpublished in Real World Decision Support, DMForum, InfoAdvisors, and the Information and DataQuality Newsletter. Pete is a listed expert in SearchDataManagement on the topics of datamodeling and data warehousing.For the current issue of Real World Decision SupportSee: http://www.ewsolutions.com/resource-center/rwds .com Phone: 602-284-0992www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 3Intelligent Business IntelligencesmEWSolutions’ Partial Client ListEWSolutions’ Partial Client ListArizona Supreme CourtBank of MontrealBankUnitedBasic American FoodsBecton, Dickinson and CompanyBlue Cross Blue Shield companiesBranch Banking & Trust (BB&T)British Petroleum (BP)California DMVCollege BoardCorning Cable SystemsCountrywide FinancialDefense Logistics Agency (DLA)Delta DentalDepartment of Defense (DoD)Driehaus Capital ManagementEli Lilly and CompanyFederal Aviation AdministrationFederal Bureau of Investigation (FBI)Fidelity Information ServicesFord Motor CompanyGlaxoSmithKlineHarris BankThe HartfordHarvard Pilgrim HealthCareHealth Care Services CorporationHewitt AssociatesHP (Hewlett-Packard)Information Resources Inc.International PaperJanus Mutual FundsJohnson ControlsKey BankLiquidNetLoyola Medical CenterManulife FinancialMayo ClinicMicrosoftNational City BankNationwideNeighborhood Health PlanNORCPhysicians Mutual InsurancePillsburyQuintilesSallie MaeSchneider NationalSecretary of Defense/LogisticsSouth Orange County Community CollegeSunTrust BankTarget CorporationThe Regence GroupThomson Multimedia (RCA)United Health GroupUnited States Air ForceUnited States NavyUnited States Transportation CommandUSAAWells FargoWisconsin Department of TransportationZurich Cantonal BankFor more information on our Strategic Consulting Services,Implementation Services, or World-Class Training, call tollfree at 866.EWS.1100, 866.397.1100, main number630.920.0005 or email us at Info@EWSolutions.comwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 4Intelligent Business Intelligencesm237

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumWhat will we talk about?Data Models and Data/Information QualityWhat is a Conceptual Data Model?Benefits of Conceptual Data Models for InformationQualityDeveloping the Conceptual Data ModelPhased modeling approach (conceptual, logical,physical)Conceptual Data Model expressivenesswww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 5Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumData Models andQualitywww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 6Intelligent Business Intelligencesm238

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumInformation and Data QualityInformation and Data Quality is a huge issue for everybusiness, government, or institution.Poor Information and Data Quality affects every type ofinformation system – OLTP or decision supportOften leads to a lack of confidence and credibility of ITand IT systems.www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 7Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumInformation and Data QualityWhat is Data Quality?Accurate, complete, and valid data that iscaptured, stored and maintained according tobusiness requirements.What is Information Quality?First, what is the difference between data andinformation?www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 8Intelligent Business Intelligencesm239

MIT Information Quality Industry Symposium, July 16-17, 2008What is Information?The MIT 2008 Information Quality Industry SymposiumDATApst cdDATAQUALITY6E2B6T 0X985016-0341DATACONTEXTPolicyPolicy Holder Postal CodeA7E 9U1B7IMETADATA“The postal code of the policyholder. OnlyCanadian postal codes in the format of “ANA NAN”is allowed. Only 1 space is allowed between the 2sections of the postal code. (A represents anAlpha character, N represents a Numericcharacter). If the Postal Code is unknown orinvalid, the value “XXX XXX” will be substituted inorder to facilitate filtering and data correction.plcyhldr postalcdA7E 9U1B7I 6E2InsuredB6T 0X9XXX XXXR3A A7EDD8T 5V3ClaimD8T 5V3www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 9Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumInformationQualityInformation Quality allowsus to ask (andanswerwith confidence) questions such as?How many unique customers do we haveacross all lines of business?What geography would be the best to focus onfor a new marketing campaign?What are patterns to look for in order toidentify a potential disease outbreak?etc, etc, www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 10Intelligent Business Intelligencesm240

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumInformation and Data QualityThere are many causes of poor Data QualityLack of system constraints when data is originally capturedFocus on quantity not quality (let’s get these projects done as quickly aspossible, and move on to the next thing )Poor data management practices, e.g. authorization, archivalProgrammatic bugsLack of management support for Data Governance and StewardshipData Profiling tool not acquired/used!Lack of automated audits and alerts when actual/potential data qualityevents occurEtc .www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 11Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumInformation and Data QualityThere are many causes of poor Information QualityStovepiped, independent data marts – different people get different numbers forthe same dataLack of an integrated Enterprise Data Warehouse, with dependant data martsData not structured in an easy to use format (e.g. Dimensional) that can helpprevent misunderstandingsUsers directly querying (e.g. via SQL tools) databasesLack of a Managed Meta Data Environment (MME)What does this data mean?Where did it originate from?What were the conditions of the data at the time of the query – e.g. were any loadsdelayedLack of Data Governance and StewardshipEtc www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 12Intelligent Business Intelligencesm241

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumInformation and Data QualityHowever, an often overlooked cause of poorinformation and data quality is:www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 13Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumData models and qualityData models are often an afterthought or developed onlyto meet immediate requirements.Data Models are often developed by applicationdevelopers or DBA’s – not by Data Architects.It is very common (and very bad practice) to see physicaldata models being the only data model developed for asystem. Better practice is to develop a logical modelbefore a physical – but this is still not BEST practice!!Physical data models are optimized for performance –NOT for understandability. Often, foreign key relationships arenot utilized in Physical Data Models – making the physical modeldifficult to understand.www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 14Intelligent Business Intelligencesm242

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumData models and qualityThe physical data model, forward engineered to becomethe database schema may be in place for years ordecades!!!Often much easier to change a program than to change adata model once a system is operational (or even whilestill in development)Ergo, data models should be developed with due rigorfollowing industry best practicesBest Practice is to use a phased modeling approach –conceptual, logical, and finally physical modelswww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 15Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumData models and qualityWhat are some of the data and information quality issuesthat can arise from poor data models?The application does not meet business expectations. Rework often required.The model may meet the immediate needs of the application but may miss thelarger needs of the enterprise.M:M relationships may be missed which can lead to significant dataduplication/missing data and increased development and maintenance costsBusiness rules not identified, or not identified well. Business exceptions notidentified possibly causing system outages.If cardinality, optionality not properly identified, database constraints may beconfigured inaccurately leading to data quality problems.If relationship identification not properly captured, granularity may be affected data not being captured at the detail necessary, other problems.Lack of good business meta data (attributes in business terms, businessdescriptions, identified data steward, etc)More www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 16Intelligent Business Intelligencesm243

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumData models and qualityA bad data architecture practice is developing PhysicalData Models without developing Conceptual and LogicalData Models firstIT needs to “Resist the Urge” to design physical (andlogical) data models first.www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 17Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumResisting the UrgeWhat does this mean?There is a tendency to build physical data models first and askquestions later!!Not uncommon to see database schemas being developed intandem with the application development processThese models may meet initial requirements but break down whenadditional requirements and functionality are identifiedThese models often allow or even force Data Quality problemsto creep inNeed to develop a conceptual data model as the first step of aphased modeling approach and use the conceptual data model as atool to validate and communicate understanding of businessrequirements with the businesswww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 18Intelligent Business Intelligencesm244

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumCausal factorsLack of data modeling experience and trainingIT professionals often don’t feel productive unless they’re “doingsomething” – e.g. developing a database or writing code.Temptation to cut corners when management wants things doneyesterdayDesigning and creating databases is fun!! Why did we get into IT butto design and build systems?In IT, there are many ways that something can be accomplished – noteach way is equal in valuewww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 19Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumResultSystems which may not fully meet business requirementsPhysical structures that may initially be easy to load andquery but over time become more difficult to usePoor data quality!Maintenance headachesInflexible for future changeLonger load cyclesEtc END RESULT: Unsatisfied customers, increasedexpense, lack of confidence in IT, etcwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 20Intelligent Business Intelligencesm245

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry Symposiumwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 21Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumSCENARIO:ExampleA pet hospital chain that performs services and sells productsstarted a CRM (Customer Relationship Management) undertakingand began capturing information about customers and their pets ina CDI (Customer Data Integration) HubAlso wanted to track household activity. Last name and addressused for determining a household. A household is comprised of 1or many customers.Data to be used for targeted marketing campaignsWanted to be able to track multiple addresses per customer.Per business requirements, a Customer had only 1 household idwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 22Intelligent Business Intelligencesm246

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumWhat’s wrong with this picture?CUSTOMERCUSTOMER IDFIRST NAMEMIDDLE NAMELAST NAMEHOUSEHOLD IDPETPET IDCUSTOMER ID (FK)PET NAMEPET TYPEPET BREEDHOUSEHOLD IDwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 23Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumWhat’s wrong with this picture?ADDRESSADDRESS IDCUSTOMERCUSTOMER IDFIRST NAMEMIDDLE NAMELAST NAMEHOUSEHOLD IDCUSTOMER ID (FK)ADDRESS TYPE ID (FK)ADDRESS LINE1ADDRESS LINE2CITYSTATEPOSTAL CODECOUNTRYADDRESS TYPEADDRESS TYPE IDADDRESS TYPE DESCPETPET IDCUSTOMER ID (FK)PET NAMEPET TYPEPET BREEDHOUSEHOLD IDwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 24Intelligent Business Intelligencesm247

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumFACTORS:ExampleDevelopers assumed they understood the business – theyinterviewed the customerA CDM was not created due many factors such as lack of datamodeling expertise and tight deadlines.Was incredibly difficult to make changes to the modelThis “proof of concept” required very extensive modification in orderfor the business to have some confidence in it . It waseventually outsourced!www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 25Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumEND RESULT:ExampleDuplication all over the place, requiring unnecessarily complexprocessing and longer ETL processing windowsTook heroic effort and a long amount time to adjust the system forchanging business requirements – CMM Level 0!Excessive maintenance programmingThe business rules had to be enforced primarily in the ETL andSQL and not in the database!The poor data model forced data quality problems into thesystemThe data model didn’t fulfill its “enforcement” role – enforcing gooddata quality through the data model!!www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 26Intelligent Business Intelligencesm248

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumHeadacheAs the old saying goes “An ounce of prevention is worth a poundof cure”Taking additional time up front to understand the business anddevelop conceptual data models helps:Prevent assumptions which lead to data, information qualityproblemsUncovers “gotchas” that can surface later –fewer “OH SHOOT” momentsReduce development and maintenance costswww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 27Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry Symposium7 HabitsOne of the habits in Steven Covey’s “7 Habits of Highly EffectivePeople” that is commonly quoted is “Begin with the end in mind”This makes great sense for many things but for good data modeling,start with the beginning in mind with an eye to the end (e.g. to limitscope for the CDM effort)Understand the business first and finally build physical structures(with many steps and iterations of steps in between)Understand the business first by developing a CDM, and review theCDM with the businesswww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 28Intelligent Business Intelligencesm249

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumWhat is theConceptual DataModel?www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 29Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumWhat is a Conceptual Data Model?A diagram identifying real worldconcepts/objects/things (entities) and therelationships between these in order to gain, reflect,and document understanding of the business (as-is &to-be), in order to:foster semantic reconciliationimprove business/IT collaborationserve as a framework for the development ofinformation systemswww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 30Intelligent Business Intelligencesm250

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumWhat is a Conceptual Data Model?“A conceptual entity-relationship model shows howthe business world sees information. It suppressesnon-critical details in order to emphasize businessrules and user objects. It typically includes onlysignificant entities which have business meaning,along with their relationships. “Applied Information Science websitewww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 31Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumWhat is a Conceptual Data Model‘A data model that represents an abstract viewof the real world. A conceptual modelrepresents the human understanding of asystem . A conceptual data model describeshow relevant information is structured in thenatural world. In other words, it is how thehuman mind is accustomed to thinking of theinformation.’OECD Glossary of Statistical Termswww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 32Intelligent Business Intelligencesm251

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumWhat is a Conceptual Data Model?It is “stateless” - NOT a state modelThe entire possible lifecycle of arelationship should be represented, per currentbusiness practiceThis includes business exceptions!!Not exceptions due to poor data quality or due to systemlimitations)The CDM should reflect the business – not IT systemsReview optionality and cardinality to ensure c Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 33Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry Symposiumwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 34Intelligent Business Intelligencesm252

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumWhere does it fit in the big picture?www.EWSolutions.comStrategic Partner & Systems IntegratorIntelligent Business Intelligencesm 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 35The MIT 2008 Information Quality Industry SymposiumWhere does it fit in the big picture?Even if notan ECDM,still need tofollow similarprogressionZachman Framework partialwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 36Intelligent Business Intelligencesm253

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumSemantic ResolutionA CDM is a key tool for semantic resolutionFor enterprise applications, have to reach consensus across divisions,departments, external agencies, etc, for naming and defining dataentities, and identifying correct relationships.Semantic resolution is a key activity of Data Governance andStewardship, and an ECDM is a key enabler of Data Governance andStewardship – these activities often take place in tandem, iterativelyDifficult to have Information Quality if synonyms, homonymshaven’t been resolved. E.g. Is a customer a party that has placed anorder, or can customer be a party who placed an order or a party thatmight become a paying customer?www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 37Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumSemantic ResolutionDue to fundamental differences with the LDM, the CDM often has tobe contained in a separate model file and so there is a risk thatlineage from a logical entity to a conceptual entity can be lostBe sure to save the association between conceptual and logicalentities, logical and physical entities, etc using:A meta data repository and related tool which can be used toestablish these relationshipsUser defined meta data properties within the modelSpreadsheet, etc. Last resortCDM’s can help drive creation of a common, corporate lexicon –fostering improved communication, standardization --- BENEFICIALTO THE ENTIRE ENTERPRISE – NOT JUST IT!www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 38Intelligent Business Intelligencesm254

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumSemantic Resolutionwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 39Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumSemantic ResolutionGary Larson – The Far Sidewww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 40Intelligent Business Intelligencesm255

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumDeveloping theConceptual DataModelwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 41Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumGetting started developing a CDMA major hurdle is separating “data thinking” vs “process thinking”For conceptual data modeling, we’re thinking about “what” (data) notthe “how” (process).For a CDM – data is a relative termData may not exist currently for a conceptual entity – but entities mustbe included in the CDM if it is an object of importance to the businessThis – not thiswww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 42Intelligent Business Intelligencesm256

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumGetting started developing a CDMWhen interviewing the business helpful to use a “recipe” analogy (seeSteve Hoberman design challenge *) . A recipe identifies theingredients, utensils, equipment (whats) and has directions (hows)in order to meet the desired goal.If the interviewee focuses on process ask “What things are needed forthe XYZ process?” “What are the components of the XYZ process?”Helpful starting place is to identify “nouns”,e.g. Customer, Product, Inventory* DMReview January 2008, quoting Geof Clarkwww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 43Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumWhen is a CDM finished?“Perfection does not come into being,when nothing more can be added, butwhen nothing can be taken away”Antoine de Saint-Exupérywww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 44Intelligent Business Intelligencesm257

MIT Information Quality Industry Symposium, July 16-17, 2008IE NotationThe MIT 2008 Information Quality Industry SymposiumInformation Engineering ng or Non-IdentifyingOrganizationEntity NameSystemOrganization IDPrimary KeySystem IDMandatory or OptionalUtilizesOrganization NameTaxpayer ID (AK)AttributesOrganization IDSystem NameManyOneVerb PhraseAlternate KeyCardinalitywww.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 45Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumEntity (type)In the CDM, a real-world object of interest to thebusinessDon’t include associative entities or entities that mirrora database table (unless it corresponds to a real-worldbusiness object)Identifying RelationshipDependantIndependent(square)ORDER LINEORDERORDER NUMBER (FK)ORDER LINE NUMBERORDER NUMBER(roundededges)www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 46Intelligent Business Intelligencesm258

MIT Information Quality Industry Symposium, July 16-17, 2008The MIT 2008 Information Quality Industry SymposiumIdentification in RelationshipsAn identifying relationship is stronger -helps determine the meaning andgranularity of a child entity. Is alwaysmandatory. (NOTE: a solid line in a M:Mrelationship does not denote an identifyingrelationship!!)A non identifying relationship may bemandatory or optional, but does not definemeaning/granularityIdentifying RelationshipNon-identifying RelationshipORDER LINEORDER HEADERORDER NUMBERORDER TYPEORDER NUMBER (FK)ORDER LINE NUMBERORDER HEADERORDER TYPE IDORDER NUMBERORDER TYPE ID (FK)www.EWSolutions.comStrategic Partner & Systems Integrator 2008 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 47Intelligent Business IntelligencesmThe MIT 2008 Information Quality Industry SymposiumRelationship Verb PhraseDescribes the relationship using business terminologyCa

Pete has developed and taught courses on Dimensional Data Modeling, Conceptual Data Modeling, ER/Studio, and SQL. Pete has presented for DAMA at the international and local level, as well as at the 2007 IADQ Conference. Pete's articles on Data Architecture have been published in Real World Decision Support, DMForum, InfoAdvisors, and the .