Principles Of Database Management

Transcription

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationPrinciples of Database ManagementThe Practical Guide to Storing, Managing and Analyzing Big and Small DataPrinciples of Database Management provides students with the comprehensive database management information to understand and apply the fundamental concepts of database design andmodeling, database systems, data storage and the evolving world of data warehousing, governanceand more. Designed for those studying database management for information management orcomputer science, this illustrated textbook has a well-balanced theory–practice focus and coversthe essential topics, from established database technologies up to recent trends like Big Data,NoSQL and analytics. On-going case studies, drill-down boxes that reveal deeper insights on keytopics, retention questions at the end of every section of a chapter, and connections boxes that showthe relationship between concepts throughout the text are included to provide the practical tools toget started in database management.Key features include: Full-color illustrations throughout the text. Extensive coverage of important trending topics, including data warehousing, business intelligence, data integration, data quality, data governance, Big Data and analytics. An online playground with diverse environments, including MySQL for querying; MongoDB;Neo4j Cypher; and a tree structure visualization environment. Hundreds of examples to illustrate and clarify the concepts discussed that can be reproduced onthe book’s companion online playground. Case studies, review questions, problems and exercises in every chapter. Additional cases, problems and exercises in the appendix. in this web service Cambridge University Presswww.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore Information“Although there have been a series of classical textbooks on database systems, the new dramatic advances callfor an updated text covering the latest significant topics, such as Big Data analytics, NoSQL and much more.Fortunately, this is exactly what this book has to offer. It is highly desirable for training the next generation ofdata management professionals.”– Jian Pei, Simon Fraser University“I haven’t seen an as up-to-date and comprehensive textbook for database management as this one in manyyears. Principles of Database Management combines a number of classical and recent topics concerning datamodeling, relational databases, object-oriented databases, XML, distributed data management, NoSQL and BigData in an unprecedented manner. The authors did a great job in stitching these topics into one coherent andcompelling story that will serve as an ideal basis for teaching both introductory and advanced courses.”– Martin Theobald, University of Luxembourg“This is a very timely book with outstanding coverage of database topics and excellent treatment of databasedetails. It not only gives very solid discussions of traditional topics such as data modeling and relationaldatabases, but also contains refreshing contents on frontier topics such as XML databases, NoSQL databases,Big Data and analytics. For those reasons, this will be a good book for database professionals, who will keepusing it for all stages of database studies and works.”– J. Leon Zhao, City University of Hong Kong“This accessible, authoritative book introduces the reader the most important fundamental concepts of datamanagement, while providing a practical view of recent advances. Both are essential for data professionalstoday.”– Foster Provost, New York University, Stern School of Business“This guide to big and small data management addresses both fundamental principles and practical deployment.It reviews a range of databases and their relevance for analytics. The book is useful to practitioners because itcontains many case studies, links to open-source software, and a very useful abstraction of analytics that willhelp them choose solutions better. It is important to academics because it promotes database principles whichare key to successful and sustainable data science.”– Sihem Amer-Yahia, Laboratoire d’Informatique de Grenoble; Editor-in-Chief, The VLDB Journal(International Journal on Very Large DataBases)“This book covers everything you will need to teach in a database implementation and design class. With somechapters covering Big Data, analytic models/methods and NoSQL, it can keep our students up to date withthese new technologies in data management-related topics.”– Han-fen Hu, University of Nevada, Las Vegas in this web service Cambridge University Presswww.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationPrinciples of DatabaseManagementThe Practical Guide to Storing, Managing andAnalyzing Big and Small DataWilfried LemahieuKU Leuven, BelgiumSeppe vanden BrouckeKU Leuven, BelgiumBart BaesensKU Leuven, Belgium; University of Southampton, United Kingdom in this web service Cambridge University Presswww.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationUniversity Printing House, Cambridge CB2 8BS, United KingdomOne Liberty Plaza, 20th Floor, New York, NY 10006, USA477 Williamstown Road, Port Melbourne, VIC 3207, Australia314–321, 3rd Floor, Plot 3, Splendor Forum, Jasola District Centre, New Delhi – 110025, India79 Anson Road, #06–04/06, Singapore 079906Cambridge University Press is part of the University of Cambridge.It furthers the University’s mission by disseminating knowledge in the pursuit ofeducation, learning, and research at the highest international levels of excellence.www.cambridge.orgInformation on this title: www.cambridge.org/9781107186125DOI: 10.1017/9781316888773 Wilfried Lemahieu, Seppe vanden Broucke, and Bart Baesens 2018This publication is in copyright. Subject to statutory exceptionand to the provisions of relevant collective licensing agreements,no reproduction of any part may take place without the writtenpermission of Cambridge University Press.First published 2018Printed and bound in Great Britain by Clays Ltd, Elcograf S.p.A.A catalog record for this publication is available from the British Library.Library of Congress Cataloging-in-Publication DataNames: Lemahieu, Wilfried, 1970– author. Broucke, Seppe vanden, 1986– author. Baesens, Bart, author.Title: Principles of database management : the practical guide to storing, managing and analyzing bigand small data / Wilfried Lemahieu, Katholieke Universiteit Leuven, Belgium, Seppe vanden Broucke,Katholieke Universiteit Leuven, Belgium, Bart Baesens, Katholieke Universiteit Leuven, Belgium.Description: First edition. New York, NY : Cambridge University Press, 2018. Includes bibliographicalreferences and index.Identifiers: LCCN 2018023251 ISBN 9781107186125 (hardback : alk. paper)Subjects: LCSH: Database management.Classification: LCC QA76.9.D3 L454 2018 DDC 005.74–dc23LC record available at https://lccn.loc.gov/2018023251ISBN 978-1-107-18612-5 HardbackAdditional resources for this publication at www.cambridge.org/LemahieuCambridge University Press has no responsibility for the persistence or accuracyof URLs for external or third-party internet websites referred to in this publicationand does not guarantee that any content on such websites is, or will remain,accurate or appropriate. in this web service Cambridge University Presswww.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationBRIEF CONTENTSAbout the AuthorsPrefaceSober: 1000‰ Driven by TechnologyPart I1234Databases and DatabaseDesignFundamental Concepts of DatabaseManagement34912 Physical File Organization andIndexing35113 Physical Database Organization39514 Basics of Transaction Management43015 Accessing Databases andDatabase APIs458Conceptual Data Modeling Using the(E)ER Model and UML Class Diagram3816 Data Distribution and DistributedTransaction Management516Organizational Aspects of DataManagement79Part IVTypes of Database SystemsLegacy Databases6Relational Databases: TheRelational Model93Physical Data Storage,Transaction Management,and Database Access20581Part IIIArchitecture and Categorizationof DBMSsPart II7page xviixixxxiv9193104Relational Databases: StructuredQuery Language (SQL)146Object-Oriented Databases andObject Persistence207Extended Relational Databases23110 XML Databases25511 NoSQL Databases300 in this web service Cambridge University PressData Warehousing, DataGovernance, and (Big)Data Analytics54917 Data Warehousing and BusinessIntelligence55118 Data Integration, Data Quality,and Data Governance59019 Big Data62620 Analytics664Appendix Using the Online EnvironmentGlossaryIndex731741770www.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore Information in this web service Cambridge University Presswww.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationCONTENTSAbout the AuthorsPrefaceSober: 1000‰ Driven by TechnologyPart I1page xviixixxxivDatabases and DatabaseDesignFundamental Concepts of DatabaseManagement1.1 Applications of Database Technology1.2 Key Definitions1.3 File versus Database Approach toData Management1.3.1 The File-Based Approach1.3.2 The Database Approach1.4 Elements of a Database System1.4.11.4.21.4.31.4.41.4.51.4.6Database Model versus InstancesData ModelThe Three-Layer ArchitectureCatalogDatabase UsersDatabase Languages1.5 Advantages of Database Systemsand Database Management1.5.1 Data Independence1.5.2 Database Modeling1.5.3 Managing Structured, SemiStructured, and Unstructured Data1.5.4 Managing Data Redundancy1.5.5 Specifying Integrity Rules1.5.6 Concurrency Control1.5.7 Backup and Recovery Facilities1.5.8 Data Security1.5.9 Performance UtilitiesSummaryKey Terms ListReview QuestionsProblems and Exercises in this web service Cambridge University 92Architecture and Categorizationof DBMSs2.1 Architecture of a DBMS2.1.1 Connection and Security Manager2.1.2 DDL Compiler2.1.3 Query Processor2.1.3.1 DML Compiler2.1.3.2 Query Parser and QueryRewriter2.1.3.3 Query Optimizer2.1.3.4 Query Executor2.1.4 Storage Manager2.1.4.1 Transaction Manager2.1.4.2 Buffer Manager2.1.4.3 Lock Manager2.1.4.4 Recovery Manager2.1.5 DBMS Utilities2.1.6 DBMS Interfaces2.2 Categorization of DBMSs2122222225252525252626262627272.2.1 Categorization Based on DataModel2.2.1.1 Hierarchical DBMSs2.2.1.2 Network DBMSs2.2.1.3 Relational DBMSs2.2.1.4 Object-Oriented DBMSs2.2.1.5 Object-Relational/ExtendedRelational DBMSs2.2.1.6 XML DBMSs2.2.1.7 NoSQL DBMSs2.2.2 Categorization Based onDegree of Simultaneous Access2.2.3 Categorization Based onArchitecture2.2.4 Categorization Based onUsageSummaryKey Terms ListReview QuestionsProblems and bridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationContentsviii3Conceptual Data Modeling Using the(E)ER Model and UML Class Diagram3.1 Phases of Database Design3.2 The Entity Relationship Model3.2.1 Entity Types3.2.2 Attribute Types3.2.3.1 Domains3.2.3.2 Key Attribute Types3.2.3.3 Simple versus CompositeAttribute Types3.2.3.4 Single-Valued versusMulti-ValuedAttribute Types3.2.3.5 Derived Attribute Type3.2.4 Relationship Types3.2.4.1 Degree and Roles3.2.4.2 Cardinalities3.2.4.3 Relationship AttributeTypes3.2.5 Weak Entity Types3.2.6 Ternary Relationship Types3.2.7 Examples of the ER Model3.2.8 Limitations of the ER Model3.3 The Enhanced Entity Relationship(EER) alizationCategorizationAggregationExamples of the EER ModelDesigning an EER Model3.4 The UML Class Diagram3.4.13.4.23.4.33.4.43.4.5Recap of Object OrientationClassesVariablesAccess ModifiersAssociations3.4.5.1 Association Class3.4.5.2 Unidirectional versusBidirectional Association3.4.5.3 Qualified Association3.4.6 Specialization/Generalization3.4.7 Aggregation3.4.8 UML Example3.4.9 Advanced UML ModelingConcepts3.4.9.1 Changeability Property3.4.9.2 Object ConstraintLanguage (OCL)3.4.9.3 Dependency Relationship3.4.10 UML versus EER in this web service Cambridge University Press383840404041424SummaryKey Terms ListReview QuestionsProblems and Exercises67717175Organizational Aspects of DataManagement4.1 Data Management79794.1.1 Catalogs and the Role of Metadata4.1.2 Metadata Modeling4.1.3 Data Quality4.1.3.1 Data Quality Dimensions4.1.3.2 Data Quality Problems4.1.4 Data Governance4243434344454.2 Roles in Data 5254555556Part II64666688898990Types of Database Systems91Legacy Databases5.1 The Hierarchical Model5.2 The CODASYL ModelSummaryKey Terms ListReview QuestionsProblems and Exercises9393971021021021036Relational Databases: TheRelational Model6.1 The Relational 6386Information ArchitectDatabase DesignerData OwnerData StewardDatabase AdministratorData ScientistSummaryKey Terms ListReview QuestionsProblems and Exercises528080818284856.1.1 Basic Concepts6.1.2 Formal Definitions6.1.3 Types of Keys6.1.3.1 Superkeys and Keys6.1.3.2 Candidate Keys, PrimaryKeys, and Alternative Keys6.1.3.3 Foreign Keys6.1.4 Relational Constraints6.1.5 Example Relational Data Model105106108108108109111111www.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationContents6.2 Normalizationix1117.1.1 Key Characteristics of SQL7.1.2 Three-Layer DatabaseArchitecture6.2.1 Insertion, Deletion, and UpdateAnomalies in an UnnormalizedRelational Model1126.2.2 Informal Normalization Guidelines 1146.2.3 Functional Dependencies andPrime Attribute Type1146.2.4 Normalization Forms1156.2.4.1 First Normal Form (1 NF) 1156.2.4.2 Second Normal Form(2 NF)1176.2.4.3 Third Normal Form(3 NF)1186.2.4.4 Boyce–Codd NormalForm (BCNF)1196.2.4.5 Fourth Normal Form(4 NF)1206.3 Mapping a Conceptual ER Modelto a Relational Model6.3.1 Mapping Entity Types6.3.2 Mapping Relationship Types6.3.2.1 Mapping a Binary 1:1Relationship type6.3.2.2 Mapping a Binary 1:NRelationship Type6.3.2.3 Mapping a Binary M:NRelationship Type6.3.2.4 Mapping UnaryRelationship Types6.3.2.5 Mapping n-aryRelationship Types6.3.3 Mapping Multi-ValuedAttribute Types6.3.4 Mapping Weak Entity Types6.3.5 Putting it All Together7.2 SQL Data Definition 301311326.4 Mapping a Conceptual EER Modelto a Relational Model1336.4.1 Mapping an EER Specialization6.4.2 Mapping an EER Categorization6.4.3 Mapping an EER AggregationSummaryKey Terms ListReview QuestionsProblems and Exercises7Relational Databases: StructuredQuery Language (SQL)7.1 Relational Database ManagementSystems and SQL in this web service Cambridge University Press1331361371381391391431461478149149Key DDL ConceptsDDL ExampleReferential Integrity ConstraintsDROP and ALTER Command1501511541557.3 SQL Data Manipulation Language1567.3.1 SQL SELECT Statement7.3.1.1 Simple Queries7.3.1.2 Queries with AggregateFunctions7.3.1.3 Queries with GROUPBY/HAVING7.3.1.4 Queries with ORDER BY7.3.1.5 Join Queries7.3.1.6 Nested Queries7.3.1.7 Correlated Queries7.3.1.8 Queries with ALL/ANY7.3.1.9 Queries with EXISTS7.3.1.10 Queries with Subqueriesin SELECT/FROM7.3.1.11 Queries with SetOperations7.3.2 SQL INSERT Statement7.3.3 SQL DELETE Statement7.3.4 SQL UPDATE 31851851867.4 SQL Views7.5 SQL Indexes7.6 SQL Privileges7.7 SQL for Metadata ManagementSummaryKey Terms ListReview QuestionsProblems and Exercises188190191192194195196205Object-Oriented Databases andObject Persistence8.1 Recap: Basic Concepts of OO8.2 Advanced Concepts of OO2072082098.2.18.2.28.2.38.2.4Method OverloadingInheritanceMethod OverridingPolymorphism and DynamicBinding8.3 Basic Principles of ObjectPersistence8.3.1 Serialization209210211212214214www.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationContentsx8.4 OODBMS2168.4.18.4.28.4.38.4.48.4.5Object Identifiers216ODMG Standard217Object Model217Object Definition Language (ODL) 218Object Query Language (OQL)2218.4.5.1 Simple OQL Queries2218.4.5.2 SELECT FROM WHEREOQL Queries2218.4.5.3 Join OQL Queries2228.4.5.4 Other OQL Queries2228.4.6 Language Bindings2238.5 Evaluating OODBMSsSummaryKey Terms ListReview QuestionsProblems and Exercises9225227227228229Extended Relational Databases2319.1 Limitations of the Relational Model 2319.2 Active RDBMS Extensions2329.2.1 Triggers9.2.2 Stored Procedures2332349.3 Object-Relational RDBMS Extensions 2369.3.1 User-Defined Types9.3.1.1 Distinct Data Types9.3.1.2 Opaque Data Types9.3.1.3 Unnamed Row Types9.3.1.4 Named Row Types9.3.1.5 Table Data Types9.3.2 User-Defined Functions9.3.3 Inheritance9.3.3.1 Inheritance at DataType Level9.3.3.2 Inheritance at TableType Level9.3.4 Behavior9.3.5 Polymorphism9.3.6 Collection Types9.3.7 Large Objects9.4 Recursive SQL QueriesSummaryKey Terms ListReview QuestionsProblems and Exercises10 XML Databases10.1 Extensible Markup Language10.1.1 Basic Concepts in this web service Cambridge University 25025125225325525625610.1.2 Document Type Definitionand XML Schema Definition10.1.3 Extensible StylesheetLanguage10.1.4 Namespaces10.1.5 XPath10.2 Processing XML Documents10.3 Storage of XML Documents26326626726726910.3.1 The Document-OrientedApproach for Storing XMLDocuments10.3.2 The Data-OrientedApproach for StoringXML Documents10.3.3 The Combined Approach forStoring XML Documents10.4 Differences Between XML Dataand Relational Data10.5 Mappings Between XMLDocuments and (Object-)Relational Data10.5.110.5.210.5.310.5.4260Table-Based MappingSchema-Oblivious MappingSchema-Aware MappingSQL/XML10.6 Searching XML Data27027027027127227227327527627910.6.1 Full-Text Search10.6.2 Keyword-Based Search10.6.3 Structured Search WithXQuery10.6.4 Semantic Search With RDFand SPARQL28028010.7 XML for Information Exchange28410.7.1 Message-OrientedMiddleware10.7.2 SOAP-Based Web Services10.7.3 REST-Based Web Services10.7.4 Web Services and Databases10.8 Other Data RepresentationFormatsSummaryKey Terms ListReview QuestionsProblems and Exercises28028228428528828929029329629729811 NoSQL Databases11.1 The NoSQL Movement30030111.1.1 The End of the “One SizeFits All” Era?301www.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationContentsxi11.1.2 The Emergence of theNoSQL Movement11.2 Key–Value .2.811.2.9From Keys to HashesHorizontal ScalingAn Example: MemcachedRequest CoordinationConsistent HashingReplication and RedundancyEventual ConsistencyStabilizationIntegrity Constraints andQuerying11.3 Tuple and Document Stores11.3.1 Items with Keys11.3.2 Filters and Queries11.3.3 Complex Queries andAggregation with MapReduce11.3.4 SQL After All. . .11.4 Column-Oriented Databases11.5 Graph-Based Databases11.5.1 Cypher Overview11.5.2 Exploring a Social Graph11.6 Other NoSQL CategoriesSummaryKey TermsReview QuestionsProblems and ExercisesPart IIIPhysical Data Storage,Transaction Management,and Database Access12 Physical File Organization andIndexing12.1 Storage Hardware and PhysicalDatabase Design30230430430530630830931131231431431512.3.1 Introductory Concepts: SearchKeys, Primary, and SecondaryFile Organization12.3.2 Heap File Organization12.3.3 Sequential File Organization12.3.4 Random File Organization(Hashing) in this web service Cambridge University 34534734912.3.8SummaryKey Terms ListReview QuestionsProblems and Exercises39039139239335135112.1.1 The Storage Hierarchy35212.1.2 Internals of Hard Disk Drives 35312.1.3 From Logical Concepts toPhysical Constructs35612.2 Record Organization12.3 File Organization12.3.512.3.4.1 Key-to-AddressTransformation36512.3.4.2 Factors that Determinethe Efficiency ofRandom FileOrganization368Indexed Sequential FileOrganization37012.3.5.1 Basic Terminologyof Indexes37012.3.5.2 Primary Indexes37112.3.5.3 Clustered Indexes37312.3.5.4 Multilevel Indexes 374List Data Organization(Linear and Nonlinear Lists) 37512.3.6.1 Linear Lists37512.3.6.2 Tree Data Structures 377Secondary Indexes andInverted Files37912.3.7.1 Characteristics ofSecondary Indexes 38012.3.7.2 Inverted Files38112.3.7.3 MulticolumnIndexes38212.3.7.4 Other Index Types 383384B-Trees and B -Trees12.3.8.1 Multilevel IndexesRevisited38412.3.8.2 Binary Search Trees 38512.3.8.3 B-Trees38612.3.8.4 B -Trees38835936136236336336513 Physical Database Organization13.1 Physical Database Organizationand Database Access Methods13.1.1 From Database to Tablespace13.1.2 Index Design13.1.3 Database Access Methods13.1.3.1 Functioning of theQuery Optimizer13.1.3.2 Index Search (withAtomic Search Key)13.1.3.3 Multiple Index andMulticolumn IndexSearch13.1.3.4 Index-Only Access13.1.3.5 Full Table g

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationContentsxii13.1.4 Join Implementations13.1.4.1 Nested-Loop Join13.1.4.2 Sort-Merge Join13.1.4.3 Hash Join13.2 Enterprise Storage Subsystemsand Business Continuity40840941041041113.2.1 Disk Arrays and RAID41113.2.2 Enterprise Storage Subsystems 41313.2.2.1 Overview andClassification41413.2.2.2 DAS (DirectlyAttached Storage)41613.2.2.3 SAN (Storage AreaNetwork)41613.2.2.4 NAS (NetworkAttached Storage)41713.2.2.5 NAS Gateway41813.2.2.6 iSCSI/StorageOver IP41913.2.3 Business Continuity42113.2.3.1 Contingency Planning,Recovery Point,and Recovery Time 42113.2.3.2 Availability andAccessibility ofStorage Devices42213.2.3.3 Availability ofDatabaseFunctionality42213.2.3.4 Data Availability423SummaryKey Terms ListReview QuestionsProblems and Exercises14 Basics of Transaction Management14.1 Transactions, Recovery, andConcurrency Control14.2 Transactions and TransactionManagement42642642742943043143214.2.1 Delineating Transactionsand the Transaction Lifecycle 43214.2.2 DBMS Components Involvedin Transaction Management43314.2.3 The Logfile43514.3 Recovery14.3.1 Types of Failures14.3.2 System Recovery in this web service Cambridge University Press43643643614.3.3 Media Recovery43814.4 Concurrency Control43914.4.1 Typical Concurrency Problems 43914.4.1.1 Lost Update Problem 44014.4.1.2 UncommittedDependency Problem(aka Dirty ReadProblem)44014.4.1.3 Inconsistent AnalysisProblem44114.4.1.4 Other ConcurrencyRelated Problems44214.4.2 Schedules and SerialSchedules44214.4.3 Serializable Schedules44214.4.4 Optimistic and PessimisticSchedulers44314.4.5 Locking and LockingProtocols44414.4.5.1 Purposes of Locking 44414.4.5.2 The Two-PhaseLocking Protocol(2PL)44614.4.5.3 Cascading Rollbacks 44714.4.5.4 Dealing withDeadlocks44814.4.5.5 Isolation Levels44914.4.5.6 Lock Granularity45014.5 The ACID Properties ofTransactionsSummaryKey Terms ListReview QuestionsProblems and Exercises45245345345445615 Accessing Databases andDatabase APIs15.1 Database System Architectures15.1.1 Centralized SystemArchitectures15.1.2 Tiered System Architectures15.2 Classification of Database APIs15.2.1 Proprietary versusUniversal APIs15.2.2 Embedded versus CallLevel APIs15.2.3 Early Binding versus rg

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationContentsxiii15.3 Universal Database .8ODBCOLE DB and ADOADO.NETJava DataBaseConnectivity (JDBC)Intermezzo: SQL Injectionand Access SecuritySQLJIntermezzo: Embedded APIsversus Embedded DBMSsLanguage-Integrated Querying15.4 Object Persistence and ObjectRelational Mapping APIs15.4.1 Object Persistence withEnterprise JavaBeans15.4.2 Object Persistence with theJava Persistence API15.4.3 Object Persistence with JavaData Objects15.4.4 Object Persistence in OtherHost Languages15.5 Database API Summary15.6 Database Access in the WorldWide Web46647147747948048216 Data Distribution and DistributedTransaction Management16.1 Distributed Systems andDistributed Databases16.2 Architectural Implications ofDistributed Databases16.3 Fragmentation, Allocation,and Replication in this web service Cambridge University Press16.4 Transparency16.5 Distributed Query Processing16.6 Distributed TransactionManagement and ConcurrencyControl16.6.1 Primary Site and PrimaryCopy 2PL16.6.2 Distributed 2PL16.6.3 The Two-Phase CommitProtocol (2PC)16.6.4 Optimistic Concurrency andLoosely Coupled Systems16.6.5 Compensation-BasedTransaction 052152152352452452552852952953053253416.7 Eventual Consistency and BASETransactions53816.7.1 Horizontal Fragmentationand Consistent Hashing16.7.2 The CAP Theorem16.7.3 BASE Transactions16.7.4 Multi-Version ConcurrencyControl and VectorClocks16.7.5 Quorum-Based Consistency50415.6.1 Introduction: the OriginalWeb Server50415.6.2 The Common GatewayInterface: Toward DynamicWeb Pages50415.6.3 Client-Side Scripting: TheDesire for a Richer Web50715.6.4 JavaScript as a Platform50815.6.5 DBMSs Adapt: REST, OtherWeb Services, and a Look Ahead 509SummaryKey Terms ListReview QuestionsProblems and Exercises16.3.1 Vertical Fragmentation16.3.2 Horizontal Fragmentation(Sharding)16.3.3 Mixed Fragmentation16.3.4 Replication16.3.5 Distribution and Replicationof Metadata466467468SummaryKey TermsReview QuestionsProblems and ExercisesPart IV541542544545546547Data Warehousing, DataGovernance, and (Big)Data Analytics17 Data Warehousing and BusinessIntelligence17.1 Operational versus Tactical/Strategic Decision-Making17.2 Data Warehouse Definition17.3 Data Warehouse Schemas17.3.1 Star Schema538539540549551552553554555www.cambridge.org

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationContentsxiv17.3.2 Snowflake Schema17.3.3 Fact Constellation17.3.4 Specific Schema Issues17.3.4.1 Surrogate keys17.3.4.2 Granularity of theFact Table17.3.4.3 Factless Fact Tables17.3.4.4 Optimizing theDimension Tables17.3.4.5 Defining JunkDimensions17.3.4.6 Defining OutriggerTables17.3.4.7 Slowly ChangingDimensions17.3.4.8 Rapidly ChangingDimensions17.4 The Extraction, Transformation,and Loading (ETL) Process17.5 Data Marts17.6 Virtual Data Warehouses andVirtual Data Marts17.7 Operational Data Store17.8 Data Warehouses versusData Lakes17.9 Business Intelligence17.9.1 Query and Reporting17.9.2 Pivot Tables17.9.3 On-Line AnalyticalProcessing (OLAP)17.9.3.1 MOLAP17.9.3.2 ROLAP17.9.3.3 HOLAP17.9.3.4 OLAP Operators17.9.3.5 OLAP Queriesin SQLSummaryKey Terms ListReview QuestionsProblems and Exercises18 Data Integration, Data Quality,and Data Governance18.1 Data and Process Integration18.1.1 Convergence of Analyticaland Operational Data Needs18.1.2 Data Integration and DataIntegration Patterns in this web service Cambridge University 159318.1.2.1 Data Consolidation:Extract, Transform,Load (ETL)59318.1.2.2 Data Federation:Enterprise InformationIntegration (EII)59518.1.2.3 Data Propagation:EnterpriseApplicationIntegration (EAI)59618.1.2.4 Data Propagation:Enterprise DataReplication (EDR) 59718.1.2.5 Changed Data Capture(CDC), Near-RealTime ETL, andEvent Processing59818.1.2.6 Data Virtualization 59818.1.2.7 Data as a Serviceand Data in theCloud59918.1.3 Data Services and Data Flowsin the Context of Data andProcess Integration60118.1.3.1 Business ProcessIntegration60218.1.3.2 Patterns forManaging SequenceDependencies andData Dependenciesin Processes60418.1.3.3 A Unified View onData and ProcessIntegration60618.2 Searching Unstructured Data andEnterprise Search61018.2.118.2.218.2.318.2.4Principles of Full-Text SearchIndexing Full-Text DocumentsWeb Search EnginesEnterprise Search18.3 Data Quality and Master DataManagement18.4 Data Governance18.4.1 Total Data QualityManagement (TDQM)18.4.2 Capability Maturity ModelIntegration (CMMI)18.4.3 Data Management Body ofKnowledge g

Cambridge University Press978-1-107-18612-5 — Principles of Database ManagementWilfried Lemahieu , Seppe vanden Broucke , Bart BaesensFrontmatterMore InformationContents18.4.4 Control Objectives forInformation and RelatedTechnology (COBIT)18.4.5 Information TechnologyInfrastructure Libraryxv20.4.5 Outlier Detection andHandling62062118.5 Outlook18.6 ConclusionKey Terms ListReview QuestionsProblems and Exercises62162262262362519 Big Data19.1 The 5 Vs of Big Data19.2 Hadoop62662763019.2.1 History of Hadoop63019.2.2 The Hadoop Stack63119.2.2.1 The HadoopDistributed FileSystem63119.2.2.2 MapReduce63519.2.2.3 Yet Another ResourceNegotiator64119.3 SQL on Hadoop19.3.1 HBase: The First Databaseon Hadoop19.3.2 Pig19.3.3 Hive19.4 Apache Spark19.4.1 Spark Cor

Principles of Database Management provides students with the comprehensive database manage-ment information to understand and apply the fundamental concepts of database design and modeling, database systems, data storage and the evolving world of data warehousing, governance and more.