Database Management Systems (2nd Ed.)

Transcription

CONTENTSPREFACEPart I1BASICS1INTRODUCTION TO DATABASE SYSTEMS1.11.21.31.41.53OverviewA Historical PerspectiveFile Systems versus a DBMSAdvantages of a DBMSDescribing and Storing Data in a DBMS1.5.1 The Relational Model1.5.2 Levels of Abstraction in a DBMS1.5.3 Data IndependenceQueries in a DBMSTransaction Management1.7.1 Concurrent Execution of Transactions1.7.2 Incomplete Transactions and System Crashes1.7.3 Points to NoteStructure of a DBMSPeople Who Deal with DatabasesPoints to Review457891011141515161718182021THE ENTITY-RELATIONSHIP i2.22.32.4Overview of Database Design2.1.1 Beyond the ER ModelEntities, Attributes, and Entity SetsRelationships and Relationship SetsAdditional Features of the ER Model2.4.1 Key Constraints2.4.2 Participation Constraints2.4.3 Weak Entities2.4.4 Class Hierarchies2.4.5 Aggregationvii

viiiDatabase Management Systems2.52.62.7338394041434445THE RELATIONAL 82833.23.33.43.53.63.73.8Part II4Conceptual Database Design With the ER Model2.5.1 Entity versus Attribute2.5.2 Entity versus Relationship2.5.3 Binary versus Ternary Relationships *2.5.4 Aggregation versus Ternary Relationships *Conceptual Design for Large Enterprises *Points to ReviewIntroduction to the Relational Model3.1.1 Creating and Modifying Relations Using SQL-92Integrity Constraints over Relations3.2.1 Key Constraints3.2.2 Foreign Key Constraints3.2.3 General ConstraintsEnforcing Integrity ConstraintsQuerying Relational DataLogical Database Design: ER to Relational3.5.1 Entity Sets to Tables3.5.2 Relationship Sets (without Constraints) to Tables3.5.3 Translating Relationship Sets with Key Constraints3.5.4 Translating Relationship Sets with Participation Constraints3.5.5 Translating Weak Entity Sets3.5.6 Translating Class Hierarchies3.5.7 Translating ER Diagrams with Aggregation3.5.8 ER to Relational: Additional Examples *Introduction to Views3.6.1 Views, Data Independence, Security3.6.2 Updates on ViewsDestroying/Altering Tables and ViewsPoints to ReviewRELATIONAL QUERIESRELATIONAL ALGEBRA AND CALCULUS4.14.2PreliminariesRelational Algebra4.2.1 Selection and Projection4.2.2 Set Operations4.2.3 Renaming4.2.4 Joins4.2.5 Division4.2.6 More Examples of Relational Algebra Queries899191929394969799100

ixContents4.34.44.55Relational Calculus4.3.1 Tuple Relational Calculus4.3.2 Domain Relational CalculusExpressive Power of Algebra and Calculus *Points to Review106107111114115SQL: QUERIES, PROGRAMMING, .105.115.125.13About the ExamplesThe Form of a Basic SQL Query5.2.1 Examples of Basic SQL Queries5.2.2 Expressions and Strings in the SELECT CommandUNION, INTERSECT, and EXCEPTNested Queries5.4.1 Introduction to Nested Queries5.4.2 Correlated Nested Queries5.4.3 Set-Comparison Operators5.4.4 More Examples of Nested QueriesAggregate Operators5.5.1 The GROUP BY and HAVING Clauses5.5.2 More Examples of Aggregate QueriesNull Values *5.6.1 Comparisons Using Null Values5.6.2 Logical Connectives AND, OR, and NOT5.6.3 Impact on SQL Constructs5.6.4 Outer Joins5.6.5 Disallowing Null ValuesEmbedded SQL *5.7.1 Declaring Variables and Exceptions5.7.2 Embedding SQL StatementsCursors *5.8.1 Basic Cursor Definition and Usage5.8.2 Properties of CursorsDynamic SQL *ODBC and JDBC *5.10.1 Architecture5.10.2 An Example Using JDBCComplex Integrity Constraints in SQL-92 *5.11.1 Constraints over a Single Table5.11.2 Domain Constraints5.11.3 Assertions: ICs over Several TablesTriggers and Active Databases5.12.1 Examples of Triggers in SQLDesigning Active Databases5.13.1 Why Triggers Can Be Hard to Understand

xDatabase Management Systems5.146167168168QUERY-BY-EXAMPLE (QBE)1776.16.2IntroductionBasic QBE Queries6.2.1 Other Features: Duplicates, Ordering AnswersQueries over Multiple RelationsNegation in the Relation-Name ColumnAggregatesThe Conditions Box6.6.1 And/Or QueriesUnnamed ColumnsUpdates6.8.1 Restrictions on Update CommandsDivision and Relational Completeness *Points to Review177178179180181181183184185185187187189DATA STORAGE AND INDEXING1936.36.46.56.66.76.86.96.10Part III75.13.2 Constraints versus Triggers5.13.3 Other Uses of TriggersPoints to ReviewSTORING DATA: DISKS AND 2122142142162182182192217.27.37.47.57.67.7The Memory Hierarchy7.1.1 Magnetic Disks7.1.2 Performance Implications of Disk StructureRAID7.2.1 Data Striping7.2.2 Redundancy7.2.3 Levels of Redundancy7.2.4 Choice of RAID LevelsDisk Space Management7.3.1 Keeping Track of Free Blocks7.3.2 Using OS File Systems to Manage Disk SpaceBuffer Manager7.4.1 Buffer Replacement Policies7.4.2 Buffer Management in DBMS versus OSFiles and Indexes7.5.1 Heap Files7.5.2 Introduction to IndexesPage Formats *7.6.1 Fixed-Length Records7.6.2 Variable-Length RecordsRecord Formats *

xiContents7.88222222224FILE ORGANIZATIONS AND 422432442448.38.48.58.697.7.1 Fixed-Length Records7.7.2 Variable-Length RecordsPoints to ReviewCost ModelComparison of Three File Organizations8.2.1 Heap Files8.2.2 Sorted Files8.2.3 Hashed Files8.2.4 Choosing a File OrganizationOverview of Indexes8.3.1 Alternatives for Data Entries in an IndexProperties of Indexes8.4.1 Clustered versus Unclustered Indexes8.4.2 Dense versus Sparse Indexes8.4.3 Primary and Secondary Indexes8.4.4 Indexes Using Composite Search KeysIndex Specification in SQL-92Points to ReviewTREE-STRUCTURED 2602652662662682712722729.9Indexed Sequential Access Method (ISAM)B Trees: A Dynamic Index StructureFormat of a NodeSearchInsertDelete *Duplicates *B Trees in Practice *9.8.1 Key Compression9.8.2 Bulk-Loading a B Tree9.8.3 The Order Concept9.8.4 The Effect of Inserts and Deletes on RidsPoints to Review10 HASH-BASED INDEXING10.110.210.310.410.5Static Hashing10.1.1 Notation and ConventionsExtendible Hashing *Linear Hashing *Extendible Hashing versus Linear Hashing *Points to Review278278280280286291292

xiiDatabase Management SystemsPart IVQUERY EVALUATION11 EXTERNAL SORTING11.111.211.311.411.5A Simple Two-Way Merge SortExternal Merge Sort11.2.1 Minimizing the Number of Runs *Minimizing I/O Cost versus Number of I/Os11.3.1 Blocked I/O11.3.2 Double BufferingUsing B Trees for Sorting11.4.1 Clustered Index11.4.2 Unclustered IndexPoints to Review12 EVALUATION OF RELATIONAL ion to Query Processing12.1.1 Access Paths12.1.2 Preliminaries: Examples and Cost CalculationsThe Selection Operation12.2.1 No Index, Unsorted Data12.2.2 No Index, Sorted Data12.2.3 B Tree Index12.2.4 Hash Index, Equality SelectionGeneral Selection Conditions *12.3.1 CNF and Index Matching12.3.2 Evaluating Selections without Disjunction12.3.3 Selections with DisjunctionThe Projection Operation12.4.1 Projection Based on Sorting12.4.2 Projection Based on Hashing *12.4.3 Sorting versus Hashing for Projections *12.4.4 Use of Indexes for Projections *The Join Operation12.5.1 Nested Loops Join12.5.2 Sort-Merge Join *12.5.3 Hash Join *12.5.4 General Join Conditions *The Set Operations *12.6.1 Sorting for Union and Difference12.6.2 Hashing for Union and DifferenceAggregate Operations *12.7.1 Implementing Aggregation by Using an IndexThe Impact of Buffering 343348349349350350351352

xiiiContents12.9Points to Review13 INTRODUCTION TO QUERY OPTIMIZATION13.113.213.313.4Overview of Relational Query Optimization13.1.1 Query Evaluation Plans13.1.2 Pipelined Evaluation13.1.3 The Iterator Interface for Operators and Access Methods13.1.4 The System R OptimizerSystem Catalog in a Relational DBMS13.2.1 Information Stored in the System CatalogAlternative Plans: A Motivating Example13.3.1 Pushing Selections13.3.2 Using IndexesPoints to Review14 A TYPICAL RELATIONAL QUERY OPTIMIZER14.114.214.314.414.514.614.7Part VTranslating SQL Queries into Algebra14.1.1 Decomposition of a Query into Blocks14.1.2 A Query Block as a Relational Algebra ExpressionEstimating the Cost of a Plan14.2.1 Estimating Result SizesRelational Algebra Equivalences14.3.1 Selections14.3.2 Projections14.3.3 Cross-Products and Joins14.3.4 Selects, Projects, and Joins14.3.5 Other EquivalencesEnumeration of Alternative Plans14.4.1 Single-Relation Queries14.4.2 Multiple-Relation QueriesNested SubqueriesOther Approaches to Query OptimizationPoints to ReviewDATABASE DESIGN15 SCHEMA REFINEMENT AND NORMAL FORMS15.115.215.3Introduction to Schema Refinement15.1.1 Problems Caused by Redundancy15.1.2 Use of Decompositions15.1.3 Problems Related to DecompositionFunctional DependenciesExamples Motivating Schema 403415417418418420421422423

xivDatabase Management Systems15.415.515.615.715.815.915.3.1 Constraints on an Entity Set15.3.2 Constraints on a Relationship Set15.3.3 Identifying Attributes of Entities15.3.4 Identifying Entity SetsReasoning about Functional Dependencies15.4.1 Closure of a Set of FDs15.4.2 Attribute ClosureNormal Forms15.5.1 Boyce-Codd Normal Form15.5.2 Third Normal FormDecompositions15.6.1 Lossless-Join Decomposition15.6.2 Dependency-Preserving DecompositionNormalization15.7.1 Decomposition into BCNF15.7.2 Decomposition into 3NF *Other Kinds of Dependencies *15.8.1 Multivalued Dependencies15.8.2 Fourth Normal Form15.8.3 Join Dependencies15.8.4 Fifth Normal Form15.8.5 Inclusion DependenciesPoints to Review16 PHYSICAL DATABASE DESIGN AND TUNING16.116.216.316.416.516.616.716.8Introduction to Physical Database Design16.1.1 Database Workloads16.1.2 Physical Design and Tuning Decisions16.1.3 Need for Database TuningGuidelines for Index SelectionBasic Examples of Index SelectionClustering and Indexing *16.4.1 Co-clustering Two RelationsIndexes on Multiple-Attribute Search Keys *Indexes that Enable Index-Only Plans *Overview of Database Tuning16.7.1 Tuning Indexes16.7.2 Tuning the Conceptual Schema16.7.3 Tuning Queries and ViewsChoices in Tuning the Conceptual Schema *16.8.1 Settling for a Weaker Normal Form16.8.2 Denormalization16.8.3 Choice of Decompositions16.8.4 Vertical 463465468470471474474475476477478478479480

xvContents16.8.5 Horizontal Decomposition16.9 Choices in Tuning Queries and Views *16.10 Impact of Concurrency *16.11 DBMS Benchmarking *16.11.1 Well-Known DBMS Benchmarks16.11.2 Using a Benchmark16.12 Points to Review17 SECURITY17.117.217.317.417.517.6Part VIIntroduction to Database SecurityAccess ControlDiscretionary Access Control17.3.1 Grant and Revoke on Views and Integrity Constraints *Mandatory Access Control *17.4.1 Multilevel Relations and Polyinstantiation17.4.2 Covert Channels, DoD Security LevelsAdditional Issues Related to Security *17.5.1 Role of the Database Administrator17.5.2 Security in Statistical Databases17.5.3 EncryptionPoints to ReviewTRANSACTION MANAGEMENT18 TRANSACTION MANAGEMENT OVERVIEW18.118.218.318.418.518.6The Concept of a Transaction18.1.1 Consistency and Isolation18.1.2 Atomicity and DurabilityTransactions and SchedulesConcurrent Execution of Transactions18.3.1 Motivation for Concurrent Execution18.3.2 Serializability18.3.3 Some Anomalies Associated with Interleaved Execution18.3.4 Schedules Involving Aborted TransactionsLock-Based Concurrency Control18.4.1 Strict Two-Phase Locking (Strict 2PL)Introduction to Crash Recovery18.5.1 Stealing Frames and Forcing Pages18.5.2 Recovery-Related Steps during Normal Execution18.5.3 Overview of ARIESPoints to Review19 CONCURRENCY 532532533535536537537540

xviDatabase Management Systems19.119.219.319.419.519.6Lock-Based Concurrency Control Revisited19.1.1 2PL, Serializability, and Recoverability19.1.2 View SerializabilityLock Management19.2.1 Implementing Lock and Unlock Requests19.2.2 Deadlocks19.2.3 Performance of Lock-Based Concurrency ControlSpecialized Locking Techniques19.3.1 Dynamic Databases and the Phantom Problem19.3.2 Concurrency Control in B Trees19.3.3 Multiple-Granularity LockingTransaction Support in SQL-92 *19.4.1 Transaction Characteristics19.4.2 Transactions and ConstraintsConcurrency Control without Locking19.5.1 Optimistic Concurrency Control19.5.2 Timestamp-Based Concurrency Control19.5.3 Multiversion Concurrency ControlPoints to Review20 CRASH RECOVERY20.120.220.320.420.5Part VIIIntroduction to ARIES20.1.1 The Log20.1.2 Other Recovery-Related Data Structures20.1.3 The Write-Ahead Log Protocol20.1.4 CheckpointingRecovering from a System Crash20.2.1 Analysis Phase20.2.2 Redo Phase20.2.3 Undo PhaseMedia RecoveryOther Algorithms and Interaction with Concurrency ControlPoints to ReviewADVANCED TOPICS21 PARALLEL AND DISTRIBUTED DATABASES21.121.221.3Architectures for Parallel DatabasesParallel Query Evaluation21.2.1 Data Partitioning21.2.2 Parallelizing Sequential Operator Evaluation CodeParallelizing Individual Operations21.3.1 Bulk Loading and 7588595597598600601601602602

221.1321.1421.3.2 Sorting21.3.3 JoinsParallel Query OptimizationIntroduction to Distributed Databases21.5.1 Types of Distributed DatabasesDistributed DBMS Architectures21.6.1 Client-Server Systems21.6.2 Collaborating Server Systems21.6.3 Middleware SystemsStoring Data in a Distributed DBMS21.7.1 Fragmentation21.7.2 ReplicationDistributed Catalog Management21.8.1 Naming Objects21.8.2 Catalog Structure21.8.3 Distributed Data IndependenceDistributed Query Processing21.9.1 Nonjoin Queries in a Distributed DBMS21.9.2 Joins in a Distributed DBMS21.9.3 Cost-Based Query OptimizationUpdating Distributed Data21.10.1 Synchronous Replication21.10.2 Asynchronous ReplicationIntroduction to Distributed TransactionsDistributed Concurrency Control21.12.1 Distributed DeadlockDistributed Recovery21.13.1 Normal Execution and Commit Protocols21.13.2 Restart after a Failure21.13.3 Two-Phase Commit Revisited21.13.4 Three-Phase CommitPoints to Review22 INTERNET DATABASES22.122.222.3The World Wide Web22.1.1 Introduction to HTML22.1.2 Databases and the WebArchitecture22.2.1 Application Servers and Server-Side JavaBeyond HTML22.3.1 Introduction to XML22.3.2 XML DTDs22.3.3 Domain-Specific DTDs22.3.4 XML-QL: Querying XML 642643643645645647651652654657659

xviiiDatabase Management Systems22.422.522.622.3.5 The Semistructured Data Model22.3.6 Implementation Issues for Semistructured DataIndexing for Text Search22.4.1 Inverted Files22.4.2 Signature FilesRanked Keyword Searches on the Web22.5.1 An Algorithm for Ranking Web PagesPoints to Review23 DECISION SUPPORT23.123.223.323.423.523.623.7Introduction to Decision SupportData Warehousing23.2.1 Creating and Maintaining a WarehouseOLAP23.3.1 Multidimensional Data Model23.3.2 OLAP Queries23.3.3 Database Design for OLAPImplementation Techniques for OLAP23.4.1 Bitmap Indexes23.4.2 Join Indexes23.4.3 File Organizations23.4.4 Additional OLAP Implementation IssuesViews and Decision Support23.5.1 Views, OLAP, and Warehousing23.5.2 Query Modification23.5.3 View Materialization versus Computing on Demand23.5.4 Issues in View MaterializationFinding Answers Quickly23.6.1 Top N Queries23.6.2 Online AggregationPoints to Review24 DATA MINING24.124.224.3Introduction to Data MiningCounting Co-occurrences24.2.1 Frequent Itemsets24.2.2 Iceberg QueriesMining for Rules24.3.1 Association Rules24.3.2 An Algorithm for Finding Association Rules24.3.3 Association Rules and ISA Hierarchies24.3.4 Generalized Association Rules24.3.5 Sequential 7707708709711713714714715716717

xixContents24.424.524.624.724.824.3.6 The Use of Association Rules for Prediction24.3.7 Bayesian Networks24.3.8 Classification and Regression RulesTree-Structured Rules24.4.1 Decision Trees24.4.2 An Algorithm to Build Decision TreesClustering24.5.1 A Clustering AlgorithmSimilarity Search over Sequences24.6.1 An Algorithm to Find Similar SequencesAdditional Data Mining TasksPoints to Review25 OBJECT-DATABASE ing Example25.1.1 New Data Types25.1.2 Manipulating the New Kinds of DataUser-Defined Abstract Data Types25.2.1 Defining Methods of an ADTStructured Types25.3.1 Manipulating Data of Structured TypesObjects, Object Identity, and Reference Types25.4.1 Notions of Equality25.4.2 Dereferencing Reference TypesInheritance25.5.1 Defining Types with Inheritance25.5.2 Binding of Methods25.5.3 Collection Hierarchies, Type Extents, and QueriesDatabase Design for an ORDBMS25.6.1 Structured Types and ADTs25.6.2 Object Identity25.6.3 Extending the ER Model25.6.4 Using Nested CollectionsNew Challenges in Implementing an ORDBMS25.7.1 Storage and Access Methods25.7.2 Query Processing25.7.3 Query OptimizationOODBMS25.8.1 The ODMG Data Model and ODL25.8.2 OQLComparing RDBMS with OODBMS and ORDBMS25.9.1 RDBMS versus ORDBMS25.9.2 OODBMS versus ORDBMS: Similarities25.9.3 OODBMS versus ORDBMS: 6757758759760761763765765768769769770770

xxDatabase Management Systems25.10 Points to Review26 SPATIAL DATA MANAGEMENT26.126.226.326.426.526.626.726.8Types of Spatial Data and QueriesApplications Involving Spatial DataIntroduction to Spatial Indexes26.3.1 Overview of Proposed Index StructuresIndexing Based on Space-Filling Curves26.4.1 Region Quad Trees and Z-Ordering: Region Data26.4.2 Spatial Queries Using Z-OrderingGrid Files26.5.1 Adapting Grid Files to Handle RegionsR Trees: Point and Region Data26.6.1 Queries26.6.2 Insert and Delete Operations26.6.3 Concurrency Control26.6.4 Generalized Search TreesIssues in High-Dimensional IndexingPoints to Review27 DEDUCTIVE DATABASES27.127.227.327.427.5Introduction to Recursive Queries27.1.1 DatalogTheoretical Foundations27.2.1 Least Model Semantics27.2.2 Safe Datalog Programs27.2.3 The Fixpoint Operator27.2.4 Least Model Least FixpointRecursive Queries with Negation27.3.1 Range-Restriction and Negation27.3.2 Stratification27.3.3 Aggregate OperationsEfficient Evaluation of Recursive Queries27.4.1 Fixpoint Evaluation without Repeated Inferences27.4.2 Pushing Selections to Avoid Irrelevant InferencesPoints to Review28 ADDITIONAL TOPICS28.128.2Advanced Transaction Processing28.1.1 Transaction Processing Monitors28.1.2 New Transaction Models28.1.3 Real-Time DBMSsIntegrated Access to Multiple Data 814816818822822822823824824

27828829829DATABASE DESIGN CASE STUDY: THE INTERNET831SHOPA.1A.2A.3A.4A.5A.6A.7BMobile DatabasesMain Memory DatabasesMultimedia DatabasesGeographic Information SystemsTemporal and Sequence DatabasesInformation VisualizationSummaryRequirements AnalysisConceptual DesignLogical Database DesignSchema RefinementPhysical Database DesignA.5.1 Tuning the DatabaseSecurityApplication Layers831832832835836838838840THE MINIBASE SOFTWARE842B.1B.2842843843844845B.3What’s AvailableOverview of Minibase AssignmentsB.2.1 Overview of Programming ProjectsB.2.2 Overview of Nonprogramming AssignmentsAcknowledgmentsREFERENCES847SUBJECT INDEX879AUTHOR INDEX896

PREFACEThe advantage of doing one’s praising for oneself is that one can lay it on so thickand exactly in the right places.—Samuel ButlerDatabase management systems have become ubiquitous as a fundamental tool for managing information, and a course on the principles and practice of database systems isnow an integral part of computer science curricula. This book covers the fundamentalsof modern database management systems, in particular relational database systems.It is intended as a text for an introductory database course for undergraduates, andwe have attempted to present the material in a clear, simple style.A quantitative approach is used throughout and detailed examples abound. An extensive set of exercises (for which solutions are available online to instructors) accompanieseach chapter and reinforces students’ ability to apply the concepts to real problems.The book contains enough material to support a second course, ideally supplementedby selected research papers. It can be used, with the accompanying software and SQLprogramming assignments, in two distinct kinds of introductory courses:1. A course that aims to present the principles of database systems, with a practicalfocus but without any implementation assignments. The SQL programming assignments are a useful supplement for such a course. The supplementary Minibasesoftware can be used to create exercises and experiments with no programming.2. A course that has a strong systems emphasis and assumes that students havegood programming skills in C and C . In this case the software can be usedas the basis for projects in which students are asked to implement various partsof a relational DBMS. Several central modules in the project software (e.g., heapfiles, buffer manager, B trees, hash indexes, various join methods, concurrencycontrol, and recovery algorithms) are described in sufficient detail in the text toenable students to implement them, given the (C ) class interfaces.Many instructors will no doubt teach a course that falls between these two extremes.xxii

PrefacexxiiiChoice of TopicsThe choice of material has been influenced by these considerations:To concentrate on issues central to the design, tuning, and implementation of relational database applications. However, many of the issues discussed (e.g., bufferingand access methods) are not specific to relational systems, and additional topicssuch as decision support and object-database systems are covered in later chapters.To provide adequate coverage of implementation topics to support a concurrentlaboratory section or course project. For example, implementation of relationaloperations has been covered in more detail than is necessary in a first course.However, the variety of alternative implementation techniques permits a widechoice of project assignments. An instructor who wishes to assign implementationof sort-merge join might cover that topic in depth, whereas another might chooseto emphasize index nested loops join.To provide in-depth coverage of the state of the art in currently available commercial systems, rather than a broad coverage of several alternatives. For example,we discuss the relational data model, B trees, SQL, System R style query optimization, lock-based concurrency control, the ARIES recovery algorithm, thetwo-phase commit protocol, asynchronous replication in distributed databases,and object-relational DBMSs in detail, with numerous illustrative examples. Thisis made possible by omitting or briefly covering some related topics such as thehierarchical and network models, B tree variants, Quel, semantic query optimization, view serializability, the shadow-page recovery algorithm, and the three-phasecommit protocol.The same preference for in-depth coverage of selected topics governed our choiceof topics for chapters on advanced material. Instead of covering a broad range oftopics briefly, we have chosen topics that we believe to be practically importantand at the cutting edge of current thinking in database systems, and we havecovered them in depth.New in the Second EditionBased on extensive user surveys and feedback, we have refined the book’s organization.The major change is the early introduction of the ER model, together with a discussionof conceptual database design. As in the first edition, we introduce SQL-92’s datadefinition features together with the relational model (in Chapter 3), and wheneverappropriate, relational model concepts (e.g., definition of a relation, updates, views, ERto relational mapping) are illustrated and discussed in the context of SQL. Of course,we maintain a careful separation between the concepts and their SQL realization. Thematerial on data storage, file organization, and indexes has been moved back, and the

xxivDatabase Management Systemsmaterial on relational queries has been moved forward. Nonetheless, the two parts(storage and organization vs. queries) can still be taught in either order based on theinstructor’s preferences.In order to facilitate brief coverage in a first course, the second edition contains overviewchapters on transaction processing and query optimization. Most chapters have beenrevised extensively, and additional explanations and figures have been added in manyplaces. For example, the chapters on query languages now contain a uniform numberingof all queries to facilitate comparisons of the same query (in algebra, calculus, andSQL), and the results of several queries are shown in figures. JDBC and ODBCcoverage has been added to the SQL query chapter and SQL:1999 features are discussedboth in this chapter and the chapter on object-relational databases. A discussion ofRAID has been added to Chapter 7. We have added a new database design case study,illustrating the entire design cycle, as an appendix.Two new pedagogical features have been introduced. First, ‘floating boxes’ provide additional perspective and relate the concepts to real systems, while keeping the main discussion free of product-specific details. Second, each chapter concludes with a ‘Pointsto Review’ section that summarizes the main ideas introduced in the chapter andincludes pointers to the sections where they are discussed.For use in a second course, many advanced chapters from the first edition have beenextended or split into multiple chapters to provide thorough coverage of current topics. In particular, new material has been added to the chapters on decision support,deductive databases, and object databases. New chapters on Internet databases, datamining, and spatial databases have been added, greatly expanding the coverage ofthese topics.The material can be divided into roughly seven parts, as indicated in Figure 0.1, whichalso shows the dependencies between chapters. An arrow from Chapter I to Chapter Jmeans that I depends on material in J. The broken arrows indicate a weak dependency,which can be ignored at the instructor’s discretion. It is recommended that Part I becovered first, followed by Part II and Part III (in either order). Other than these threeparts, dependencies across parts are minimal.Order of PresentationThe book’s modular organization offers instructors a variety of choices. For example, some instructors will want to cover SQL and get students to use a relationaldatabase, before discussing file organizations or indexing; they should cover Part IIbefore Part III. In fact, in a course that emphasizes concepts and SQL, many of theimplementation-oriented chapters might be skipped. On the other hand, instructorsassigning implementation projects based on file organizations may want to cover Part

xxvPreface417Relational Algebraand CalculusIntroduction,Data Storage265QBESQL Queries, etc.IVVVIVII8ER ModelConceptual DesignIntroduction toFile Organizations3III12External SortingEvaluation ofRelational OperatorsPhysical DBDesign, TuningConcurrencyControlTransaction Figure 0.110Hash Indexes14Parallel andDistributed DBs2220CrashRecovery26SpatialDatabasesA TypicalRelational Introduction toQuery Optimization1615Schema Refinement,FDs, NormalizationTree IndexesIIIRelational ModelSQL ionalTopicsChapter Organization and DependenciesIII early to space assignments. As another example, it is not necessary to cover all thealternatives for a given operator (e.g., various techniques for joins) in Chapter 12 inorder to cover later related material (e.g., on optimization or tuning) adequately. Thedatabase design case study in the appendix can be discussed concurrently with theappropriate design chapters, or it can be discussed after all design topics have beencovered, as a review.Several section headings contain an asterisk. This symbol does not necessarily indicatea higher level of difficulty. Rather, omitting all asterisked sections leaves about theright amount of material in Chapters 1–18, possibly omitting Chapters 6, 10, and 14,for a broad introductory one-quarter or one-semester course (depending on the depthat which the remaining material is discussed and the nature of the course assignments).

xxviDatabase Management SystemsThe book can be used in several kinds of introductory or second courses by choosingtopics appropriately, or in a two-course sequence by supplementing the material withsome advanced readings in the second course. Examples of appropriate introductorycourses include courses on file organizations and introduction to database managementsystems, especially if the course focuses on relational database design or implementation. Advanced courses can be built around the later chapters, which contain detailedbibliographies with ample pointers for further study.Supplementary MaterialEach chapter contains several exercises designed to test and expand the reader’s understanding of the material. Students can obtain solutions to odd-numbered chapterexercises and a set of lecture slides for each chapter through the Web in Postscript andAdobe PDF formats.The following material is available online to instr

x Database Management Systems 5.13.2 Constraints versus Triggers 167 5.13.3 Other Uses of Triggers 168 5.14 Points to Review 168 6 QUERY-BY-EXAMPLE (QBE) 177 6.1 Introduction 177 6.2 Basic QBE Queries 178 6.2.1 Other Features: Duplicates, Ordering Answers 179 6.3 Queries over Multiple R