Accolades For Database Administration

Transcription

Accolades for Database Administration“I’ve forgotten how many times I’ve recommended this book to people. It’swell written, to the point, and covers the topics that you need to know tobecome an effective DBA.”—Scott Ambler, Thought Leader, Agile Data Method“This is a well-written, well-organized guide to the practice of databaseadministration. Unlike other books on general database theory or relationaldatabase theory, this book focuses more directly on the theory and realityof database administration as practiced by database professionals today, anddoes so without catering too much to any specific product implementation.As such, Database Administration is very well suited to anyone interestedin surveying the job of a DBA or those in similar but more specific rolessuch as data modeler or database performance analyst.”—Sal Ricciardi, Program Manager, Microsoft“One of Craig’s hallmarks is his ability to write in a clear, easy-to-read fashion. The main purpose of any technical book is to transfer information fromwriter to reader, and Craig has done an excellent job. He wants the readerto learn—and it shows.”—Chris Foot, Manager, Remote DBA Experts and Oracle ACE“A complete and comprehensive listing of tasks and responsibilities forDBAs, ranging from creating the database environment to data warehouseadministration, and everything in between.”—Mike Tarrani, Computer Consultant“I think every business manager and every IT manager should have a copyof this book.”—Dan Hotka, Independent Consultant and Oracle ACE“This book by Craig Mullins is wonderfully insightful and truly important.Mullins describes the role and duties of data administrators and databaseadministrators in modern organizations with remarkable insight and clarity.”—Michael Tozer, Author and former U.S. Navy officer

This page intentionally left blank

DatabaseAdministrationSecond Edition

This page intentionally left blank

DatabaseAdministrationThe Complete Guide to DBA Practicesand ProceduresSecond EditionCraig S. MullinsUpper Saddle River, NJ Boston Indianapolis San FranciscoNew York Toronto Montreal London Munich Paris MadridCapetown Sydney Tokyo Singapore Mexico City

Many of the designations used by manufacturers and sellers to distinguish theirproducts are claimed as trademarks. Where those designations appear in thisbook, and the publisher was aware of a trademark claim, the designations havebeen printed with initial capital letters or in all capitals.The author and publisher have taken care in the preparation of this book, butmake no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information orprograms contained herein.The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versionsand/or custom covers and content particular to your business, training goals,marketing focus, and branding interests. For more information, please contact:U.S. Corporate and Government Sales(800) 382-3419corpsales@pearsontechgroup.comFor sales outside the United States, please contact:International k TaubAcquisitions EditorGreg DoenchDevelopment EditorSusan Brown ZahnManaging EditorJohn FullerProduction EditorCaroline SenayCopy EditorBarbara WoodIndexerRichard EvansProofreaderDiane FreedVisit us on the Web: informit.com/awTechnical ReviewersLibrary of Congress Cataloging-in-Publication DataWilliam ArledgeKevin KlineDatabase administration : the complete guide to DBA practices and procedures /Craig S. Mullins.—2 [edition].pages cmIncludes bibliographical references and index.ISBN 978-0-321-82294-9 (alk. paper)—ISBN 0-321-82294-3 (alk. paper)1. Database management. I. Title.QA76.9.D3M838 2013005.74—dc232012029466Copyright 2013 Craig S. MullinsAll rights reserved. Printed in the United States of America. This publication isprotected by copyright, and permission must be obtained from the publisherprior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying,recording, or likewise. To obtain permission to use material from this work,please submit a written request to Pearson Education, Inc., Permissions Department, One Lake Street, Upper Saddle River, New Jersey 07458, or you may faxyour request to (201) 236-3290.ISBN-13: 978-0-321-82294-9ISBN-10:0-321-82294-3Text printed in the United States on recycled paper at Edwards Brothers Malloyin Ann Arbor, Michigan.First printing October, 2012Editorial AssistantMichelle HousleyCover DesignerChuti PrasertsithCompositorRob MauharThe CIP Group

To my wife, Beth, for her unending love,constant support, and beautiful smile.

This page intentionally left blank

ContentsPrefacexxxiHow to Use This BookxxxiiiAcknowledgmentsxxxvAbout the AuthorxxxviiChapter 1 What Is a DBA?1Why Learn Database Administration?3A Unique Vantage Point4DBA Salaries4Database Technology6The Management Discipline of Database AdministrationA Day in the Life of a DBA12Evaluating a DBA Job Offer149ix

xContentsDatabase, Data, and System Administration15Data Administration15Database Administration19System Administration20DBA Tasks20Database Design21Performance Monitoring and Tuning22Ensuring Availability24Database Security and Authorization24Governance and Regulatory Compliance26Backup and Recovery26Ensuring Data Integrity27DBMS Release Migration29Jack-of-All-Trades29The Types of DBAs31System DBA31Database Architect32Database Analyst33Data Modeler33Application DBA34Task-Oriented DBA36Performance Analyst36Data Warehouse Administrator36Staffing Considerations37How Many DBAs?37DBA Reporting Structures40Multiplatform DBA Issues42Production versus Test44The Impact of Newer Technology on DBA46Procedural DBAs: Managing Database Logic46

xiContentsThe Internet: From DBA to eDBA50The Personal DBA and the Cloud53NoSQL, Big Data, and the DBA55New Technology Impacts on DBA56DBA Certification56The Rest of the Book58Review58Bonus Question59Chapter 2 Creating the Database Environment61Defining the Organization’s DBMS Strategy61Choosing a DBMS63DBMS Architectures68DBMS Clustering71DBMS Proliferation73Hardware Issues73Cloud Database Systems74Installing the DBMS75DBMS Installation Basics75Hardware Requirements76Storage Requirements76Memory Requirements78Configuring the DBMS80Connecting the DBMS to Supporting Infrastructure SoftwareInstallation Verification81DBMS Environments82Upgrading DBMS Versions and Releases82Features and Complexity87Complexity of the DBMS Environment87Reputation of the DBMS Vendor89Support Policies of the DBMS8981

xiiContentsOrganization Style89DBA Staff Skill Set90Platform Support90Supporting Software91Fallback Planning92Migration Verification92The DBMS Upgrade Strategy92Database Standards and Procedures92Database Naming Conventions93Other Database Standards and ProceduresDBMS Education103Summary104Review104Bonus Question105Suggested Reading10596Chapter 3 Data Modeling and Normalization107Data Modeling Concepts108Entity-Relationship Diagramming110The Components of a Data ps122Discovering Entities, Attributes, and Relationships124Conceptual, Logical, and Physical Data Models125What Is Normalization?128The Normal Forms128First Normal Form129Second Normal Form129Third Normal Form132

ContentsA Normalized Data Model133Further Normal Forms134Normalization in Practice135Additional Data Modeling Issues135Summary136Review137Bonus Question138Suggested Reading138Chapter 4 Database Design141From Logical Model to Physical Database141Transform Entities to Tables142Transform Attributes to Columns142Build Referential Constraints for All RelationshipsBuild Physical Data Structures147Database Performance Design150Designing Indexes150Hashing158Clustering159Interleaving Data160Denormalization160When to Denormalize161Prejoined Tables164Report Tables164Mirror Tables165Split Tables165Combined Tables168Redundant Data168Repeating Groups169Derivable Data170Hierarchies171146xiii

xivContentsSpecial Physical Implementation NeedsDenormalization Summary173Views175Data Definition Language177Temporal Data Support177A Temporal Example178Business Time and System Time179Summary180Review181Bonus Question181Suggested Reading182173Chapter 5 Application Design185Database Application Development and SQL186SQL186Set-at-a-Time Processing and Relational ClosureEmbedding SQL in a Program191SQL Middleware and APIs192Application Infrastructure193Object Orientation and SQL199Types of SQL200SQL Coding for Performance202Querying XML Data203Defining Transactions205Transaction Guidelines207Unit of Work207Transaction Processing Systems207Application Servers209Locking210Types of Locks212Lock Time-outs213189

xvContentsDeadlocks214Lock Duration215Lock Escalation219Programming Techniques to Minimize Locking ProblemsLocking Summary220Batch Processing221Summary222Review222Bonus Question223Suggested Reading223Chapter 6 Design Reviews227What Is a Design Review?227Rules of Engagement228Design Review Participants229Knowledge and Skills Required232Types of Design Reviews232Conceptual Design Review233Logical Design Review235Physical Design Review236Organizational Design Review237SQL and Application Code Design Review238Pre-Implementation Design Review239Post-Implementation Design Review239Design Review Output239Additional Considerations240Dealing with Remote Staff240Mentorship and Knowledge Transfer240Summary241Review241Suggested Reading242220

xviContentsChapter 7 Database Change Management243Change Management Requirements244The Change Management Perspective of the DBA246Types of Changes247DBMS Software248Hardware Configuration248Logical and Physical Design248Applications249Physical Database Structures250Impact of Change on Database Structures250The Limitations of ALTER252Database Change Scenarios254Comparing Database Structures257Requesting Database Changes258Standardized Change Requests259Communication260Coordinating Database and Application Changes260Compliance261DBA Scripts and Change Management262Summary262Review263Suggested Reading263Chapter 8 Data Availability265Defining Availability267Increased Availability RequirementsCost of Downtime271How Much Availability Is Enough?Availability Problems274Loss of the Data Center274Network Problems275268273

ContentsLoss of the Server Hardware276Disk-Related Outages278Operating System Failure279DBMS Software Failure279Application Problems279Security and Authorization Problems280Corruption of Data280Loss of Database Objects281Loss of Data282Data Replication and Propagation Failures283Severe Performance Problems283Recovery Issues284DBA Mistakes284Outages: Planned and Unplanned286Ensuring Availability287Perform Routine Maintenance While Systems RemainOperational288Automate DBA Functions290Exploit High-Availability Features291Exploit Clustering Technology292Database Architecture and NoSQL296Summary296Review297Suggested Reading298Chapter 9 Performance Management299Defining Performance299A Basic Database Performance Road Map302Monitoring versus Management304Reactive versus Proactive306xvii

xviiiContentsPreproduction Performance EstimationHistorical Trending308Service-Level Management308Types of Performance Tuning311System Tuning311Database Tuning312Application Tuning312Performance Tuning Tools313DBMS Performance Basics315Summary316Review316Bonus Question317Suggested Reading317307Chapter 10 System Performance319The Larger Environment320Interaction with the Operating System320Allied Agents321Hardware Configuration322Components of the DBMS324DBMS Installation and Configuration Issues327Types of Configuration327Memory Usage328Data Cache Details332“Open” Database Objects336Database Logs336Locking and Contention341The System Catalog342Other Configuration Options343General Advice344System Monitoring345

ContentsSummary346Review346Bonus QuestionSuggested Reading347347Chapter 11 Database Performance349Techniques for Optimizing Databases349Partitioning350Raw Partition versus File 6Interleaving Data360Free Space360Compression361File Placement and Allocation362Page Size (Block Size)364Database Reorganization365Determining When to sted Reading372Chapter 12 Application PerformanceDesigning Applications for Relational AccessRelational Optimization374CPU and I/O Costs376Database Statistics376Query Analysis378Joins379Access Path Choices381373373xix

xxContentsAdditional Optimization Considerations391View Access391Query Rewrite392Rule-Based Optimization393Reviewing Access Paths394Forcing Access Paths398SQL Coding and Tuning for Efficiency399A Dozen SQL Rules of Thumb400Additional SQL Tuning Tips406Identifying Poorly Performing SQL406Summary407Review407Suggested Reading408Chapter 13 Data Integrity409Types of Integrity409Database Structure Integrity410Types of Structural Problems410Managing Structural Problems411Semantic Data Integrity414Entity Integrity416Unique Constraints417Data Types417Default Values419Check Constraints419Triggers426Referential Integrity433Temporal Database Systems444Summary446Review447Suggested Reading448

ContentsChapter 14 Database Security449Data Breaches449Database Security Basics451Database Users455Granting and Revoking Authority456Types of Privileges457Granting to PUBLIC460Revoking Privileges461Label-Based Access Control463Security Reporting465Authorization Roles and Groups466Roles466Groups467Other Database Security Mechanisms468Using Views for Security468Using Stored Procedures for Security470Encryption470Data at Rest Encryption472Data in Transit Encryption472Encryption Techniques472SQL Injection473SQL Injection Prevention475Auditing477External Security478Job Scheduling and Security479Non-DBMS DBA Security480DBMS Fixpacks and Maintenance480Summary481Review481Suggested Reading482xxi

xxiiContentsChapter 15Regulatory Compliance and DatabaseAdministration483A Collaborative Approach to Compliance486Why Should DBAs Care about Compliance?487Metadata Management, Data Quality, and Data GovernanceMetadata488Data Quality489Data Governance489Database Auditing and Data Access Tracking490Database Auditing Techniques493Privileged User Auditing495Data Masking and Obfuscation496Data Masking Techniques497Database Archiving for Long-Term Data Retention498The Life Cycle of Data499Database Archiving500Components of a Database Archiving Solution505The Impact of e-Discovery on DBA506Closer Tracking of Traditional DBA Tasks507Database Change Management508Database Backup and Recovery508Summary511Review511Suggested Reading512Chapter 16 Database Backup and RecoveryThe Importance of Backup and Recovery515Preparing for Problems516Backup517Full versus Incremental Backups521Database Objects and Backups523515488

ContentsDBMS Control524Concurrent Access Issues525Backup Consistency527Log Archiving and Backup529Determining Your Backup Schedule531DBMS Instance Backup533Designing the DBMS Environment for Recovery533Alternate Approaches to Database Backup534Document Your Backup Strategy536Database Object Definition Backups536Recovery537Determining Recovery Options538General Steps for Database Object Recovery540Types of Recovery541Index Recovery550Testing Your Recovery Plan551Recovering a Dropped Database Object552Recovering Broken Blocks and Pages553Populating Test Databases553Alternatives to Backup and Recovery554Standby Databases554Replication555Disk Mirroring556Summary557Review557Suggested Reading558Chapter 17 Disaster PlanningThe Need for Planning559Risk and Recovery561559xxiii

xxivContentsGeneral Disaster Recovery Guidelines563The Remote Site564The Written Plan564Personnel569Backing Up the Database for Disaster Recovery569Tape Backups570Storage Management Backups572Other Approaches573Some Guidelines573Disaster Prevention575Disaster and Contingency Planning Web Sites576Summary576Review576Suggested Reading577Chapter 18 Data and Storage ManagementStorage Management Basics579Files and Data Sets583File Placement on Disk584Raw Partitions versus File Systems586Temporary Database Files587Space Management587Data Page Layouts588Index Page Layouts592Transaction Logs594Fragmentation and Storage595Storage Options596RAID597JBOD604Storage Area Networks604579

ContentsNetwork-Attached StorageTiered Storage606Planning for the Future608Capacity Planning608Summary609Review609Suggested Reading610605Chapter 19 Data Movement and DistributionLoading and Unloading Data614The LOAD Utility614The UNLOAD Utility618Maintaining Application Test Beds621EXPORT and IMPORT622Bulk Data Movement623ETL Software623Replication and Propagation623Messaging Software624Other Methods625Distributed Databases626Setting Up a Distributed Environment627Data Distribution Standards629Accessing Distributed Data630Two-Phase COMMIT631Distributed Performance Problems632Summary633Review634Bonus Question634Suggested Reading635613xxv

xxviContentsChapter 20 Data Warehouse Administration637What Is a Data Warehouse?637Analytical versus Transaction Processing638Administering the Data Warehouse640Too Much Focus on Technology?641Data Warehouse Design641Data Movement644Data Cleansing645Data Warehouse Scalability649Data Warehouse Performance650Data Freshness654Data Content654Data Usage655Financial Chargeback655Backup and Recovery656Don’t Operate in a Vacuum!657Summary658Review658Suggested Reading659Chapter 21 Database ConnectivityMultitier, Distributed Computing661A Historical Look661Business Issues663What Is Client/Server Computing?Types of Client/Server ApplicationsNetwork Traffic670Database Gateways671Database Drivers672Connection Pooling674661663667

ContentsxxviiDatabases, the Internet, and the Web675Internet-Connected Databases676Web Development and Web Services680Summary681Review682Suggested Reading682Chapter 22 Metadata Management685What Is Metadata?685From Data to Knowledge and Beyond686Metadata Strategy687Data Warehousing and Metadata688Types of Metadata689Repositories and Data Dictionaries691Repository Benefits693Repository Challenges693Data Dictionaries695Summary696Review696Suggested Reading697Chapter 23 DBA Tools699Types and Benefits of DBA Tools699Data Modeling and Design700Database Change Management701Table Editors707Performance Management708Backup and Recovery714Database Utilities715Data Protection, Governance, Risk, and Compliance Tools716Data Warehousing, Analytics, and Business Intelligence721

xxviiiContentsProgramming and Development ToolsMiscellaneous Tools726Examine Native DBA Tools728Evaluating DBA Tool Vendors729Homegrown DBA Tools732Summary733Review733724Chapter 24 DBA Rules of Thumb735Write Down Everything735Keep Everything736Automate!737Share Your Knowledge739Analyze, Simplify, and Focus741Don’t Panic!742Measure Twice, Cut Once743Understand the Business, Not Just the TechnologyDon’t Become a Hermit745Use All of the Resources at Your Disposal745Keep Up-to-Date746Invest in Yourself747Summary748Final Exam748Appendix A Database FundamentalsWhat Is a Database?753Why Use a DBMS?754Advantages of Using a DBMS755Summary759753743

ContentsAppendix B The DBMS Vendors761The Big Three762The Second Tier763Other Significant Players763Open-Source DBMS Offerings764Nonrelational DBMS Vendors765NoSQL DBMS Vendors765Object-Oriented DBMS Vendors766PC-Based DBMS Vendors766Appendix C DBA Tool Vendors769The Major Vendors769Other DBA Tool Vendors770Data Modeling Tool Vendors771Repository Vendors772Data Movement and Business Intelligence VendorsAppendix D DBA Web ResourcesUsenet Newsgroups775Mailing Lists776Web Sites, Blogs, and Portals778Vendor Web Sites778Magazine Web Sites778Consultant Web Sites779Blogs780Database Portals781Other Web Sites782775Appendix E Sample DBA Job PostingJob Posting

The Internet: From DBA to eDBA 50 The Personal DBA and the Cloud 53 NoSQL, Big Data, and the DBA 55 New Technology Impacts on DBA 56 DBA Certification 56 The Rest of the Book 58 Review 58 Bonus Question 59 Chapter 2 Creating the Database Environment 61 Defini