The Practice Of System And Network Administration

Transcription

The Practice of System andNetwork AdministrationSecond Edition

This page intentionally left blank

The Practice of Systemand NetworkAdministrationSecond EditionThomas A. LimoncelliChristina J. HoganStrata R. ChalupUpper Saddle River, NJ Boston Indianapolis San FranciscoNew York Toronto Montreal London Munich Paris MadridCapetown Sydney Tokyo Singapore Mexico City

Many of the designations used by manufacturers and sellers to distinguish their products areclaimed as trademarks. Where those designations appear in this book, and the publisher wasaware of a trademark claim, the designations have been printed with initial capital letters orin all capitals.The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions.No liability is assumed for incidental or consequential damages in connection with or arisingout of the use of the information or programs contained herein.The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and contentparticular to your business, training goals, marketing focus, and branding interests. For moreinformation, please contact:U.S. Corporate and Government Sales, (800) 382-3419, corpsales@pearsontechgroup.comFor sales outside the United States please contact:International Sales, international@pearsoned.comVisit us on the Web: www.awprofessional.comLibrary of Congress Cataloging-in-Publication DataLimoncelli, Tom.The practice of system and network administration / Thomas A. Limoncelli, Christina J.Hogan, Strata R. Chalup.—2nd ed.p. cm.Includes bibliographical references and index.ISBN-13: 978-0-321-49266-1 (pbk. : alk. paper)1. Computer networks—Management. 2. Computer systems.I. Hogan, Christine. II. Chalup, Strata R. III. Title.TK5105.5.L53 2007004.6068–dc222007014507c 2007 Christine Hogan, Thomas A. Limoncelli, Virtual.NET Inc., and LumetaCopyright Corporation.All rights reserved. Printed in the United States of America. This publication is protectedby copyright, and permission must be obtained from the publisher prior to any prohibitedreproduction, storage in a retrieval system, or transmission in any form or by any means,electronic, mechanical, photocopying, recording, or likewise. For information regarding permissions, write to:Pearson Education, Inc.Rights and Contracts Department75 Arlington Street, Suite 300Boston, MA 02116Fax: (617) 848-7047ISBN 13: 978-0-321-49266-1ISBN 10:0-321-49266-8Text printed in the United States on recycled paper at RR Donnelley in Crawfordsville, Indiana.Seventh printing, February 2012

Contents at a GlancePart IGetting StartedWhat to Do When . . .Climb Out of the HoleChapter 1Chapter 2Part IIFoundation ElementsChapter 3Chapter 4Chapter 5Chapter 6Chapter 7Chapter 8Chapter 9Chapter 10Chapter 11Chapter 12Chapter 13Chapter 14Part IIIWorkstationsServersServicesData CentersNetworksNamespacesDocumentationDisaster Recovery and Data IntegritySecurity PolicyEthicsHelpdesksCustomer CareChange ProcessesChapter 15Chapter 16Chapter 17Chapter 18Chapter 19Chapter 20Chapter 21DebuggingFixing Things OnceChange ManagementServer UpgradesService ConversionsMaintenance WindowsCentralization and 43363389391405415435457473501v

viContents at a GlancePart IVProviding ServicesChapter 22Chapter 23Chapter 24Chapter 25Chapter 26Chapter 27Chapter 28Chapter 29Part VService MonitoringEmail ServicePrint ServiceData StorageBackup and RestoreRemote Access ServiceSoftware Depot ServiceWeb ServicesManagement PracticesChapter 30Chapter 31Chapter 32Chapter 33Chapter 34Chapter 35Chapter 36EpilogueOrganizational StructuresPerception and VisibilityBeing HappyA Guide for Technical ManagersA Guide for Nontechnical ManagersHiring System AdministratorsFiring System 777819853871899909Appendixes911Appendix A The Many Roles of a System AdministratorAppendix B AcronymsBibliographyIndex913939945955

ContentsPrefaceAcknowledgmentsAbout the AuthorsPart IxxvxxxvxxxviiGetting Started11 What to Do When . . .151.161.171.181.19Building a Site from ScratchGrowing a Small SiteGoing GlobalReplacing ServicesMoving a Data CenterMoving to/Opening a New BuildingHandling a High Rate of Office MovesAssessing a Site (Due Diligence)Dealing with Mergers and AcquisitionsCoping with Machine CrashesSurviving a Major Outage or Work StoppageWhat Tools Should Every Team Member Have?Ensuring the Return of ToolsWhy Document Systems and Procedures?Why Document Policies?Identifying the Fundamental Problems inthe EnvironmentGetting More Money for ProjectsGetting Projects DoneKeeping Customers Happy3444556789101112121313141415vii

1.421.431.441.451.461.471.48Keeping Management HappyKeeping SAs HappyKeeping Systems from Being Too SlowCoping with a Big Influx of ComputersCoping with a Big Influx of New UsersCoping with a Big Influx of New SAsHandling a High SA Team Attrition RateHandling a High User-Base Attrition RateBeing New to a GroupBeing the New Manager of a GroupLooking for a New JobHiring Many New SAs QuicklyIncreasing Total System ReliabilityDecreasing CostsAdding FeaturesStopping the Hurt When Doing “This”Building Customer ConfidenceBuilding the Team’s Self-ConfidenceImproving the Team’s Follow-ThroughHandling Ethics IssuesMy Dishwasher Leaves Spots on My GlassesProtecting Your JobGetting More TrainingSetting Your PrioritiesGetting All the Work DoneAvoiding StressWhat Should SAs Expect from Their Managers?What Should SA Managers Expect from Their SAs?What Should SA Managers Provide to Their Boss?2 Climb Out of the Hole2.12.2Tips for Improving System 222323232424252526262627282.1.1Use a Trouble-Ticket System282.1.2Manage Quick Requests Right292.1.3Adopt Three Time-Saving Policies302.1.4Start Every New Host in a Known State322.1.5Follow Our Other Tips33Conclusion36

ContentsPart IIFoundation Elements3 Workstations3.13.23.3The Basics4.24.34144Loading the OS463.1.2Updating the System Software and Applications543.1.3Network Configuration573.1.4Avoid Using Dynamic DNS with DHCP61The Icing653.2.1High Confidence in Completion653.2.2Involve Customers in the Standardization Process663.2.3A Variety of Standard ConfigurationsConclusion666769The Basics694.1.1Buy Server Hardware for Servers694.1.2Choose Vendors Known for Reliable Products724.1.3Understand the Cost of Server Hardware724.1.4Consider Maintenance Contracts and Spare Parts744.1.5Maintaining Data Integrity784.1.6Put Servers in the Data Center784.1.7Client Server OS Configuration794.1.8Provide Remote Console Access804.1.9Mirror Boot Disks83The Icing844.2.1Enhancing Reliability and Service Ability844.2.2An Alternative: Many Inexpensive Servers89Conclusion5 Services5.1393.1.14 Servers4.1ix9295The Basics965.1.1Customer Requirements5.1.2Operational Requirements985.1.3Open Architecture1045.1.4Simplicity1075.1.5Vendor Relations108100

xContents5.25.35.1.6Machine Independence1095.1.7Environment1105.1.8Restricted Access1115.1.9Reliability1125.1.10Single or Multiple Servers1155.1.11Centralization and 95.1.14Service Rollout1205.2.1Dedicated Machines1205.2.2Full Redundancy1225.2.3Dataflow Analysis for Scaling124Conclusion6 Data Centers6.16.26.36.4120The Icing126129The ty1346.1.4Power and Cooling1366.1.5Fire ling1666.1.9Communication1706.1.10Console Access1716.1.11Workbench1726.1.12Tools and Supplies1736.1.13Parking Spaces175The Icing1766.2.1Greater Redundancy1766.2.2More Space179Ideal Data Centers1796.3.1Tom’s Dream Data Center1796.3.2Christine’s Dream Data Center183Conclusion185

Contents7 Networks7.17.27.3187The Basics1887.1.1The OSI Model1887.1.2Clean Architecture1907.1.3Network Topologies1917.1.4Intermediate Distribution Frame1977.1.5Main Distribution Frame2037.1.6Demarcation Points2057.1.7Documentation2057.1.8Simple Host Routing2077.1.9Network Devices2097.1.10Overlay Networks2127.1.11Number of Vendors2137.1.12Standards-Based Protocols2147.1.13Monitoring2147.1.14Single Administrative Domain8.28.3216The Icing2177.2.1Leading Edge versus Reliability2177.2.2Multiple Administrative Domains219Conclusion2197.3.1Constants in Networking2197.3.2Things That Change in Network Design2208 Namespaces8.1xi223The Basics2248.1.1Namespace Policies2248.1.2Namespace Change Procedures2368.1.3Centralizing Namespace Management236The Icing2378.2.1One Huge Database2388.2.2Further Automation2388.2.3Customer-Based Updating2398.2.4Leveraging Namespaces239Conclusion2399 Documentation2419.1The Basics2429.1.1242What to Document

xiiContents9.29.39.1.2A Simple Template for Getting Started2439.1.3Easy Sources for Documentation2449.1.4The Power of Checklists2469.1.5Storage Documentation2479.1.6Wiki Systems2499.1.7A Search Facility2509.1.8Rollout Issues2519.1.9Self-Management versus Explicit Management251The IcingA Dynamic Documentation Repository2529.2.2A Content-Management System2539.2.3A Culture of Respect2539.2.4Taxonomy and Structure2549.2.5Additional Documentation Uses2559.2.6Off-Site Links258Conclusion10 Disaster Recovery and Data Integrity10.110.210.3258261The Basics26110.1.1Definition of a Disaster26210.1.2Risk Analysis26210.1.3Legal Obligations26310.1.4Damage Limitation26410.1.5Preparation26510.1.6Data Integrity267The Icing26810.2.1Redundant Site26810.2.2Security Disasters26810.2.3Media RelationsConclusion11 Security Policy11.12529.2.1The Basics26926927127211.1.1Ask the Right Questions27311.1.2Document the Company’s Security Policies27611.1.3Basics for the Technical Staff28311.1.4Management and Organizational Issues300

Contents11.211.311.4xiiiThe Icing31511.2.1Make Security Pervasive31511.2.2Stay Current: Contacts and Technologies31611.2.3Produce Metrics317Organization Profiles31711.3.1Small Company31811.3.2Medium-Size Company31811.3.3Large Company31911.3.4E-Commerce Site31911.3.5UniversityConclusion32032112 Ethics32312.132312.212.3The Basics12.1.1Informed Consent32412.1.2Professional Code of Conduct32412.1.3Customer Usage Guidelines32612.1.4Privileged-Access Code of Conduct32712.1.5Copyright Adherence33012.1.6Working with Law Enforcement332The Icing33612.2.1Setting Expectations on Privacy and Monitoring33612.2.2Being Told to Do Something Illegal/UnethicalConclusion13 Helpdesks13.1338340343The Basics34313.1.1Have a Helpdesk34413.1.2Offer a Friendly Face34613.1.3Reflect Corporate Culture34613.1.4Have Enough Staff34713.1.5Define Scope of Support34813.1.6Specify How to Get Help35113.1.7Define Processes for Staff35213.1.8Establish an Escalation Process35213.1.9Define “Emergency” in Writing353Supply Request-Tracking Software35413.1.10

xivContents13.213.3The Icing35613.2.1Statistical Improvements35613.2.2Out-of-Hours and 24/7 Coverage35713.2.3Better Advertising for the Helpdesk35813.2.4Different Helpdesks for Service Provision and Problem Resolution359Conclusion36014 Customer Care36314.114.214.3Part IIIThe Basics36414.1.1Phase A/Step 1: The Greeting36614.1.2Phase B: Problem Identification36714.1.3Phase C: Planning and Execution37314.1.4Phase D: Verification37614.1.5Perils of Skipping a Step37814.1.6Team of One38014.2.1Based Model-Training38014.2.2Holistic Improvement38114.2.3Increased Customer Familiarity38114.2.4Special Announcements for Major Outages38214.2.5Trend Analysis38214.2.6Customers Who Know the Process38414.2.7Architectural Decisions That Match the ProcessConclusionChange Processes15 Debugging15.115.215.3380The Icing384385389391The Basics39115.1.1Learn the Customer’s Problem39215.1.2Fix the Cause, Not the Symptom39315.1.3Be Systematic39415.1.4Have the Right Tools395The Icing39915.2.1Better Tools39915.2.2Formal Training on the Tools40015.2.3End-to-End Understanding of the SystemConclusion400402

Contents16 Fixing Things Once16.116.216.3The Basics17.217.3405Don’t Waste Time40516.1.2Avoid Temporary Fixes40716.1.3Learn from CarpentersThe IcingConclusion410412414415The Basics41617.1.1Risk Management41717.1.2Communications Structure41817.1.3Scheduling41917.1.4Process and Documentation42217.1.5Technical Aspects424The Icing42817.2.1Automated Front Ends42817.2.2Change-Management Meetings42817.2.3Streamline the ProcessConclusion18 Server Upgrades18.140516.1.117 Change Management17.1xvThe Basics43143243543518.1.1Step 1: Develop a Service Checklist43618.1.2Step 2: Verify Software Compatibility43818.1.3Step 3: Verification Tests43918.1.4Step 4: Write a Back-Out Plan44318.1.5Step 5: Select a Maintenance Window44318.1.6Step 6: Announce the Upgrade as Appropriate44518.1.7Step 7: Execute the Tests44618.1.8Step 8: Lock out Customers44618.1.9Step 9: Do the Upgrade with Someone Watching44718.1.10Step 10: Test Your Work44718.1.11Step 11: If All Else Fails, Rely on the Back-Out Plan44818.1.12Step 12: Restore Access to Customers44818.1.13Step 13: Communicate Completion/Back-Out448

xviContents18.2The Icing44918.2.1Add and Remove Services at the Same Time45018.2.2Fresh Installs45018.2.3Reuse of Tests45118.2.4Logging System Changes45118.2.5A Dress Rehearsal45118.2.6Installation of Old and New Versions on the452Same Machine18.2.718.3Minimal Changes from the BaseConclusion19 Service Conversions19.119.219.3457The Basics45819.1.1Minimize Intrusiveness45819.1.2Layers versus 9.1.5Small Groups First46319.1.6Flash-Cuts: Doing It All at Once46319.1.7Back-Out Plan465The Icing46719.2.1Instant Rollback46719.2.2Avoiding Conversions46819.2.3Web Service Conversions46919.2.4Vendor Support470Conclusion20 Maintenance Windows20.1452454470473The Basics

9.1.5 Storage Documentation 247 9.1.6 Wiki Systems 249 9.1.7 A Search Facility 250 9.1.8 Rollout Issues 251 9.1.9 Self-Management versus Explicit Management 251 9.2 The Icing 252 9.2.1 A Dynamic Documentation Repository 252 9.2.2 A Content-Management System 253 9.2.3 A Culture of Respect 253 9.2.4 Taxonomy and Structure 254