Oracle High Availability Architecture And Best Practices

Transcription

Oracle High Availability Architecture and Best Practices10g Release 1 (10.1)Part No. B10726-01December 2003

Oracle High Availability Architecture and Best Practices, 10g Release 1 (10.1)Part No. B10726-01Copyright 2003 Oracle Corporation. All rights reserved.Primary Author: Cathy BairdContributors: David Austin, Andrew Babb, Mark Bauer, Ruth Baylis, Tammy Bednar, Pradeep Bhat,Donna Cooksey, Ray Dutcher, Jackie Gosselin, Mike Hallas, Daniela Hansell, Wei Hu, Susan Kornberg,Jeff Levinger, Diana Lorentz, Roderick Manalac, Ashish Ray, Antonio Romero, Vivian Schupmann,Deborah Steiner, Ingrid Stuart, Bob Thome, Lawrence To, Paul Tsien, Douglas Utzig, Jim Viscusi, ShariYamaguchiGraphic Artist:Valarie MooreThe Programs (which include both the software and documentation) contain proprietary information ofOracle Corporation; they are provided under a license agreement containing restrictions on use anddisclosure and are also protected by copyright, patent and other intellectual and industrial propertylaws. Reverse engineering, disassembly or decompilation of the Programs, except to the extent requiredto obtain interoperability with other independently created software or as specified by law, is prohibited.The information contained in this document is subject to change without notice. If you find any problemsin the documentation, please report them to us in writing. Oracle Corporation does not warrant that thisdocument is error-free. Except as may be expressly permitted in your license agreement for thesePrograms, no part of these Programs may be reproduced or transmitted in any form or by any means,electronic or mechanical, for any purpose, without the express written permission of Oracle Corporation.If the Programs are delivered to the U.S. Government or anyone licensing or using the programs onbehalf of the U.S. Government, the following notice is applicable:Restricted Rights Notice Programs delivered subject to the DOD FAR Supplement are "commercialcomputer software" and use, duplication, and disclosure of the Programs, including documentation,shall be subject to the licensing restrictions set forth in the applicable Oracle license agreement.Otherwise, Programs delivered subject to the Federal Acquisition Regulations are "restricted computersoftware" and use, duplication, and disclosure of the Programs shall be subject to the restrictions in FAR52.227-19, Commercial Computer Software - Restricted Rights (June, 1987). Oracle Corporation, 500Oracle Parkway, Redwood City, CA 94065.The Programs are not intended for use in any nuclear, aviation, mass transit, medical, or other inherentlydangerous applications. It shall be the licensee's responsibility to take all appropriate fail-safe, backup,redundancy, and other measures to ensure the safe use of such applications if the Programs are used forsuch purposes, and Oracle Corporation disclaims liability for any damages caused by such use of thePrograms.Oracle is a registered trademark, and Oracle Store, Oracle8i, Oracle9i, PL/SQL, and SQL*Plus aretrademarks or registered trademarks of Oracle Corporation. Other names may be trademarks of theirrespective owners.

ContentsSend Us Your Comments . xvPreface. xviiAudience . xviiOrganization. xviiiRelated Documentation . xxConventions. xxDocumentation Accessibility . xxiiiPart I1Getting StartedOverview of High AvailabilityIntroduction to High Availability.What is Availability? .Importance of Availability .Causes of Downtime .What Does This Book Contain? .Who Should Read This Book? .21-21-21-31-41-41-5Determining Your High Availability RequirementsWhy It Is Important to Determine High Availability Requirements .Analysis Framework for Determining High Availability Requirements.Business Impact Analysis .Cost of Downtime.2-22-32-32-3iii

Recovery Time Objective .Recovery Point Objective.Choosing a High Availability Architecture .HA Systems Capabilities .Business Performance, Budget and Growth Plans .High Availability Best Practices .Part II3Oracle Database High Availability Features, Architectures, and PoliciesOracle Database High Availability FeaturesOracle Real Application Clusters .Oracle Data Guard .Oracle Streams.Online Reorganization.Transportable Tablespaces .Automatic Storage Management.Flashback Technology .Oracle Flashback Query.Oracle Flashback Version Query .Oracle Flashback Transaction Query .Oracle Flashback Table .Oracle Flashback Drop.Oracle Flashback Database.Dynamic Reconfiguration.Oracle Fail Safe.Recovery Manager .Flash Recovery Area .Hardware Assisted Resilient Data (HARD) Initiative 03-113-11High Availability ArchitecturesOracle Database High Availability Architectures ."Database Only" Architecture ."RAC Only" Architecture."Data Guard Only" Architecture .iv2-42-52-52-72-82-94-24-34-54-7

Maximum Availability Architecture.Streams Architecture.Choosing the Correct HA Architecture.Assessing Other Architectures .5Operational Policies for High AvailabilityIntroduction to Operational Policies for High Availability .Service Level Management for High Availability.Planning Capacity to Promote High Availability.Change Management for High Availability .Backup and Recovery Planning for High Availability.Disaster Recovery Planning.Planning Scheduled Outages .Staff Training for High Availability.Documentation as a Means of Maintaining High Availability .Physical Security Policies and Procedures for High Availability .Part 13Configuring a Highly Available Oracle EnvironmentSystem and Network ConfigurationOverview of System Configuration Recommendations.Recommendations for Configuring Storage .Ensure That All Hardware Components Are Fully Redundant and Fault-Tolerant.Use an Array That Can Be Serviced Online.Mirror and Stripe for Protection and Performance .Load-Balance Across All Physical Interfaces.Create Independent Storage Areas .Storage Recommendations for Specific HA Architectures.Define ASM Disk and Failure Groups Properly .Use HARD-Compliant Storage for the Greatest Protection Against Data Corruption.Storage Recommendation for RAC.Protect the Oracle Cluster Registry and Voting Disk From Media Failure .Recommendations for Configuring Server Hardware.Server Hardware Recommendations for All Architectures .6-26-26-36-36-36-46-46-56-66-86-96-96-96-9v

Use Fewer, Faster, and Denser Components.Use Redundant Hardware Components.Use Systems That Can Detect and Isolate Failures.Protect the Boot Disk With a Backup Copy .Server Hardware Recommendations for RAC .Use a Supported Cluster System to Run RAC .Choose the Proper Cluster Interconnect .Server Hardware Recommendations for Data Guard.Use Identical Hardware for Every Machine at Both Sites .Recommendations for Configuring Server Software.Server Software Recommendations for All Architectures.Use the Same OS Version, Patch Level, Single Patches, and Driver Versions.Use an Operating System That is Fault-Tolerant to Hardware Failures .Configure Swap Partititions Appropriately .Set Operating System Parameters to Enable Future Growth.Use Logging or Journal File Systems.Mirror Disks That Contain Oracle and Application Software .Server Software Recommendations for RAC.Use Supported Clustering Software .Use Network Time Protocol (NTP) On All Cluster Nodes .Recommendations for Configuring the Network .Network Configuration Best Practices for All Architectures .Ensure That All Network Components Are Redundant .Use Load Balancers to Distribute Incoming Requests .Network Configuration Best Practices for RAC.Classify Network Interfaces Using the Oracle Interface Configuration Tool.Network Configuration Best Practices for Data Guard .Configure System TCP Parameters Appropriately .Use WAN Traffic Managers to Provide Site Failover Capabilities .7Oracle Configuration Best PracticesConfiguration Best Practices for the Database .Use Two Control Files.Set CONTROL FILE RECORD KEEP TIME Large Enough .Configure the Size of Redo Log Files and Groups Appropriately -176-176-176-176-187-27-27-37-3

Multiplex Online Redo Log Files .Enable ARCHIVELOG Mode .Enable Block Checksums.Enable Database Block Checking .Log Checkpoints to the Alert Log .Use Fast-Start Checkpointing to Control Instance Recovery Time .Capture Performance Statistics About Timing .Use Automatic Undo Management .Use Locally Managed Tablespaces .Use Automatic Segment Space Management.Use Temporary Tablespaces and Specify a Default Temporary Tablespace .Use Resumable Space Allocation .Use a Flash Recovery Area.Enable Flashback Database .Set Up and Follow Security Best Practices.Use the Database Resource Manager.Use a Server Parameter File .Configuration Best Practices for Real Application Clusters.Register All Instances with Remote Listeners .Do Not Set CLUSTER INTERCONNECTS Unless Required for Scalability .Configuration Best Practices for Data Guard .Use a Simple, Robust Archiving Strategy and Configuration .Use Multiplexed Standby Redo Logs and Configure Size Appropriately .Enable FORCE LOGGING Mode .Use Real Time Apply .Configure the Database and Listener for Dynamic Service Registration.Tune the Network in a WAN Environment .Determine the Data Protection Mode .Determining the Protection Mode .Changing the Data Protection Mode .Conduct a Performance Assessment with the Proposed Network Configuration .Use a LAN or MAN for Maximum Availability or Maximum Protection Modes.Set SYNC NOPARALLEL/PARALLEL Appropriately .Use ARCH for the Greatest Performance Throughput.Use the ASYNC Attribute with a 50 MB Buffer for Maximum Performance 27-227-237-247-257-267-277-287-29vii

Evaluate SSH Port Forwarding with Compression .Set LOG ARCHIVE LOCAL FIRST to TRUE.Provide Secure Transmission of Redo Data.Set DB UNIQUE NAME .Set LOG ARCHIVE CONFIG Correctly .Recommendations for the Physical Standby Database Only .Tune Media Recovery Performance.Recommendations for the Logical Standby Database Only .Use Supplemental Logging and Primary Key Constraints .Set the MAX SERVERS Initialization Parameter .Increase the PARALLEL MAX SERVERS Initialization Parameter .Set the TRANSACTION CONSISTENCY Initialization Parameter .Skip SQL Apply for Unnecessary Objects.Configuration Best Practices for MAA.Configure Multiple Standby Instances .Configure Connect-Time Failover for Network Service Descriptors.Recommendations for Backup and Recovery.Use Recovery Manager to Back Up Database Files .Understand When to Use Backups .Perform Regular Backups.Initial Data Guard Environment Set-Up .Recovering from Data Failures Using File or Block Media Recovery.Double Failure Resolution.Long-Term Backups .Use an RMAN Recovery Catalog .Use the Autobackup Feature for the Control File and SPFILE .Use Incrementally Updated Backups to Reduce Restoration Time .Enable Change Tracking to Reduce Backup Time .Create Database Backups on Disk in the Flash Recovery Area .Create Tape Backups from the Flash Recovery Area .Determine Retention Policy and Backup Frequency.Configure the Size of the Flash Recovery Area Properly.In a Data Guard Environment, Back Up to the Flash Recovery Area on All Sites.During Backups, Use the Target Database Control File as the RMAN Repository .Regularly Check Database Files for Corruption 97-397-397-397-407-407-407-407-417-417-427-43

Periodically Test Recovery Procedures .Back Up the OCR to Tape or Offsite .Recommendations for Fast Application Failover .Configure Connection Descriptors for All Possible Production Instances .Use RAC Availability Notifications and Events .Use Transparent Application Failover If RAC Notification Is Not Feasible.New Connections .Existing Connections .LOAD BALANCE Parameter in the Connection Descriptor .FAILOVER Parameter in the Connection Descriptor .SERVICE NAME Parameter in the Connection Descriptor .RETRIES Parameter in the Connection Descriptor .DELAY Parameter in the Connection Descriptor.Configure Services.Configure CRS for High Availability .Configure Service Callouts to Notify Middle-Tier Applications and Clients .Publish Standby or Nonproduction Services .Publish Production Services.Part 97-497-507-507-507-517-51Managing a Highly Available Oracle EnvironmentUsing Oracle Enterprise Manager for Monitoring and DetectionOverview of Monitoring and Detection for High Availability .Using Enterprise Manager for System Monitoring.Set Up Default Notification Rules for Each System.Use Database Target Views to Monitor Health, Availability, and Performance .Use Event Notifications to React to Metric Changes.Use Events to Monitor Data Guard system Availability .Managing the HA Environment with Enterprise Manager.Check Enterprise Manager Policy Violations.Use Enterprise Manager to Manage Oracle Patches and Maintain System Baselines.Use Enterprise Manager to Manage Data Guard Targets .Highly Available Architectures for Enterprise Manager .Recommendations for an HA Architecture for Enterprise Manager.Protect the Repository and Processes As Well as the Configuration They Monitor x

Place the Management Repository in a RAC Instance and Use Data Guard .Configure At Least Two Management Service Processes and Load Balance Them.Consider Hosting Enterprise Manager on the Same Hardware as an HA System.Monitor the Network Bandwidth Between Processes and Agents .Unscheduled Outages for Enterprise Manager.Additional Enterprise Manager Configuration.Configure a Separate Listener for Enterprise Manager .Install the Management Repository Into an Existing Database .9Recovering from OutagesRecovery Steps for Unscheduled Outages .Recovery Steps for Unscheduled Outages on the Primary Site .Recovery Steps for Unscheduled Outages on the Secondary Site .Recovery Steps for Scheduled Outages.Recovery Steps for Scheduled Outages on the Primary Site .Recovery Steps for Scheduled Outages on the Secondary Site .Preparing for Scheduled Secondary Site Maintenance 9-119-12Detailed Recovery StepsSummary of Recovery Operations.

Planning Capacity to Promote High Availability. 5-4 Change Management for High Availability . Network Configuration Best Practices for RAC. 6-17 Classify Network Interfaces Using the Oracle Interface Configuration Tool. 6-17 Network Configuration Best Practices for Data Guard . 6-17 Configure System TCP Parameters Appropriately . 6-17 Use WAN Traffic Managers to Provide .