Application Clusters Troubleshooting - GBV

Transcription

ORACLGOracle PressOracle Database 12cRelease 2 RealApplicationClustersHandbookConcepts, nanAlapatiSam R.McGrawHillEducationNew YorkAthensMilanChicagoLondonNew DelhiSan FranciscoMadridMexicoSingaporeCitySydneyToronto

xxiAcknowledgmentsxxiiiIntroductionPART IHigh Availability Architecture1Introductiontoand ClustersHigh Availability and ScalabilityHigh AvailabilityHA TerminologyPlanned and Unplanned OutagesAn End-to-End PerspectiveBuildingComponentsCommon Solutions for HACluster, Cold Failover, andHAOption566910Hot FailoverPros and Cons111415ScalabilityThe Oracle RAC Solution1516AgilityThe Oracle Database 12c RAC Solution1718Summary247Cost of DowntimeRedundant3Oracle DatabaseClustering Basicsand Its Evolution19Cloud Computing with Clusters21Shared Storage in25Types ofHadoopClusteringClustering ArchitecturesArchitecture ofHistoricalBackground2530ClustersHadoop31of Oracle RAC32Oracle Parallel Storage Evaluation32ix

XOracle Database 12c Release 2 RealApplication Clusters HandbookOracle Parallel Server Architecture33Components of34ClusterOPS DatabaseanCroupServices (CGS)35Distributed LockManager (DLM)Locking Concepts in Oracle ParallelCache Fusion Stage 1, CR Server35Server3539The Oracle RAC 41Summary341Oracle RAC ArchitectureIntroductiontoOracle ters44Oracle RAC Environment45Oracle Flex Clusters46Oracle Extended Clusters47Oracle Multitenant and Oracle cle DatabaseOracleDatabasesServicesQualityRAC ComponentsShared Disk Systemof Service Management494949Oracle Clusterware51The OracleOracleOracleHigh Availability Services TechnologyRAC Networking Concepts and ComponentsKey Networking ConceptsThe Networking Stack ComponentsKernel ComponentsGlobal Cache and Global Enqueue ServicesGlobal Resource DirectoryOracle RAC Background ProcessesStackSummary596060626767676873PART IIInstallation, Configuration, and Storage4Oracle Grid Infrastructure Installation77An Overview of the Grid Infrastructure Installation Process79Preinstallation Tasks81Understanding theConfiguringtheInstaller,CVU, and ORAchkOperating Systemthe NetworkConfiguringConfiguring NTPSetting Up the Groups82828388and Users88

ContentsCreating the Required LinuxConfiguring Shared StorageDirectoriesSecure Shell and User LimitsConfigurationSetting89909595User LimitsConfiguringRunningthe Kernel Parametersthe Cluster Verification96Utility96Oracle Grid Infrastructure with OUIInstallingInstalling Oracle GridInfrastructure101101The Product Installation122104107108113115116the Oracle Grid Infrastructure Installation124125Installing OracleRAC andOracle Real6100Choosing the Installation Options and Naming Your ClusterSpecifying the Cluster Nodes and Verifying SSH ConnectivitySpecifying the Network InterfacesSelecting the Storage OptionsSpecifying Management Options and Privileged OS System GroupsPerforming the Prerequisite ChecksRunning the root cle RAC Database127ClustersInstallingCreating theSummaryOracle RAC DatabaseAutomaticStorage Management129140148149Standard Oracle ASM and Oracle Flex ASM150Introduction to Automatic151PhysicalStorage Management151Limits of ASMASM in152ASM154ManagingOperationBuilding BlocksOracle ASM Files and Directories162ASM FilenamesandCreatingManagingReferencing.ASM Files163Disk Group Directories164ASM Administration and ManagementManagingan162164ASM Instance165173ASM Initialization ParametersManaging ASM Disk Groups174Creating a Disk GroupAdding Disks to a Disk GroupDropping, Undropping, Resizing,Administering ACFSSetting Up ACFSCreating an ACFS Snapshot176176and Renaming Disks inaDisk Group177178179180

xiiOracle Database 12c Release 2 RealASM DiskApplication Clusters HandbookRebalancing180Manually Rebalancing a Disk GroupRebalancing Phase OptionsMonitoring the Performance of Balancing OperationsTuning Disk Rebalancing OperationsBackup and Recovery in ASMASM Flex Clusters182183184184185186Configurationof Oracle ASM in Flex ASM187Setting Up Flex ASMManaging ASM Flex Disk GroupsUnderstanding ASM File Groups and ASM Quota GroupsOracle Extended Disk GroupsASM Tools187188189191192ASMCA: The ASMConfigurationAssistantASMCMD: The ASM Command-LineASM FTP192Utility192Utility194ASMLib195Installing ASMLibConfiguring195ASMLib196Oracle ASM Filter Driver (Oracle ASMFD)197Summary198PART IIIOracle RAC Administration and7ManagementOracle RAC Basic Administration201Oracle RAC Initialization ParametersParameters That AreUnique to202anInstanceIdentical Parameters206Instance Parameters That "Should" Be the Same208Managing the208Backing UpParameter Filethe Server Parameter FileSearch Order for the Parameter Files inStarting203209anOracle RAC DatabaseandStopping InstancesUsing SRVCTL to Start/Stop Databases and Instance(s)Administering the Oracle ASM InstancesUsing CRSCTL to Stop Databases and InstancesUsing SQL*Plus to Start/Stop InstancesCommon SRVCTL Management Commands209210210214214214215Database-Related SRVCTL Commands216Instance-Related SRVCTL Commands217Listener-Related SRVCTL Commands217Setting, Unsetting,andDisplayingChanging the ConfigurationRelocating Servicesthe Environment Variablesof Databases and Instances218218219

ContentsRemoving the Configuration Information for Specific TargetsPredicting the Impact of FailuresManaging Pluggable Databases in a RAC EnvironmentAdministering Undo in an Oracle RAC DatabaseAdministering a Temporary TablespaceManaging Traditional (Global) TablespacesManaging Local Temporary TablespacesHierarchy of Temporary TablespacesOnline Redo LogsAdministeringEnabling Archive LogsEnabling the Flashback AreaConfigurationSpecific InstancesManaging Database ObjectsManaging SequencesManaging TablesManaging IndexesScope of SQL rver Pools233234Server PoolCreatingConverting an Administrator-Managedto a Policy-Managed onsolidation of DatabasesManaging221231Server Pools8220231Database Connectionsa219225with SRVCTLSessions onAdministeringConfiguring219225in the Oracle RAC EnvironmentManaging DatabaseKillingxiil237Oracle ClusterwareandAdministeringOracle Clusterware238239Benefits of Server PoolsServer Pools andPolicy-Based Management239Server Pools andCategorization239How Server Pools Work239Types ofCreating240Server PoolsServer Pools240Evaluatingthe Addition ofDeletingServer PoolaaServer Pool242Role-Separated ManagementManaging Cluster AdministratorsConfiguring Role SeparationUsing the crsctl setperm CommandWeight-Based Server Node EvictionAdministering SCANStarting and Stopping SCANDisplayingthe SCAN StatusAdministering the Grid241241Naming242242242243244244245Service (GNS)247

xivOracle Database 12c Release 2 Realthe CLUVFYUsingUtilityforApplication Clusters HandbookOracle RACManagingClusterware StartupThe Clusterware Startup ProcessClusterware Starting SequenceOracle Clusterware248248249Auto-StartupStartup252Oracle Clusterware ManualUsingCRSCTLManage the ClusterwareandStartingStopping CRSClusterized (Cluster-Aware) CRSCTL253to253253CommandsVerifying the Status of CRSDisabling and Enabling CRSOther Utilities toManage256256Oracle Clusterwarethe olsnodes CommandUsing258The Cluster Health Monitor259The OCLUMON Tool259Oracle InterfaceConfiguration: oifcfgConfiguration Utility: clscfgThe Cluster Name CheckOracle Trace FileAdministeringUtility:265the Oracle Localthe266266OCRCONFIGAddingandDeleting VotingMigrating VotingSummaryOracleDisksOracleBackupBasicsin Oraclein Oracle RACRedo Threads and ng BackupsRedo Records and269271Backup and RecoveryRecovery268269FilesDisksBackupsBackup UsingStorage for VotingBacking Up Voting DisksRestoring Voting DisksIntroduction to261262ASMOracle RACcemutloAnalyzerChecking OCR IntegrityDumping OCR InformationManaging the OCR with theMaintaining a Mirror OCRMigrating the OCR to ASM9260261the OCRAdministeringAdministering257257The GPnP ToolCluster254255The CRSCTL EVAL CommandsUsing247Change Vectors276278279279279

Contents281Crash RecoveryStepsin CrashRecovery (SingleInstance)282RecoveryCrash Recovery and Media283RecoveryRecovery283Block-Written Record (BWR)283BoundedPast284Two-Pass284Image (PI)RecoveryCache Fusion RecoveryDynamic Reconfiguration and Affinity Remastering10285286in Oracle RAC287Internals of Cache Fusion Recovery289FastBackup281282Crash Recovery in Oracle RACInstanceXVReconfigurationof the Voting Disk and OCRandRecoveryBackup and Recovery of VotingBackup and Recovery of OCRValidating OCR BackupsDisks292292293295Summary295Oracle RAC Performance Management297Oracle RAC Design Considerations298OracleOracleDesign Best PracticesRAC-Specific Design298Best Practices299Partitioning the Workload300Scalability and PerformanceChoosing the Block Size for an300Oracle RAC DatabaseIntroduction to the V and GV ViewsParallelQuery302303SlavesV Views Containing Cache Fusion Statistics303303Oracle RAC Wait EventsUnderstanding302304Cluster Waits315Global Cache StatisticsGlobal Cache StatisticsSummary317Global Cache Service TimesGlobal Cache Service Times317SummaryEnqueue Tuning in OracleOracle AWR ReportInterpreting the AWR ReportRAC320321322323ADDM329ASH330Tuning theReportsCluster InterconnectVerifyingThat Private Interconnect Is UsedInterconnect LatenciesVerifyingSummaryThat Network Interconnect Is Not Saturated330330331332332

XviOracle Database 12c Release 2 RealApplication Clusters HandbookPART IVAdvanced Oracle RAC11Global ResourceResources andConceptsDirectory335Enqueues336Grants and Conversions337Locks and Enqueues338CacheCoherencyGlobal Enqueue Services339340Latches and Enqueues340Global Locks Database and Structure341in Oracle RAC345MessagingGlobal Cache Services348Lock Modes and Lock Roles348Consistent ReadProcessingGCS Resource MasteringRead-Mostly LockingSummary12352355357363A Closer Look at Cache FusionKey ComponentsPing365in Cache Fusion367367DeferredPingPast Image (PI) BlocksLock MasteringTypes of ContentionCache Fusion I,Cache Fusion II,367368368369orConsistent Read ServerorWriteA/Vrite Cache Fusion373Cache Fusion in Operation375Cache cessesandRemasteringand Cache Fusion399400LMON: Lock Monitor Process400LMS: Lock400ManagerLMD: LockManagerServerDaemon Process (LMDn)400LCKn: Lock Process (LCK0)401DIAG:401DiagnosticDaemon (DIAG)Summary13369401Workload Management, Connectionand Application ContinuityUnderstanding DynamicManagement,Database Services403404Service Characteristics406Services and410Resource411UsingPolicy-Managed DatabasesManagement and ServicesServices with Oracle Scheduler411

ContentsAdministering ServicesUsing Views to Get Service InformationDistributed Transaction Processing411AQ HA Notifications416414415Workload Distribution and LoadBalancingHardware and Software Load BalancingClient-Side Load BalancingServer-Side Load BalancingTransparent Application Failover417419419423427TAF ConsiderationsWorkload427Balancing432Measuring Workloads byService435Using Service-Level Thresholds436Oracle RACHigh Availability FeaturesHigh Availability, Notifications,437and FAN438Event-Based NotificationApplicationUsing438Failure Issues440Transaction Guard for Efficient Client FailoverApplication ContinuitySummary14Oracle sTroubleshootinga446448TroubleshootingLogLog Directory StructureLog Directory Structure450in the Oracle RDBMS451in Oracle Grid InfrastructureFailed Oracle Grid Infrastructure Installation452453Inside the Database Alert Log455RAC ON and OFF459Database Performance Issues460Hung Database460Hanganalyze UtilityDebugging Node Eviction Issues461463Cluster Health MonitorInstance465Membership Recovery466Debugging for Oracle ClusterwareDebugging Various Utilities in Oracle RACUsing ORAchk to Troubleshoot RACSummaryAdvancedPARTModulesExtending OracleRAC for MaximumBenefits of Extended RAC Clusters472474475476VDeploying Oracle15xviiRACAvailability479481Full Utilization of Resources481Extremely Rapid Recovery481

XviiiOracle Database 12c Release 2 RealApplicationClusters HandbookDesign ConsiderationsSpeed of LightNetwork482482Connectivity482Cache Fusion PerformanceDataCommon483Storage484Techniquesfor DataMirroring485Array-Based MirroringHost-Based Mirroring485485ASM Preferred Read487Challenges487in Extended ClustersExtended Oracle RAC Limitations488Extended Oracle RAC489vs.Oracle Data GuardSummary16489Developing Applicationsfor Oracle RACApplication PartitioningBest Practice: ApplicationPartitioningData PartitioningBest 499Waits: Index Branch/Leaf Blocks ContentionSorted Hash ClustersWorking499501with Sequences502CACHE and NOORDER502CACHE and ORDER503NOCACHE and ORDER503Best Practice: Use DifferentConnectionSequencesfor Each InstanceManagementIdentifying503504Full Table Scans504Full Table Scans505Interconnect Protocol506Ethernet Frame Size507Library CacheCommit495496Guidance SystemsBusy Waits/BlockPartitioningBuffer491Effect in theParsing507Frequency508Summary508PART VIAppendixesAOracle RAC ReferenceGlobal Cache Services and Cache Fusion511Diagnostics512V CACHE512V CACHE TRANSFER514V INSTANCE CACHE TRANSFER515

ContentsGlobalV CR BLOCK SERVER516V CURRENT BLOCK SERVER517V GC ELEMENT518EnqueueServicesDiagnostics519V GES BLOCKING ENQUEUE520V ENQUEUE STATISTICS522V LOCKED OBJECT532V GES STATISTICS532V GES ENQUEUE532V GES CONVERT LOCAL534V GES CONVERT REMOTE535V GES RESOURCE535DynamicResourceRemastering Diagnostics536V GCSHVMASTER INFO536Cluster Interconnect DiagnosticsAddingAdding537537V CLUSTER IINTERCONNECTS537V CONFIGURED INTERCONNECTS538andaRemoving Cluster NodesNodePerformingExecuting the addNode.sh ScriptInstalling the Oracle Database SoftwareCreating a Database InstanceRemoving a NodeDeleting539540Pre-lnstallation Checksthe InstanceRemovingonthe Node to Be Deletedthe Node from the DatabaseRemoving the RAC4 Node from the ClusterwareIndex536V HVMASTER INFOV GCSPFMASTER INFOB519V LOCK540541542542543543543544547XIX

OracleTraceFile Analyzer 262 AdministeringtheOCR 265 CheckingOCRIntegrity 266 DumpingOCRInformation 266 ManagingtheOCRwith the OCRCONFIGUtility 267 Maintaining . OracleAWRReport 322 Interpreting the AWRReport 323 ADDM 329 ASHReports 330 TuningtheCluster Interconnect 330 Verifying ThatPrivate Interconnect Is Used 330