SOA PaaS Disaster Recovery Overview - Oracle

Transcription

SOA PaaS DR OverviewSOACS and SOAMP on OCI Disaster RecoveryPaaS MAA team2021 September

Program agenda21Introduction2SOA Cloud Service & SOA Marketplace3SOACS/MP DR Topology4SOACS/MP DR Setup5SOACS/MP DR main Lifecycle operations6LinksCopyright 2021, Oracle and/or its affiliates Confidential: Public

Program agenda31Introduction2SOA Cloud Service & SOA Marketplace3SOACS/MP DR Topology4SOACS/MP DR Setup5SOACS/MP DR main Lifecycle operations6LinksCopyright 2021, Oracle and/or its affiliates Confidential: Public

INTRODUCTIONMaximum Availability Architecture Oracle Maximum Availability Architectures (MAA)- Oracle’s best practices blueprint that use Oracle’s proven technologies to provide Disaster Recovery solutions forall the Oracle Stack.- The key goal of MAA is to achieve optimal high availability, data protection and disaster recovery for Oraclecustomers at the lowest cost and complexity, by minimizing the RPO and RTO of the system- MAA consists of Reference Architectures, configuration practices and HA Life Cycle operational best practicesapplicable for non-engineered systems, engineered systems, non-cloud and cloud deployments. Disaster Recovery (DR) solutions are MAA architectures intended to protect critical missionsystems by providing a secondary system in another geographical area. Disaster Recovery protection is also required for systems running in the Cloud, like SOA CloudServices (SOACS) and Oracle SOA Suite on Marketplace (SOAMP).- DR is additional protection to High Availability. SOAMP and SOACS provide High Availability by default4Copyright 2021, Oracle and/or its affiliates Confidential: Public

INTRODUCTIONHigh Availability in the scope of a single OCI Region Oracle SOA Cloud Service uses the Active high availability(HA) policy for compute when it provisions instancecompute nodes: virtual machines (VM) fail overautomatically to another physical compute node in thesame compute zone in case the primary compute nodefails. A different Fault Domain is used by default in SOACS andSOAMP for each compute instance used by the WLScluster In SOAMP, when using regional subnets, the provisioningprocess places each compute instance used by the WLScluster in a different Availability Domain Additionally, the front-end LBR in OCI used by SOAMP isregional and failover across ADs provided OOTB forregions with more than one AD The Database can also be protected against AD failuresby using Oracle Data Guard and placing the standby in adifferent ADs (see on-prem MDC AA for Datasourceconfiguration) This configuration, however, does not provide protectionagainst disasters that affect an entire region5Copyright 2021, Oracle and/or its affiliates Confidential: Public

INTRODUCTIONMAA topology for SOA on Cloud The Disaster Recovery solution for SOA in Cloud was initially released in 2016, and has beenimplemented by many customers. 3 whitepapers released to address each SOA Cloud service type and infrastructure:- For SOA Marketplace- For SOACS on OCI- For SOACS on OCI Classicthe CloudSOA Suite on Oracle Cloud Infrastructure Marketplace Disaster RecoverySOA Cloud Service Disaster Recovery on OCI - Production and DR in the CloudSOA Cloud Service Disaster Recovery on OCI Classic - Production and DR in The solution provided by each whitepaper consists of:- A recommended Disaster Recovery topology- A list of steps and automation tools for the initial DR setup- A list of a recommendations and steps for the system’s lifecycle management6Copyright 2021, Oracle and/or its affiliates Confidential: Public

INTRODUCTIONMAA topology for SOA on Cloud The DR solution for SOACS and SOAMP systems involves setting up an standby system in a“geographically-separated” Oracle Cloud Data Center. It uses an active-passive model.Based on solid and proven DR technologies While there are some unique considerations to a cloud disaster recovery configuration, it follows the same Oracle MAAbest practices as any Oracle Fusion Middleware (FMW) and Oracle Database deployment Based on Data Guard (more than 20 years providing DR)Cross-region The DR solution for SOACS and SOAMP systems involves setting up an standby system at a geographically differentOracle Cloud Data Center, in a active-passive model. Cross-region DR is a real protection for any unforeseen (natural or man-made) event that can put your organization atriskProvides the best RTO and RPO By utilizing high availability and disaster protection capabilities provided by Oracle Fusion Middleware and OracleDatabase. RTO for a typical switchover: 15-30 min7Copyright 2021, Oracle and/or its affiliates Confidential: Public

INTRODUCTIONCustomer experiences SOACS/MP DR whitepaper defines the reference topology for disaster recovery However, variations on the reference topology have been implemented by customers to addressparticular customer cases:-Cross AD instead cross RegionCross ADs using single frontend LBRSetup with RAC DB systems in the DB layer using automated scripts (*)DR Setup integrated with customer’s automations toolsIntegration with JMS client applications* When RAC was not supported by the Disaster Recovery Setup (DRS) tool. DRS tool supports RAC since August 2020.8Copyright 2021, Oracle and/or its affiliates Confidential: Public

Program agenda91Introduction2SOA Cloud Service & SOA Marketplace3SOACS/MP DR Topology4SOACS/MP DR Setup5SOACS/MP DR main Lifecycle operations6LinksCopyright 2021, Oracle and/or its affiliates Confidential: Public

SOA Cloud Service & SOA MarketplaceIntroduction Both Oracle SOA Cloud Service (SOACS) and Oracle SOA Suite on Marketplace (SOAMP) provide aPaaS (Platform as a Service) computing platform solution for running the SOA applications in thecloud. SOACS- It was initially released for OCI Classic and then migrated to OCI- It is based on PSM (Platform Service Manager) SOAMP- It is a OCI native solution, provisioned via Marketplace images- recommended for new deployments Complete list of differences Between Oracle SOA Cloud Service and Oracle SOA Suite l10Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOA Cloud Service & SOA MarketplaceComparison in Disaster Recovery areaSimilarities The DR topology is the same (frontend candiffer, OTD/LBR in SOACS and LBR inSOAMP Setup procedure is almost the same Main lifecycle operations are the same Oracle Site Guard can be used in both casesto manage switchoversDifferences WLS config replica methods: SOACS supports only the DBFS method SOAMP supports DBFS method, FSS with rsync, and BlockVolume cross-region replica New features/improvements will be introduced only in theSOAMP solution. Examples: FSS with rsync, the scale-outsteps, etc. Whitepapers are different To accommodate better the specifics (provisioning menus,resource naming convention, differences in lifecycleoperations, setup, etc.) and for future changes affectingonly to one of them. SOAMP uses an improved DRS framework Main features aligned, but differences between eachother (FSS with rsync method support, new runtimeoptions, etc.)11Copyright 2021, Oracle and/or its affiliates Confidential: Public

Program agenda121Introduction2SOA Cloud Service & SOA Marketplace3SOACS/MP DR Topology4SOACS/MP DR Setup5SOACS/MP DR main Lifecycle operations6LinksCopyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR TopologyOverview Active-Passive model- Primary SOA & DB system in one region- Standby SOA & DB system in a different region DB systems configured with Data Guard Standby WLS domain is a replica of the primarydomain (same name, schemas, passwords, etc., onlydb connect string is different). Two options for theWLS config replica:- DBFS based method- FSS with RSYNC method Unique frontend hostname to access to thesystem. Is a “virtual name” that points to the IP ofthe LBR of the site with primary role Network communication between primary andsecondary networks via Dynamic Routing Gateway(recommended)13Copyright 2020, Oracle and/or its affiliates Confidential: PublicDNS(or local hosts resolution)mysoampdr.mycompany.comLBRLBRVCNVCNOCI Region 1 - PrimaryData Guard&WLS domainConfig replicaOCI Region 2 - Standby

SOACS/MP DR TopologyOverviewAfter a SwitchoverNormal OperationDNS(or local hosts resolution)DNS switch(or local hosts ompany.comVCNVCNOCI Region 1 - Primary14Data Guard&WLS domainConfig replicaOCI Region 2 - StandbyCopyright 2020, Oracle and/or its affiliates Confidential: PublicLBRLBRLBRVCNVCNOCI Region 1 - StandbyData Guard&WLS DomainConfig replicaOCI Region 2 - Primary

SOACS/MP DR TopologyWebLogic Domain config replicaDBFS based methodFSS with RSYNC DBFS mount as staging file system for a copy of theWLS domain. File Storage Service (FSS) as staging file system for acopy of the WLS domain. Uses underlying Data Guard replica to copy thedomain to standby region. Uses rsync to copy the domain to standby region. Recommended when latency is low.* Supported only in SOAMP DR. Recommended for any latency (high or low). Supported in SOACS and SOAMP DR.Takes advantage of the robustness of the DG replicaMore resilient behavior through Oracle Driver’s retry logicEasy to configure and maintainMore complex to configure (db client required) and maintainMore sensible to latency and jitterconfig replica.shscript15config replica.shscriptCopyright 2020, Oracle and/or its affiliates Confidential: Publicconfig replica.shscriptconfig replica.shscript

SOACS/MP DR TopologyTopology - DBFS based method16Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR TopologyTopology - FSS with RSYNC methodSupported inSOAMP only17Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR TopologyTopology – Block Volume cross-region replica The Block Volume containing the WLS Domain isreplicated using Cross-Region Block VolumeReplication feature (automatic asynchronous replication toother region) No stage location is used, hence, the setup andongoing replication differs significantly from theDBFS and FSS-rsync approaches. Main Disadvantages of this model:- Slightly Worse RTO- More complex Switchover operations Main advantages:- It is a general-purpose solution applicable not onlyto FMW- based PaaS services.- Provides continuous and automated replica. More details in the Appendix E of the whitepaper18Copyright 2021, Oracle and/or its affiliates Confidential: Public(NEW since July 2021!)Supported inSOAMP only

SOACS/MP DR TopologyTopology – Block Volume cross-region replicaNormal OperationSupported inSOAMP only(NEW since July 2021!)After a SwitchoverDNS(or local hosts resolution)LBRDNS switch(or local hosts resolution)VCNVCNLBRLBRLBRVCNVCNData GuardOCI Region 1 - PrimaryData GuardOCI Region 2 - StandbyBlock Volumes Replica19Copyright 2021, Oracle and/or its affiliates Confidential: PublicOCI Region 1 - StandbyOCI Region 2 - PrimaryBlock Volumes Replica

Program agenda1Introduction2SOA Cloud Service & SOA Marketplace3SOACS/MP DR Topology4SOACS/MP DR Setup5SOACS/MP DR main Lifecycle operations6Links20Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR setupSetup vs ManagementDR Setup Initial configuration, one time operation DR setup has evolved since the initial whitepapers: Automation level 0: Manual step-by-step: Initially the DR setup was a very manual step-by-step (copyfolders, tar, scp, replacements, etc.) Automation level 1: DR setup scripts: When SOACS was released in OCI, disaster recovery setup scriptswere created to automate many steps. Automation level 3: DRS tool: In 2019, the DRS tool was released to wrap the DR setup scripts ina single operation, orchestrate the execution, and automatesome other additional tasks (aliases, etc.). Available both forSOACS and SOAMPDR Management Similar to on premise. Specific DR operations are: Switchovers/Failover. They can be done: Manually Oracle Site Guard (when configured) WebLogic config replication Oracle provides a script to replicate midtierconfigurationRegardless how the setup is done (more manually by using DR setup scripts, or more automated by using DRS tool), the resultingDR topology is supported and runtime is the same.21Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR setupStarting point Starting point assumption: the primary SOA system already exists (along with its LBR and DBsystem) The impact of the DR setup on the existing system minimal: Down time needed (a restart of the managed servers) only in case the frontend name was notalready configure or frontend is not going to be re-used for DR DR setup process has been designed to be idempotent: each step can be retried.Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR setupStepsPrimarySOACS/MPexists1. Choose avirtual frontendname andregister in DNS23DR setupcomplete2. Prepareprimary midtierto use thevirtual frontendname3. SetupsecondarydatabaseCopyright 2021, Oracle and/or its affiliates Confidential: Public4. ProvisionsecondarySOA5. Preparesecondarymid-tier forvirtualfrontend6. Configure thestaging mountsfor WebLogicconfigreplication(in FSS method)7. Downloadand run DRS

SOACS DR setupDetails on the step 31. Choose avirtual frontendname andregister in DNS2. Prepareprimary midtierto use thevirtual frontendname3. Setupsecondarydatabase4. ProvisionsecondarySOA5. Preparesecondarymid-tier forvirtualfrontend Since March 2020, OCI console allows toconfigure Data Guard cross-region (before, onlycross-ad was supported) Some requirements: same tenancy, samecompartment, communication between DynamicRouting Gateway For scenarios where 1) does not apply, it can bedone manually. First, provision standby database as a regularDB System (same version, shape, password, etc.than primary) Second, use scripts provided in the whitepaperto configure it as standby (rman duplicate,dgmgrl commands. etc), dataguardit primary.sh anddataguardit standby root.shOption 1) Configuringusing OCI Console(“auto DG”)Option 2) Configuringdata guard manually(“manual DG”)24Copyright 2021, Oracle and/or its affiliates Confidential: Public6. Configure thestaging mountsfor WebLogicconfigreplication(in FSS method)7. Downloadand run DRSThe secondary database iscreated as a Data Guardphysical standby of the primarydatabase. Two ways to do this.

SOACS DR setupDetails on the step 41. Choose avirtual frontendname andregister in DNS2. Prepareprimary midtierto use thevirtual frontendname3. Setupsecondarydatabase4. ProvisionsecondarySOA5. Preparesecondarymid-tier forvirtualfrontend6. Configure thestaging mountsfor WebLogicconfigreplication(in FSS method) Convert secondary database into SNAPSHOT STANDBY (fully updatable database, anymodification is lost when it is converted to physical standby again)[oracle@soacsdrDBa ] dgmgrl sys/your sys password@primary db unqnameDGMGRL CONVERT DATABASE secondary db unqname to SNAPSHOT STANDBY;Converting database "secondary db unqname " to a Snapshot Standbydatabase, please wait.Database "secondary db unqname" converted successfully Provision secondary SOA as usual (SOACS or SOAMP), pointing to the secondary database25Copyright 2021, Oracle and/or its affiliates Confidential: Public7. Downloadand run DRS

SOACS DR setupDetails on the step 72. Prepareprimary midtierto use thevirtual frontendname1. Choose avirtual frontendname andregister in DNS3. Setupsecondarydatabase4. ProvisionsecondarySOA5. Preparesecondarymid-tier forvirtualfrontend6. Configure thestaging mountsfor WebLogicconfigreplication(in FSS method)7. Downloadand run DRS- The Disaster Recovery Setup (DRS) framework is written in python and wraps the fmw dr setup scripts,orchestrates the execution of the DR setup, and runs prechecks and post checks.- To run DRS tool: Choose a host that has SSH access to all the hosts in the DR (primary and secondary midtier and db hosts)Download DRS tool, upload to the host, untarReview README.md and customize drs user config.yaml with environment valuesRun “sh drs run.sh --config dr”- DRS tool can be re-run: 26Before, shutdown secondary processes if they are running (admin, wls, nodemanagers)Restore the domain backup that DRS does in secondary hostsVerify that standby database is in snapshot standby modeRe-run “sh drs run.sh --config dr –skip checks“Copyright 2021, Oracle and/or its affiliates Confidential: Public

Program agenda271Introduction2SOA Cloud Service & SOA Marketplace3SOACS/MP DR Topology4SOACS/MP DR Setup5SOACS/MP DR main Lifecycle operations6LinksCopyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR Lifecycle OperationsMain lifecycle operations in DROther lifecycle operationsWLS Config replicationSwitchoverFailover28Copyright 2021, Oracle and/or its affiliates Confidential: PublicScaleout/inRecreatedbfswalletOpensecondary forvalidation

SOACS/MP DR Lifecycle OperationsWLS Config ReplicationOPTION 1)WHEN DOMAIN CHANGES ARE INFREQUENT- Apply the configuration manually twiceSTEP1Apply the configuration change normally in the primary site2Convert the standby database to a snapshot standby3Start (if it wasn’t started) the WebLogic Administration Serveron the secondary site4Repeat the configuration change in the secondary site5Revert the database to physical standbyOPTION 2)WHEN DOMAIN CHANGES ARE FREQUENT- Use the provided script to replicate changes:- Run the script in primary WLS Administrationhost:- It copies primary domain to the staging mount(DBFS or FSS)- In FSS with rsync approach, the script rsyncs thecopy to the secondary FSS too.- In DBFS approach, DBFS content is automaticallyreplicated to secondary site by DG.- Run script in secondary WLS Administrationhost:- It copies from the secondary staging mount (DBFSor FSS) to secondary domain, making the requiredreplacements in db connection string29Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR Lifecycle OperationsWLS Config ReplicationDBFS approach- Script config replica.sh(*dbfscopy.sh in previous versions and in SOACS)30Copyright 2021, Oracle and/or its affiliates Confidential: PublicFSS with rsync approach- Script config replica.sh

SOACS/MP DR Lifecycle OperationsSwitchoverA switchover is a planned operation where anadministrator reverts the roles of the two sites.SWITCHOVER STEPDETAILS1Propagate any pendingconfiguration changesRun config replica.sh in primary admin node and then in secondaryadmin node2Stop servers in primarySiteUse WebLogic Administration Server Console or scripts to stopmanaged servers in primary Site (the admin server can remain up).3Switchover DNS namePerform the required DNS push in the DNS server hosting the namesused by the system or alter the file host resolution in clients to pointthe front end address of the system to the public IP used by LBR insite24Switchover DatabaseUse DG broker in primary db host to perform the switchover.As user oracle:# dgmgrl sys/your sys password@primary db unqnameDGMGRL switchover to “secondary db unqname”5Start the servers insecondary site (newprimary)Restart secondary Admin Server if configuration changes werereplicated while this was standby, so they take effect.Stop primary serversconfig replica.shStart servers in newprimaryconfig replica.shStart secondary managed servers (using the WebLogic Console orscripts)Switchback consist on the same steps, but in the other way31Switchoverdns nameCopyright 2021, Oracle and/or its affiliates Confidential: PublicSwitchover database

SOACS/MP DR Lifecycle OperationsSwitchoverRTO time based on our latest tests in SOAMP:SWITCHOVER STEPDETAILS1Propagate any pendingconfiguration changesRun config replica.sh in primary admin node and then in secondaryadmin 6 min2Stop servers in primarySiteUse WebLogic Administration Server Console or scripts to stopmanaged servers in primary Site (the admin server can remain up). 4 min (complete normal shutdown)3Switchover DNS namePerform the required DNS push in the DNS server hosting the namesused by the system or alter the file host resolution in clients to pointthe front end address of the system to the public IP used by LBR insite24Switchover Database(depends on DNS, TTL)Use DG broker in primary db host to perform the switchover.As user oracle:# dgmgrl sys/your sys password@primary db unqname 3 minDGMGRL switchover to “secondary db unqname”5Start the servers insecondary site (newprimary)Restart secondary Admin Server if configuration changes werereplicated while this was standby, so they take effect.Start secondary managed servers (using the WebLogic Console orscripts) 10 min (starting admin first and then managed)Of course these times can vary depending on the hosts shapes, tuning, etc., but:TOTAL SWITCHOVER time in 15-30 min range32Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR Lifecycle OperationsFailoverA failover operation is performed when theprimary site becomes unavailable, and it iscommonly an unplanned operation.1FAILOVERSTEPDETAILSSwitchoverDNS namePerform the required DNS push in the DNS server hosting the namesused by the system or alter the file host resolution in clients to pointthe front end address of the system to the public IP used by LBR insite22 FailoverDatabaseSwitchoverdns nameNOTE: it is assumed that primaryweblogic servers are down. If not, it isrecommended to shutdown thembefore the failoverStart servers in newprimaryUse DB broker in secondary db host to perform the failover. As useroracle: dgmgrl sys/your sys password@secondary db unqnameDGMGRL failover to “secondary db unqname”3 Start theservers insecondary siteRestart secondary admin server if configuration changes werereplicated while this was the standby, so they take effect.Start secondary managed servers (use the WebLogic Console orscripts)Failoverdatabase33Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR Lifecycle OperationsUsing Oracle Site Guard for Switchover/FailoverEM Cloud ControlFull stack switchover can be orchestrated by Oracle Site GuardRequired setup documented in separated whitepaper(common for SOAMP and SOACS DR)EM AdministratorPerform switchover/failover usingOracle Site Guard34Copyright 2021, Oracle and/or its affiliates Confidential: Public

SOACS/MP DR Lifecycle OperationsUsing Oracle Site Guard for Switchover/Failover35Copyright 2021, Oracle and/or its affiliates Confidential: Public

Program agenda361Introduction2SOA Cloud Service & SOA Marketplace3SOACS/MP DR Topology4SOACS/MP DR Setup5SOACS/MP DR main Lifecycle operations6LinksCopyright 2021, Oracle and/or its affiliates Confidential: Public

LINKSDocuments in OTNSummary of SOA on Cloud Disaster Recovery whitepapers: SOA on Marketplace DR:SOA Suite on Oracle Cloud Infrastructure Marketplace Disaster amp-dr.pdf) SOACS on OCI DR:SOA Cloud Service Disaster Recovery on OCI - Production and DR in the -dr-oci.pdf) To configure Oracle Site Guard to manage switchovers for SOACS and SOAMP DR:Using Oracle Site Guard to Manage Disaster Recovery for OCI PaaS eguard-paasdr.pdf)37Copyright 2021, Oracle and/or its affiliates Confidential: Public

LINKSDocuments in OTN The PaaS DR whitepapers are published in MAA OTN pages:- MAA Best Practices for the Oracle Cloud -availability/oracle-cloud-maa.html)- MAA Best Practices - Oracle Fusion Middleware ht 2021, Oracle and/or its affiliates Confidential: Public

Thank you39Copyright 2021, Oracle and/or its affiliates Confidential: Public

PaaS (Platform as a Service) computing platform solution for running the SOA applications in the cloud. SOACS-It was initially released for OCI Classic and then migrated to OCI-It is based on PSM (Platform Service Manager) SOAMP-It is a OCI native solution, provisioned via Marketplace images-recommended for new deployments