Design Considerations For Block Based Storage Virtualization Applications

Transcription

Design Considerations for BlockBased Storage VirtualizationApplicationsVenugopal ReddyEMC Proven Professional Knowledge Sharing 2010Venugopal ReddyGlobal Solutions ArchitectEMC CorporationReddy venugopal@emc.com

Table of ContentsIntroduction . 3Virtualization – Virtual Entities . 4Virtual Target . 5Virtual Initiator . 5Virtual LUN . 5ITL . 5Virtualization hardware . 6Applications . 7Recoverpoint . 7SANTAP Implementation . 8Brocade Implementation . 9Invista . 10Storage Encryption . 10Cisco Storage Media Encryption . 11Brocade Disk and Tape Encryption . 12Design Considerations . 12Recoverpoint . 12SAN design . 14Fabric Splitter Sizing . 18Journal volume design . 19Sizing the RPAs (Appliances) . 20Sizing the WAN Pipe. 21Databases in Consistency Groups . 21Recoverpoint over Invista . 22Splitter configuration limits . 23Storage performance with RecoverPoint . 23SANTAP Performance . 25FAP performance . 26Network latency and BW requirements . 26Statistics and Bottlenecks . 27Storage Encryption with Cisco SME . 29Storage Encryption with Brocade Encryption Services . 31Future Directions . 32Conclusion . 32References . 33Disclaimer: The views, processes or methodologies published in this compilation are those of the authors. They do notnecessarily reflect EMC Corporation’s views, processes, or methodologies2010 EMC Proven Professional Knowledge Sharing2

IntroductionThe tumultuous global events during recent years accentuate the need for organizationsto optimize their information technology budgets by increasing resource utilization,reducing infrastructure costs, and accelerating deployment cycles.Simultaneously,regulatory framework and compliance requirements are imposing new demands on theways we store, access, and manage data.“Network based Storage Virtualization,” the term describing virtualization implemented inStorage Area Networks (SANs), is an enabling technology that has spawned severalinnovative applications that are beginning to form a framework to optimize informationlife cycle management in data centers and to solve the challenges mentioned above.These innovations include the ability to create and manage dynamic storage resourcepools, perform seamless data migrations, encrypt stored data, and enable long distancereplication and server less backups.The core functionality of data virtualization in a SAN is its ability to map, redirect, ormake a copy of the data within the intelligent SAN switch. The block level I/O data thatcan be mapped, redirected, or copied fosters unprecedented flexibility and facilitatessome of these innovations. Products such as EMC’s Recoverpoint and Invista, andStorage Encryption solutions from Brocade and Cisco are some of these block basedstorage virtualization applications.They implement virtualization in the “network” toachieve innovative uses of data without impacting I/O performance. The intelligentappliances and SAN switches implementing the virtualization layer provide highperformance and scalability to match the needs of enterprise class applicationenvironments.Network based storage virtualization, while enabling a number of innovative applications,poses a number of important design and deployment challenges. The implementation ofthe technology requires specialized ‘virtualizing modules’ in the SAN switches that arehighly vendor specific and hardware dependent. The inherent complexities and theinterdisciplinary nature of the technologies also call for special considerations whendesigning, deploying and maintaining these new generation applications.2010 EMC Proven Professional Knowledge Sharing3

Successful deployment of network based storage virtualization applications requiresmeticulous planning and design, and an intimate understanding of the intelligenceimplemented in the SAN layer. This article provides a practitioner’s perspective onimplementing these solutions and aims to:1. Extend the understanding of the mechanisms of various virtual entities that existin network based virtualization2. Provide insight into a broad range of applications that benefit from virtualizationin the SANs3. Share recommendations and best practices for successful virtualizedinfrastructure planning, design and deployment considering scalability, availabilityand reliability4. Describe techniques to maintain the virtualized infrastructure while sustainingperformanceVirtualization – Virtual EntitiesThe principle behind virtualization is to capture the I/O in flight from the host to or fromthe target within the fabric. You can perform various operations once the I/O isintercepted, for example make a copy, redirect the I/O, encrypt the data etc.Virtualization hardware within the fabric creates virtual entities by implementing thismechanism of intercepting the I/O. There are four main components of virtual entities: Virtual Targets (VT) Virtual Initiators (VI) Virtual LUNs (vLUN) Initiator-Target-LUN (ITL) nexus2010 EMC Proven Professional Knowledge Sharing4

Virtual TargetVT is a virtual entity created in the fabric that is presented to the real host as a target. Insome cases, the VT can assume the identity (WWN) of the real storage target (in case ofSANTAP on Cisco switches) or it can be a different identity. The physical host performsI/O on these virtual targets; virtualization modules intercept the I/O.Virtual InitiatorVI is a virtual entity created in the fabric that is presented to the physical target as a hostHBA. In some cases, the VI can assume the identity (WWN) of the real host HBA (incase of SANTAP on Cisco switches) or it can be a different identity. The Virtual Initiatorperforms I/O on the physical storage targets; the virtualization module acts as the‘intermediary’ between the physical host initiator (HI) and Physical Target (PT).Virtual LUNvLUN is a virtual entity created on the VT. The Physical Host Initiator performs its I/O onthe vLUN. The virtualization hardware intercepts the I/O operation to the vLUN and isthen redirects it to the physical target (through the VI) after performing necessaryoperations (copy, map, encrypt etc). A vLUN can be created from one or more physicalLUNs from disparate arrays. Features such as striping, mirroring, and concatenation areimplemented at this layer.ITLFront end ITL is a virtual entity that makes up a nexus between the physical host HBA,Virtual Target, and the virtual LUN. Front end ITL counts and their placement in thefabric play an important role in optimizing performance and scalability of the virtualizedapplication. ITLs define the limits of the virtualized hardware.A backend ITL is a nexus between the Virtual Initiator, Physical Target, and physicalLUN. The virtualization framework in the fabric manages the mapping between a FITLand BITL.2010 EMC Proven Professional Knowledge Sharing5

Virtualization hardwareIn the current market, virtualization hardware comes from two major fibre channel switchequipment vendors, Brocade and Cisco. The virtualization modules are available asblade modules that can be inserted into directors and expandable switches, or in theform of standalone switches.Brocade hardwareVirtualization hardware from Brocade comes in the following forms:1) AP-7600B Storage Application Services switch2) PB-48k-AP4-18 Storage Application Services blade3) ES5832 Encryption switch4) PB-DCX-16EB Encryption BladeBrocade Application platforms - the standalone AP-7600B switch or PB-48k-AP4-18blade for the Connectrix ED-48000B, or ED-DCX-B director implement the StorageApplication Services (SAS) framework in the SAN using the Fabric ApplicationProgramming (FAP) layer. Brocade's SAS API implemented on these specializedmodules provides hardware implemented primitives like mirroring, copying, extent maps,striping, concatenation, copy on write, resync, dirty region logging etc. These featuresenable the virtualization applications to implement features such as Mirroring,Replication, snapshots, Migration, Backup etc.ES-5832B and PB-DCX-16EB implement encryption services for data-at-rest disk arrayLUNs using IEEE standard p1619 Advanced Encryption Standard (AES) 256 bitalgorithms. The I/O redirection using VIs and VTs enable the data compression by sixencryption/compression FPGAs in the blade or switch.Cisco HardwareCisco’s virtualization hardware comes in the following forms:1) Storage Services Module (SSM)2) MDS-PBFI-1804 (18/4 port) Multi Services Module3) 9222i MSM switch2010 EMC Proven Professional Knowledge Sharing6

Virtualization functionality is enabled through the Storage Services Module (SSM) linecard that you can insert into any modular switch within the Cisco MDS 9000 family. EachSSM contains a virtualization daughter card that hosts 4 virtualization engines. Eachcard has 2 Data path processors (DPPs) with Storage virtualization, volumemanagement, reliable replication writes (Cisco SANTAP), SCSI Flow services, FibreChannel Write Acceleration, and Network-Accelerated Serverless Backup (NASB).Storage Media Encryption (SME), or encryption of data on the Tape Libraries, isfacilitated by Cisco’s encryption engines integrated on the Cisco MDS 9000 18/4-PortMultiservice Module (MSM), MDS-PBFI-1804 and Cisco MDS 9222i Multiservice ModuleSwitches. SAN OS’s FC-Redirect creates VIs and VTs to enable data encryption withany fabric reconfiguration. These Multiservice Modules and switches also support CiscoSANTAP.ApplicationsThis section briefly describes the applications that use virtualization features in theintelligent switches.RecoverpointRecoverpoint is a distance replication and disaster recovery solution that runs on out-ofband appliances protecting locally and remotely. The salient features of the solution are:1) Heterogeneous storage replication for tiered disaster recovery2) Innovative data Journaling and Application Data consistency3) Support of FC and IP replicationRecoverpoint uses splitter technology for replication where a copy of the host writes issent to the out of band intelligent appliances. The splitters can be host, fabric, or arraybased. This article focuses on fabric. A Cisco fabric splitter uses SANTAP technology onits SSM and MSM modules, and a Brocade fabric based splitter uses AP7600B or PB48k-AP4-18 as splitting engine.2010 EMC Proven Professional Knowledge Sharing7

SANTAP ImplementationSANTAP implemented on Cisco SSM / MSM modules ‘taps’ a copy of the I/O to be sentto the Recoverpoint appliance. SANTAP in the SSM / MSM module is responsible forcreating the virtual entities in the fabric. SANTAP currently utilizes two VSANs: aFrontend VSAN where the Host Initiators and the Virtual targets reside, and a Back endVSAN where the physical targets, VIs and appliance HBAs are situated. In addition tothe host VIs, SANTAP creates a group of virtual entities called Control Virtual Targets(CVT).SANTAP CommunicationCVT is the portal through which the appliance (RPA) communicates with SANTAP. In anSSM module, when a CVT is created in the back-end VSAN, 10 virtual WWNs arecreated. Of these, 8 Virtual Initiators (VI) represent 8 Data Path Processors (DPPs,ASICS on the module), the remaining VI represents the Control Path processor (CPP)Initiator, and the lone Virtual Target (VT) created represents the CPP Target.Communications between the SANTAP service and the RPA fit into three classes:1.Control messages from the RPA to the SANTAP service2.Control messages from the SANTAP service to the RPA3.Data traffic (reliable writes) mirrored from a host issuing a write to a storagearray.The first two classes of communication are messages/notifications between the devicesto control various aspects of the SANTAP service. Both the SANTAP service (CPP VIand CPP VT) and the RPA appear as both a standard SCSI initiator and target. SCSIWrite operations are used between the SANTAP service and the RPA to convey controlmessages.The appliance discovers SANTAP service when it logs into the FC fabric and queries thename server (FCNS). Once discovered, the RPA issues Port Login (PLOGI) andProcess Login (PRLI) commands, followed by the standard SCSI device-discoveryprocess. The SANTAP service (CPP target) responds to a SCSI Inquiry with VendorInformation set to "CISCO MDS" and Product Identification set to "CISCO SANTAPCVT."2010 EMC Proven Professional Knowledge Sharing8

The Cisco SANTAP will initially issue a pending write log (PWL) to the RPA whenmirroring a host write to the appliance. The PWL is a short SCSI command (severalbytes) consisting only of the write operation’s Metadata (LBA number). Once the RPAacknowledges the PWL, the Cisco SANTAP service will simultaneously perform a writeI/O to both the RPA and target device (storage array). The RPA will then acknowledgethe write I/O. Finally, the pending write log entry is cleared with another short PWLCommand. Communications and Operations on MSM modules and switches are similar.Brocade ImplementationRecoverPoint on AP7600B and PB-48k-AP4-18 can be deployed in two modes: Multi VI mode Frame redirect ModeMulti VI modeIn this mode, the HI are zoned with the Virtual target (created on the switch, when HI isbound to a VI) and the VI is zoned with the PT. Because of this, you need to mask VIson the PT and reorganize the zones. When the host sends an I/O to the VT, the I/O isintercepted by the DPP on the switch. VI then sends one copy to the Physical Targetand the other to the appliance.Frame RedirectFrame redirect ensures that a copy of the I/O can be sent to the Recoverpoint appliance.The feature uses a combination of Redirect zones and Name Server changes to mapreal device WWNs to the FCIDs of the virtual entities. This allows redirecting a flowbetween a host and target to the appliance without reconfiguring them. When youperform binding between an HI and a PT, a new redirect (RD) zone is created. The RDzones have a prefix of “lsan ” and will contain the HI, PT, VI and VT.The RD Zone is part of the defined zone configuration and will not appear in the effectivezone configuration. When you create the first RD zone (using the bind host initiatorscommand on the RPA), two additional zone objects are created: A base zone"red base" and a "r e d i r c fg" zone configuration . These additional zoneobjects are required by the Frame Redirect implementation and must remain on theswitch as long as other RD zones are defined.2010 EMC Proven Professional Knowledge Sharing9

InvistaEMC Invista is a network-based storage virtualization solution that utilizes intelligentFibre Channel switches to implement centralized storage virtualization services thatspan heterogeneous storage systems. Using the virtualization modules in the FCswitches, Invista provides services such as volume management, mirroring, clones, nondisruptive data migration across heterogeneous storage systems with an easy tomanage centralized management user interface.Virtual volumes created out of one or more storage systems are presented to the host onthe virtual targets created by Invista. Similarly, Invista VIs perform I/O on the physicalstorage systems on behalf of the hosts. I/O remapping occurs in the data path for fastpath commands (read6/write6 and read10/write10) at hardware speeds, with minimaladditional latency. Slow-path commands on the virtual volumes (such as inquiry) areserviced by the highly available and redundantly configured Invista appliances thatmaintain the metadata of the virtualized storage on the highly available LUN in the SAN.Brocade ImplementationInvista creates 16 VIs and 16VTs on the virtualizing modules or switches. The VIs arezoned with PTs and VTs are zoned with HIs. The VIs and VTs are equally distributedamong the two DPPs on the DPC.Cisco ImplementationInvista create 9 VIs (one per each DPP and one Control VI) and 32 VTs. The VTs arezoned to HIs in front end VSANs, and the VIs are zoned to storage targets. The SALagent installed on the FC switches communicates with Invista appliances to configurethe intelligent services modules.Storage EncryptionStorage Encryption at the Fabric layer is a relatively new application of Block levelstorage Virtualization. Key advantages of the Fabric Level Storage Encryption include: The ability to encrypt data at wire speeds Central management of Encryption resources Simplified, non-disruptive installation and configuration2010 EMC Proven Professional Knowledge Sharing10

These encryption solutions are ideal for cases such as: Highly sensitive data on the Disk or Tape that needs protection (Data-at-rest). Secure data backups for offsite tape storage and long-term archiving Centralized management of heterogeneous disk and tape storage environments Secure replication of Encrypted data backups to remote facilities Implementing Clusters of Encryption Blades or Switches by scaling data centerencryption servicesCisco Storage Media EncryptionCisco MDS Storage Media Encryption (SME) service enables encryption of data storedon tape. This protects the backed up data on the tapes from unauthorized access ortape loss. SME creates VIs and VTs. An I/O sent to VT is intercepted, encrypted, andwritten to the tape by the MSM module through the VI. SME is a transparent fabricservice and the MSM module can be deployed anywhere in the fabric. It does not needto be directly in the data path; hence no cabling or configuration changes are required.Once an SME is enabled, traffic that is being encrypted is redirected to the appropriateMSM in the fabric using the FC-Redirect service.FC-RedirectVIs and VTs are created and placed in the default zone when SME is enabled. When anHI-PT nexus is configured on the SME, a LOGO (Logoff) is sent to the host to abort anyexisting sessions and exchanges to the physical target that may be in transit. The hostthen performs another PLOGI, but the MSM module intercepts it and redirects it to VT.The VI corresponding to the VT then performs a PLOGI on behalf of the Host, andcontinues through the PRLI and Discovery sequence. Once complete, the VTacknowledges the host’s PLOGI request, and accepts the host’s PRLI request. Then,the VT will intercept the host I/O sent to the PT, encrypt by the encryption module that isforwarded to the VI that sends the encrypted data to the PT. This is transparent to theHI and PT.2010 EMC Proven Professional Knowledge Sharing11

Brocade Disk and Tape EncryptionSimilar to the frame redirect option in Recoverpoint deployment, the Brocade encryptionengine uses RD zones for encryption. The HI gets the FCID of the VT when it queries forthe FCID of the PT and PT gets the FCID of the VI when it queries for the FCID of theHI. The I/O intercepted by the VT is encrypted by the encryption engine and is written tothe PT by the VT.The reverse happens when data is read from the PT. In addition, there is another entitynamed CryptoTarget Container that binds all these virtual entities. A CryptoTargetContainer holds configuration information for a single target including: Target Port, Initiators, and LUN settings Interfaces between the encryption engine and targets the initiators that access storage devicesDesign ConsiderationsWe will discuss a few design considerations when deploying the above mentionedapplications using virtualization in the fabrics. These considerations stem from theexperience of deploying these applications and may be considered additional to theproduct documentation provided by vendors.RecoverpointConsider four major components when designing a replication solution usingRecoverPoint. RecoverPoint Appliances (RPA) — RecoverPoint appliances are Linux basedboxes and are instrumental for replication activities. They accept “split” data and,based on policy settings, apply bandwidth reduction techniques, ensure writeorder fidelity, guarantee data consistency, and route the data to the appropriatedestination volume, either via IP or Fibre Channel. The RPA also acts as the solemanagement interface to the RecoverPoint installation.2010 EMC Proven Professional Knowledge Sharing12

RecoverPoint Journal Volumes – Journal volumes are dedicated LUNs on bothProduction and Target sides used to stage small aperture, incremental snapshotsof host data. As the personality of production and target can change duringfailover and failback scenarios, Journal volumes are required on all sides ofReplication (production, CDP and CRR). Intelligent Fabric Splitter — The RecoverPoint splitter driver is a use-specific,small footprint software that enables continuous data protection (CDP) andcontinuous remote replication (CRR). The splitter driver can be loaded on a host,on an Intelligent Blade within a SAN director, or on a CLARiiON array. Theintelligent fabric splitter is the intelligent-switch hardware that containsspecialized port-level processors (ASICs) to perform virtualization operations onIO at line speed. As mentioned in the previous sections, this functionality isavailable from two vendors: Brocade and Cisco. Brocade’s intelligent switch, theAP-7600, can be linked though ISLs to a new or existing SAN. Cisco’s intelligentblades are the Storage Services Module (SSM) and the MultiServices Module(MSM) that can be installed in MDS 9513, 9509, 9506, 9216i, 9216A, or 9222i. Remote Replication — Two RecoverPoint Appliance (RPA) clusters can beconnected via TCP/IP or FC to perform replication to a remote location. RPAclusters connected via TCP/IP for remote communication will transfer “split” datavia IP to the remote cluster. The target cluster’s distance from the source is onlylimited by the physical limitations of TCP/IP. RPA clusters can also be connectedremotely via Fibre Channel. They can reside on the same fabric or on differentfabrics, as long as the two clusters can be zoned together. The target cluster’sdistance from the source is again only limited by FC’s physical limitations. RPAclusters can support distance extension hardware (i.e., DWDM) to extend thedistance between clusters.2010 EMC Proven Professional Knowledge Sharing13

SAN designDeciding where to place the SSM modules or AP-7600 switch / PB-48k-AP4-18 modulein the SAN is one of the most common design considerations in Recoverpoint. You alsohave to decide on the location of Recoverpoint appliances on the SAN.Here are guidelines for placing the Intelligent switch modules and SSM modules:1) As a best practice, the intelligent modules/switches should be placed nearest tothe storage ports. In Core-Edge fabrics, the intelligent modules/switches shouldbe connected on the Core Switches. Similarly in Host-edge-Core-Storage-edgefabrics, the most logical place would be on the Storage edge fabrics. However, ifthe modules will be used by multiple storage ports on different storage edgeswitches, placing the intelligent modules on the core switches is ideal.2) As a best practice, the Recoverpoint appliances should be placed as close to theintelligent modules as possible. In AP-7600B deployments, it is preferable toplace the appliance ports on the switch itself. Similarly in MDS, MSM modulesand switches appliance ports should be placed on the module/switch. However,on MDS switches with SSM module, the appliance should be connected to aregular line card on the switch on non-shared FC ports. Further, the applianceports should not be connected to the ports on SSM modules.3) Inserting an SSM module in a MDS9513 Director reduces the director port countto 255. For this reason, placing SSM modules in a 9513 director is notrecommended where scalability of the ports in MDS Directors is a concern.Complex SAN topologiesA complex topology with numerous switches in the fabrics will most likely be in one oftwo designs, each discussed in the next sections: Core/edge (hosts on an edge switch and storage on a core switch) Edge/core/edge (hosts and storage on an edge switch but on different tiers)2010 EMC Proven Professional Knowledge Sharing14

Core/Edge configurationsIn this model, you connect hosts to the edge tier switches and storage to the core tierswitch(es). The core tier is the centralized location and connection point for all physicalstorage in this model. All IO between the host and storage must flow over the ISLsbetween the edge and core. It is a one hop logical infrastructure. (ISL hop is the linkbetween two switches).MDS ConfigurationsThe SSM/MSM blade is located in the core. To minimize latency and increase fabricefficiency, co-locate the SSM/MSM blade with the storage that it is virtualizing, just asyou would do in a Layer 2 SAN.In these deployments: ISLs between the switches do not have to be connected to the SSM/MSM blade Hosts do not have to be connected to the SSM/MSM blade Storage should not be connected to the SSM bladeInternal routing between blades in the chassis is not considered an ISL hop. Since theSSM/MSM is located inside the MDS switch, there is additional latency with thevirtualization ASICs. However, there is no protocol overhead associated with routingbetween blades in a chassis.If you are using a switch with an embedded MSM blade (MDS 9222i switch) forvirtualization, it should be ISLed to the Core tier switch. The number of ISLs used shouldmeet the amount of virtualization traffic.Brocade ConfigurationsIn these configurations, you can locate an AP4-18 blade on the core tier director or youcan use an external AP-7600B switch to ISL to the core switch. The considerations forlocating the blade are similar to the MSM blade in the MDS configuration. The AP-7600Bis an external intelligent switch; it must be linked through ISLs to the core switch.Physical placement of the RecoverPoint Appliances can be anywhere within the fabricand need not be connected directly to the intelligent switch although it is the mostcommonly employed approach.2010 EMC Proven Professional Knowledge Sharing15

When using AP-7600B switches, hosts are connected to the edge tier and storage isconnected to the core tier. The core tier is the centralized location and connection pointfor all physical storage in this model. All IO between the host and storage must flow overthe ISLs between the edge and core. It is a one-hop infrastructure for non-virtualizedstorage. However, all virtualized storage traffic must pass through at least one

Virtualization hardware In the current market, virtualization hardware comes from two major fibre channel switch equipment vendors, Brocade and Cisco. The virtualization modules are available as blade modules that can be inserted into directors and expandable switches, or in the form of standalone switches. Brocade hardware