Pure Storage Architectural Comparison-RGB-15

Transcription

SolidFire and PureStorage ArchitecturalComparisonSEPTEMBER 2015This document includes general information about the Pure Storagearchitecture as it compares to SolidFire. Not intended to be exhaustive,it covers architectural elements where the solutions differ and impactoverall suitability for the needs of the Next Generation Data Center (NGDC).Be fueled by SolidFire. Do what’s never been done.

OverviewThe FlashArray series from Pure Storage is an all-flash,dual-controller based storage solution1, based on atraditional active/passive scale-up architecture, wherecapacity is scaled by adding additional drive shelves to apair of controllers and performance is scaled by replacingthe controllers with more powerful controller models.Architectures based on storage controller pairs, alsoreferred to as storage node pairs, have been the standardfor enterprise block storage for many years. Enterprisefamiliarity, standard design concepts and fast timeto market are some of the key reasons Pure Storagedesigned its FlashArray off this familiar architecture.Unfortunately, dual-controller, scale-up architectures arenot without their caveats and with all-flash storage manyof those caveats become limitations.1. Effective capacity is defined as raw capacity – storage overhead * claimed efficiencyThe intention of Pure’s FlashArray was to replace traditionalblock storage arrays with a similar architecture, thedifferences being that it has been designed and optimizedspecifically for flash with a focus on low latency. Thisapproach not only enables Pure to directly replace legacyspinning disk with all-flash storage but also addressesapplication performance complaints common to blockstorage. While Pure has been successful at implementingkey functionality such as inline efficiencies, its focus onlatency performance and fitting into the existing blockstorage paradigm has kept it from addressing the issue ofscaling a traditional architecture and providing a way tocontrol how flash performance is delegated to applicationworkloads. The resultant solution from Pure is one wellsuited for high-performance, single point solutions, but it isnot a good fit for mixed workloads and solutions where verylarge performance and capacity scale is desired.September 2015 SolidFire.com2

Findings:As a high performance alternative to traditional block storage point solutions, Pure Storage’ssolutions offer high performance and an implementation familiar to the industry. The active/passive controller-centric architecture provides a familiar look to a block storage solution, butcarries with it many of the traditional drawbacks found in a dual-controller architecture, limitingits ability to enable a next generation data center.Specifically: Agile - SolidFire enables enterprises to cost-effectivelysupport specific solutions and adapt on the fly to multipleworkload environments without affecting the performanceof existing applications. Likewise, SolidFire’s shared-nothingarchitecture allows for the addition or removal of any modelof cluster node, 1U at a time, on the fly while maintainingapplication-specific Quality of Service (QoS) with maximum,minimum, and burst IOPS settings independent of capacity.Pure solutions offer scale-up of capacity, but only to a pointof 400TB of effective1 capacity depending on the modelof controllers in the individual implementation. The //m20, //m50, and //m70 models each scale up through theaddition of external drive shelves to offer claimed effectivecapacities of 5-120TB, 30-250TB, and 44-400TB, respectively.After implementation, any need to scale beyond (above orbelow) one of the capacity ranges has historically requireda disruptive “forklift” controller swap-out. With the new //mseries, Pure claims that is no longer the case. Scalability - Pure’s solutions follow a traditionalcontroller-centric solution limiting the current usablecapacity of deployments to 400TB and tiered ratherthan linear scaling of performance. Pure’s location- vs.content-addressed architecture, in mapping content toa specific location on disk, means that expansion and/or movement of data are significantly more overheadintensive as the array fills, negatively affecting theperformance and manageability at scale. Guaranteed - A key requirement of the next generationdata center is to have an environment based onrepeatable, predictable performance. Designed to getto the market quickly with an all-flash replacementto traditional block storage, Pure solutions offer goodspeed with low latency, but do not have the ability tospecify QoS for individual volumes, meaning applicationsand users can experience inconsistent performance inmultiple parallel workload environments.SolidFire enables enterprises to specify and guaranteeminimum, maximum, and burst IOPS for individualstorage volumes on the fly, independently of capacity,eliminating the “noisy neighbor” problem in mixedworkload environments. Automated - Both SolidFire and Pure Storage have APIsfor automating storage management but only SolidFireoffers the ability to automate every storage function ofthe array from the API. With SolidFire, data availability is also highly automated.In the unlikely event of a SolidFire node or multiple nodefailures, SolidFire automatically rebalances the lost node’sdata across all remaining nodes, restoring completeredundancy while maintaining all guaranteed QoS settings.As an added bonus, performance of SolidFire’s rebalanceimproves as more nodes are added to the cluster!Pure’s dual controllers, on the other hand, are deployedin an active/passive configuration. At any one time whena controller experiences a failure, the deployment failsover to the backup controller. The now active controllerbecomes a single point of failure until the failingcontroller is manually replaced.September 2015 SolidFire.com3

SolidFire vs Pure StorageData Addressing/ManagementBoth SolidFire and Pure use a log-structured approachin writing to disk which optimizes utilization andperformance of SSDs, significantly improves the lifespanof SSDs, and most importantly enables the use of lessexpensive consumer grade MLC SSDs.The log-structured approach, or reading in of currentvalid data and new data and writing it to disk in a linearfashion, in essence aggregates many small writes into alarge write. Compared to a fixed block approach, the logstructured approach significantly simplifies supportingcompression and variable block sizes.One area where Pure and SolidFire architectures differ isthat Pure relies on layering of metadata for functionalitylike deduplication, snapshots, and clones. The architecturebegins with a location addressing schema, as opposed toSolidFire’s content addressing technique, using a logicalunit number (LUN) and logical block address (LBA) to taga specific piece of data to a physical location in the array.Layered on top of this base key value store, additionaltables of block checksum values, link values, and sharedblock value tables are maintained to compare incomingdata against and/or to map multiple reference pointers fordeduplication, snapshots, and clones.Differences between Pure’s Location AddressingArchitecture and SolidFire’s Content AddressingArchitecture In the Pure architecture all of the metadata is storedon SSDs and partially cached (as opposed to 100% atruntime for SolidFire). The net result of this tradeoff is anaverage of 50% (1.5 to 1) cache miss ratio. This can leadto somewhat inconsistent read performance, particularlyas the array begins to fill up. Any time data is moved within the array (as is frequentlydone in storage systems) the primary data structure mustbe updated, compared to SolidFire’s content addressingsystem where the content ID does not change and thusthere is no need for I/O intensive updates. Because of the need for tight coupling betweenmultiple layers of metadata, Pure’s content addressingarchitecture works well in a tightly coupled dualcontroller architecture, but would not be well-suited forglobal deduplication.Quality of ServiceClearly Pure’s all-flash systems are fast arrays withconsistently low latency. However, Pure provides noQuality of Service (QoS) or performance provisioning toensure all applications in mixed workload deploymentsconsistently get the IOPS they need, and are protected fromunpredictable application I/O usage such as noisy neighborscenarios. Without QoS, large-scale infrastructure customersand/or those looking to consolidate multiple applicationsonto a single platform may find they have to buy more PureStorage arrays or larger controllers to provide insuranceagainst application performance variability.Figure 1: SolidFire QoSSolidFire architecture allows users to set minimum, maximum, and burst IOPS on aper volume basis.To deliver predictable and guaranteed storageperformance, SolidFire leverages QoS PerformanceVirtualization of resources. Patented by SolidFire,this technology permits the management of storageperformance independently from storage capacity.SolidFire architecture enables the capability to setminimum, maximum, and burst IOPs on a per volumebasis. Because performance and capacity are managedindependently, SolidFire clusters are able to deliverpredictable storage performance to thousands ofapplications within a shared infrastructure.September 2015 SolidFire.com4

ScalingPositive aspects of Pure scaling model:SolidFire’s clustered architecture enables the linear scaleout of capacity and performance as nodes are added tothe system, meaning each additional node provides apredictable amount of performance and capacity scalingup to 3.4PBs of effective and 7.5M guaranteed IOPs.Pure’s use of a scale-up architecture means a PureStorage array’s performance and ability to scale capacity,is based completely on the capabilities of the controllersin use. Increasing performance or capacity beyond thecontroller limit requires new, more powerful controllersor the deployment of a new Pure Storage array. Capacity can be scaled independently of performance Performance is “consistent” (excluding cache missvariations) in the event of a controller or disk failure Top end performance with a relatively small amount ofcapacityPotential negative aspects of Pure’s Scaling Model: Performance upgrade is only possible by upgradingcontrollers Unplanned controller upgrades can be required ifperformance or capacity limits are reached Large environments may require deployment of multiplesiloed arraysData AssurancePureFigure 2: SolidFire mixed-node scale-outAt any point during or after deployment, nodes can be added, removed, or replacedto increase capacity and/or performance without impacting existing workloads. Asnodes are added, their capacity and IOPS are aggregated into the total provisionablecapacity and performance available for assignment to any existing or new volume.Controllers are deployed in an active/passive pair. Undernormal operating conditions (i.e. without a controllerissue/failure), only one of the controllers is active at anytime and the second becomes active only in the event of afailure of the primary.The Pure Storage FlashArray //m sPerformance:150,000 32K IOPsPerformance:125,000 32K IOPs//m70Capacity:44-400TBsPure’s solutions use a traditional shared controller model.Historically with this type of architecture, the redundancylives with the disk shelf in the form of dual ported drivesand redundant back planes. Due to cost considerations,dual drive ports are not available for solid state drives orincluded in Pure’s design.In the event of a Pure disk shelf failure, RAID groups notstriped across multiple shelves will be lost. Use of RAID fordata protection means at best that Pure will experiencelonger drive rebuild times than SolidFire as the array fillsup. At worst, a cascading drive failure could result.As mentioned earlier in the I/O pathway comparison, thePure Storage dual controllers are deployed in an active/passive configuration. At any one time when a controllerexperiences a failure, the deployment defaults to thebackup controller. The now active controller becomes asingle point of failure until the failed controller is replaced.Performance:300,000 32K IOPsFigure 3: Pure Scale-Up vs. Scale-OutIn Pure’s architecture, performance increases are achieved by means of a controllerupgrade. Pure Storage claims its current largest array scales up to about 400TB ofusable capacity with 300,000 32k IOPs in 11U of rack space.September 2015 SolidFire.com5

SolidFireWith SolidFire’s RAID-less approach there is no sharingof any hardware component in the system. Connectivitybetween the nodes is redundant, and the design issuch that anything in the cluster can fail — any piece ofhardware, any software process, any network component— and the system will continue running.In the event of a node or multiple node failures, SolidFireautomatically rebuilds redundant data across remainingnodes in minutes, restoring complete redundancy whilemaintaining all guaranteed QoS settings.Efficiency & data integrity - Helix vs. RaidRAID is commonly promoted as an advantage by othershared disk flash architectures because it is very easy toimplement dual-parity protection with acceptable capacityoverhead. It is important to remember that in the SolidFiredesign, the shared nothing architecture means each nodeis a fully functional unit, removing the overhead (cost,management, and footprint) of shared controllers. SolidFireHelix takes advantage of this distributed architecture byproviding an exact copy of all data in the cluster, so if adrive fails the rebuild process simply reconstructs the datathat was on that drive based on the copy. Since the rebuildis simply copying the data from other drives in the cluster,rebuild times are extremely fast, there is far less wear andtear on drives, and there is no RAID overhead to impede thelarge scale common to scale-out clusters.Another benefit of Helix and the SolidFire architecture isthat in addition to drive failures, Helix can automaticallyrecover from an entire node failure, enabling the nondisruptive addition or removal of nodes to/from an activecluster. When a node fails or is added to or removed fromthe cluster, the process of rebalancing is exactly the sameas that of a drive. The advantage of SolidFire Helix isthat the entire cluster takes part in the rebalancing andeverything happens automatically, resulting in a very fastrebuild with very minimal performance impact, and mostimportantly, the elimination of any single point of failure.Comparing SolidFire’s self-healing approach to Pure’straditional RAID striping, it quickly becomes apparent thatSolidFire offers flexibility and protection not found withPure Storage. In the event of a node failure in a Pure array,the array is in a single point of failure scenario until thefailed node is physically removed. Pure is also unable toadd or remove nodes to an array non-disruptively, limitingPure to dual-controller silos within the data center.EEDDFigure 5: SolidFire Helix AutomatedMesh RedundancyWhen drive fails the rebuild processreconstructs the data that was on thatdrive and restores the redundancy inthe system.The Bottom LinePure Storage offers all-flash solutions built upon thetraditional controller-centric, RAID-based architecturemany organizations are accustomed to. Pure hasbeen successful at providing a solution that brings theperformance benefits of flash to these traditional blockstorage customers. Like traditional scale-up solutions,Pure’s architecture is well suited for point solutionenvironments but is less than optimal in the areas ofscale, automation, QoS, and agility for next generationdata center applications, including large scale multiple/mixed workload and IT as a Service (ITaaS) deployments.SolidFire’s shared-nothing, scale-out architecture makesit ideally suited for large-scale, mixed-workload enterpriseand service provider deployments. The ability to mixmultiple models of nodes within clusters, scaling outperformance and capacity linearly at any time (not just atinitial deployment), combined with the ability to guaranteeIOPS per volume means deployments can start and growas needed without disruption to running applications orworry of stranding either performance or capacity.SolidFire’s architecture means organizations can consolidatemultiple applications and workloads onto an agile, scalable,predictable, and automated, infrastructure. The flexibility ofthe SolidFire architecture ultimately saves customers timeand money, resulting in a much lower infrastructure TCOand consequently, a healthier bottom lineSeptember 2015 SolidFire.com6

//m70 Figure 3: Pure Scale-Up vs. Scale-Out In Pure’s architecture, performance increases are achieved by means of a controller upgrade. Pure Storage claims its current largest array scales up to about 400TB of usable capacity with 300,000 32k IOPs in 11U of rack space. Capacity: 5-120TBs Performance: 150,000 32K IOPs //m20 //m50