Software-defined Storage: What Can It Do For You

Transcription

SOFTWARE-DEFINED STORAGE:WHAT CAN IT DO FOR YOU?Mikhail GloukhovtsevSenior Cloud Solutions ArchitectOrange Business Services

Table of Contents1.Introduction . 32.Definition of Software-defined Storage . 53.Why Has the SDS Concept Emerged? . 74.The Main Capabilities of SDS and What Value They Provide to Customers.105.Does Every Company Need SDS?.146.Co-existence of Traditional Storage Platforms and SDS .147.SDS Relies on Hardware Innovations .158.Hardware Support Is a Must-Have Even for SDS .169.SDS Vendors, Products, and Solutions .169.1 What Features to Consider While Selecting an SDS Vendor and Product .179.2 A Variety of SDS Platforms .189.3 EMC ViPR .189.4 EMC ECS Appliance .239.5 EMC ScaleIO .249.5 EVP: Federation Software-defined Data Center.279.6 VMware VSAN.309.7 VMware ECO:Rail .3310.Conclusion .3511.References .36Disclaimer: The views, processes, or methodologies published in this article are those of theauthor. They do not necessarily reflect the views, processes or methodologies of EMCCorporation or Orange Business Services (my employer).2015 EMC Proven Professional Knowledge Sharing2

1.IntroductionWhen I first heard about software-defined storage (SDS) at a technical conference two yearsago, I got confused - software has always defined a properly designed infrastructure, has it not?For example, redundant array of independent disk (RAID) sets, well known for more than 30years, can be seen as software-defined storage. Is “software-defined storage” part of what iscovered by a marketing phrase – “Software-defined Everything”1 – that has been called the"next big thing"? How can “software-defined storage” be defined as an IT term to preventmisusing it in a chain of “software-defined X” constructs like “software-defined radio”2?Furthermore, for many years some storage vendors claimed the advantage of their “hardwarebased performance” products and it appeared to make sense. While I understood that newsoftware releases come much more frequently than new application-specific integrated circuit(ASIC) types, should it be seen as the main benefit of software-defined storage?Christos Karamanolis (VMware, Office of the CTO) wrote that 2012 was the year of “softwaredefined data center.”3 It was the year when the term “software-defined data center (SDDC)” wascoined by VMware’s former chief technology officer (CTO), Dr. Steve Herrod. The followingyears have shown that we witness the emergence of the new concept of data center defined byvarious terms – software-defined data center (SDDC, VMware),3,4 application-centricinfrastructure (ACI, Cisco),5 software-defined environment (SDE, IBM),6 software-definedinfrastructure (SDI, Intel),7 federated software-defined data center (EMC, VMware, Pivotal –EVP).8 SDS can follow the development and acceptance of “software-defined networking”(SDN) that has gained popularity as a component of the SDDC. We see the SDS buzzword inmany online and print trade magazines and hear it at almost every technical briefing on the nextgeneration data center.Is SDS really the next big thing in storage technology? Or is it just hype generated by themarketing machine as Rich Castanga, the editor of Storage Magazine,9 and other critics of SDSpoint out? Valdis Filks, research director for Storage Technologies and Strategies at GartnerResearch, ironically discussed whether SDS is, in fact, the re-labeled storage resourcemanagement, a kind of SRM 2.0 – a creature of “surreally defined marketing.”10If SDDC as an umbrella term for all the derivatives mentioned above (ACI, SDE, SDI, and EVP)is a semantic construct, SDS as an SDDC component is considered by SDS critics as “asynonym for private storage clouds, which is a synonym for Storage as a Service, which is asynonym for managed storage.”10 Critics of the view of SDS as a complete replacement of2015 EMC Proven Professional Knowledge Sharing3

hardware-centric storage remind that SDS relies on continuing progress in storage hardwaredevelopment.9 Indeed, innovations in hardware technologies such as new powerful processorsfrom Intel and flash storage are enablers for SDS.This article is my attempt to find answers to the questions of what SDS is and how we canseparate marketing myths and reality. I review the benefits of SDS, challenges in developingthis technology, how SDS is related to the broader concept of SDDC, and SDS use cases. Howwill various non-IT companies include SDS in the storage services roadmaps they develop tomeet business requirements? As large investments have been made in the existing storageenvironments, it is important to understand whether traditional storage platforms can coexistwith SDS. Or will legacy storage be converted into SDS?To answer these questions, we need to understand first how SDS is defined.2015 EMC Proven Professional Knowledge Sharing4

2.Definition of Software-defined StorageStorage vendors offer various definitions of software-defined storage (SDS) but there is nogenerally accepted definition at this time. The common element in all the definitions is"hardware independence," as hardware-agnostic storage solutions allow users to deploystorage on hardware they choose, including commodity hardware, and thereby avoid vendorlock-in for their future storage purchases. In the “Software-defined Storage” Working Draft,11 theStorage Networking Industry Association (SNIA) proposes an SDS definition via attributes andfunctionality rather than giving a brief definition. In my opinion, this definition has been chosenwith the goal of making such a definition applicable to the broad trends in SDS development.Actually the SNIA Working Draft considers SDS implementation model as the main differentiatorof SDS: “Data Services can be executed either in servers, storage, or both spanning thehistorical boundaries of where they execute.”11 They can run on any storage device and supportmany different data types and access protocols.Gartner defines SDS as “an architectural vision that includes the principles of orchestration,instrumentation and automation. [It] can be fully realized only by a standards-driven integrationof heterogeneous storage hardware and software platforms.”12Other definitions of SDS focus on separation of data and control planes: “SDS layers a controlplane for applications and policy on top of a data plane, which essentially manages informationacross various forms of infrastructure from on premise to the cloud.”13 Or they underscore thehardware-agnostic feature of SDS: “SDS is any storage software stack that can be installed onany commodity resources (x86 hardware, hypervisors, or cloud) and/or off-the-shelf computinghardware.”14 The control plane becomes a centralized storage resource management (SRM)service capable of managing pools of heterogeneous resources across the entire data center.VMware’s definition of SDS is VM-centric: “Software-defined Storage (SDS) is the vision thatstorage services are dynamically created and delivered per VM and controlled by policy.”15 TheVMware vision of SDS assumes the transition of storage services from hardware-centric arraysto VM-centric environment. This will lead to alignment of the storage services with applicationrequirements.2015 EMC Proven Professional Knowledge Sharing5

The SNIA concept of SDS emphasizes platform independence of SDS, allowing customers touse commodity hardware. At the same time, it considers the possibility of integration oftraditional storage and SDS when SDS may be an addition to the existing storage platform byproviding new features or an enhancement to the existing functions of specialized hardware.Figure 1: The SNIA Vision of Software-defined Storage (Ref. 11)According to the SNIA Working Draft,11 other attributes of SDS are scale-out capability, use ofstorage resource pools, ability for incremental growth, management automation, self-serviceinterface for users, and ability to set policy for managing the storage and data services (Figure1).2015 EMC Proven Professional Knowledge Sharing6

3.Why Has the SDS Concept Emerged?The SDS concept was put forward in 2012 for a few reasons. Historically, the idea of providingstorage using free, non-proprietary storage software running on commodity hardware is notnew. For example, Ceph is a free, open-source software storage platform developed in 2007and designed to run on commodity hardware to present object, block, and file storage from asingle distributed computer cluster scalable to the exabyte level.16 GlusterFS, an open-sourcesoftware-based network-attached filesystem that deploys on commodity hardware and iscapable of scaling to several petabytes handling thousands of clients17 is another example ofwhat can be called SDS today.The term SDS for this type of storage products has become popular for describing a generalcategory of hardware-agnostic storage platforms as their development gains momentum andnew business needs emerge. These new business requirements resulted from storage servicepain points – fast data growth with the advent of Big Data, inability to meet quick changes inbusiness processes and related application workloads, growing storage TCO, and securitychallenges – the list is not short.Next-generation applications such as Big Data Analytics and cloud-based applications led to theconcept of application-aware storage in the application-centric data center. The traditionalstorage platforms are not designed for cloud and Big Data applications, having a fundamentallydifferent architecture.18,19 These new applications, which require hyper-scalability, use standardcommunication protocols to facilitate universal access and interoperability, mix both structuredand unstructured content, process massive data sets, and must store both object- and blockaccess content. They need a storage platform that can support many different data types andaccess protocols independently of the hardware. This has been characterized as emerging ofthe Third Platform20 driven by cloud computing, Big Data, social media, and mobile computing.The Third Platform includes the SDS concept as a component of a more broad vision of theSoftware-defined Data Center (SDDC).2015 EMC Proven Professional Knowledge Sharing7

Figure 2: Business Drivers for Software-defined Storage (Ref.21)With all these new business requirements (Figure 2), the purpose of storage virtualizationbecomes not consolidation, as it was in the recent past, but rather storage agility requiringstorage service provisioning on demand. As the application landscape is moving from “manyapplications on one server” to “one application on many servers” deployment, the role of scaleout storage technologies is gaining momentum.There has been a confluence of new business needs for storage services with the postrecession mind-set of consumers no longer accepting the escalating cost of storage and lookingfor “doing more with less.” Consumers have understood that simply adding more capacity isunsustainable - they cannot continually purchase more storage as their primary storage reachesthe capacity limit. It has become clear that the traditional storage services based on static, slowto-respond, hardware-centric storage infrastructures cannot scale economically. Growth of thedata residing on proprietary storage hardware often leads to premature rip-and-replaceupgrades in order to meet new performance and capacity requirements. Companies start tocompare their cost of traditional storage service ( per GB) with the cost of storage servicesoffered by public cloud providers with very low per GB ratio. The end-users begin to demandthat their companies deliver storage service with more cloud-like features. If their companycannot provide it, they have alternatives by using cloud provider services. This has led to end2015 EMC Proven Professional Knowledge Sharing8

users’ investment into “shadow IT.” Thus, the fundamental shift in customer expectations forstorage services is one of the factors which have brought SDS into existence.Emerging hyperscale computing environments that work with multi-petabytes of storage andtens of thousands of servers change the storage economics. Let us look at numbers - attraditional, terabyte scale, just one cent reduction in the storage service cost ( 0.01/GB) resultsin savings of up to 1,000 per month for 100 TB of storage. However, the same cost reduction( 0.01/GB) for the petabyte-scale storage (for example, 100 PB), results in cost savings of 1,000,000 per month. This would be something to tell the CFO about. Using SDS oncommodity hardware platforms allows consumers to bring hyperscale economics to their datacenters, making them software-defined data centers.Hyperscale and Big Data applications have become the new normal, and wish lists of manystorage customers now include cloud-type scalability to thousands of nodes and to supportmulti-petabytes of data. Storage services should be able to manage Big Data Analytics, be“application workload smart”, and provide flexibility, intelligence, security, and costeffectiveness. As proprietary hardware solutions stumble to deliver these features in a costeffective manner, developing intelligent storage software that allows consumers to buildmassively scalable unified storage with commodity hardware has come as a solution. This is aconceptual shift from solutions developed for infrastructure based on reliable expensivehardware with unreliable software to an architecture using reliable software running onunreliable commodity hardware.SDS can be seen as an evolution of legacy storage virtualization through creative destruction.18Traditional block-based storage virtualization solutions are based on a virtualizer running on anarray-controller, appliance, or SAN-switch blade. SDS is more than just storage virtualization. Inthe SDS realm, a storage virtualizer evolves into a storage hypervisor like the VM-centricstorage hypervisor developed by DataCore (SANsymphony-V). The storage hypervisor providesa higher level of software intelligence capable of delivering storage services on commodityhardware without relying on any ASIC-built storage functionality.2015 EMC Proven Professional Knowledge Sharing9

4.The Main Capabilities of SDS and Value They Provide toCustomersThe SNIA Working Draft on SDS11 lists the following capabilities SDS platforms should include: Automation Standard Interfaces Virtualized data path ScalabilityAutomation is of key importance, as storage users are moving to hyperscale and cloud-basedcomputing. Automated policies used for storage management functions, such as storageprovisioning, automated dynamic storage tiering, and information life cycle management (ILM),simplify the management tasks that can be orchestrated by implementing a storage servicecatalog. This improves service metrics such as storage amount per storage administrator(TB/FTE) and, therefore, reduces the cost of providing storage services.Standard interfaces are required for programmatic control of storage infrastructure via open andstandards-based APIs. The most common of them is Representation State Transfer (REST)which is widely used in clouds and other web-based and network-based services. Conversely, atraditional storage platform mainly uses its own proprietary APIs and management tools.Virtualized data path is related to block, file, and object interfaces that support applicationsusing these interfaces. The unified SDS platform does not need to use different hardwareplatforms for every data type and access protocol.Scalability is critical, as users want to be able to scale the storage infrastructure in a costeffective manner without disruption to availability or performance.Other features include federation capability, which allows customers to create a large-scalesolution that aggregates disparate storage sources into a single pool with data mobility withinthe pool. This significantly improves storage resource utilization and makes it possible to avoidcreating separate storage silos. SDS should enable mixing and matching hardware types ifneeded.2015 EMC Proven Professional Knowledge Sharing10

Automated policies used to maintain service levels and match requirements with capabilitiesallow storage administrators to focus on higher level tasks rather than spending time fixing runand maintain type problems.The main benefit for customers from the cost perspective is the fact that SDS decouples thesoftware from the hardware life cycle. For a hardware-defined platform, typically a mix andmatch of multiple system generations is not allowed and hardware upgrades are intricately tiedto the development cycle of the software that is supported on that platform. By decoupling thesoftware from the hardware life cycle, customers can extend the value of their investment byusing a mix of hardware instances and generations. They can run and federate newer softwareon multiple older hardware instances as necessary for greater investment protection.Another benefit is that SDS provides the customer with flexibility in procuring the storageplatform via multiple delivery models: as software and hardware instances (appliances), assoftware only (which means the customer chooses the hardware instance), or as virtualmachines, etc. (see Section 9). As a result, customers are able to leverage existing investmentsand gain cost savings. SDS allows customers to homogenize the hardware-vendorheterogeneous data center at the software layer.22Table 1 provides a comparison between traditional hardware-centric storage and SDS andshows how companies can benefit from implementing SDS.2015 EMC Proven Professional Knowledge Sharing11

FeaturesTraditional Hardware-centricSoftware-defined Storage (SDS)StorageReliabilityMature technologies. Stable software.SDS is developed under an assumptionReliability is built in storage hardware,that the hardware is not necessarily reliablesoftware protects against rare casesand the software is responsible to continuethat are not handled by hardwareproviding storage services in case ofprotection features.hardware failures. The alternative to highcost redundancy in a single device is toscale out by adding additional servers andstorage in a distributed SDS environment.Performance/qWhile implementation of storage poolsLinear scalable performance. By definition,uality ofshared by applications with verySDS platform

hardware.”14 The control plane becomes a centralized storage resource management (SRM) service capable of managing pools of heterogeneous resources across the entire data center. VMware’s definition of SDS is VM-centric: “Software-defined Storage (SDS) is the vision that