Easily Manage Your Structured And Unstructured Data For Analytics - Cisco

Transcription

Easily Manage Your Structured andUnstructured Data for AnalyticsSolution BriefAugust 2016With Software-Defined Storage Using IBM Spectrum Scale andCisco UCS Integrated InfrastructureHighlightsDeploy Integrated Infrastructure forSoftware-Defined Storage Cisco Unified Computing System (Cisco UCS ) and IBM SpectrumScale make it easy to quickly deployconsistent software-defined storage.Automate Storage Tiering and DataLifecycle Management Storage policies allow data to beautomatically tiered, compressed, andmigrated to the right storage platform.Keep Information Secure Native encryption capabilities helpprotect data from unauthorizedaccess, and cryptographic erasefeatures provide fast and securefile deletion.Simplify Management A single namespace and a singlepoint of management make it easyfor you to manage and monitor largevolumes of file and object data.Reduce Cost The solution supports a rangeof disk storage options, andinexpensive solid-state disk (SSD)drives or flash-memory can beused in local caches to balancecapacity, performance, and cost.Reduce Risk This prevalidated solution fromCisco and IBM reduces integrationand deployment risk by givingyour IT staff proven configurationsand guidelines for deployment.Cisco and IBM provide the rightfoundation for software-definedCisco incollaborationstorage so that you can easily store, with IBMaccess, and manage your structuredand unstructured data for big data analytics.With digital transformation changing the way your business works, your data centermust do more than simply store digital records. You rely on your computing, storage,and networking resources to store, retrieve, and analyze data to help you achievea competitive advantage. IBM Spectrum Scale running on Cisco UCS IntegratedInfrastructure for Big Data and Analytics delivers the performance and scalability youneed to quickly store data to and retrieve data from your content repositories.Cisco UCS Integrated Infrastructure for Big Data andAnalytics with IBM Spectrum ScaleCisco UCS Integrated Infrastructure for Big Data and Analytics with IBMSpectrum Scale integrates computing, network, storage, and managementresources into a cohesive programmable infrastructure that can respondto the demands of users and workloads. The solution unifies Cisco UCSC240 M4 Rack Servers, Cisco Unified Computing System (Cisco UCS)fabric interconnects, and IBM Spectrum Scale storage with virtualization,analytics, and file-level and object-level access in a scale-out solution thatis prevalidated to reduce integration and deployment risk (Figure 1).Cisco UCS C240 M4 Rack ServersCisco UCS C240 M4 high-density rack servers support a range of computing, I/O,and storage-capacity demands in a compact design. The server uses dual Intel Xeon processor E5-2600 v4 series CPUs and supports up to 1.5 terabytes (TB)of main memory and a range of hard-disk drive (HDD) and solid-state disk (SSD)drive options. Twenty-four small-form-factor (SFF) disk drives are supported 2016 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information.

Easily Manage Your Structured and Unstructured Data for AnalyticsWith Software-Defined Storage Using IBM Spectrum Scale and Cisco UCS Integrated Infrastructurein the performance-optimizedoption, and 12 large-form-factor(LFF) disk drives are supported inthe capacity-optimized option.Cisco UCS C240 M4 servers canbe used with the Cisco UCS VirtualInterface Card (VIC) 1227 or 1387,depending on the fabric interconnectthat is being used. The VIC 1227 isdesigned to optimize high-bandwidthand low-latency cluster connectivity.The VIC 1387 offers dual-portEnhanced Quad Small Form-FactorPluggable (QSFP ) 40 Gigabit Ethernetand Fibre Channel over Ethernet (FCoE)in a modular LAN-on-motherboard(mLOM) form factor.Cisco UCS 6200 and 6300 SeriesFabric InterconnectsCisco UCS 6200 Series FabricInterconnects provide high-bandwidth,low-latency connectivity for servers,with integrated, unified managementprovided for all connected devices byCisco UCS Manager. The latest versionof this technology, Cisco UCS 6300Series Fabric Interconnects expandthese capabilities with support for 10and 40 Gigabit Ethernet, FCoE, andFibre Channel connectivity. Deployedin redundant pairs, Cisco fabricinterconnects offer the full activeactive redundancy, performance, andscalability needed to support the largenumber of nodes that are typical inclusters serving big data applications.Cisco UCS ManagerUnified management simplifies yourdeployment and provisioning processesand provides the automation you needto be efficient. Using the role- andpolicy-based management capabilitiesof Cisco UCS Manager, your IT staff canprovision servers in minutes rather thanthe days or weeks required in traditionalenvironments. Ongoing maintenanceactivities are automated, and advancedmonitoring allow the system to raisealarms and send notifications about thehealth of the solution.IBM Spectrum ScaleIBM Spectrum Scale allows you tocombine flash-memory, hard-disk, andtape storage into a high-performance,low-cost, unified system. Thissoftware-defined storage solutionprovides many advanced capabilities.Cisco UCS 6200 SeriesFabric InterconnectsCisco UCS C240 M4Rack Servers withLFF Disk Drives16 Servers per RackFigure 1. Reference Architecture for Software-Defined Storage Deployment 2016 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 2 of 5

Easily Manage Your Structured and Unstructured Data for AnalyticsWith Software-Defined Storage Using IBM Spectrum Scale and Cisco UCS Integrated Infrastructure Automated storage tiering anddata lifecycle management:Storage policies allow data to beautomatically tiered, compressed,and migrated to the right storageplatform. You can group yourstorage devices (flash memory,SSD drives, and HDDs) basedon your latency, performance,locality, or cost requirements. Scalability: You can independentlyscale storage capacity, performance,and protocols. With the flexibilityto choose what to scale and whento scale it, you can start with asmall configuration and expand topetabytes of capacity. Information security: Nativeencryption capabilities can help youprotect your data from unauthorizedaccess, theft, loss, and inadvertent orimproper deletion. When you want tomake sure that your data is deleted,cryptographic erase features providefast and secure file deletion. Data reliability, availability, andintegrity: With no single pointof failure, even in large-scaledeployments, you can haveconfidence that the system willautomatically recover so thatyour data remains available in theevent of a node, storage, or otherinfrastructure failure. Global file sharing: By usingActive File Management (AFM)capabilities, you can deliver theright data to the right user at theright time regardless of location.This distributed disk-cachingtechnology expands the namespaceacross geographic locations whiledelivering accelerated read andwrite performance to every user. Management simplicity: With asingle namespace and a single pointof management, you can easilymanage very large quantities of fileand object data. Your administratorscan monitor multiple installationsfrom a single interface, improvingvisibility, control, and productivity. Performance: Using inexpensiveSSD drives or flash-memory locatedin your servers for your local cachehelps accelerate I/O performance.CPUs spend less time waiting fordata, and the load on your networkand storage resources is significantlyreduced, allowing other applicationsto benefit from available bandwidth.IOzone was used to validate theperformance characteristics of CiscoUCS Integrated Infrastructure for BigData and Analytics with IBM SpectrumScale. These tests used four CiscoUCS C240 M4 Rack Servers with 6-TBLFF disk drives. The four servers andeight client nodes were connected totwo fabric interconnects.Delivering Performance 2016 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Test ConfigurationTo avoid a single point of failure,systems and data were protected atseveral levels. Server: Three Cisco UCS C240M4 servers ran the IBM SpectrumScale software. The servers wereconfigured into a cluster running inpeer mode so that the failure of aserver would not affect operation. Disk: The servers were configuredwith two data replicas to balancecost and reliability. These twocopies of data help ensure thatinformation is available in the eventof a node failure. Furthermore, thedisks in the servers used RAID 6mechanisms to help ensure dataintegrity. By using striping, distributedparity with two independent parityblocks, and at least three drives inthe configuration, data access wasassured even if two drives failed. Network: Linux bonding (mode 6)was used to combine multiplenetwork links from the servers tothe fabric interconnects into a singleinterface for improved reliability. Inmode 6, adaptive load balancingdistributes network traffic to optimizethe use of the underlying networklinks and increase throughput.Page 3 of 5

Easily Manage Your Structured and Unstructured Data for AnalyticsWith Software-Defined Storage Using IBM Spectrum Scale and Cisco UCS Integrated InfrastructureTests for ScalabilitySeveral tests were run to highlight thescale-out performance of the solution,with the number of storage nodesincreased in each test iteration. IOzonewas used to perform a sequentialread and write test on multiple nodes.This test was run multiple times overdifferent numbers of server nodes. Thesolution scales linearly with the numberof server nodes, validating that IBMSpectrum Scale can be used in clustersranging from a few nodes to thousandsof nodes to support your applicationsand growing volumes of data (Figure 3).Throughput (Gigabytes per Second)Four IOzone threads ran on each of theeight client nodes, resulting in a total of32 IOzone threads. The sequential readand write test was performed multipletimes using transfer block sizes rangingfrom 4 KB to 16 MB (Figure 2). Theseresults show that the system performsconsistently and can adapt to manyapplication scenarios regardless of theapplication’s I/O block size. Becausethe system used two data replicas, thetotal throughput measured is twice thatshown in Figure 2.543WriteRewriteReadReread2104 KB256 KB1 MBBlock SizeFigure 2. Test Results Using Different Block Sizes16 KB64 KB2 MB4 MB16 MB12Throughput (Gigabytes per Second)Tests with Different Block SizesThe solution was tested using multipletransfer block sizes to determine theI/O throughput characteristics of thesolution. To fully saturate the servers,IOzone cluster mode was used tostart multiple IOzone test threadson multiple client nodes. Doing soallowed the aggregate throughputof the server to be gauged.Write10RewriteRead8Reread64202468Number of ServersFigure 3. Scalability Test Results 2016 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 4 of 5

Easily Manage Your Structured and Unstructured Data for AnalyticsWith Software-Defined Storage Using IBM Spectrum Scale and Cisco UCS Integrated InfrastructureEasy OrderingConclusionFor More InformationWith Cisco UCS Solution AcceleratorPaks, you can quickly and easilydeploy the software-defined storageyou need in your enterprise withoutthe expense entailed in designing andbuilding your own custom solution(Table 1). You can scale the capacityof the solution by adding serversas needed, and the servers areintegrated into the cluster in minutes.If you need software-defined storage,consider programmable infrastructurefrom Cisco and IBM. This enterpriseplatform can help you support yourdata-intensive applications and contentrepositories. And you can simplify yourdata workflows, improve service levels,reduce costs, manage risk, and deliverbusiness results today while positioningyour data center for growth. For more information about Cisco big data solutions, visit http://www.cisco.com/go/bigdata. For more information about CiscoUCS Integrated Infrastructure for BigData, visit http://blogs.cisco.com/datacenter/cpav4. For more information about IBMSpectrum Scale, visit Table 1. Cisco UCS Solution Accelerator Paks for Software-Defined StorageSolutionSolution SKUServer SKUCapacity OptimizedOption 1UCS-SL-CPA4-C1UCS-SPBD-C240M4-C1Capacity OptimizedOption 2UCS-SL-CPA4-C2UCS-SPBD-C240M4-C2Connectivity2 Cisco UCS 6296UP 96-Port Fabric Interconnects2 Cisco UCS 6296UP 96-Port Fabric InterconnectsServers16 Cisco UCS C240 M4 Rack Servers, each with: 2 Intel Xeon processorE5-2620 v4 CPUs (8 cores; 256 cores for solution) 128 GB of memory 2 x 240-GB 6-Gbps SSD drives 12 x 6-TB 7.2K LFF SAS drives Total of 72 TB of storage and 2.5 GBps of I/O bandwidth Cisco UCS VIC 122716 Cisco UCS C240 M4 Rack Servers, each with: 2 Intel Xeon processorE5-2620 v4 CPUs (8 cores; 256 cores for solution) 256 GB of memory 2 x 240-GB 6-Gbps SSD drives 12 x 8-TB 7.2K LFF SAS drives Total of 96 TB of storage and 2.3 GBps of I/O bandwidth Cisco UCS VIC 1227Storage controller Cisco 12-Gbps SAS Modular RAID Controller with 2-GB flash-based write cache (FBWC)Rack space 36RU 36RUScaling Up to 80 servers per domain with no oversubscription Up to 80 servers per domain with no oversubscription Scalability to thousands of servers with Cisco Nexus 7000 or 9000 Series SwitchesKey to abbreviations: 10,000-rpm (10K); 7200-rpm (7.2K) large form factor (LFF); rack units (RU); small form factor (SFF); terabyte (TB); and virtual interface card (VIC)Americas HeadquartersCisco Systems, Inc.San Jose, CAAsia Pacific HeadquartersCisco Systems (USA) Pte. Ltd.SingaporeEurope HeadquartersCisco Systems International BV Amsterdam,The NetherlandsCisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco Website at www.cisco.com/go/offices.Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to thisURL: www.cisco.com/go/trademarks. Third party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnershiprelationship between Cisco and any other company. (1110R) LE-58601-00 08/16

With Software-Defined Storage Using IBM Spectrum Scale and Cisco UCS Integrated Infrastructure Solution Brief August 2016 Cisco and IBM provide the right foundation for software-defined storage so that you can easily store, access, and manage your structured and unstructured data for big data analytics.