SCALING AN ALL FLASH ARRAY

Transcription

SCALING AN ALL FLASH ARRAY THE DEVIL IS IN THE DRAMNEIL VACHHARAJANILEAD SOFTWARE ARCHITECT 2015 Pure Storage Inc.1

The Economics of an All Flash Storage ArrayThe Media /GBSteepDescentDataReductionLow /GBHigh /GBLow /GB usable 2015 Pure Storage Inc.2

The Economics of an All Flash Storage ArrayThe Full System4x Intel Xeon CPUs512 GB 1TB DRAM4x FC HBAs6x SAS HBAs2x drive enclosures1x server chassisbattery backupInfiniband HCAInfiniband switch88xSIGNIFICANT COST OF A SYSTEM LIES OUTSIDE OF FLASH 2015 Pure Storage Inc.3

Scale-Up vs Scale-OutA False Choice Scale-Up Amortizes the overhead over more flash Capacity and performance scale separately Limitations on how far you can scaleScale-Up Scale-Out No limitations on how far you can scale Capacity and performance scale together Fixed overhead per TB of storage CPU and DRAM HBAs, NICs, switches, switch ports Power supplies, rack space, coolingScale-Out 2015 Pure Storage Inc.4

THE STRAWMAN ARCHITECTURECONTENT-ADDRESSED STORAGE 2015 Pure Storage Inc.5

How It ScalesContent Addressed Storage16KB I/O4KB4KB4KBFixed Logical Block Size4KBLBALBALBALBALBAhashhashhashhashhashScales with logical capacityLogical LayerDRAM ResidentPhysical Layer 2015 Pure Storage Inc.hashhashhashhashphysphysphysphysScales with physical capacity6

Doing the MathContent Addressed StorageQuantity Type50TiB4KiB12.5 Gi20B250 GiBPhysical capacityLogical block sizeBlocksCryptographic hash sizeHash metadata the physical layerQuantity Type250 TiB4KiB62.5 Gi20B1250 GiB Logical capacityLogical block sizeBlocksCryptographic hash sizeHash metadata for logical layer 2015 Pure Storage Inc. DRAM HUNGRY ARCHITECTURE1500 GiB JUST FOR HASHES! SCALES LINEARLY2X CAPACITY 2X DRAM AT ODDS WITH DATA REDUCTIONMORE LOGICAL CAPACITY MORE DRAM TRADE DRAM FOR DEDUPE2X BLOCK SIZE ½ DRAM2X BLOCK SIZE LESS DEDUPE7

A MORE SCALABLE ARCHITECTUREDECOUPLING DEDUPE FROM THE DATA PATH 2015 Pure Storage Inc.8

Making Metadata Not DRAM ResidentA More Scalable Architecture16KB I/ODRAM ResidentMetadata4KB4KB4KB4KBLogical LayerLBALBALBALBALBAPhysical LayerphysphysphysphysphysFixed Logical Block SizeScales with logical capacityScales with physical capacity 2015 Pure Storage Inc.hashhashhashhashphysphysphysphys9

Efficiently Encoding the Block MapA More Scalable Architecture16KB I/OFixed LogicalBlock SizeLogical LayerLBALBA extentLBA extentPhysical LayerphysphysphysScales with logical capacity (reduced coefficient)Scales with physical capacity 2015 Pure Storage Inc.hashhashhashhashphysphysphysphys10

Efficiently Encoding the Dedupe MapA More Scalable ArchitectureQuantity Type4KiBDedupe block size 2.5X SAVINGS FROM SIMPLER HASH20BCryptographic hash sizeSMALLER HASH MORE HASHES IN DRAM0.5%Hash size/block size VERIFY DATA IS REALLY DUPEQuantity Type4KiBDedupe block size8BNon-Cryptographic hash size0.2%Hash size/block size 2015 Pure Storage Inc.8B HASH HIGHER CHANCE COLLISIONJUST CHECK THE DATA DATA SET VS DATA STREAMSET: MANY COPIES OF IDENTICAL DATASTREAM: MOSTLY UNIQUE DATA11

Squeezing Dedupe Map into the Dedupe SetA More Scalable ArchitectureDedupe Maphashhashhashhashphysphysphysphys MOST BLOCKS ARE UNIQUE LOOKUP IN MAP ONLY IF IN SETSET CAN BE REPRESENTED WITH LESS DRAM DEDUPE SET OPTIMIZATIONSDedupe Sethashhash hashhash 2015 Pure Storage Inc.NO FALSE NEGATIVESALLOW FALSE POSITIVESMUCH MORE COMPACT REPRESENTATION!12

CONCLUSIONSTHE DEVIL IS TRULY IN THE DETAILS 2015 Pure Storage Inc.13

ConclusionsThe Devil is Truly in the Details WHAT WE BUILT FlashArray //m //m70 SCALES TO 136 TB IMPLEMENTATIONS MATTER TOO! MANAGING FLASH IS TRICKY ECONOMICS IS ONLY PART OF THE STORY PERFORMANCE MATTERS SIMPLICITY MATTERS RELIABILITY MATTERS OPERATIONS MATTER 2015 Pure Storage Inc.14

THANK YOU.QUESTIONS? 2015 Pure Storage Inc.15

2015 pure storage inc. 5 the strawman architecture content-addressed storage