Memory & Storage Day Intel Optane DC Persistent Memory Breakout Session

Transcription

Memory & Storage DayIntel Optane DC Persistent Memory Breakout SessionMohamed Arafa, PhDSr. Principal Engineer, Datacenter Engineering and Architecture

Legal DisclaimerAll information provided here is subject to change without notice. The products described in this document may contain design defects or errors knownas errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation.Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer orretailer or learn more at intel.com.Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such asSYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of thosefactors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplatedpurchases, including the performance of that product when combined with other products. For more complete information visitwww.intel.com/benchmarks.Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, mayaffect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.Intel does not control or audit third-party data. You should review this content, consult other sources, and confirm whether referenced data areaccurate.Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you forinformational purposes. Any differences in your system hardware, software or configuration may affect your actual performance. Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may beclaimed as the property of others.2

Intel optane dc and Intel QLC Technology comparison(note that these can be orders of RDABILITY(per capacity)PERSISTENCYNY1YYBIGGER IS BETTER21 Persistency is enabled when product is used in app direct mode2 Graphical representation of product comparison is based on internal Intel analysis, and is provided here for informational purposes only. Any differences in system hardware,software or configuration may affect actual performance.3

More Acceleration Offload functions with FPGAAnnounced September 19UPI4

SAP HANAFaster Restart TimesIncreased Memory CapacityWorking Memory Volatile data structures remain inDRAM Column Store Main moves toPersistent MemoryRow StoreColumn storeDelta‐ DIMM form-factor, replacing DRAMSAP HANA Main Storerelocated to largerpersistent memory toachieve lower TCO‐ Could be configured for each table,partition, or columnMain‐ Loading of tables into memory atstartup becomes obsoleteColumn storeLOG‐ Lower TCO, larger capacityDATA No changes to the persistenceSAP HANA controls what is placed in Persistent Memory and what remains in DRAM5

RedisWrite requestStore key/ValueKEYSVALUESWrite requestAppend operationinto AOF log fileAOF File(log)Store keyKEYSStorevalueAppend pointer intoAOF log fileVALUESAOF Pointers(to log)Moving values to Intel Optane DC persistent memory improves performance for AOF-Always Use persistence to reduce writes to SSD - pointer only instead of full valueDirect access vs. disk protocolReduces DRAM requirementsMoving Value to AD reduces DRAM and optimizes logging6

Oracle Exadata : Persistent memory Accelerator for OLTP Exadata Storage Servers will add Persistent Memory Acceleratorin front of Flash memory RDMA bypasses the software stack, giving 10X faster accesslatency to remove Persistent Memory Persistent Memory mirrored across storage servers for faulttolerance Persistent memory used as a shared cache effectively increases itscapacity 10X vs using it directly as expensive storage Log Writes will use RDMA to achieve super fast commits10X Lower LatencySlide courtesy of OracleSee Appendix A for system configuration information.7

Accelerating Cassandra with App Direct modeCassandra Front EndReader / Writer ThreadsTable Direct persistence enables synchronous dataprotection and higher availability through fastrestarts.row.rowTree.rowTreerow.rowrowrow Short read lookup code path simplifies thestorage engine and accelerates s datasharded withinPM to reduce.lockingrowthreadTree.QueuerowQueue Multi-TB dataset persisted directly in memoryPM Storage EngineWorkload8x NANDNVMe SSD (1.6TB)12x Intel Optane DC PMEM .0X80/20 Read / Update Mix76,747800491,83144006.4X5.5XWrites (updates)54,013440390,93540007.2X9.0X Adaptive Radix Tree: https://db.in.tum.de/%7Eleis/papers/ART.pdfUp to 7X client thread & 8X throughput1with Intel Optane DC Persistent Memory vs NVMe-based solutionSee Aopendices B, C and D for system configuration information.Performance results are based on testing as of 2/20/2019 and may not reflect all publicly available security updates. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.8

Backup

Appendix AOracle Exadata Example & Config Details10xlower latency10Slide courtesy of OracleFor more complete information about performance and benchmark results, visit www.intel.com/benchmarks.

Appendix B - Cassandra config (1)ParameterTest byTest datePlatform# Nodes# SocketsCPUCores/socket, Threads/socketucodeHTTurboBIOS versionDCPMM BKC versionDCPMM FW versionSystem DDR Mem Config: slots / cap / run-speedSystem DCPMM Config: slots / cap / run-speedTotal Memory/Node (DDR, DCPMM)Storage - bootStorage - application drivesNICSoftwareOSKernelMitigation log attachedDCPMM modeRun MethodIterations and result choiceDataset sizeWorkload & versionCompilerLibrariesOther SW (Frameworks, Topologies )NVMeIntel/Java Performance C620.86B.0D.01.0286.011120190816NANA12 slots / 16GB / 2666192GB, 01x Intel 800GB SSD OS Drive4x P4610 1.6TB NVMe1x Intel X722DCPMMIntel/Java Performance C620.86B.0D.01.0286.011120190816WW52 -2018531812 slots / 16GB / 266612 slots / 512GB192GB, 6TB1x Intel 800GB SSD OS Drive12x512GB DCPMM1x Intel X722Red Hat Enterprise Linux Server 7.64.19.0 (64bit)YesNA5 minute warm up post boot, then startperformance recording3 iterations, medianTwo 1.5 Billion Partitions (Insanity schema)Read Only, Mix 80% Read/20% Updates,Updates onlyANT 1.9.4 compiler for CassandraNANARed Hat Enterprise Linux Server 7.64.19.0 (64bit)YesApp Direct, Persistent Memory5 minute warm up post boot, then startperformance recording3 iterations, medianTwo 1.5 Billion Partitions (Insanity schema)Read Only, Mix 80% Read/20% Updates,Updates onlyANT 1.9.4 compiler for CassandraPMDK 1.5, LLPL (latest as of 2/20/1019)NAPerformance results are based on testingas of 2/20/2019 and may not reflect allpublicly available security updates. Formore complete information aboutperformance and benchmark results,visit www.intel.com/benchmarks.11

Appendix c - Cassandra config (2)SoftwareVersionCassandra SettingsValueCassandraNVME uses 3.11.3 released version,DCPMM uses 4.0 trunk with persistentmemory a/tree/13981llpl engineYaml modificationsconcurrent read/concurrent write 168/168 for DCPMMconcurrent read/concurrent write 56/32 for NVMEJvm.options (commentout CMS section in file)-Xms64G –Xmx64G –Xmn48G for DCPMM, no read cache-Xms32G –Xmx32G –Xmn24G for NVME, more read cache-XX: UseAdaptiveSizePolicy for bothPMDK1.5LLPLhttps://github.com/pmem/llpl/ pulled 2/20/19Number of CassandraProcesses, DataBases,Clusters2 independent Cassandra processes each with adatabase, each process running 1 node clusterconfigurationJavaJava SE Runtime Environment 1.8.0 201Java HotSpot 64-bit Server VM (build25.201)Cassandra Databaseper Applicationcqlstress-insanity-example.yaml schema, with 1.5Billion partition per database(3.0 Billion Total)Cassandra Applicationpinned to CPUnumactl -m 0 -C 0-27,56-83 for socket 0numactl -m 1 -C 28-55,84-111 for socket 1Cassandra-StressCommand to PopulateDatabasecassandra-stress userprofile CASSANDRA HOME/tools/cqlstress-insanityexample.yaml ops\(insert 1\) n 1500000000 cl ONEno-warmup -pop seq 1.1500000000 -mode native cql3-node ip addr -rate threads variable Cassandra-StressCommand to ReadDatabasecassandra-stress userprofile CASSANDRA HOME/tools/cqlstress-insanityexample.yaml ops\(simple1 1\) duration 30m cl ONE-pop dist UNIFORM\(1.1500000000\) -mode nativecql3 –node ip addr -rate threads variable atastax rformance results are based on testing as of 2/20/2019 and may not reflect all publicly availablesecurity updates. For more complete information about performance and benchmark results,visit www.intel.com/benchmarks.

Appendix d - Cassandra Result Summary Methodology: Adjust the Cassandra-stress load (number of client threads) to get maximum throughput where the 99th latency is less than 20ms. This method has been accepted by our partners (Apple, NetFlix and others). Two different way of classifying the speed up: Increased throughput speedup, for example maximum seen for the read workload of 8.13 times more throughput with DCPMM vsNVMe Increased number of supported clients threads, for example maximum seen for the update workload of 9.09 times more clientthreads supported for similar SLA with DCPMM vs NVMEWorkloadNVMeThroughput(op/sec)Read66,018Mix (80/20)UpdateNVMelatency(ms)NVMeclient load(# threads)DCPMMThroughput(op/sec)DCPMM99th latency(ms)DCPMMclient load(# threads)ThroughputSpeedupwith DCPMMClient LoadIncrease 6.140007.23X9.09X99thPerformance results are based on testing as of 2/20/2019 and may not reflect all publicly available security updates.For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.13

Exadata Storage Servers will add Persistent Memory Accelerator in front of Flash memory RDMA bypasses the software stack, giving 10X faster access latency to remove Persistent Memory Persistent Memory mirrored across storage servers for fault-tolerance Persistent memory used as a shared cache effectively increases its