Dgx Systems: Deep Learning From Desk To Data Center - Nvidia

Transcription

DGX SYSTEMS: DEEP LEARNINGFROM DESK TO DATA CENTERMarkus Weber and Haiduong Vo

NVIDIA DGX SYSTEMSAgenda NVIDIA DGX-1 NVIDIA DGX STATION2

ONE YEAR LATER – NVIDIA DGX-1Barriers Toppled, the Unsolvable Solved – a Sampling of DGX-1 ImpactDGX-1 launchUC 20162017OpenAINYUMass. General Hosp.DFKIDSIAMicrosoftSAPNVIDIASATURNV launchFacebook3

INTRODUCING THE DGX FAMILYAI WORKSTATIONAI DATA CENTERCLOUD-SCALE AIDGX StationDGX-1NVIDIA GPU CloudwithwithwithCloud platform with the highestdeep learning efficiencyTesla V100Tesla P100Tesla V100The PersonalAI SupercomputerThe World’s FirstAI Supercomputerin a BoxThe EssentialInstrument for AIResearch4

NVIDIA DGX-1 OVERVIEW5

USERS AND CUSTOMERS FOR NVIDIA DGX-1DATA SCIENTISTS ANDAI RESEARCHERSCIO, CTO, CMO,LINE OF BUSINESS (LOB)IT DIRECTORSAND MANAGERSWhy use NVIDIA DGX-1?Why buy NVIDIA DGX-1?Why add NVIDIA DGX-1into your datacenter?Reduce DL training timeExtract actionable insightsAnalyze and visualizevast amount of dataCreate new businessopportunitiesCut infrastructure footprintby 400x and reducecost by 20xAccelerate deeplearning frameworksTurn huge amounts of datainto extreme valueDesign more sophisticatedneural networksReduce power andcooling costsSave installation andconfiguration time6

NVIDIA DGX-1 WITH VOLTAHighest Performance, Fully Integrated HW System7 TB SSD8 x Tesla V100 16GB1 PetaFLOPS 8x Tesla V100 16GB 300 GB/s NVLink Hybrid Cube Mesh2x Xeon 7 TB RAID 0 Quad IB 100Gbps, Dual 10GbE 3U — 3200W7

NEXT-GENERATION NVIDIA NVLINK FOR V100300 GB/sec per GPU, 10x Faster than PCIe Gen3NVLink for Tesla VoltaNVLink for Tesla PascalP100P100P100P100P100P100P100P1002 Rings3 Rings8

NVIDIA DGX-1Customer Momentum9

RIKEN SUCCESS STORYFujitsu and NVIDIA Build AI Supercomputer With 24 DGX-1sCHALLENGESSOLUTIONIMPACTEnterprises and researchorganizations embracing AI/DLPartnered with Fujitsu for scale-outAI architecture built on DGX-1Accelerated real-worldimplementation of scale-out AINeeded to accelerated researchin areas including medicine,manufacturing and healthcare24 DGX-1’s deliver 4 petaflopspowering the RIKEN supercomputerEnables RIKEN team to takeadvantage of next-gen DLalgorithmsNVIDIA Cloud Services streamlinesAI researcher workflow, helpingaccelerate RIKEN productivityHelping create future in which AIfinds solutions to societal issuesConventional HPC architecturestoo costly and inefficient10

MASS. GENERAL HOSPITAL SUCCESS STORYMan, Machine & Medicine: AI-Powered Research at MGHCHALLENGESSOLUTIONIMPACTClinical data science centerneeded to apply ML to medicine1st medical institute in the worldto leverage the DGX-1New prostrate cancer pathologydeveloped on DGX-1 in 6 monthsData volume requires immensecomputational capacity to processCenter for Clinical Data Sciencesexpands to partner hosp. (3X data)AI/DL becomes critical tool inphysician’s toolkit in 5-10 yearsImmediate applications includeradiology to improve accuracy,reduce variationDeployment has grown to scale-outarchitecture with 4 DGX-1’sAdvancements in diagnostics,genomics and genetics11

SMARTER SYSTEMS FOR AI ASSISTED RADIOLOGYCenter for Clinical Data Science Received First DGX Systems with Volta“Today’s practitioners have abarrage of data thrown at them —lab reports, MRIs, CAT scans,family health histories and more —which makes it incredibly difficultto make decisions. So, havingtechnology that can aid them inthis effort can be incrediblytransformative.”--Dr. Mark Michalski, CCDSExecutive Director12

BENEVOLENTAI: TRAINING REDUCED TO DAYSTechnology Review Article on DGX-1:The Pint-Sized Supercomputer That Companies Are Scrambling to ing-to-get/SYSTEM INSTALLATIONTRAINING MODELSNVIDIA DGX-1Other GPU SystemDaysWeeks of TrainingDGX-1“The cost of renting enough servers on AmazonWeb Services would surpass the system’s 129,000 price tag within a year.”-Jackie Hunter, CEO, BenevolentAI3x-4xFASTER TRAINING13

DGX STATION OVERVIEW14

INTRODUCINGNVIDIA DGX STATIONThe Fastest Personal Supercomputerfor Researchers and Data ScientistsGroundbreaking AI – at your deskRevolutionary form factor designed for the desk, whisper-quietStart experimenting in hours,not weeks, powered by DGX StackProductivity that goes from deskto data center to cloudBreakthrough performance andprecision – powered by Volta15

The Personal AI Supercomputerfor Researchers and Data ScientistsINTRODUCINGNVIDIA DGX STATIONGroundbreaking AI – at your deskKey Features6512431.4 x NVIDIA Tesla V100 GPU2.2nd-gen NVLink (4-way)3.Water-cooled design4.3 x DisplayPort (4K resolution)5.Intel Xeon E5-2698 20-core6.256GB DDR4 RAM16

IONSAt a GlanceGPUs4x NVIDIA Tesla V100TFLOPS (GPU FP16)500GPU Memory16 GB per GPUNVIDIA Tensor Cores2,560 (total)NVIDIA CUDA Cores20,480 (total)CPUIntel Xeon E5-2698 v4 2.2 GHz (20-core)System Memory256 GB LRDIMM DDR4StorageData: 3 x 1.92 TB SSD RAID 0OS: 1 x 1.92 TB SSDNetworkDual 10GBASE-T LAN (RJ45)Display3x DisplayPort, 4K ResolutionAdditional Ports2x eSATA, 2x USB 3.1, 4x USB 3.0Acoustics 35 dBMaximum Power Requirements1500 WOperating Temperature Range10 - 30 oCSoftwareUbuntu Desktop Linux OSDGX Recommended GPU DriverCUDA Toolkit17

3X FASTER THANTHE FASTESTWORKSTATIONSSupercomputing performanceat your desk500 TFLOPSWater-cooled performance – the onlyworkstation built on 4 Tesla V100’s3X3X the performance of today’sfastest GPU workstations30%with 30% faster trainingover non-DGX stack solutions5X5X increase in I/O performancewith 4-way NVLinkvs. PCIe-connected GPU’s18

DEEP LEARNING FRAMEWORKSDEEP LEARNING USER SOFTWARENVIDIA DIGITS NVIDIA DGXSOFTWARE STACKFully Integrated Software forInstant ProductivityAdvantages:Instant productivity with NVIDIAoptimized deep learning frameworksCONTAINERIZATION TOOLNVIDIA DockerGPU DRIVERNVIDIA DriverSYSTEMHost OSDGX SOFTWARE STACKCaffe, CNTK, MXNet, PyTorch, TensorFlow,Theano, and TorchPerformance optimized acrossthe entire stackFaster Time-to-Insight with pre-built, tested,and ready to run framework containersFlexibility to use different versionsof libraries like libc, cuDNN in eachframework container19

THE POWER TO RUN MULTIPLEFRAMEWORKS AT ONCEContainer Images portable across new driver versionsContainerized ApplicationsNVIDIA DockerNVIDIA DockerNVIDIA DockerNVIDIA DockerNVIDIA Docker.MicrosoftCognitiveToolkitOtherFrameworksand AppsTF Tuned SWCNTK Tuned SWCaffe2 Tuned SWPytorch Tuned SWTuned SWCUDA RTCUDA RTCUDA RTCUDA RTCUDA RTLinux Kernel CUDA DriverNVIDIA DGX Systems20

DL FROM DEVELOPMENT TO PRODUCTIONAccelerated Deep Learning Value with DGX SolutionsDGX StationDGX-1/SATURNV/CloudTo Data CenterorTo CloudFrom DeskinstalledFast Bring-upProcureDGXStationInstall aining at ScaleExperimentTune/OptimizeDeployTrainInsights21

GE/AVITAS ENHANCE AI FOR ROBOTIC INSPECTIONAND AUTOMATED DEFECT RECOGNITIONPowered by DGX-1 and DGX Station22

NVIDIA DGXSYSTEMSFaster AI Innovationand InsightThe World’s First Portfolio ofPurpose-Built AI SupercomputersGet Started in AI – FasterEffortless ProductivityPerformance Without CompromiseFor More Information: nvidia.com/dgx-systems23

SPECIFICATIONS At a Glance GPUs 4x NVIDIA Tesla V100 TFLOPS (GPU FP16) 500 GPU Memory 16 GB per GPU NVIDIA Tensor Cores 2,560 (total) NVIDIA CUDA Cores 20,480 (total) CPU Intel Xeon E5-2698 v4 2.2 GHz (20-core) System Memory 256 GB LRDIMM DDR4 Storage Data: 3 x 1.92 TB SSD RAID 0 OS: 1 x 1.92 TB SSD Network Dual 10GBASE-T LAN (RJ45)