History And Overview Of High Performance Computing - Gordon College

Transcription

History and overview of high performance computingCPS343Parallel and High Performance ComputingSpring 2020CPS343 (Parallel and HPC)History and overview of high performance computingSpring 20201 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 20202 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 20203 / 49

Military-driven evolution of supercomputingThe development of the modern supercomputer was largely driven bymilitary needs.World War IIhand-computed artillery tables used during war; led to development inthe USA of ENIAC (Electronic Numerical Integrator and Computer)from 1943 to 1946.Nazi codes like Enigma were cracked in large part by machines Bombeand Colossus in the UK.Cold WarNuclear weapon designAircraft, submarine, etc. designIntelligence gathering and processingCode breakingCPS343 (Parallel and HPC)History and overview of high performance computingSpring 20204 / 49

ENIAC, 1943The “ElectronicNumericalIntegrator andComputer” wasthe first storedprogram electroniccomputer.Designed and builtat the Universityof Pennsylvaniafrom 1943 to1946, it wastransferred toMaryland in 1947and remained inoperation until1955.CPS343 (Parallel and HPC)Image source: /eniac3.gifHistory and overview of high performance computingSpring 20205 / 49

CDC 6600, 1964The Control Data Corporation’s CDC 6600 could do 500 KFLOP/s up to1 MFLOP/s and was the first computer dubbed a “supercomputer.”Image source: http://en.wikipedia.org/wiki/File:CDC 6600.jc.jpgCPS343 (Parallel and HPC)History and overview of high performance computingSpring 20206 / 49

ILLIAC-IV, 1966-1976ILLIAC-IV was the forth Illinois AutomaticComputer.design began in 1966;goal was 1 GFLOP/sand estimated cost was 8 million“finished” in 1971-1972at a cost of 31 millionand a top speed wellbelow the 1 GFLOP/sgoaldesigned to be a parallelcomputer with lineararray of 256 64-bitprocessing elements, thefinal computer had onlya fraction of this amountImage source: http://en.wikipedia.org/wiki/File:ILLIAC 4 parallel computer.jpgCPS343 (Parallel and HPC)History and overview of high performance computingSpring 20207 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 20208 / 49

Cray 1, 1976Seymour Cray, the architect of the6600 and other CDC computers,started his own company, CrayResearch Inc., in 1972.Cray 1scalar and vector processor,80 MHz clock, capable of133 MFLOP/s for in normal use. 5 to 8 million20-ton compressor for Freoncooling systemshape facilitated shorter wirelengths, increasing clock speedImage source: -museum.jpgCPS343 (Parallel and HPC)History and overview of high performance computingSpring 20209 / 49

Cray X-MP (1982) and Y-MP (1988)Cray X-MPCray Y-MP105 MHztwo vector processors, eachcapable of 200 MFLOPSmemory shared by all processors167 MHz2, 4, or 8 vector processors,each capable of 333 MFLOPSshared memory for all processorsImage sources: -computing-in-the-80s/ S343 (Parallel and HPC)History and overview of high performance computingSpring 202010 / 49

Vector computersDuring the 1980s speed in supercomputers was primarily achievedthrough two mechanisms:12vector processors: these were designed using a pipeline architectureto rapidly perform a single floating point operation on a large amountof data. Achieving high performance depended on data arriving in theprocessing unit in an uninterrupted stream.shared memory multiprocessing: a small number (up to 8)processors with access to the same memory space. Interprocesscommunication took place via the shared memory.Cray was not the only player, other companies like Convex and Alliantmarketed vector supercomputers during the 1980s.Some believed vector processing this was a better approach to HPCthan using many smaller processors. Seymour Cray famously said ‘Ifyou were plowing a field, which would you rather use? Two strongoxen or 1024 chickens?’CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202011 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202012 / 49

Distributed memory computersThe age of truly effective parallel computers had begun, but wasalready limited by access to shared memory.Memory contention was a major impediment to increasing speed; thevector processors required high-speed access to memory but multipleprocessors working simultaneously created contention for memorythat reduced access speed.Vector processing worked well with 4 or 8 processors, but memorycontention would prevent a 64 or 128 processor machine from workingefficiently.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202013 / 49

Distributed memory computersThe alternative to shared memory is distributed memory, where eachprocessor has a dedicated memory space.The challenge became implementing effective processescommunication – processes can’t communicate with one another bywriting data into shared memory; a message must be passed.During the 1980s there began to be a lot of interest in distributedmemory computers.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202014 / 49

Intel iPSC Hypercube, 1985between 32 and 128 nodeseach node has80286 processor80287 math co-processor512K of RAM8 Ethernet ports (7 for othercompute nodes, one forcontrol node)used hypercube connectionscheme between processorsbase model was 5-dimensionhypercube (25 32 processors)superseded by iPSC/2 in 1987Image source: 588/CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202015 / 49

Thinking Machines’ Connection MachinesThe CM-1 was a massivelyparallel computer with 65,536SIMD processing elementsarranged in a hypercubeEach element was composed ofa 1-bit processor and 4Kbits ofRAM (the CM-2 upped this to64 Kbits)CM-5, a MIMD computer witha different network topology,was at the top of the first Top500 list in 1993. Located at LosAlamos National Lab, it had1024 processors and was capableof 59.7 upercomputers/10/73CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202016 / 49

Beowulf Clusters, 1994-presentIn 1994 Donald Becker and TomStirling, both at NASA, built acluster using available PCs andnetworking hardware.16 Intel 486DX PCs connectedwith 10 Mb/s EthernetAchieved 1 GFLOP/s on 50,000 systemNamed Beowulf by Stirling inreference to a quote in sometranslations of the epic poemBeowulf that says ’Because myheart is pure, I have thestrength of a thousand men.’Image source: http://www.ligo.caltech.edu/LIGO web/0203news/0203mit.htmlCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202017 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202018 / 49

Hybrid clusters, 2000–presentDuring the 2000s the trend of increasing processor speeds wasreplaced with increasing the number of processor coresThis led to hybrid clusters with a large number of processors, eachwith a small number of core sharing RAM and some cache spacewith the development of GPU and other general purpose acceleratorhardware, today’s top supercomputers are hybrid clusters witha large number of standard processor nodeseach node has a multicore processor with some individual cache, someshared cache, and RAM shared by all coressome number of GPU or other accelerators that are used for offloadingspecific types of computation from the CPU nodesComputation speed often limited by data-movement rateCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202019 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202020 / 49

The Top 500 ListThe site www.top500.org maintains a list of fastest computers in theworld, according to a particular benchmark program. A new list comes outevery June and November. As of the November 2019 the list starts with:CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202021 / 49

#1 – 148.6 PFLOPs – Summit: IBM Power System AC922Location: Oak Ridge National Laboratory, Tennessee USA4,608 compute nodestwo IBM POWER9 processors and 512GB DDR4 RAM per nodesix NVIDIA Volta V100 GPUs with 640 Tensor cores, 5,120 CUDAcores, and 96GB HBM2 RAM per nodeInterconnect: Mellanox EDR 100G InfiniBandCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202022 / 49

#2 – 94.6 PFLOPs – Sierra: IBM Power System AC922Location: Lawrence Livermore National Laboratory, California USA4,320 compute nodestwo IBM POWER9 processors with 256GB DDR4 RAM per nodefour NVIDIA Volta V100 GPUs with 640 Tensor cores, 5,120 CUDAcores, and 64GB HBM2 RAM per nodeInterconnect: Mellanox EDR 100G InfiniBandImage source: sierraLLNL.width-880.pngCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202023 / 49

#3 – 93.0 PFLOPs – Sunway TaihuLight: Sunway MPPLocation: National Supercomputing Center, Wuxi ChinaPurpose: research and engineering workCores: 10,649,600Memory: 1,310,720 GBProcessor: Sunway SW26010 260C 1.45GHzInterconnect: SunwayImage source: jpgCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202024 / 49

HPCG: a new benchmarkThe benchmark used for the Top 500 list is the High PerformanceLINPACK (HPL) benchmark and is based on a variant of LU factorizationwith row partial pivoting. The main criticism of the HPL benchmark isthat this problem is not representative of many important applications.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202025 / 49

HPCG: a new benchmarkThe benchmark used for the Top 500 list is the High PerformanceLINPACK (HPL) benchmark and is based on a variant of LU factorizationwith row partial pivoting. The main criticism of the HPL benchmark isthat this problem is not representative of many important applications.A new benchmark, called the High Performance Conjugate Gradients(HPCG) benchmark (www.hpcg-benchmark.org), is an effort to create anew metric for ranking HPC systems and is intended as a complement tothe High Performance LINPACK (HPL) benchmark.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202025 / 49

HPCG: a new benchmarkThe benchmark used for the Top 500 list is the High PerformanceLINPACK (HPL) benchmark and is based on a variant of LU factorizationwith row partial pivoting. The main criticism of the HPL benchmark isthat this problem is not representative of many important applications.A new benchmark, called the High Performance Conjugate Gradients(HPCG) benchmark (www.hpcg-benchmark.org), is an effort to create anew metric for ranking HPC systems and is intended as a complement tothe High Performance LINPACK (HPL) benchmark.The HPCG benchmark is designed to exercise computational and dataaccess patterns that more closely match a broad set of importantapplications, and to give incentive to computer system designers to investin capabilities that will have impact on the collective performance of theseapplications.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202025 / 49

HPCG: a new benchmarkNovember 2019 HPCG Results from https://www.top500.org:CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202026 / 49

Supercomputers use a lot of power!Summit draws 10,096 kW and uses 88 GWh per year.Sierra draws 7,438 kW and uses 65 GWh per year.TaihuLight draws 15,371 kW and uses 135 GWh per year.1http://www.eia.gov/tools/faqs/faq.cfm?id 97&t 3CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202027 / 49

Supercomputers use a lot of power!Summit draws 10,096 kW and uses 88 GWh per year.Sierra draws 7,438 kW and uses 65 GWh per year.TaihuLight draws 15,371 kW and uses 135 GWh per year.According to the U.S. Energy Information Administration, in 2018 theaverage U.S. home used 10,972 kWh of electricity per year.1Summit uses enough energy to power 8,061 U.S. homes.TaihuLight needs about as much power as 12,272 U.S. homes!1http://www.eia.gov/tools/faqs/faq.cfm?id 97&t 3CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202027 / 49

Supercomputers use a lot of power!Summit draws 10,096 kW and uses 88 GWh per year.Sierra draws 7,438 kW and uses 65 GWh per year.TaihuLight draws 15,371 kW and uses 135 GWh per year.According to the U.S. Energy Information Administration, in 2018 theaverage U.S. home used 10,972 kWh of electricity per year.1Summit uses enough energy to power 8,061 U.S. homes.TaihuLight needs about as much power as 12,272 U.S. homes!Using 0.12 per kWh, the cost of running Summit is just over 10.6million per year; the annual cost for TaihuLight is over 16.1million per year! This does not include the cost of cooling.!The Massachusetts Green High Performance Computing Center inHolyoke draws 10,000 kW, mostly hydroelectric power.1http://www.eia.gov/tools/faqs/faq.cfm?id 97&t 3CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202027 / 49

The Green 500 ListIt is becoming more common to talk about HPC systems in terms of theirMFLOPs per watt as a measure of how efficient they are.The top 5 of the November 2019 Top 500 list:SystemSummit (ORNL)Sierra (LLNL)TaihuLight 12.72396.0513Total power (kW)10,0967,4388,209The top 3 of the November 2019 Green 500 list:SystemA64FX (Japan)NA-1 (Japan)AiMOS (RPI)CPS343 (Parallel and .771Total power (kW)11880510History and overview of high performance computingSpring 202028 / 49

The Green 500 ListNovember 2019 Green500 Results from https://www.top500.org:CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202029 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202030 / 49

Grand ChallengesQuote from: National Science Foundation Advisory Committee forCyberinfrastructure Task Force on Grand Challenges, Final Report, March2011:The “Grand Challenges” were U.S. policy terms set in the1980’s as goals for funding high-performance computing and communications research in response to foreign competition. Theywere described as “fundamental problems of science and engineering, with broad applications, whose solution would be enabled byhigh-performance computing resources.”Today, the Grand Challenges are interpreted in a much broadersense with the realization that they cannot be solved by advancesin HPC alone: they also require extraordinary breakthroughs incomputational models, algorithms, data and visualization technologies, software, and collaborative organizations uniting diversedisciplines.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202031 / 49

Grand ChallengesAccording to the NFS Grand Challenges final report, some of theimportant Grand Challenges are:Advanced New MaterialsPrediction of Climate ChangeUnderstanding BiologicalSystemsQuantum Chromodynamics andCondensed Matter TheoryNew Combustion SystemsSemiconductor Design andManufacturingHazard Analysis andManagementAssembling the Tree of LifeHuman Sciences and PolicyDrug Design and DevelopmentVirtual Product DesignEnergy through FusionCancer Detection and TherapyWater SustainabilityCO2 SequestrationCPS343 (Parallel and HPC)Astronomy and CosmologyHistory and overview of high performance computingSpring 202032 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202033 / 49

ParallelismThe following slides are shamelessly borrowed from Henry Neeman’spresentation “What the heck is Supercomputing?” given at the NCISsponsored Introduction to Parallel and Cluster Computing at OklahomaUniversity, summer 2012.These slides use a simple metaphor to introduce several key issues (lookfor the bold underlined terms) in parallel computing that we will need todeal with during our course.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202034 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202035 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202036 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202037 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202038 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202039 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202040 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202041 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202042 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202043 / 49

CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202044 / 49

Outline1High-Performance Computing in the modern computing era1940s–1960s: the first supercomputers1975–1990: the Cray era1990–2010: the cluster era2000–present; the GPU and hybrid era2What does high performance computing look like today?Top ListsGrand Challenge ProblemsParallelismSome things to think aboutCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202045 / 49

Why bother with HPC?Making effective use of HPC takes quite a bit of effort; both learninghow to use it and in developing software.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202046 / 49

Why bother with HPC?Making effective use of HPC takes quite a bit of effort; both learninghow to use it and in developing software.It seems like a lot of trouble just to get code to run faster.CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202046 / 49

Why bother with HPC?Making effective use of HPC takes quite a bit of effort; both learninghow to use it and in developing software.It seems like a lot of trouble just to get code to run faster.Sure, it’s nice to speed up code that normally runs for 24 hours sothat it runs in 1 hour, but if you can afford to wait a day for theresult, why bother with HPC?CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202046 / 49

.because it’s worth it (usually)In many cases, HPC is worth pursuing becauseHPC provides the ability (unavailable elsewhere) to solve bigger andmore exciting problems: You can tackle bigger problems in thesame amount of time and/or you can solve the same sizedproblems in less time.What happens in HPC today will be on your desktop (or in yourpocket) in about 10 to 15 years – people working in HPC are aheadof the curve!CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202047 / 49

A dividend of high performance computing.And now for something completely different.Question: What historicallysignificant web applicationwas developed and releasedby the National Center forSupercomputing Applicationsin 1993?CPS343 (Parallel and HPC)History and overview of high performance computingSpring 202048 / 49

A dividend of high performance computing.And now for something completely different.Question: What historicallysignificant web applicationwas developed and releasedby the National Center forSupercomputing Applicationsin 1993?Answer: The NCSA MosaicWeb Browser. This was theprecursor of all browsers suchas Netscape, Firefox, Safari,Chrome, etc.Image source: c plaque.jpgCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202048 / 49

AcknowledgementsSome material used in creating these slides comes fromJohnnie W. Baker, CS 4/59995 Parallel Programming, Kent StateUniversityHenry Neeman, University of Oklahoma Supercomputing CenterOak Ridge and Lawerance Livermore National LaboratoriesLecture on history of supercomputing from 2009 offering ofCMPE113 by Andrea Di Blas, University of California Santa Cruz.Wikipedia provided a starting point and led to many pages from onwhich facts and images were found.Don Becker: The Inside Story of the Beowulf SagaCPS343 (Parallel and HPC)History and overview of high performance computingSpring 202049 / 49

1 High-Performance Computing in the modern computing era 1940s{1960s: the rst supercomputers 1975{1990: the Cray era 1990{2010: the cluster era 2000{present; the GPU and hybrid era 2 What does high performance computing look like today? Top Lists Grand Challenge Problems Parallelism Some things to think about