Parallel Computer Architectures

Transcription

Parallel ComputerArchitecturesChapter 8Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Parallel Computer Architectures(a) On-chip parallelism. (b) A coprocessor. (c) A multiprocessor.(d) A multicomputer. (e) A grid.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Instruction-Level Parallelism(a) A CPU pipeline. (b) A sequence of VLIW instructions.(c) An instruction stream with bundles marked.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

The TriMedia VLIW CPU (1)A typical TriMedia instruction, showing five possible operations.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

The TriMedia VLIW CPU (2)The TM3260 functional units, their quantity, latency,and which instruction slots they can use.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

The TriMedia VLIW CPU (3)The major groups of TriMedia custom operations.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

The TriMedia VLIW CPU (4)(a) An array of 8-bit elements. (b) The transposed array.(c) The original array fetched into four registers.(d) The transposed array in four registers.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

On-Chip Multithreading (1)(a) – (c) Three threads. The empty boxes indicated that the threadhas stalled waiting for memory. (d) Fine-grained multithreading.(e) Coarse-grained multithreading.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

On-Chip Multithreading (2)Multithreading with a dual-issue superscalar CPU.(a) Fine-grained multithreading.(b) Coarse-grained multithreading.(c) Simultaneous multithreading.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Hyperthreading on the Pentium 4Resource sharing between threads in thePentium 4 NetBurst microarchitecture.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Homogeneous Multiprocessors on a ChipSingle-chip multiprocessors.(a) A dual-pipeline chip. (b) A chip with two cores.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Heterogeneous Multiprocessors on a Chip (1)The logical structure of a simple DVD player contains a heterogeneousmultiprocessor containing multiple cores for different functions.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Heterogeneous Multiprocessors on a Chip (2)An example of the IBM CoreConnect architecture.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Introduction to Networking (1)How users are connected to servers on the Internet.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Introduction to Networking (2)A packet as it appears on the Ethernet.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Introduction to Network ProcessorsA typical network processor board and chip.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Nexperia Media ProcessorThe Nexperia heterogeneous multiprocessor on a chip.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Multiprocessors(a) A multiprocessor with 16 CPUs sharing a common memory.(b) An image partitioned into 16 sections, each being analyzedby a different CPU.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Multicomputers (1)(a) A multicomputer with 16 CPUs, each with its own private memory.(b) The bit-map image of Fig. 8-17 split up among the 16 memories.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Multicomputers (2)Various layers where shared memory can be implemented. (a) Thehardware. (b) The operating system. (c) The language runtime system.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Taxonomy of Parallel Computers (1)Flynn’s taxonomy of parallel computers.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Taxonomy of Parallel Computers (2)A taxonomy of parallel computers.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Sequential Consistency(a) Two CPUs writing and two CPUs reading a common memoryword. (b) - (d) Three possible ways the two writes and fourreads might be interleaved in time.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Weak ConsistencyWeakly consistent memory uses synchronization operations todivide time into sequential epochs.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

UMA Symmetric Multiprocessor ArchitecturesThree bus-based multiprocessors. (a) Without caching. (b) Withcaching. (c) With caching and private memories.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Snooping CachesThe write through cache coherence protocol.The empty boxes indicate that no action is taken.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

The MESI Cache Coherence ProtocolThe MESI cache coherence protocol.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

UMA Multiprocessors Using Crossbar Switches(a) An 8 8 crossbar switch.(b) An open crosspoint.(c) A closed crosspoint.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

UMA Multiprocessors Using Multistage SwitchingNetworks (1)(a) A 2 2 switch.(b) A message format.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

UMA Multiprocessors Using Multistage SwitchingNetworks (2)An omega switching network.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

NUMA MultiprocessorsA NUMA machine based on two levels of buses. The Cm* wasthe first multiprocessor to use this design.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Cache Coherent NUMA Multiprocessors(a) A 256-node directory-based multiprocessor. (b) Division of a 32-bitmemory address into fields. (c) The directory at node 36.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Sun Fire E25K NUMA Multiprocessor (1)The Sun Microsystems E25K multiprocessor.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

The Sun Fire E25K NUMA Multiprocessor (2)The SunFire E25K uses a four-level interconnect. Dashed linesare address paths. Solid lines are data paths.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Message-Passing MulticomputersA generic multicomputer.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

TopologyVarious topologies. The heavy dots represent switches. The CPUsand memories are not shown. (a) A star. (b) A complete interconnect.(c) A tree. (d) A ring. (e) A grid. (f) A double torus.(g) A cube. (h) A 4D hypercube.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

BlueGene (1)The BlueGene/L custom processor chip.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

BlueGene (2)The BlueGene/L. (a) Chip. (b) Card. (c) Board.(d) Cabinet. (e) System.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Red Storm (1)Packaging of the Red Storm components.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Red Storm (2)The Red Storm system as viewed from above.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

A Comparison of BlueGene/L and Red StormA comparison ofBlueGene/L andRed Storm.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Google (1)Processing of a Google query.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Google (2)A typical Googlecluster.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

SchedulingScheduling a cluster. (a) FIFO. (b) Without head-of-line blocking.(c) Tiling. The shaded areas indicate idle CPUs.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Distributed Shared Memory (1)A virtual address space consisting of 16 pagesspread over four nodes of a multicomputer.(a) The initial situation. .Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Distributed Shared Memory (2)A virtual address space consisting of 16 pagesspread over four nodes of a multicomputer. (b) After CPU 0 references page 10. Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Distributed Shared Memory (3)A virtual address space consisting of 16 pagesspread over four nodes of a multicomputer. (c) After CPU 1 references page 10, here assumed to be a read-only page.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

LindaThree Linda tuples.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

OrcaA simplified ORCA stack object, with internal data and twooperations.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Software Metrics (1)Real programs achieve less than the perfect speedupindicated by the dotted line.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Software Metrics (2)(a) A program has a sequential part and a parallelizable part.(b) Effect of running part of the program in parallel.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Achieving High Performance(a) A 4-CPU bus-based system. (b) A 16-CPU bus-based system.(c) A 4-CPU grid-based system. (d) A 16-CPU grid-based system.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Grid ComputingThe grid layers.Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights reserved. 0-13-148521-0

Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc. All rights r