Transcription
Digital IntegratedCircuitsA Design PerspectiveJan M. RabaeyAnantha ChandrakasanBorivoje NikolicArithmetic CircuitsJanuary, 2003EE141 Integrated DigitalCircuits 2nd1Arithmetic Circuits
A Generic Digital ProcessorINPUT-OUTPUTMEMORYCONTROLDATAPATHEE141 Integrated DigitalCircuits 2nd2Arithmetic Circuits
Building Blocks for Digital ArchitecturesArithmetic unit- Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.)Memory- RAM, ROM, Buffers, Shift registersControl- Finite state machine (PLA, random logic.)- CountersInterconnect- Switches- Arbiters- BusEE141 Integrated DigitalCircuits 2nd3Arithmetic Circuits
ag64CARRYGENSUMSELnode12-1 Mux9-1 Muxck1SUMGEN LUbREG5-1 Mux9-1 MuxAn Intel Microprocessorsumsumbto Caches0s1LU : LogicalUnit1000umItanium has 6 integer execution units like thisEE141 Integrated DigitalCircuits 2nd4Arithmetic Circuits
Bit-Sliced DesignControlBit 2Bit 1Data-OutMultiplexerShifterAdderRegisterData-InBit 3Bit 0Tile identical processing elementsEE141 Integrated DigitalCircuits 2nd5Arithmetic Circuits
Bit-Sliced DatapathFrom register files / Cache / BypassMultiplexersShifterAdder stage 1Adder stage 2WiringBit slice 0Bit slice 1Bit slice 2Bit slice 63Adder stage 3Loopback BusLoopback BusLoopback BusWiringSum SelectTo register files / CacheEE141 Integrated DigitalCircuits 2nd6Arithmetic Circuits
Itanium Integer DatapathFetzer, Orton, ISSCC’022ndEE141 Integrated Circuits Digital7Arithmetic Circuits
AddersEE141 Integrated DigitalCircuits 2nd8Arithmetic Circuits
Full-AdderACinBFulladderCoutSumEE141 Integrated DigitalCircuits 2nd9Arithmetic Circuits
The Binary AdderACinBFulladderCoutSumS A B Ci ABC i ABC i ABCi ABCiC o AB BCi ACiEE141 Integrated DigitalCircuits 2nd10Arithmetic Circuits
Express Sum and Carry as a function of P, G, DDefine 3 new variable which ONLY depend on A, BGenerate (G) ABPropagate (P) A BDelete A BCan also derive expressions for S and Co based on D and PNote that we will be sometimes using an alternate definition forPropagate (P) A BEE141 Integrated DigitalCircuits 2nd11Arithmetic Circuits
The Ripple-Carry 1S2S3( Ci,1)S0Worst case delay linear with the number of bitstd O(N)tadder (N-1)tcarry tsumGoal: Make the fastest possible carry path circuitEE141 Integrated DigitalCircuits 2nd12Arithmetic Circuits
Complimentary Static CMOS Full 8 TransistorsEE141 Integrated DigitalCircuits 2nd13Arithmetic Circuits
Inversion PropertyACiFASEE141 Integrated DigitalABCircuits 2ndCoCiBFACoS14Arithmetic Circuits
Minimize Critical Path by Reducing Inverting StagesEven cellA0B0Ci,0A1B1Co,0A2Odd cellB2Co,1A3B3Co,2Co,3FAFAFAFAS0S1S2S3Exploit Inversion PropertyEE141 Integrated DigitalCircuits 2nd15Arithmetic Circuits
A Better Structure: The Mirror A"1"-PropagateGenerateABBABCiAB24 transistorsEE141 Integrated DigitalCircuits 2nd16Arithmetic Circuits
Mirror AdderStick DiagramVDDABCiBA CiCoCiABCoSGNDEE141 Integrated DigitalCircuits 2nd17Arithmetic Circuits
The Mirror Adder The NMOS and PMOS chains are completely symmetrical.A maximum of two series transistors can be observed in the carrygeneration circuitry. When laying out the cell, the most critical issue is the minimizationof the capacitance at node Co. The reduction of the diffusioncapacitances is particularly important. The capacitance at node Co is composed of four diffusioncapacitances, two internal gate capacitances, and six gatecapacitances in the connecting adder cell . The transistors connected to Ci are placed closest to the output. Only the transistors in the carry stage have to be optimized foroptimal speed. All transistors in the sum stage can be minimalsize.EE141 Integrated DigitalCircuits 2nd18Arithmetic Circuits
Transmission Gate Full AdderPVDDCiAPAAPBVDDCiAPCiS Sum GenerationCiPBVDDAPCo Carry GenerationCiASetupEE141 Integrated DigitalVDDCircuits 2ndP19Arithmetic Circuits
Manchester Carry ChainVDDPiVDDPiCoCiGiCoCiφGiDiPiEE141 Integrated DigitalCircuits 2ndφ20Arithmetic Circuits
Manchester Carry ChainVDDφP0P1P2P3C3Ci,0G1G0G3G2φC0EE141 Integrated DigitalCircuits 2ndC1C2C321Arithmetic Circuits
Manchester Carry ChainStick DiagramPropagate/Generate RowVDDPiCi - 1GiφPi 1Gi 1CiφCi 1GNDInverter/Sum RowEE141 Integrated DigitalCircuits 2nd22Arithmetic Circuits
Carry-Bypass AdderG1Ci,0P0G1C o,0P0FAP2Also calledCarry-SkipFAG2Co,1FAG3C o,3FAG1C o,0P3Co,2FAP0 G1G2Co,1FACi,0P2P3G3BP PoP 1 P2 P 3C o,2FAFAMultiplexerP0Co,3Idea: If (P0 and P1 and P2 and P3 1)then C o3 C 0, else “kill” or “generate”.EE141 Integrated DigitalCircuits 2nd23Arithmetic Circuits
Carry-Bypass Adder (cont.)Bit 0–3Bit 4–7SetuptsetupSetuptbypassBit 8–11Bit arrypropagationCarrypropagationSumSumSumtsumSumM bitstadder tsetup Mtcarry (N/M-1)tbypass (M-1)tcarry tsumEE141 Integrated DigitalCircuits 2nd24Arithmetic Circuits
Carry Ripple versus Carry Bypasstpripple adderbypass adder4.8EE141 Integrated DigitalCircuits 2ndN25Arithmetic Circuits
Carry-Select AdderSetupP,G"0""0" Carry Propagation"1""1" Carry PropagationCo,k-1MultiplexerCo,k 3Carry VectorSum GenerationEE141 Integrated DigitalCircuits 2nd26Arithmetic Circuits
Carry Select Adder: Critical PathBit 0–3Bit 4–7Bit 8–11Bit erSum GenerationSum GenerationSum GenerationSum GenerationS0–3S4–7S8–11S12–15EE141 Integrated DigitalCircuits 2ndCo,1527Arithmetic Circuits
Linear Carry SelectBit 0-3Bit 4-7SetupSetupBit 8-11Bit 12-15SetupSetup(1)"0" Carry"0""0""0" Carry"0""0" Carry"0""0" Carry(1)"1" Carry"1""1" Carry"1"(5)(5)(5)(6)Multiplexer"1" Carry"1"(5)(7)Multiplexer"1" Carry"1"(5)(8)MultiplexerCi,0Multiplexer(9)Sum GenerationS0-3EE141 Integrated DigitalCircuits 2ndSum GenerationSum GenerationS 4-7S8-11Sum GenerationS12-15 (10)28Arithmetic Circuits
Square Root Carry SelectBit 0-1Bit 2-4SetupSetupBit 5-8Bit 9-13SetupSetupBit 14-19(1)"0" Carry"0""0""0" Carry"0""0" Carry"0""0" Carry(1)"1" Carry"1" Carry"1"(3)"1""1" Carry"1"(3)(4)(4)Multiplexer(5)(5)Multiplexer"1" 8)Sum GenerationSum GenerationSum GenerationS2-4S5-8S0-1EE141 Integrated DigitalCircuits 2ndSum GenerationS9-13SumS14-19 (9)29Arithmetic Circuits
Adder Delays - Comparison50Ripple addertp (in unit delays)4030Linear select20100Square root select0204060NEE141 Integrated DigitalCircuits 2nd30Arithmetic Circuits
LookAhead - Basic IdeaA0 , B0Ci,0A1, B1P0 Ci,1S0 P1S1AN-1 , BN-1Ci, N-1 PN-1SN-1C o, k f (A k, B k, Co , k – 1) Gk P k Co, k – 1EE141 Integrated DigitalCircuits 2nd31Arithmetic Circuits
Look-Ahead: TopologyExpanding Lookahead equations:Co,kVDD G k P k (Gk – 1 P k – 1 Co k – 2),G3G2All the way:G1C o, k G k Pk ( G k – 1 P k – 1( P 1( G 0 P 0 Ci , 0)))G0Ci,0Co,3P0P1P2P3EE141 Integrated DigitalCircuits 2nd32Arithmetic Circuits
Logarithmic Look-Ahead AdderFA0A1A2A3A4A5A6A7tp NA0A1A2A3FA4A5A6A7EE141 Integrated DigitalCircuits 2ndtp log2 (N)33Arithmetic Circuits
Carry Lookahead TreesCo , 0 G 0 P 0 Ci , 0C o, 1 G 1 P 1 G0 P 1P 0 Ci, 0C o, 2 G 2 P2 G1 P2 P1 G0 P2 P 1P 0C i, 0 ( G 2 P2 G1) ( P2 P1 ) ( G0 P0 Ci, 0 ) G 2:1 P 2:1 C o, 0Can continue building the tree hierarchically.EE141 Integrated DigitalCircuits 2nd34Arithmetic Circuits
EE141 Integrated DigitalS2S3S4S5S6S7S8S9S10S11S12S13S14S15(A3, B3)(A4, B4)(A5, B5)(A6, B6)(A7, B7)(A8, B8)(A9, B9)(A10, B10)(A11, B11)(A12, B12)(A13, B13)(A14, B14)(A15, B15)S1(A1, B1)(A2, B2)S0(A0, B0)Tree Adders16-bit radix-2 Kogge-Stone treeCircuits 2nd35Arithmetic Circuits
EE141 Integrated DigitalS2S3S4S5S6S7S8S9S10S11S12S13S14S15(a3 , b3 )(a4 , b4 )(a5 , b5 )(a6 , b6 )(a7 , b7 )(a8 , b8 )(a9 , b9 )(a10 , b10 )(a11 , b11 )(a12 , b12 )(a13 , b13 )(a14 , b14 )(a15 , b15 )S1(a1 , b1 )(a2 , b2 )S0(a0 , b0 )Tree Adders16-bit radix-4 Kogge-Stone TreeCircuits 2nd36Arithmetic Circuits
EE141 Integrated DigitalS2S3S4S5S6S7S8S9S10S11S12S13S14S15(a3 , b3 )(a4 , b4 )(a5 , b5 )(a6 , b6 )(a7 , b7 )(a8 , b8 )(a9 , b9 )(a10 , b 10 )(a11 , b 11 )(a12 , b 12 )(a13 , b 13 )(a14 , b 14 )(a15 , b 15 )S1(a1 , b1 )(a2 , b2 )S0(a0 , b0 )Sparse Trees16-bit radix-2 sparse tree with sparseness of 2Circuits 2nd37Arithmetic Circuits
S0S1S2S3S4S5S6S7S8S9S10S11S12S13S14S15(A0, B0)(A1, B1)(A2, B2)(A3, B3)(A4, B4)(A5, B5)(A6, B6)(A7, B7)(A8, B8)(A9, B9)(A10, B10)(A11, B11)(A12, B12)(A13, B13)(A14, B14)(A15, B15)Tree AddersBrent-Kung TreeEE141 Integrated DigitalCircuits 2nd38Arithmetic Circuits
Example: Domino AdderVDDVDDClkClkP i a i b iaibiGi a ibiaibiClkClkPropagateEE141 Integrated DigitalCircuits 2ndGenerate39Arithmetic Circuits
Example: Domino AdderVDDVDDClkkClkkG i:i-2k 1Pi:i-2k 1Pi:i-k 1Pi:i-k 1Gi:i-k 1Gi-k:i-2k 1Pi-k:i-2k 1PropagateEE141 Integrated DigitalCircuits 2ndGenerate40Arithmetic Circuits
Example: Domino SumVDDClkVDDKeeperClkdSumGi:0ClkSi 0ClkdClkGi:0Si 1ClkEE141 Integrated DigitalCircuits 2nd41Arithmetic Circuits
MultipliersEE141 Integrated DigitalCircuits 2nd42Arithmetic Circuits
The Binary Multiplication··Z X YM N– 1 Zk2k 0 N – 1i M – 1 Xi 2 i 0 j 0 j Yj 2 M – 1 N – 1 k Xi Yj 2i 0 j 0i j withM –1X X i2i 0N–1Y Yj2j 0EE141 Integrated DigitalCircuits 2ndij43Arithmetic Circuits
The Binary Multiplication1 0 1 0 1 01 0 1 1xMultiplicandMultiplier1 0 1 0 1 01 0 1 0 1 00 0 0 0 0 0 1 0 1 0 1 01 1 1 0 0 1 1 1 0EE141 Integrated DigitalPartial productsCircuits 2ndResult44Arithmetic Circuits
The Array MultiplierX3Z7X2X1X0Y1 Z 6EE141 Integrated DigitalZ5Circuits 2ndZ4Y3Y2Y0Z1Z2Z345Arithmetic Circuits
The MxN Array Multiplier— Critical PathFAHAFAFAFAFAHAHACritical Path 1Critical Path 2FAEE141 Integrated DigitalFACircuits 2ndFAHACritical Path 1 & 246Arithmetic Circuits
Carry-Save MultiplierHAHAHAHAHAFAFAFAHAFAFAFAFAFAHAHAVector Merging AdderEE141 Integrated DigitalCircuits 2nd47Arithmetic Circuits
Multiplier FloorplanX3X2X1X0Y0HA Multiplier CellY1C SC SC SC SZ0FA Multiplier CellY2C SY3C SCZ7EE141 Integrated DigitalC SC SCC SC SCC SVector Merging CellZ2X and Y signals are broadcastedthrough the complete array.()C SCSSSSZ6Z5Z4Z3Circuits 2ndZ148Arithmetic Circuits
Wallace-Tree MultiplierPartial products6543First stage210654(a)5410 Bit position210Final adder3FA2106543HA(c)EE141 Integrated Digital2(b)Second stage63Circuits 2nd(d)49Arithmetic Circuits
Wallace-Tree MultiplierPartial productsx 3y 3x 3y 2x2y3First stageSecond stageFAx2y2 x3y1 x1y2 x3y0 x1y1 x2y0 x0y1x1y3x0y3 x2y1 x0y2 x1y0 x0y0HAHAFAFAFAFinal adderz7 z6EE141 Integrated DigitalCircuits 2ndz5z4z3z2z1z050Arithmetic Circuits
Wallace-Tree Multipliery 0 y1y2y 0 y1 y2Ci-1FAy3Ciy 3 y 4 CCEE141 Integrated DigitalSSCircuits 2nd51Arithmetic Circuits
Multipliers —Summary Optimization Goals Different Vs Binary Adder Once Again: Identify Critical Path Other possible techniques- Logarithmic versus Linear (Wallace Tree Mult)- Data encoding (Booth)- PipeliningFIRST GLIMPSE AT SYSTEM LEVEL OPTIMIZATIONEE141 Integrated DigitalCircuits 2nd52Arithmetic Circuits
ShiftersEE141 Integrated DigitalCircuits 2nd53Arithmetic Circuits
The Binary ShifterRightnopAiLeftBiAi-1Bi-1Bit-Slice i.EE141 Integrated DigitalCircuits 2nd54Arithmetic Circuits
The Barrel ShifterA3B3Sh1A2B2: Data WireSh2A1B1: Control WireSh3A0B0Sh0Sh1Sh2Sh3Area Dominated by WiringEE141 Integrated DigitalCircuits 2nd55Arithmetic Circuits
4x4 barrel shifterA3A2A1A0Sh0Sh1Sh2Sh3BufferWidthbarrel 2 pm MEE141 Integrated DigitalCircuits 2nd56Arithmetic Circuits
Logarithmic ShifterSh1 Sh1Sh2 Sh2Sh4 Sh4A3B3A2B2A1B1A0B0EE141 Integrated DigitalCircuits 2nd57Arithmetic Circuits
0-7 bit Logarithmic ShifterA3Out3AAA2Out21Out10EE141 Integrated DigitalOut0Circuits 2nd58Arithmetic Circuits
EE141 2 Digital Integrated Circuits2nd Arithmetic Circuits A Generic Digita