Crosstalk-Aware Transmitter Pulse-Shaping For Parallel .

Transcription

Crosstalk-Aware Transmitter Pulse-Shaping for ParallelChip-to-Chip LinksbyMike BichanA thesis submitted in conformity with the requirementsfor the degree of Master of Applied ScienceGraduate Department of Electrical and Computer EngineeringUniversity of Torontoc Copyright by Mike Bichan 2006

Crosstalk-Aware Transmitter Pulse-Shaping for ParallelChip-to-Chip LinksMike BichanMaster of Applied Science, 2006Graduate Department of Electrical and Computer EngineeringUniversity of TorontoAbstractThis thesis examines the difficulties involved in transmitting data over chip-to-chiplinks. Links for which crosstalk from adjacent parallel channels dominates the receivernoise are given particular attention. The idea of using a crosstalk-aware pulse shapeto mitigate crosstalk is explored. This method stands in contrast to the traditionalmethod of explicit crosstalk cancellation in which each parallel transmitter takes thebit streams of its two neighbours as input in order to cancel the crosstalk from thosebit streams. Some measurements are performed on a board-to-board channel whichare then used to find the optimal transmitter pulse shape for that channel. Finally, a5-Gb/s chip-to-chip transmitter is designed in 0.13-µm CMOS based on informationfrom the simulations and measurements performed.ii

AcknowledgmentsRationality is the recognition of thefact that nothing can alter the truthand nothing can take precedence overthe act of perceiving it.Ayn Rand, Atlas ShruggedI am grateful to my supervisor, Professor Tony Chan Carusone. This thesis wouldnot have been possible without his guidance. His enthusiasm is contagious and makesit easy to stay motivated.I would also like to thank Professor Johns, Professor Gulak, and Professor Yu forserving on my thesis defense committee and providing valuable feedback that helpedimprove this thesis.Many thanks go to CMC, MOSIS, and IBM for allowing me access to the 0.13-µmCMOS design kit and especially CMC for allocating me area on a fabrication run.Thank you to my fellow graduate students in BA5158 who were always ready andwilling to share their knowledge and expertise, and especially Raf Karakiewicz for manyfruitful discussions.I am indebted to my parents for their constant love and support throughout mylife. I would not be where I am today without them.Finally, I would like to thank Danielle Tchao for being a wonderful companion onthe journey.iii

ContentsList of FiguresviList of Tablesix1 Introduction1.1 Motivation . . . . . . . . . . . . . . .1.2 Background Information . . . . . . .1.3 The State of the Art . . . . . . . . .1.3.1 Chip-to-Chip Transceivers . .1.3.2 Fractionally-Spaced Equalizers1.3.3 Filter Tap Weight Selection .1.3.4 Insights from DSL . . . . . .1.4 Thesis Organization . . . . . . . . . .1124467892 Chip-to-Chip Channel Impairments2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2.2 Chip-to-Chip Channel Modelling . . . . . . . . . . . . . . . . . . . . .2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10101215.3 Optimal Pulse Shape173.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 Pulse Shape Search Methodology . . . . . . . . . . . . . . . . . . . . . 183.2.1 Figure of Merit . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2.2 Searching the Space of Candidate Pulse Shapes . . . . . . . . . 233.3 Results of the Exhaustive Search . . . . . . . . . . . . . . . . . . . . . 233.3.1 PCB Channel with No Crosstalk . . . . . . . . . . . . . . . . . 233.3.2 PCB Channel with Crosstalk . . . . . . . . . . . . . . . . . . . 243.4 Guideline for Equalizer Specification: Optimal Pulse Shapes above 2.7 Gb/s 283.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28iv

Contents4 Measurement Results4.1 Introduction . . . . . . . . .4.2 Time Domain Reflectometry4.3 Output Eye Diagrams . . .4.4 Bit Error Rate Testing . . .4.5 Summary . . . . . . . . . .3131313236385 Transmitter Design5.1 Introduction . . . . . . .5.2 Transmitter Architecture5.2.1 Output Driver . .5.2.2 Delay Cell . . . .5.3 Simulation Results . . .5.4 Test Chip Results . . . .5.5 Summary . . . . . . . .40404041435257576 Conclusion6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6.2 Suggestions for Future Work . . . . . . . . . . . . . . . . . . . . . . . .616162Appendix A Aggregate Data Rate Derivations64Appendix B Simulation Data65References70.v

List of Figures2.12.22.32.42.52.62.72.8General chip-to-chip channel. . . . . . . . . . . . . .(a) Physical configuration and (b) frequency responsechannel and (c) a more complex PCB channel. . . . .Two parallel microstrip lines. . . . . . . . . . . . . .Through and crosstalk responses of the channel. . . .Effect of L variation on f3dB . . . . . . . . . . . . . . .Effect of s variation on f3dB . . . . . . . . . . . . . . .Effect of L variation on G(f ) max . . . . . . . . . . . .Effect of s variation on G(f ) max . . . . . . . . . . . .3.13.23.33.43.53.63.73.83.93.103.113.12Measured step response of the board-to-board channel shown in Figure 3.2.Board-to-board communication link. . . . . . . . . . . . . . . . . . . .Close-up of six adjacent chip-to-chip links. . . . . . . . . . . . . . . . .Contour plot of simulated crosstalk-free eye opening. . . . . . . . . . .Contour plot of simulated E2C against number of taps and taps per UI.Comparison of optimal and regular NRZ pulses. . . . . . . . . . . . . .Simulated E2C vs. T apstotal for a filter with T apsperU I 2. . . . . . . .Plot of simulated E2C showing effect of time granularity. . . . . . . . .Contour plot of E2C for a data rate of 5Gb/s. . . . . . . . . . . . . . .Plot of simulated E2C showing effect of time granularity at 5Gb/s. . .Contour plot of E2C for a data rate of 7.5Gb/s. . . . . . . . . . . . . .Plot of simulated E2C showing effect of time granularity at 7.5Gb/s. .4.14.2Equalizer proof-of-concept test setup. . . . . . . . . . . . . . . . . . . . 32Impulse response of the channel, computed from the step response (time step 10 ps). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Frequency response of the through channel ( ), first crosstalk channel(4), and second crosstalk channel ( ) computed from the step response(time step 50 ps). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33Measured eye diagram at 2.7 Gb/s with a PRBS sequence of length 231 1. This figure shows the channel input with no aggressors. . . . . . . . 344.34.4vi. . . . . . . . . .of a simple PCB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1111131414151616192020252626272728292930

List of FiguresMeasured eye diagram at 2.7 Gb/s with a PRBS sequence of length 231 1. This figure shows the channel output corresponding to Figure 4.4 withno aggressors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4.6 Impact of adjacent aggressor signals on desired signal. . . . . . . . . . .4.7 Measured eye diagram at 2.7 Gb/s with a PRBS sequence of length 231 1. This figure shows the output of the chip-to-chip channel for squarepulse input with two agressors. . . . . . . . . . . . . . . . . . . . . . .4.8 Measured eye diagram at 2.7 Gb/s with a PRBS sequence of length 231 1. This figure shows the output of the chip-to-chip channel for crosstalkaware pulse input with two aggressors. . . . . . . . . . . . . . . . . . .4.9 Bathtub plot comparing crosstalk-aware and square pulses. . . . . . . .4.10 Bathtub plot comparing crosstalk-aware and pre-emphasis pulses. BERis higher than in Figure 4.9 because a smaller signal swing was used inthis measurement for both pulse shapes. . . . . . . . . . . . . . . . . ck diagram of the proposed transmitter. . . . . . . . . . . . . . . . .Schematic diagram of the output driver cell. Gate lengths are 0.12 µm.Gate widths of M1 and M2 are 80 µm. . . . . . . . . . . . . . . . . . . .Schematic diagram of the crossbar switch. Gate lengths are 0.12 µm andgate widths are 10 µm. The pass transistors M3 –M6 have Ron 285 Ω. .Schematic diagram of the improved output driver cell. . . . . . . . . . .Schematic diagram of the starved inverter delay cell. Gate lengths are0.12 µm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Schematic diagram of the common source delay cell. Gate lengths are0.12 µm and gate widths are 10 µm unless otherwise stated. . . . . . . .(a) Single transistor load, (b) I-V curve comparison, (c) symmetric load.Schematic diagram of the diode-connected delay cell. Gate lengths are0.12 µm and gate widths are 10 µm unless otherwise stated. . . . . . . .Schematic diagram of the self-biased symmetric-load delay cell. Gatelengths are 0.12 µm and gate widths are 10 µm unless otherwise stated.Schematic diagram of the low voltage delay cell. Gate lengths are0.12 µm and gate widths are 10 µm. . . . . . . . . . . . . . . . . . . . .Schematic diagram of the low voltage delay cell with cross-coupled inverters. Gate lengths are 0.12 µm and gate widths are 10 µm. . . . . . .Delay cell comparison. Note: simulations conducted at 1 Gb/s. Powerdissipation of the starved inverter cell would be higher at the targetedbit rate of 5 Gb/s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .Supply voltage scaling: (a) low voltage cell, (b) self-biased, symmetricload cell. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vii35353637383941424344464747484949505152

List of Figures5.14 Spectre simulation of (a) regular NRZ and (b) pre-emphasis pulse shapesat 5 Gb/s with a PRBS sequence of length 27 1. The correspondingsignal and crosstalk outputs are shown in (c)-(f) for the chip-to-chipchannel. T apsperU I 3. . . . . . . . . . . . . . . . . . . . . . . . . . . .5.15 Spectre simulation of (a) slew-rate limited and (b) square pulse shapesat 5 Gb/s with a PRBS sequence of length 27 1. The correspondingsignal and crosstalk outputs are shown in (c)-(f) for the chip-to-chipchannel. T apsperU I 3. . . . . . . . . . . . . . . . . . . . . . . . . . . .5.16 Spectre simulation of (a) the optimal pulse shape at 5 Gb/s with a PRBSsequence of length 27 1. Also shown are (b) the corresponding signal output and (c) the crosstalk output for the chip-to-chip channel.T apsperU I 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5.17 Spectre simulation at 5 Gb/s with a PRBS sequence of length 27 1.When longer delays are used the bandwidth of the delay cell decreases,increasing jitter in the delay cells farther down the chain. . . . . . . . .5.18 Photomicrograph of the test chip in 0.13-µm CMOS. The die dimensionsare 1.5 mm 1.5 mm. . . . . . . . . . . . . . . . . . . . . . . . . . . . .viii5455565859

List of Tables1.11.21.3Comparison of several signalling schemes. . . . . . . . . . . . . . . . . .State of the art chip-to-chip communication circuits. . . . . . . . . . . .Circuits using fractionally-spaced equalizers. . . . . . . . . . . . . . . .3575.15.2Summary of delay cell characteristics. . . . . . . . . . . . . . . . . . . .Simulated transmitter characteristics. . . . . . . . . . . . . . . . . . . hipixlinklinklinklinkwith no crosstalk . . . . .including crosstalk . . . .with crosstalk at 5Gb/s .with crosstalk at 7.5Gb/s.

List of Acronyms2-PAM two-level pulse amplitude modulation4-PAM four-level pulse amplitude modulationBER bit error rateDAC digital-to-analog converterDLL delay-locked loopDSL digital subscriber lineDUT device under testDSM dynamic spectrum managementE2C eye-to-crosstalk ratioFEXT far-end crosstalkFIR finite impulse responseIC integrated circuitI/O input/outputISI inter-symbol interferenceMIMO multiple-input multiple-outputNEXT near-end crosstalkNRZ nonreturn-to-zeroParBERT parallel bit error ratio testerPCB printed circuit boardx

List of AcronymsPD phase detectorPSD power spectral densityPRBS pseudo-random bit streamT apstotal total number of tapsT apsperU I taps per UIUI unit intervalxi

1 Introduction1.1CMotivationROSSTALK between adjacent channels is a severe problem in chip-to-chip com-munication links. It exists as a result of parasitic capacitance and inductance onprinted circuit boards and it is a barrier preventing bit rates for parallel chip-to-chiplinks from increasing past 5 Gb/s/pin. Even more dramatic are the effects of crosstalkon board-to-board channels and multidrop busses. To extend the useful bandwidth ofthese channels, it is possible to use a transmitted pulse shape that minimizes crosstalkwhile also equalizing inter-symbol interference (ISI).The desire for higher chip-to-chip bit rates stems from the computer industry. Formost of the history of the computer, system performance has been limited by themaximum clock frequency of the CPU. In recent years, improvements in integratedcircuit (IC) fabrication technology have led to computer chips running at speeds approaching 4 GHz. This frequency is approximately equal to the bandwidth of a typicalchip-to-chip channel on a printed circuit board (PCB). Now an important performancelimiting factor is the speed at which data can be sent between different chips in thesame system.As chip speeds increased over the past two decades, the a

I am grateful to my supervisor, Professor Tony Chan Carusone. This thesis would not have been possible without his guidance. His enthusiasm is contagious and makes it easy to stay motivated. I would also like to thank Professor Johns, Professor Gulak, and Professor Yu for serving on my thesis defense committee and providing valuable feedback that helped