Compilers - GitHub Pages

Transcription

pCompilersPrinciples, Techniques, § ToolsSecond EditionSecondEdition

CompilersPrinciples,Techniques, & ToolsSecond EditionAlfred V. AhoColumbia UniversityMonica S. LamStanford UniversityRavi SethiAvayaJeffrey D. UllmanStanford UniversityBoston San Francisco New YorkLondon Toronto Sydney Tokyo Singapore MadridMexico City Munich Paris Cape Town Hong Kong Montreal

PublisherExecutive EditorAcquisitions EditorProject EditorAssociate Managing EditorCover DesignerDigital Assets ManagerMedia ProducerSenior Marketing ManagerMarketing AssistantSenior Author Support/Technology SpecialistSenior Manufacturing BuyerCover ImageGreg TobinMichael HirschMatt GoldsteinKatherine HarutunianJeffrey HolcombJoyce Cosentino WellsMarianne GrothBethany TiddMichelle BrownSarah MilmoreJoe VetereCarol MelvilleScott Ullman of Strange Tonic Productions(www. strangetonic.com)Many of the designations used by manufacturers and sellers to distinguish theirproducts are claimed as trademarks. Where those designations appear in thisbook, and Addison-Wesley was aware of a trademark claim, the designationshave been printed in initial caps or all caps.AThis interior of this book was composed in L T X .ELibrary of Congress Cataloging-in-Publication DataCompilers : principles, techniques, and tools / Alfred V. Aho . [et al.]. 2nd ed.p. cm.Rev. ed. of: Compilers, principles, techniques, and tools / Alfred V. Aho, RaviSethi, Jeffrey D. Ullman. 1986.ISBN 0-321-48681-1 (alk. paper)1. Compilers (Computer programs) I. Aho, Alfred V. II. Aho, Alfred V.Compilers, principles, techniques, and tools.QA76.76.C65A37 2007005.4'53 dc222006024333Copyright 2007 Pearson Education, Inc. All rights reserved. No part of thispublication may be reproduced, stored in a retrieval system, or transmitted, inany form or by any means, electronic, mechanical, photocopying, recording, orotherwise, without the prior written permission of the publisher. Printed in theUnited States of America. For information on obtaining permission for use ofmaterial in this work, please submit a written request to Pearson Education,Inc., Rights and Contracts Department, 75 Arlington Street, Suite 300, Boston,MA 02116, fax your request to 617-848-7047, or e-mail athttp://www.pearsoned.com/legal/permissions.htm.2 3 4 5 6 7 8 9 10—CW—10 09 08 07 06

PrefaceIn the time since the 1986 edition of this book, the world of compiler designhas changed significantly. Programming languages have evolved to present newcompilation problems. Computer architectures offer a variety of resources ofwhich t h e compiler designer must take advantage. Perhaps most interestingly,t h e venerable technology of code optimization has found use outside compilers.It is now used in tools t h a t find bugs in software, and most importantly, findsecurity holes in existing code. And much of the "front-end" technology —grammars, regular expressions, parsers, and syntax-directed translators — arestill in wide use.Thus, our philosophy from previous versions of the book has not changed.We recognize t h a t few readers will build, or even maintain, a compiler for amajor programming language. Yet the models, theory, and algorithms associated with a compiler can be applied to a wide range of problems in softwaredesign and software development. We therefore emphasize problems t h a t aremost commonly encountered in designing a language processor, regardless oft h e source language or target machine.Use of the BookIt takes at leastmaterial in thiscourse and thea second coursechapters:two quarters or even two semesters to cover all or most of thebook. It is common to cover t h e first half in an undergraduatesecond half of the book — stressing code optimization — inat the graduate or mezzanine level. Here is an outline of theChapter 1 contains motivational material and also presents some backgroundissues in computer architecture and programming-language principles.Chapter 2 develops a miniature compiler and introduces many of the import a n t concepts, which are then developed in later chapters. The compiler itselfappears in the appendix.Chapter 3 covers lexical analysis, regular expressions, finite-state machines, andscanner-generator tools. This material is fundamental to text-processing of allsorts.v

viPREFACEChapter 4 covers the major parsing methods, top-down (recursive-descent, LL)and bottom-up (LR and its variants).Chapter 5 introduces the principal ideas in syntax-directed definitions andsyntax-directed translations.Chapter 6 takes the theory of Chapter 5 and shows how to use it to generateintermediate code for a typical programming language.Chapter 7 covers run-time environments, especially management of the run-timestack and garbage collection.Chapter 8 is on object-code generation. It covers construction of basic blocks,generation of code from expressions and basic blocks, and register-allocationtechniques.Chapter 9 introduces the technology of code optimization, including flow graphs,data-flow frameworks, and iterative algorithms for solving these frameworks.Chapter 10 covers instruction-level optimization. The emphasis is on the extraction of parallelism from small sequences of instructions and scheduling t h e mon single processors t h a t can do more t h a n one thing at once.Chapter 11 talks about larger-scale parallelism detection and exploitation. Here,the emphasis is on numeric codes t h a t have many tight loops t h a t range overmultidimensional arrays.Chapter 12 is on interprocedural analysis. It covers pointer analysis, aliasing,and data-flow analysis t h a t takes into account the sequence of procedure callst h a t reach a given point in the code.Courses from material in this book have been taught at Columbia, Harvard,and Stanford. At Columbia, a senior/first-year graduate course on programming languages and translators has been regularly offered using material fromthe first eight chapters. A highlight of this course is a semester-long projectin which students work in small teams to create and implement a little language of their own design. The student-created languages have covered diverseapplication domains including q u a n t u m computation, music synthesis, computer graphics, gaming, matrix operations and many other areas. Students usecompiler-component generators such as ANTLR, Lex, and Yacc and the syntaxdirected translation techniques discussed in chapters two and five to build theircompilers. A follow-on graduate course has focused on material in Chapters 9through 12, emphasizing code generation and optimization for contemporarymachines including network processors and multiprocessor architectures.At Stanford, a one-quarter introductory course covers roughly t h e material in Chapters 1 through 8, although there is an introduction to global codeoptimization from Chapter 9. T h e second compiler course covers Chapters 9through 12, plus the more advanced material on garbage collection from Chapter 7. Students use a locally developed, Java-based system called J o e q forimplementing data-flow analysis algorithms.

PREFACEviiPrerequisitesT h e reader should possess some "computer-science sophistication," includingat least a second course on programming, and courses in d a t a structures anddiscrete mathematics. Knowledge of several different programming languagesis useful.ExercisesT h e book contains extensive exercises, with some for almost every section. Weindicate harder exercises or p a r t s of exercises with an exclamation point. T h ehardest exercises have a double exclamation point.Gradiance On-Line HomeworksA feature of the new edition is t h a t there is an accompanying set of on-linehomeworks using a technology developed by Gradiance Corp. Instructors mayassign these homeworks to their class, or students not enrolled in a class mayenroll in an "omnibus class" t h a t allows them to do the homeworks as a tutorial(without an instructor-created class). Gradiance questions look like ordinaryquestions, but your solutions are sampled. If you make an incorrect choice youare given specific advice or feedback to help you correct your solution. If yourinstructor permits, you are allowed to t r y again, until you get a perfect score.A subscription to the Gradiance service is offered with all new copies of thistext sold in North America. For more information, visit the Addison-Wesleyweb site www.aw.com/gradiance or send email to computing@aw.com.Support on the World Wide WebT h e book's home page isdragonbook.Stanford.eduHere, you will find errata as we learn of them, and backup materials. We hopeto make available the notes for each offering of compiler-related courses as weteach them, including homeworks, solutions, and exams. We also plan to postdescriptions of important compilers written by their implementers.AcknowledgementsCover art is by S. D. Ullman of Strange Tonic Productions.Jon Bentley gave us extensive comments on a number of chapters of anearlier draft of this book. Helpful comments and e r r a t a were received from:

viiiPREFACEDomenico Bianculli, Peter Bosch, Marcio Buss, Marc Eaddy, Stephen Edwards,Vibhav Garg, Kim Hazelwood, Gaurav Kc, Wei Li, Mike Smith, Art Stamness,Krysta Svore, Olivier Tardieu, and Jia Zeng. The help of all these people isgratefully acknowledged. Remaining errors are ours, of course.In addition, Monica would like to t h a n k her colleagues on the SUIF compiler team for an 18-year lesson on compiling: Gerald Aigner, Dzintars Avots,Saman Amarasinghe, Jennifer Anderson, Michael Carbin, Gerald Cheong, AmerDiwan, Robert French, Anwar Ghuloum, Mary Hall, John Hennessy, DavidHeine, Shih-Wei Liao, Amy Lim, Benjamin Livshits, Michael Martin, DrorMaydan, Todd Mowry, Brian Murphy, Jeffrey Oplinger, Karen Pieper, Martin Rinard, Olatunji Ruwase, Constantine Sapuntzakis, Patrick Sathyanathan,Michael Smith, Steven Tjiang, Chau-Wen Tseng, Christopher Unkel, JohnWhaley, Robert Wilson, Christopher Wilson, and Michael Wolf.A. V. A., C h a t h a m NJM. S. L., Menlo P a r k CAR. S., Far Hills NJJ. D. U., Stanford CAJune, 2006

Table of Contents1Introduction1.1 Language Processors1.1.1 Exercises for Section 1.11.2 T h e Structure of a Compiler1.2.1 Lexical Analysis1.2.2Syntax Analysis1.2.3 Semantic Analysis1.2.4 Intermediate Code Generation1.2.5 Code Optimization1.2.6Code Generation1.2.7 Symbol-Table Management1.2.8 The Grouping of Phases into Passes1.2.9 Compiler-Construction Tools1.3 T h e Evolution of Programming Languages1.3.1 T h e Move to Higher-level Languages1.3.2 Impacts on Compilers1.3.3 Exercises for Section 1.31.4 T h e Science of Building a Compiler1.4.1 Modeling in Compiler Design and Implementation . . . .1.4.2 T h e Science of Code Optimization1.5 Applications of Compiler Technology1.5.1 Implementation of High-Level Programming Languages .1.5.2Optimizations for Computer Architectures1.5.3 Design of New Computer Architectures1.5.4 Program Translations1.5.5 Software Productivity Tools1.6 Programming Language Basics1.6.1 The Static/Dynamic Distinction1.6.2 Environments and States1.6.3 Static Scope and Block Structure1.6.4 Explicit Access Control1.6.5 Dynamic Scope1.6.6 P a r a m e t e r Passing 21222325252628313133

TABLE1.71.81.6.7 Aliasing1.6.8 Exercises for Section 1.6Summary of Chapter 1References for Chapter 1A Simple Syntax-Directed Translator2.1 Introduction2.2 Syntax Definition2.2.1 Definition of G r a m m a r s2.2.2 Derivations2.2.3 Parse Trees2.2.4 Ambiguity2.2.5 Associativity of Operators2.2.6 Precedence of Operators2.2.7 Exercises for Section 2.22.3 Syntax-Directed Translation .2.3.1 Postfix Notation2.3.2 Synthesized Attributes2.3.3 Simple Syntax-Directed Definitions2.3.4 Tree Traversals2.3.5 Translation Schemes2.3.6 Exercises for Section 2.32.4 Parsing2.4.1 Top-Down Parsing2.4.2 Predictive Parsing2.4.3 When to Use e-Productions2.4.4 Designing a Predictive Parser2.4.5 Left Recursion2.4.6 Exercises for Section 2.42.5 A Translator for Simple Expressions2.5.1 Abstract and Concrete Syntax2.5.2 Adapting the Translation Scheme2.5.3 Procedures for the Nonterminals2.5.4 Simplifying the Translator2.5.5 T h e Complete P r o g r a m2.6 Lexical Analysis2.6.1 Removal of White Space and Comments2.6.2 Reading Ahead2.6.3 Constants2.6.4 Recognizing Keywords and Identifiers2.6.5 A Lexical Analyzer2.6.6 Exercises for Section 2.62.7 Symbol Tables2.7.1 Symbol Table Per Scope2.7.2 The Use of Symbol TablesOFCONTENTS353536383940424244454748485152. 535456565760606164656667686869707273747677787 8798184858698

TABLE OF CONTENTS2.82.93Intermediate Code Generation2.8.1 Two Kinds of Intermediate Representations2.8.2 Construction of Syntax Trees2.8.3 Static Checking2.8.4 Three-Address Code2.8.5 Exercises for Section 2.8Summary of Chapter 2Lexical Analysis3.1 T h e Role of the Lexical Analyzer3.1.1 Lexical Analysis Versus Parsing3.1.2 Tokens, P a t t e r n s , and Lexemes3.1.3 Attributes for Tokens3.1.4 Lexical Errors3.1.5 Exercises for Section 3.13.2 Input Buffering3.2.1 Buffer Pairs3.2.2 Sentinels3.3 Specification of Tokens3.3.1 Strings and Languages3.3.2 Operations on Languages3.3.3 Regular Expressions3.3.4 Regular Definitions3.3.5 Extensions of Regular Expressions3.3.6 Exercises for Section 3.33.4 Recognition of Tokens3.4.1 Transition Diagrams3.4.2 Recognition of Reserved Words and Identifiers3.4.3 Completion of the Running Example3.4.4 Architecture of a Transition-Diagram-Based Lexical Analyzer3.4.5 Exercises for Section 3.43.5 The Lexical-Analyzer Generator Lex3.5.1 Use of Lex3.5.2 Structure of Lex Programs3.5.3 Conflict Resolution in Lex3.5.4 T h e Lookahead Operator3.5.5 Exercises for Section 3.53.6 Finite A u t o m a t a3.6.1 Nondeterministic Finite A u t o m a t a3.6.2 Transition Tables3.6.3 Acceptance of Input Strings by A u t o m a t a3.6.4 Deterministic Finite A u t o m a t a3.6.5 Exercises for Section 3.63.7 From Regular Expressions to A u t o m a t 146147147148149149151152

xiiTABLEOF3.7.1 Conversion of an NFA to a DFA3.7.2 Simulation of an NFA3.7.3 Efficiency of NFA Simulation3.7.4 Construction of an NFA from a Regular Expression3.7.5 Efficiency of String-Processing Algorithms3.7.6 Exercises for Section 3.73.8 Design of a Lexical-Analyzer Generator3.8.1 T h e Structure of the Generated Analyzer3.8.2 P a t t e r n Matching Based on NFA's3.8.3 DFA's for Lexical Analyzers3.8.4 Implementing the Lookahead Operator3.8.5 Exercises for Section 3.83.9 Optimization of DFA-Based P a t t e r n Matchers3.9.1 Important States of an NFA3.9.2 Functions Computed From the Syntax Tree3.9.3 Computing unliable, firstpos, and lastpos3.9.4 Computing followpos3.9.5 Converting a Regular Expression Directly to a DFA3.9.6 Minimizing the Number of States of a DFA3.9.7 State Minimization in Lexical Analyzers3.9.8 Trading Time for Space in DFA Simulation3.9.9 Exercises for Section 3.93.10 Summary of Chapter 33.11 References for Chapter 34CONTENTS152156157. . .159163166166167168170171172173173175176177. . . 179180.184185186187189Syntax Analysis1914.1 Introduction1924.1.1 T h e Role of t h e Parser1924.1.2 Representative G r a m m a r s1934.1.3 Syntax Error Handling1944.1.4 Error-Recovery Strategies1954.2 Context-Free G r a m m a r s1974.2.1 The Formal Definition of a Context-Free G r a m m a r . . . . 1974.2.2 Notational Conventions1984.2.3 Derivations1994.2.4 Parse Trees and Derivations2014.2.5 Ambiguity2034.2.6 Verifying t h e Language Generated b y a G r a m m a r. . . . 2044.2.7 Context-Free G r a m m a r s Versus Regular Expressions . . . 2054.2.8 Exercises for Section 4.22064.3 Writing a G r a m m a r2094.3.1 Lexical Versus Syntactic Analysis2094.3.2 Eliminating Ambiguity2104.3.3 Elimination of Left Recursion2124.3.4 Left Factoring214

TABLE OF CONTENTS4.3.5 Non-Context-Free Language Constructs4.3.6 Exercises for Section 4.34.4 Top-Down Parsing4.4.1 Recursive-Descent Parsing4.4.2 F I R S T and F O L L O W4.4.3 LL(1) G r a m m a r s4.4.4 Nonrecursive Predictive Parsing4.4.5 Error Recovery in Predictive Parsing4.4.6 Exercises for Section 4.44.5 Bottom-Up Parsing4.5.1 Reductions4.5.2 Handle Pruning4.5.3 Shift-Reduce Parsing4.5.4 Conflicts During Shift-Reduce Parsing4.5.5 Exercises for Section 4.54.6 Introduction to LR Parsing: Simple LR4.6.1 Why LR Parsers?4.6.2 Items and the LR(0) Automaton4.6.3 T h e LR-Parsing Algorithm4.6.4 Constructing SLR-Parsing Tables4.6.5 Viable Prefixes4.6.6 Exercises for Section 4.64.7 More Powerful LR Parsers4.7.1 Canonical LR(1) Items4.7.2 Constructing LR(1) Sets of Items4.7.3 Canonical LR(1) Parsing Tables4.7.4 Constructing LALR Parsing Tables4.7.5 Efficient Construction of LALR Parsing Tables4.7.6 Compaction of LR Parsing Tables4.7.7 Exercises for Section 4.74.8 Using Ambiguous G r a m m a r s4.8.1 Precedence and Associativity t o Resolve Conflicts4.8.2 The "Dangling-Else" Ambiguity4.8.3 Error Recovery in LR Parsing4.8.4 Exercises for Section 4.84.9 Parser Generators4.9.1 The Parser Generator Yacc4.9.2 Using Yacc with Ambiguous G r a m m a r s4.9.3 Creating Yacc Lexical Analyzers with Lex4.9.4 Error Recovery in Yacc4.9.5 Exercises for Section 4.94.10 Summary of Chapter 44.11 References for Chapter 4x i i 41242248252256257259260261265266270275277278. . . . 279281283285287287291294295297297300

xivTABLEOF5Syntax-Directed Translation5.1 Syntax-Directed Definitions5.1.1 Inherited and Synthesized Attributes5.1.2 Evaluating an SDD at the Nodes of a Parse Tree5.1.3 Exercises for Section 5.15.2 Evaluation Orders for SDD's5.2.1 Dependency Graphs5.2.2 Ordering the Evaluation of Attributes5.2.3 S-Attributed Definitions5.2.4 L-Attributed Definitions5.2.5 Semantic Rules with Controlled Side Effects5.2.6 Exercises for Section 5.25.3 Applications of Syntax-Directed Translation5.3.1 Construction of Syntax Trees5.3.2 The Structure of a Type5.3.3 Exercises for Section 5.35.4 Syntax-Directed Translation Schemes5.4.1 Postfix Translation Schemes5.4.2 Parser-Stack Implementation of Postfix SDT's5.4.3 SDT's W i t h Actions Inside Productions5.4.4 Eliminating Left Recursion From SDT's5.4.5 SDT's for L-Attributed Definitions5.4.6 Exercises for Section 5.45.5 Implementing L-Attributed SDD's5.5.1 Translation During Recursive-Descent Parsing5.5.2 On-The-Fly Code Generation5.5.3 L-Attributed SDD's and LL Parsing5.5.4 Bottom-Up Parsing of L-Attributed SDD's5.5.5 Exercises for Section 5.55.6 Summary of Chapter 55.7 References for Chapter 56Intermediate-Code Generation6.1 Variants of Syntax Trees6.1.1 Directed Acyclic Graphs for Expressions6.1.2 The Value-Number Method for Constructing DAG's6.1.3 Exercises for Section 6.16.2 Three-Address Code6.2.1 Addresses and Instructions6.2.2 Quadruples6.2.3 Triples6.2.4 Static Single-Assignment Form6.2.5 Exercises for Section 6.26.3 Types and Declarations6.3.1 Type 348352353354357358359. . . 360362363364366367369370370371

TABLE OF CONTENTS 6.3.2Type Equivalence3726.3.3Declarations3736.3.4Storage Layout for Local Names3736.3.5Sequences of Declarations3766.3.6Fields in Records and Classes3766.3.7Exercises for Section 6.33786.46.56.66.76.86.9Translation of Expressions3786.4.1Operations Within Expressions3786.4.2Incremental Translation3806.4.3Addressing Array Elements3816.4.4Translation of Array References3836.4.5Exercises for Section 6.4384Type Checking3866.5.1Rules for T y p e Checking3876.5.2Type Conversions3886.5.3Overloading of Functions and Operators3906.5.4Type Inference and Polymorphic Functions3916.5.5An Algorithm for Unification3956.5.6Exercises for Section 6.5398Control Flow3996.6.1Boolean Expressions3996.6.2Short-Circuit Code4006.6.3Flow-of-Control Statements4016.6.4Control-Flow Translation of Boolean Expressions4036.6.5Avoiding Redundant Gotos4056.6.6Boolean Values and Jumping Code4086.6.7Exercises for Section 6.6Backpatching4084106.7.1One-Pass Code Generation Using Backpatching4106.7.2Backpatching for Boolean Expressions4116.7.3Flow-of-Control Statements4136.7.4Break-, Continue-, and Goto-Statements4166.7.5Exercises for Section 6.7417Switch-Statements4186.8.1419Translation of Switch-Statements6.8.2Syntax-Directed Translation o f Switch-Statements6.8.3Exercises for Section 6.8Intermediate Code for Procedures. . . . 4204214226.10 Summary of Chapter 64246.11 References for Chapter 6425

xvi7TABLEOFRun-Time Environments7.1 Storage Organization7.1.1 Static Versus Dynamic Storage Allocation7.2 Stack Allocation of Space7.2.1 Activation Trees7.2.2 Activation Records7.2.3 Calling Sequences7.2.4 Variable-Length D a t a on t h e Stack7.2.5 Exercises for Section 7.27.3 Access to Nonlocal D a t a on t h e Stack7.3.1 D a t a Access W i t h o u t Nested Procedures7.3.2 Issues W i t h Nested Procedures7.3.3 A Language W i t h Nested Procedure Declarations7.3.4 Nesting Depth7.3.5 Access Links7.3.6 Manipulating Access Links7.3.7 Access Links for Procedure P a r a m e t e r s7.3.8 Displays7.3.9 Exercises for Section 7.37.4 Heap Management7.4.1 T h e Memory Manager7.4.2 T h e Memory Hierarchy of a Computer7.4.3 Locality in Programs7.4.4 Reducing Fragmentation7.4.5 Manual Deallocation Requests7.4.6 Exercises for Section 7.47.5 Introduction to Garbage Collection7.5.1 Design Goals for Garbage Collectors7.5.2 Reachability7.5.3 Reference Counting Garbage Collectors7.5.4 Exercises for Section 7.57.6 Introduction to Trace-Based Collection7.6.1 A Basic Mark-and-Sweep Collector7.6.2 Basic Abstraction7.6.3 Optimizing Mark-and-Sweep7.6.4 Mark-and-Compact Garbage Collectors7.6.5 Copying collectors7.6.6 Comparing Costs7.6.7 Exercises for Section 7.67.7 Short-Pause Garbage Collection7.7.1 Incremental Garbage Collection7.7.2 Incremental Reachability Analysis7.7.3 Partial-Collection Basics7.7.4 Generational Garbage Collection7.7.5 The Train 90

TABLE OF CONTENTS7.7.6 Exercises for Section 7.7Advanced Topics in Garbage Collection7.8.1 Parallel and Concurrent Garbage Collection7.8.2 Partial Object Relocation7.8.3 Conservative Collection for Unsafe Languages7.8.4 Weak References7.8.5 Exercises for Section 7.87.9 Summary of Chapter 77.10 References for Chapter 77.88Code Generation8.1 Issues in the Design of a Code Generator8.1.1 Input to t h e Code Generator8.1.2 T h e Target Program8.1.3 Instruction Selection8.1.4 Register Allocation8.1.5 Evaluation Order8.2 T h e Target Language8.2.1 A Simple Target Machine Model8.2.2 Program and Instruction Costs8.2.3 Exercises for Section 8.28.3 Addresses in the Target Code8.3.1 Static Allocation8.3.2 Stack Allocation8.3.3 Run-Time Addresses for Names8.3.4 Exercises for Section 8.38.4 Basic Blocks and Flow Graphs8.4.1 Basic Blocks8.4.2 Next-Use Information8.4.3 Flow Graphs8.4.4 Representation of Flow Graphs8.4.5 Loops8.4.6 Exercises for Section 8.48.5 Optimization of Basic Blocks8.5.1 The DAG Representation of Basic Blocks8.5.2 Finding Local Common Subexpressions8.5.3 Dead Code Elimination8.5.4 The Use of Algebraic Identities8.5.5 Representation of Array References8.5.6 Pointer Assignments and Procedure Calls8.5.7 Reassembling Basic Blocks From DAG's8.5.8 Exercises for Section 8.58.6 A Simple Code Generator8.6.1 Register and Address Descriptors8.6.2 T h e Code-Generation . 530531531533533534535536537539539541542543544

3 Design of t h e Function getReg8.6.4 Exercises for Section 8.6Peephole Optimization8.7.1 Eliminating R e d u n d a n t Loads and Stores8.7.2 Eliminating Unreachable Code8.7.3 Flow-of-Control Optimizations8.7.4 Algebraic Simplification and Reduction i n Strength . . . .8.7.5 Use of Machine Idioms8.7.6 Exercises for Section 8.7Register Allocation and Assignment8.8.1 Global Register Allocation8.8.2 Usage Counts8.8.3 Register Assignment for Outer Loops8.8.4 Register Allocation by G r a p h Coloring8.8.5 Exercises for Section 8.8Instruction Selection by Tree Rewriting8.9.1 Tree-Translation Schemes8.9.2 Code Generation by Tiling an Input Tree8.9.3 P a t t e r n Matching by Parsing8.9.4 Routines for Semantic Checking8.9.5 General Tree Matching8.9.6 Exercises for Section 8.9Optimal Code Generation for Expressions8.10.1 Ershov Numbers8.10.2 Generating Code From Labeled Expression Trees8.10.3 Evaluating Expressions with an Insufficient Supply of Registers 8.10.4 Exercises for Section 8.10Dynamic Programming Code-Generation8.11.1 Contiguous Evaluation8.11.2 The Dynamic Programming Algorithm8.11.3 Exercises for Section 8.11Summary of Chapter 8References for Chapter 8Machine-Independent Optimizations9.1 The Principal Sources of Optimization9.1.1 Causes of Redundancy9.1.2 A Running Example: Quicksort9.1.3 Semantics-Preserving Transformations9.1.4 Global Common Subexpressions9.1.5 Copy Propagation9.1.6 Dead-Code Elimination9.1.7 Code Motion9.1.8 Induction Variables and Reduction in 7578579583584584585586588590591592592

TABLE OF CONTENTSx i x9.1.9 Exercises for Section 9.1596Introduction to Data-Flow Analysis5979.2.1 The Data-Flow Abstraction5979.2.2 T h e Data-Flow Analysis Schema5999.2.3 Data-Flow Schemas on Basic Blocks6009.2.4 Reaching Definitions6019.2.5 Live-Variable Analysis6089.2.6 Available Expressions6109.2.7 Summary6149.2.8 Exercises for Section 9.26159.3 Foundations of Data-Flow Analysis6189.3.1 Semilattices6189.3.2 Transfer Functions6239.3.3 T h e Iterative Algorithm for General Frameworks6269.3.4 Meaning of a Data-Flow Solution6289.3.5 Exercises for Section 9.36319.4 Constant Propagation6329.4.1 Data-Flow Values for the Constant-Propagation Framework6339.4.2 T h e Meet for the Constant-Propagation Framework . . . 6339.4.3 Transfer Functions for the Constant-Propagation Framework6349.4.4 Monotonicity of the Constant-Propagation Framework . . 6359.4.5 Nondistributivity of the Constant-Propagation Framework 6359.4.6 Interpretation of the Results6379.4.7 Exercises for Section 9.46379.5 Partial-Redundancy Elimination6399.5.1 T h e Sources of Redundancy6399.5.2 Can All Redundancy Be Eliminated?6429.5.3 T h e Lazy-Code-Motion Problem6449.5.4 Anticipation of Expressions6459.5.5 T h e Lazy-Code-Motion Algorithm6469.5.6 Exercises for Section 9.56559.6 Loops in Flow Graphs6559.6.1 Dominators6569.6.2 Depth-First Ordering6609.6.3 Edges in a Depth-First Spanning Tree6619.6.4 Back Edges and Reducibility6629.6.5 Depth of a Flow Graph6659.6.6 Natural Loops6659.6.7 Speed of Convergence of Iterative Data-Flow Algorithms . 6679.6.8 Exercises for Section 9.66699.7 Region-Based Analysis6729.7.1 Regions6729.7.2 Region Hierarchies for Reducible Flow Graphs6739.2

XXTABLEOFCONTENTS9.7.3 Overview of a Region-Based Analysis9.7.4 Necessary Assumptions About Transfer Functions9.7.5 An Algorithm for Region-Based Analysis9.7.6 Handling Nonreducible Flow Graphs9.7.7 Exercises for Section 9.79.8 Symbolic Analysis9.8.1 Affine Expressions of Reference Variables9.8.2 Data-Flow Problem Formulation9.8.3 Region-Based Symbolic Analysis9.8.4 Exercises for Section 9.89.9 Summary of Chapter 99.10 References for Chapter 9676. . . . 67868068468668668768969469970070310 Instruction-Level Parallelism10.1 Processor Architectures10.1.1 Instruction Pipelines and Branch Delays10.1.2 Pipelined Execution10.1.3 Multiple Instruction Issue10.2 Code-Scheduling Constraints10.2.1 D a t a Dependence10.2.2 Finding Dependences Among Memory Accesses10.2.3 Tradeoff Between Register Usage and Parallelism10.2.4 Phase Ordering Between Register Allocation and CodeScheduling10.2.5 Control Dependence10.2.6 Speculative Execution Support10.2.7 A Basic Machine Model10.2.8 Exercises for Section 10.210.3 Basic-Block Scheduling10.3.1 Data-Dependence Graphs10.3.2 List Scheduling of Basic Blocks10.3.3 Prioritized Topological Orders10.3.4 Exercises for Section 10.310.4 Global Code Scheduling10.4.1 Primitive Code Motion10.4.2 Upward Code Motion10.4.3 Downward Code Motion10.4.4 Updating D a t a Dependences10.4.5 Global Scheduling Algorithms .10.4.6 Advanced Code Motion Techniques10.4.7 Interaction with Dynamic Schedulers10.4.8 Exercises for Section 10.410.5 Software Pipelining10.5.1 Introduction10.5.2 Software Pipelining of 722723725726727728730731732732736737737738738740

TABLE OF CONTENTSx x i10.5.3 Register Allocation and Code Generation10.5.4 Do-Across Loops10.5.5 Goals and Constraints of Software Pipelining10.5.6 A Software-Pipelining Algorithm10.5.7 Scheduling Acyclic Data-Dependence Graphs10.5.8 Scheduling Cyclic Dependence Graphs10.5.9 Improvements to the Pipelining Algorithms10.5.10Modular Variable Expansion10.5.11 Conditional Statements10.5.12 Hardware Support for Software Pipelining10.5.13Exercises for Section 10.510.6 Summary of Chapter 1010.7 References for Chapter 107437437457497497517587587617627637657661 1 O p t i m i z i n g for P a r a l l e l i s m a n d L o c a l i t y11.1 Basic Concepts11.1.1 Multiprocessors11.1.2 Parallelism in Applications11.1.3 Loop-Level Parallelism11.1.4 D a t a Locality11.1.5 Introduction to Affine Transform Theory11.2 Matrix Multiply: An In-Depth Example11.2.1 The Matrix-Multiplication Algorithm11.2.2 Optimizations11.2.3 Cache Interference11.2.4 Exercises for Section 11.211.3 Iteration Spaces11.3.1 Constructing Iteration Spaces from Loop Nests11.3.2 Execution Order for Loop Nests11.3.3 Matrix Formulation of Inequalities11.3.4 Incorporating Symbolic Constants11.3.5 Controlling t h e Order of Execution11.3.6 Changing Axes11.3.7 Exercises for Section 11.311.4 Affine Array Indexes11.4.1 Affine Accesses11.4.2 Affine and Nonaffine Accesses in Practice11.4.3 Exercises for Section 11.41

Use of the Book It takes at least two quarters or even two semesters to cover all or most of the material in this book. It is common to cover the first half in an undergraduate course and the second half of the book — stressing code optimization — in a second course at the graduat