Fourth Edition - UOITC PDF Free Download

2y ago

22 Views

1 Downloads

4.21 MB

654 Pages

Report/dmca

Download PDF

Transcription

Fourth EditionData Structuresand AlgorithmAnalysis inC

This page intentionally left blank

Fourth EditionData Structuresand AlgorithmAnalysis inC Mark Allen WeissFlorida International UniversityBostonColumbusIndianapolisNew YorkSan FranciscoUpper Saddle River Amsterdam Cape Town Dubai xico City Sao Paulo Sydney Hong Kong Seoul SingaporeTaipei Tokyo

Editorial Director, ECS: Marcia HortonExecutive Editor: Tracy JohnsonEditorial Assistant: Jenah Blitz-StoehrDirector of Marketing: Christy LeskoMarketing Manager: Yez AlayanSenior Marketing Coordinator: Kathryn FerrantiMarketing Assistant: Jon BryantDirector of Production: Erin GreggSenior Managing Editor: Scott DisannoSenior Production Project Manager: Marilyn LloydManufacturing Buyer: Linda SagerArt Director: Jayne ConteCover Designer: Bruce KenselaarPermissions Supervisor: Michael JoycePermissions Administrator: Jenell Forschlerc De-kay Dreamstime.comCover Image: Media Project Manager: Renata ButeraFull-Service Project Management: Integra SoftwareServices Pvt. Ltd.Composition: Integra Software Services Pvt. Ltd.Text and Cover Printer/Binder: Courier Westfordc 2014, 2006, 1999 Pearson Education, Inc., publishing as Addison-Wesley. All rights reserved.Copyright Printed in the United States of America. This publication is protected by Copyright, and permission should beobtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmissionin any form or by any means, electronic, mechanical, photocopying, recording, or likewise. To obtainpermission(s) to use material from this work, please submit a written request to Pearson Education, Inc.,Permissions Department, One Lake Street, Upper Saddle River, New Jersey 07458, or you may fax your requestto 201-236-3290.Many of the designations by manufacturers and sellers to distinguish their products are claimed as trademarks.Where those designations appear in this book, and the publisher was aware of a trademark claim, thedesignations have been printed in initial caps or all caps.Library of Congress Cataloging-in-Publication DataWeiss, Mark Allen.Data structures and algorithm analysis in C / Mark Allen Weiss, Florida International University. — Fourthedition.pages cmISBN-13: 978-0-13-284737-7 (alk. paper)ISBN-10: 0-13-284737-X (alk. paper)1. C (Computer program language) 2. Data structures (Computer science) 3. Computer algorithms. I. Title.QA76.73.C153W46 2014005.7 3—dc2320130110641098 7 6 5 4 3 2 13: 978-0-13-284737-7

To my kind, brilliant, and inspiring Sara.

This page intentionally left blank

CO NTE NTSPrefacexvChapter 1 Programming: A General Overview11.1 What’s This Book About?11.2 Mathematics Review21.2.1 Exponents 31.2.2 Logarithms 31.2.3 Series 41.2.4 Modular Arithmetic 51.2.5 The P Word 61.3 A Brief Introduction to Recursion81.4 C Classes121.4.1 Basic class Syntax 121.4.2 Extra Constructor Syntax and Accessors 131.4.3 Separation of Interface and Implementation 161.4.4 vector and string 191.5 C Details211.5.1 Pointers 211.5.2 Lvalues, Rvalues, and References 231.5.3 Parameter Passing 251.5.4 Return Passing 271.5.5 std::swap and std::move 291.5.6 The Big-Five: Destructor, Copy Constructor, Move Constructor, CopyAssignment operator , Move Assignment operator 301.5.7 C-style Arrays and Strings 351.6 Templates361.6.1 Function Templates 371.6.2 Class Templates 381.6.3 Object, Comparable, and an Example 391.6.4 Function Objects 411.6.5 Separate Compilation of Class Templates 441.7 Using Matrices441.7.1 The Data Members, Constructor, and Basic Accessors 441.7.2 operator[] 45vii

viiiContents1.7.3 Big-Five 46Summary 46Exercises 46References 48Chapter 2 Algorithm Analysis2.12.22.32.451Mathematical Background51Model54What to Analyze54Running-Time Calculations572.4.1 A Simple Example 582.4.2 General Rules 582.4.3 Solutions for the Maximum SubsequenceSum Problem 602.4.4 Logarithms in the Running Time 662.4.5 Limitations of Worst-Case Analysis 70Summary 70Exercises 71References 76Chapter 3 Lists, Stacks, and Queues3.1 Abstract Data Types (ADTs)773.2 The List ADT783.2.1 Simple Array Implementation of Lists 783.2.2 Simple Linked Lists 793.3 vector and list in the STL803.3.1 Iterators 823.3.2 Example: Using erase on a List 833.3.3 const iterators 84863.4 Implementation of vector3.5 Implementation of list913.6 The Stack ADT1033.6.1 Stack Model 1033.6.2 Implementation of Stacks 1043.6.3 Applications 1043.7 The Queue ADT1123.7.1 Queue Model 1133.7.2 Array Implementation of Queues 1133.7.3 Applications of Queues 115Summary 116Exercises 11677

ContentsChapter 4 Trees1214.1 Preliminaries1214.1.1 Implementation of Trees 1224.1.2 Tree Traversals with an Application 1234.2 Binary Trees1264.2.1 Implementation 1284.2.2 An Example: Expression Trees 1284.3 The Search Tree ADT—Binary Search Trees1324.3.1 contains 1344.3.2 findMin and findMax 1354.3.3 insert 1364.3.4 remove 1394.3.5 Destructor and Copy Constructor 1414.3.6 Average-Case Analysis 1414.4 AVL Trees1444.4.1 Single Rotation 1474.4.2 Double Rotation 1494.5 Splay Trees1584.5.1 A Simple Idea (That Does Not Work) 1584.5.2 Splaying 1604.6 Tree Traversals (Revisited)1664.7 B-Trees1684.8 Sets and Maps in the Standard Library1734.8.1 Sets 1734.8.2 Maps 1744.8.3 Implementation of set and map 1754.8.4 An Example That Uses Several Maps 176Summary 181Exercises 182References 189Chapter 5 Hashing5.15.25.35.4General Idea193Hash Function194Separate Chaining196Hash Tables without Linked Lists2015.4.1 Linear Probing 2015.4.2 Quadratic Probing 2025.4.3 Double Hashing 2075.5 Rehashing2085.6 Hash Tables in the Standard Library210193ix

xContents5.7 Hash Tables with Worst-Case O(1) Access5.7.1 Perfect Hashing 2135.7.2 Cuckoo Hashing 2155.7.3 Hopscotch Hashing 2275.8 Universal Hashing2305.9 Extendible Hashing233Summary 236Exercises 237References 241212Chapter 6 Priority Queues (Heaps)2456.1 Model2456.2 Simple Implementations2466.3 Binary Heap2476.3.1 Structure Property 2476.3.2 Heap-Order Property 2486.3.3 Basic Heap Operations 2496.3.4 Other Heap Operations 2526.4 Applications of Priority Queues2576.4.1 The Selection Problem 2586.4.2 Event Simulation 2596.5 d-Heaps2606.6 Leftist Heaps2616.6.1 Leftist Heap Property 2616.6.2 Leftist Heap Operations 2626.7 Skew Heaps2696.8 Binomial Queues2716.8.1 Binomial Queue Structure 2716.8.2 Binomial Queue Operations 2716.8.3 Implementation of Binomial Queues 2766.9 Priority Queues in the Standard Library282Summary 283Exercises 283References 288Chapter 7 Sorting7.1 Preliminaries2917.2 Insertion Sort2927.2.1 The Algorithm 2927.2.2 STL Implementation of Insertion Sort 2937.2.3 Analysis of Insertion Sort 2947.3 A Lower Bound for Simple Sorting Algorithms295291

Contents7.4 Shellsort2967.4.1 Worst-Case Analysis of Shellsort 2977.5 Heapsort3007.5.1 Analysis of Heapsort 3017.6 Mergesort3047.6.1 Analysis of Mergesort 3067.7 Quicksort3097.7.1 Picking the Pivot 3117.7.2 Partitioning Strategy 3137.7.3 Small Arrays 3157.7.4 Actual Quicksort Routines 3157.7.5 Analysis of Quicksort 3187.7.6 A Linear-Expected-Time Algorithm for Selection 3217.8 A General Lower Bound for Sorting3237.8.1 Decision Trees 3237.9 Decision-Tree Lower Bounds for Selection Problems3257.10 Adversary Lower Bounds3287.11 Linear-Time Sorts: Bucket Sort and Radix Sort3317.12 External Sorting3367.12.1 Why We Need New Algorithms 3367.12.2 Model for External Sorting 3367.12.3 The Simple Algorithm 3377.12.4 Multiway Merge 3387.12.5 Polyphase Merge 3397.12.6 Replacement Selection 340Summary 341Exercises 341References 347Chapter 8 The Disjoint Sets Class8.18.28.38.48.58.6Equivalence Relations351The Dynamic Equivalence Problem352Basic Data Structure353Smart Union Algorithms357Path Compression360Worst Case for Union-by-Rank and Path Compression8.6.1 Slowly Growing Functions 3628.6.2 An Analysis by Recursive Decomposition 3628.6.3 An O( M log * N ) Bound 3698.6.4 An O( M α(M, N) ) Bound 3708.7 An Application372351361xi

xiiContentsSummary 374Exercises 375References 376Chapter 9 Graph Algorithms3799.1 Deﬁnitions3799.1.1 Representation of Graphs 3809.2 Topological Sort3829.3 Shortest-Path Algorithms3869.3.1 Unweighted Shortest Paths 3879.3.2 Dijkstra’s Algorithm 3919.3.3 Graphs with Negative Edge Costs 4009.3.4 Acyclic Graphs 4009.3.5 All-Pairs Shortest Path 4049.3.6 Shortest Path Example 4049.4 Network Flow Problems4069.4.1 A Simple Maximum-Flow Algorithm 4089.5 Minimum Spanning Tree4139.5.1 Prim’s Algorithm 4149.5.2 Kruskal’s Algorithm 4179.6 Applications of Depth-First Search4199.6.1 Undirected Graphs 4209.6.2 Biconnectivity 4219.6.3 Euler Circuits 4259.6.4 Directed Graphs 4299.6.5 Finding Strong Components 4319.7 Introduction to NP-Completeness4329.7.1 Easy vs. Hard 4339.7.2 The Class NP 4349.7.3 NP-Complete Problems 434Summary 437Exercises 437References 445Chapter 10 Algorithm Design Techniques10.1 Greedy Algorithms44910.1.1 A Simple Scheduling Problem 45010.1.2 Huffman Codes 45310.1.3 Approximate Bin Packing 45910.2 Divide and Conquer46710.2.1 Running Time of Divide-and-Conquer Algorithms10.2.2 Closest-Points Problem 470449468

Contents10.2.3 The Selection Problem 47510.2.4 Theoretical Improvements for Arithmetic Problems10.3 Dynamic Programming48210.3.1 Using a Table Instead of Recursion 48310.3.2 Ordering Matrix Multiplications 48510.3.3 Optimal Binary Search Tree 48710.3.4 All-Pairs Shortest Path 49110.4 Randomized Algorithms49410.4.1 Random-Number Generators 49510.4.2 Skip Lists 50010.4.3 Primality Testing 50310.5 Backtracking Algorithms50610.5.1 The Turnpike Reconstruction Problem 50610.5.2 Games 511Summary 518Exercises 518References 527Chapter 11 Amortized Analysis47853311.111.211.311.4An Unrelated Puzzle534Binomial Queues534Skew Heaps539Fibonacci Heaps54111.4.1 Cutting Nodes in Leftist Heaps 54211.4.2 Lazy Merging for Binomial Queues 54411.4.3 The Fibonacci Heap Operations 54811.4.4 Proof of the Time Bound 54911.5 Splay Trees551Summary 555Exercises 556References 557Chapter 12 Advanced Data Structuresand Implementation12.1 Top-Down Splay Trees55912.2 Red-Black Trees56612.2.1 Bottom-Up Insertion 56712.2.2 Top-Down Red-Black Trees 56812.2.3 Top-Down Deletion 57012.3 Treaps576559xiii

xivContents12.4 Sufﬁx Arrays and Sufﬁx Trees57912.4.1 Sufﬁx Arrays 58012.4.2 Sufﬁx Trees 58312.4.3 Linear-Time Construction of Sufﬁx Arrays and Sufﬁx Trees12.5 k-d Trees59612.6 Pairing Heaps602Summary 606Exercises 608References 612Appendix A Separate Compilation ofClass TemplatesA.1 Everything in the Header616A.2 Explicit Instantiation616Index619586615

P R E FAC EPurpose/GoalsThe fourth edition of Data Structures and Algorithm Analysis in C describes data structures,methods of organizing large amounts of data, and algorithm analysis, the estimation of therunning time of algorithms. As computers become faster and faster, the need for programsthat can handle large amounts of input becomes more acute. Paradoxically, this requiresmore careful attention to efﬁciency, since inefﬁciencies in programs become most obviouswhen input sizes are large. By analyzing an algorithm before it is actually coded, studentscan decide if a particular solution will be feasible. For example, in this text students look atspeciﬁc problems and see how careful implementations can reduce the time constraint forlarge amounts of data from centuries to less than a second. Therefore, no algorithm or datastructure is presented without an explanation of its running time. In some cases, minutedetails that affect the running time of the implementation are explored.Once a solution method is determined, a program must still be written. As computershave become more powerful, the problems they must solve have become larger and morecomplex, requiring development of more intricate programs. The goal of this text is to teachstudents good programming and algorithm analysis skills simultaneously so that they candevelop such programs with the maximum amount of efﬁciency.This book is suitable for either an advanced data structures course or a ﬁrst-yeargraduate course in algorithm analysis. Students should have some knowledge of intermediate programming, including such topics as pointers, recursion, and object-basedprogramming, as well as some background in discrete math.ApproachAlthough the material in this text is largely language-independent, programming requiresthe use of a speciﬁc language. As the title implies, we have chosen C for this book.C has become a leading systems programming language. In addition to ﬁxing manyof the syntactic ﬂaws of C, C provides direct constructs (the class and template) toimplement generic data structures as abstract data types.The most difﬁcult part of writing this book was deciding on the amount of C toinclude. Use too many features of C and one gets an incomprehensible text; use too fewand you have little more than a C text that supports classes.The approach we take is to present the material in an object-based approach. As such,there is almost no use of inheritance in the text. We use class templates to describe genericdata structures. We generally avoid esoteric C features and use the vector and stringclasses that are now part of the C standard. Previous editions have implemented classtemplates by separating the class template interface from its implementation. Althoughthis is arguably the preferred approach, it exposes compiler problems that have made itxv

xviPrefacedifﬁcult for readers to actually use the code. As a result, in this edition the online coderepresents class templates as a single unit, with no separation of interface and implementation. Chapter 1 provides a review of the C features that are used throughout the text anddescribes our approach to class templates. Appendix A describes how the class templatescould be rewritten to use separate compilation.Complete versions of the data structures, in both C and Java, are available onthe Internet. We use similar coding conventions to make the parallels between the twolanguages more evident.Summary of the Most Signiﬁcant Changes in the Fourth EditionThe fourth edition incorporates numerous bug ﬁxes, and many parts of the book haveundergone revision to increase the clarity of presentation. In addition,rChapter 4 includes implementation of the AVL tree deletion algorithm—a topic oftenrequested by readers.r Chapter 5 has been extensively revised and enlarged and now contains material ontwo newer algorithms: cuckoo hashing and hopscotch hashing. Additionally, a newsection on universal hashing has been added. Also new is a brief discussion of theunordered set and unordered map class templates introduced in C 11.rChapter 6 is mostly unchanged; however, the implementation of the binary heap makesuse of move operations that were introduced in C 11.r Chapter 7 now contains material on radix sort, and a new section on lower-boundproofs has been added. Sorting code makes use of move operations that wereintroduced in C 11.rChapter 8 uses the new union/ﬁnd analysis by Seidel and Sharir and shows theO( M α(M, N) ) bound instead of the weaker O( M log N ) bound in prior editions.r Chapter 12 adds material on sufﬁx trees and sufﬁx arrays, including the linear-timesufﬁx array construction algorithm by Karkkainen and Sanders (with implementation).The sections covering deterministic skip lists and AA-trees have been removed.r Throughout the text, the code has been updated to use C 11. Notably, this meansuse of the new C 11 features, including the auto keyword, the range for loop, moveconstruction and assignment, and uniform initialization.OverviewChapter 1 contains review material on discrete math and recursion. I believe the only wayto be comfortable with recursion is to see good uses over and over. Therefore, recursionis prevalent in this text, with examples in every chapter except Chapter 5. Chapter 1 alsoincludes material that serves as a review of basic C . Included is a discussion of templatesand important constructs in C class design.Chapter 2 deals with algorithm analysis. This chapter explains asymptotic analysisand its major weaknesses. Many examples are provided, including an in-depth explanation of logarithmic running time. Simple recursive programs are analyzed by intuitivelyconverting them into iterative programs. More complicated divide-and-conquer programsare introduced, but some of the analysis (solving recurrence relations) is implicitly delayeduntil Chapter 7, where it is performed in detail.

PrefaceChapter 3 covers lists, stacks, and queues. This chapter includes a discussion of the STLvector and list classes, including material on iterators, and it provides implementationsof a signiﬁcant subset of the STL vector and list classes.Chapter 4 covers trees, with an emphasis on search trees, including external searchtrees (B-trees). The UNIX ﬁle system and expression trees are used as examples. AVL treesand splay trees are introduced. More careful treatment of search tree implementation detailsis found in Chapter 12. Additional coverage of trees, such as ﬁle compression and gametrees, is deferred until Chapter 10. Data structures for an external medium are consideredas the ﬁnal topic in several chapters. Included is a discussion of the STL set and map classes,including a signiﬁcant example that illustrates the use of three separate maps to efﬁcientlysolve a problem.Chapter 5 discusses hash tables, including the classic algorithms such as separate chaining and linear and quadratic probing, as well as several newer algorithms,namely cuckoo hashing and hopscotch hashing. Universal hashing is also discussed, andextendible hashing is covered at the end of the chapter.Chapter 6 is about priority queues. Binary heaps are covered, and there is additionalmaterial on some of the theoretically interesting implementations of priority queues. TheFibonacci heap is discussed in Chapter 11, and the pairing heap is discussed in Chapter 12.Chapter 7 covers sorting. It is very speciﬁc with respect to coding details and analysis.All the important general-purpose sorting algorithms are covered and compared. Fouralgorithms are analyzed in detail: insertion sort, Shellsort, heapsort, and quicksort. New tothis edition is radix sort and lower bound proofs for selection-related problems. Externalsorting is covered at the end of the chapter.Chapter 8 discusses the disjoint set algorithm with proof of the running time. This is ashort and speciﬁc chapter that can be skipped if Kruskal’s algorithm is not discussed.Chapter 9 covers graph algorithms. Algorithms on graphs are interesting, not onlybecause they frequently occur in practice but also because their running time is so heavilydependent on the proper use of data structures. Virtually all of the standard algorithmsare presented along with appropriate data structures, pseudocode, and analysis of runningtime. To place these problems in a proper context, a short discussion on complexity theory(including NP-completeness and undecidability) is provided.Chapter 10 covers algorithm design by examining common problem-solving techniques. This chapter is heavily fortiﬁed with examples. Pseudocode is used in these laterchapters so that the student’s appreciation of an example algorithm is not obscured byimplementation details.Chapter 11 deals with amortized analysis. Three data structures from Chapters 4 and6 and the Fibonacci heap, introduced in this chapter, are analyzed.Chapter 12 covers search tree algorithms, the sufﬁx tree and array, the k-d tree, andthe pairing heap. This chapter departs from the rest of the text by providing complete andcareful implementations for the search trees and pairing heap. The material is structured sothat the instructor can integrate sections into discussions from other chapters. For example,the top-down red-black tree in Chapter 12 can be discussed along with AVL trees (inChapter 4).Chapters 1 to 9 provide enough material for most one-semester data structures courses.If time permits, then Chapter 10 can be covered. A graduate course on algorithm analysiscould cover chapters 7 to 11. The advanced data structures analyzed in Chapter 11 caneasily be referred to in the earlier chapters. The discussion of NP-completeness in Chapter 9xvii

xviiiPrefaceis far too brief to be used in such a course. You might ﬁnd it useful to use an additionalwork on NP-completeness to augment this text.ExercisesExercises, provided at the end of each chapter, match the order in which material is presented. The last exercises may address the chapter as a whole rather than a speciﬁc section.Difﬁcult exercises are marked with an asterisk, and more challenging exercises have twoasterisks.ReferencesReferences are placed at the end of each chapter. Generally the references either are historical, representing the original source of the material, or they represent extensions andimprovements to the results given in the text. Some references represent solutions toexercises.SupplementsThe following supplements are available to all readers at http://cssupport.pearsoncmg.com/rSource code for example programsrErrataIn addition, the following material is available only to qualiﬁed instructors at PearsonInstructor Resource Center (www.pearsonhighered.com/irc). Visit the IRC or contact yourPearson Education sales representative for access.rrSolutions to selected exercisesFigures from the bookrErrataAcknowledgmentsMany, many people have helped me in the preparation of books in this series. Some arelisted in other versions of the book; thanks to all.As usual, the writing process was made easier by the professionals at Pearson. I’d liketo thank my editor, Tracy Johnson, and production editor, Marilyn Lloyd. My wonderfulwife Jill deserves extra special thanks for everything she does.Finally, I’d like to thank the numerous readers who have sent e-mail messages andpointed out errors or inconsistencies in earlier versions. My website www.cis.ﬁu.edu/ weisswill also contain updated source code (in C and Java), an errata list, and a link to submitbug reports.M.A.W.Miami, Florida

C H A P T E R1Programming: A GeneralOverviewIn this chapter, we discuss the aims and goals of this text and brieﬂy review programmingconcepts and discrete mathematics. We will . . .rSee that how a program performs for reasonably large input is just as important as itsperformance on moderate amounts of input.rSummarize the basic mathematical background needed for the rest of the book.Brieﬂy review recursion.r Summarize some important features of C that are used throughout the text.r1.1 What’s This Book About?Suppose you have a group of N numbers and would like to determine the kth largest. Thisis known as the selection problem. Most students who have had a programming courseor two would have no difﬁculty writing a program to solve this problem. There are quite afew “obvious” solutions.One way to solve this problem would be to read the N numbers into an array, sort thearray in decreasing order by some simple algorithm such as bubble sort, and then returnthe element in position k.A somewhat better algorithm might be to read the ﬁrst k elements into an array andsort them (in decreasing order). Next, each remaining element is read one by one. As a newelement arrives, it is ignored if it is smaller than the kth element in the array. Otherwise, itis placed in its correct spot in the array, bumping one element out of the array. When thealgorithm ends, the element in the kth position is returned as the answer.Both algorithms are simple to code, and you are encouraged to do so. The natural questions, then, are: Which algorithm is better? And, more important, Is either algorithm goodenough? A simulation using a random ﬁle of 30 million elements and k 15,000,000will show that neither algorithm ﬁnishes in a reasonable amount of time; each requiresseveral days of computer processing to terminate (albeit eventually with a correct answer).An alternative method, discussed in Chapter 7, gives a solution in about a second. Thus,although our proposed algorithms work, they cannot be considered good algorithms,1

2Chapter 11234Programming: A General Overview1234twofhaagithdssgtFigure 1.1 Sample word puzzlebecause they are entirely impractical for input sizes that a third algorithm can handle in areasonable amount of time.A second problem is to solve a popular word puzzle. The input consists of a twodimensional array of letters and a list of words. The object is to ﬁnd the words in the puzzle.These words may be horizontal, vertical, or diagonal in any direction. As an example, thepuzzle shown in Figure 1.1 contains the words this, two, fat, and that. The word this beginsat row 1, column 1, or (1,1), and extends to (1,4); two goes from (1,1) to (3,1); fat goesfrom (4,1) to (2,3); and that goes from (4,4) to (1,1).Again, there are at least two straightforward algorithms that solve the problem. For eachword in the word list, we check each ordered triple (row, column, orientation) for the presence of the word. This amounts to lots of nested for loops but is basically straightforward.Alternatively, for each ordered quadruple (row, column, orientation, number of characters)that doesn’t run off an end of the puzzle, we can test whether the word indicated is in theword list. Again, this amounts to lots of nested for loops. It is possible to save some timeif the maximum number of characters in any word is known.It is relatively easy to code up either method of solution and solve many of the real-lifepuzzles commonly published in magazines. These typically have 16 rows, 16 columns, and40 or so words. Suppose, however, we consider the variation where only the puzzle board isgiven and the word list is essentially an English dictionary. Both of the solutions proposedrequire considerable time to solve this problem and therefore might not be acceptable.However, it is possible, even with a large word list, to solve the problem very quickly.An important concept is that, in many problems, writing a working program is notgood enough. If the program is to be run on a large data set, then the running time becomesan issue. Throughout this book we will see how to estimate the running time of a programfor large inputs and, more important, how to compare the running times of two programswithout actually coding them. We will see techniques for drastically improving the speedof a program and for determining program bottlenecks. These techniques will enable us toﬁnd the section of the code on which to concentrate our optimization efforts.1.2 Mathematics ReviewThis section lists some of the basic formulas you need to memorize, or be able to derive,and reviews basic proof techniques.

1.2 Mathematics Review1.2.1 ExponentsXA XB XA BXA XA BXB(XA )B XABXN XN 2XN X2N2N 2N 2N 11.2.2 LogarithmsIn computer science, all logarithms are to the base 2 unless speciﬁed otherwise.Deﬁnition 1.1XA B if and only if logX B ASeveral convenient equalities follow from this deﬁnition.Theorem 1.1logA B logC B;logC AA, B, C 0, A 1ProofLet X logC B, Y logC A, and Z logA B. Then, by the deﬁnition of logarithms, CX B, CY A, and AZ B. Combining these three equalities yieldsB CX (CY )Z . Therefore, X YZ, which implies Z X/Y, proving the theorem.Theorem 1.2log AB log A log B;A, B 0ProofLet X log A, Y log B, and Z log AB. Then, assuming the default base of 2,2X A, 2Y B, and 2Z AB. Combining the last three equalities yields2X 2Y AB 2Z . Therefore, X Y Z, which proves the theorem.Some other useful formulas, which can all be derived in a similar manner, follow.log A/B log A log Blog(AB ) B log Alog X Xlog 1 0,log 2 1,for all X 0log 1,024 10,log 1,048,576 203

4Chapter 1Programming: A General Overview1.2.3 SeriesThe easiest formulas to remember areN 2i 2N 1 1i 0and the companion,N Ai i 0AN 1 1A 1In the latter formula, if 0 A 1, thenN Ai i 011 Aand as N tends to , the sum approaches 1/(1 A). These are the “geometric series”formulas. iWe can derive the last formula for i 0 A (0 A 1) in the following manner. LetS be the sum. ThenS 1 A A2 A3 A4 A5 · · ·ThenAS A A2 A3 A4 A5 · · ·If we subtract these two equations (which is permissible only for a convergent series),virtually all the terms on the right side cancel, leavingS AS 1which implies that11 A iWe can use this same technique to compute i 1 i/2 , a sum that occurs frequently.We write23451S 2 3 4 5 ···2 2222and multiply by 2, obtainingS 32456 3 4 5 ···2 22222Subtracting these two equations yields2S 1 S 1 Thus, S 2.11111 3 4 5 ···2 22222

1.2 Mathematics ReviewAnother type of common series in analysis is the arithmetic series. Any such series canbe evaluated from the basic formula:N i i 1N(N 1)N2 22For instance, to ﬁnd the sum 2 5 8 · · · (3k 1), rewrite it as 3(1 2 3 · · · k) (1 1 1 · · · 1), which is clearly 3k(k 1)/2 k. Another way to rememberthis is to add the ﬁrst and last terms (total 3k 1), the second and next-to-last terms (total3k 1), and so on. Since there are k/2 of these pairs, the total sum is k(3k 1)/2, whichis the same answer as before.The next two formulas pop up now and then but are fairly uncommon.N i2 N3N(N 1)(2N 1) 63ik Nk 1 k 1 i 1N i 1k 1When k 1, the latter formula is not valid. We then need the following formula,which is used far more in computer science than in other mathematical disciplines. Thenumbers HN are known as the harmonic numbers, and the sum is known as a harmonicsum. The error in the following approximation tends to γ 0.57721566, which is knownas Euler’s constant.HN N 1i 1i loge NThese two formulas are just general algebraic manipulations:N f(N) Nf(N)i 1N f(i) i n0N i 1f(i) n 0 1f(i)i 11.2.4 Modular Arithmeti

Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed in initial caps or all caps. Library of Congress Cataloging-in-Publication Data Weiss, Mark Allen. Data structures and algorithm analysis in C / Mark Allen Weiss, Florida Int