Query Evaluation Techniques For Large Databases

Transcription

Query EvaluationTechniquesfor Large DatabasesGOETZ tment,P. O. Box751,Portland,Oregon97207-0751Database management systems will continue to manage large data volumes. Thus,efficient algorithms for accessing and manipulating large sets and sequences will berequired to provide acceptable performance. The advent of object-oriented and extensibledatabase systems will not solve this problem. On the contrary, modern data modelsexacerbate the problem: In order to manipulate large sets of complex objects asefficiently as today’s database systems manipulate simple records, query processingalgorithms and software will become more complex, and a solid understanding ofalgorithm and architectural issues is essential for the designer of database managementsoftware.This survey provides a foundation for the design and implementation of queryexecution facilities innew database management systems. It describes awide array ofpractical query evaluation techniques for both relational and postrelational databasesystems, including iterative execution of complex query evaluation plans, the duality ofsort- and hash-based set-matching algorithms, types of parallel query execution andtheir implementation, and special operators for emerging database application domains.Categories and Subject Descriptors: E.5 [Data]:Systems—query processingFiles; H.2.4 [DatabaseManagement]:General Terms: Algorithms, PerformanceAdditional Key Words and Phrases: Complex query evaluation plans, dynamic queryevaluation plans; extensible database systems, iterators, object-oriented databasesystems, operator model of parallelization, parallel algorithms, relational databasesystems, set-matching algorithms, sort-hash dualityINTRODUCTIONEffectiveand efficientmanagementoflarge data volumesis necessaryin virtually all computerapplications,from business dataprocessingto cationswithimagesandsound,computer-aideddesign and manufacturing, real-timeprocess control,and msarestandardtools in businessdata processing,theyare only slowlybeing introducedto allthe other emergingdatabaseapplicationareas.In most of these new applicationdomains,databasemanagementsystemshave traditionallynot been used for tworeasons. First, restrictivedata definitionand manipulationlanguagescan makeapplicationdevelopmentandmaintenance unbearablycumbersome.Researchintosemanticand object-orienteddatamodels and into persistentdatabaseprogramminglanguageshas been addressing this problemand will eventuallyleadto acceptablesolutions.Second, data vol-Permission to copy without fee all or part of this material is granted provided that the copies are not madeor distributed for direct commercial advantage, the ACM copyright notice and the title of the publicationand its date appear, and notice is given that copying ie by permission of the Association for ComputingMachinery. To copy otherwise, or to republish, requires a fee and/or specific permission.@ 1993 ACM 0360-0300/93/0600-0073 01.50ACM Computing Surveys,Vol. 25, No. 2, June1993

GUNI tNT”SINTRODUCTION1 ARCHITECTUREOF QUERY EXECUTIONENGINES9-, SORTING AND HASHING2.1 Sorting2.2 Hashing3. DISK ACCESS31 File Scans32 AssociativeAccess Using Indices3.3 Buffer Management4 AGGREGATIONAND DUPLICATEREMOVAL41 AggregationAlgorithmsBased on NestedLoops4,z Aggrcgat onAlgorithmsBased on S(,rtlng43 AggregationAlgorithmsBased on Hashing44 ARoughPerformanceC’omparlson45 AddltlonalRemarks on AggregationMATCHINGOPERATIONS5 BINARY51 Nested-LoopsJom Algorithms52 Merge-JoinAlgorithms53 Hash Join AIgorlthms54 Pointer-BasedJoins55 ARoughPerformanceComparisonQuANTIFICATION6 UNIVERSAL7 DUALITYOF SORT- AND HASH-BASEDQUERY PROCESSINGALGORITHMS8 EXECUTIONOF COMPLEXQUERY PLANS9 MECHANISMSFOR PARALLELQUERYEXECUTION91 Parallel versus DlstrlbutedDatabaseSystemsg rm fp ralle]lsm9.3 ImplementationStrategies94 Load Balancingand Skew95 Arcbltecturesand ArchitectureIndependence10 PARALLELALGORITHMS10.1 Parallel Selections and Updates102 Parallel SOrtmg103Parallel Aggregationand DuphcateRemoval10.4 Parallel JolnsandOtber Binary Matcb, ngOperations105 Parallel UniversalQuantlficatlon11 NONSTANDARDQUERY PROCESSINGALGORITHMS11 1 Nested Relations112 TemPoral and Scientific DatabaseManagement11.3 Object-orientedDatabase Systems114 More Control Operators12 ALJDITIONALTECHNIQUESFORPERFORMANCEIMPROVEMENT121 Precomputatlonand Derived Data122 Data Compression12.3 SurrogateProcessing124 B,t Vector F,lter, ng12.5 Specmllzed HardwareSUMMARYAND OUTLOOKACKNOWLEDGMENTSREFERENCESumes mightbe so large or complexthatthe real or perceivedperformanceadvantage of file systemsis consideredmoreimportantthan all other criteria,e.g., thehigher levels of abstractionand programmer temsthat are designedfor bledatabasemanagementsystemtoolkitsthatsupporta varietyof datamodelsmustprovideexcellentperformanceto meet the challengesof verylarge data volumes,and techniquesformanipulatinglargedata sets willfindrenewedand increasedinterestin thedatabasecommunity.The purposeof this paper is to surveyefficientalgorithmsand softwarearchitecturesof databasequery executionengines for executingcomplexqueriesoverqueryislargedatabases.A “complex”one thatrequiresa numberof queryprocessingalgorithmsto work together,databaseuses files withand a “large”sizes fromseveralmegabytesto manyterabytes,which are typicalfor databaseapplicationsat presentand in the nearfuture[Dozier1992; Silberschatzet al.1991]. This survey discusses a large variety of queryexecutiontechniquesthatmust be consideredwhen designingandimplementingthe query executionmodule of a new databasemanagementsystem: algorithmsand their source allocationand schedulingissuesin complex queries,special h as statisticaland scientificdatabases, and generalperformance-enhancing techniquessuch as precomputationand compression.Whilemany,althoughnot all, techniquesdiscussedin this paper have been developedin the context ofrelationaldatabasesystems,mostofthem are applicableto and useful in thequery processingfacilityfor any databasemanagementsystem and any data model,providedthe data model permitsqueriesover “bulk”data types such as sets andlists.

QueryUser InterfaceFDatabase Query LanguageQuery OptimizerQuery Execution EngineFiles and Indices1/0 BufferDiskFigure 1.Query processing in a database system,It is assumed that the reader possessesbasictextbookknowledgeof databasequerylanguages,in particularof relationalalgebra,and of file systems,includingsome basic knowledgeof indexstructures.As shown in Figure1, queryprocessingfills the gap betweendatabasequery languagesand file systems. It canbe dividedinto queryoptimizationandqueryexecution.A queryoptimizertranslatesa query expressedin a highlevel query languageinto a sequenceofoperationsthat are implementedin thequery executionengine or the file system.The goal of query optimizationis to finda query evaluationplan that minimizesthe most relevantperformancemeasure,which can be the databaseuser’s wait forthe first or last resultitem,CPU, 1/0,and networktime and effort(timeandeffortcan differdue to parallelism),memorycosts (as maximumallocationoras time-spaceproduct),total resource usage, even energyconsumption(e.g., forbattery-poweredlaptopsystems or spacecraft), a combinationof the above, or someother performancemeasure.Query optimizationis a specialform of planning,employingtechniquesfrom artificialintelligencesuch as hms,etc. The query executionengineis a collectionof queryexecutionoperatorsand gsystems,networks,and paralleland distributedcomputation.The facilitiesof the niques 75possibleplansthatcan be chosenbythe auerv o timizer,A en ra outlineof the steps requiredforprocessinga databasequeryareshownin Fimu-e2. Of course.this sequence is only a generalguideline,anddifferentdatabasesystemsmay use different steps or merge multiplesteps intoone. Aftera query or requesthas beenenteredinto the databasesystem,be itinteractivelyor by an applicationprogram, the query is parsed into an internal form.Next,the queryis validatedagainstthe metadata(dataaboutthedata, also called schema or catalogs)toensure that the query containsonly validreferencesto existingdatabaseobjects. Ifthe databasesystemprovidesa macrofacilitysuch as relationalviews,referenced macrosand viewsare expandedinto the query[ Stonebraker1975]. Integrityconstraintsmight be expressedasviews (externallyor internally)and wouldalso be integratedinto the query at thispoint in most systems [Metro1989]. Thequery optimizerthen maps the expandedquery expressioninto an optimizedplanthatoperatesdirectlyon thestoreddatabaseobjects.This mappingprocesscan be very complexand mightrequiresubstantialsearchand cost estimationeffort.(O timizationis not discussedinthis pape ; a survey can be found in Jarkeand Koch [1984 ].) The optimizer’soutputis called a queryexecutionplan, queryevaluationplan,QEP, or simplyplan.Using a simple tree traversalalgorithm,this plan is translatedinto a representation readv for executionbv the database’squery ex cutionengine; t e result of thistranslationcan be compiledmachinecodeor a semicompiledor interpretedlanguage or data structure.Thissurveydiscussesonly read-onlyqueriesexplicitly;however,most of thetechniquesare also applicableto updaterequests.In most databasemanagementsystems,updaterequestsmay ts are to be modified.Standardqueryoptimizationand executiontechniquesapply to this search; the actual updateprocedurecan be eitherap-ACMComputingSurveys,Vol. 25, No. 2, June1993

76*Goetz GraefeParsingLQuery ValidationLView ResolutionLOptimizationLPlan CompilationLFigure 2.Query processing steps.plied in a second phase, a methodcalleddeferredupdates,or mergedintothesearchphase if thereis no dangerofcreatingambiguousupdate semantics. 1The problemof ensuringACID ,Isolated(fromotherqueries and requests),and Durable(persistent across all failures)—isbeyond thescope of this paper;suitabletechniqueshave been describedby many other authors,e.g.,BernsteinandGoodman[1981], Bernsteinet al. [1987], Gray andReuter[1991],and Haerderand Reuter[1983].Most researchinto providingACID semanticsfocuses on efficienttechniquesfor s.For example,increasingthe balance of one account anddecreasingthe balance of anotheraccountrequireexclusiveaccessto onlytwodatabaserecordsandwritingsomeinformationto an updatelog. rocessingtargethundredsand even thousandsof small transactionsper second[Davis1992;Serlin1991].—1A standard example for this danger is the “Halloween” problem: Consider the request to “give allemployees with salaries greater than 30,000 a 3%raise.” If (i) these employees are found using anindex on salaries, (ii) index entries are scanned inincreasing salary order, and (iii ) the index is updated immediately as index entries are found, theneach qualifying employee will get an infinite number of raises.ACMComputingSurveys,Vol25, No2, JuneExecution1993Query processing,on the other hand, focuses on extractinginformationfrom alargeamountof data tingreportsfor each branchofficewith average salariesof employeesunder30 years old requiresshared access to alarge numberof records. Mixed requestsarealsopossible,e.g., forcreditingmonthlyearningsto a stock accountbycombininginformationabouta numberof sales transactions.The techniquesdiscussed here apply to the search effort forsuch a mixed request,e.g., for findingtherelevantsales transactionsfor each stockaccount.Embeddedqueries,i.e.,databasequeries that are containedin an application programwrittenin a standardprogramminglanguagesuch as Cobol, PL/1,C, or Fortran,are also not addressedspecificallyin thispaperbecausealltechniquesdiscussedhere can be usedforinteractiveas wellas when the programis compiled,in order to avoid the as pioneeredin System R, includingmechanismsforstoringoptimized plans and invalidatingstored planswhen they become infeasible,e.g., whenan indexis droppedfrom the database[Chamberlainet al. 1981b]. Of course, thecut betweencompile-timeand run-timecan be placed at any other point in thesequence in Figure2.Recursivequeries are omittedfrom thissurvey, because the entirefield of recursivequeryprocessing—optimizationrules and heuristics,selectivityand cost

Queryestimation,algorithmsand ffice it to point to two recent surveysby Bancilhonand Ramakrishnan[1986]and Cacace et al. [1993]).The presentpaper surveysquery executiontechniques;othersurveysthatpertainto the wide subjectof databasesystems have considereddata models andquerylangaages[Gallaireet al. 1984;Hull and King 1987; Jarke and Vassiliou1985;McKenzieandSnodgrass1991;Peckhamand Maryanski1988],accessmethods[Comer1979; Enbodyand Du1988;Faloutsos1985;Samet1984;Sockut and Goldberg1979], compressiontechniques[Bell et al. 1989; LelewerandHirschberg1987],distributedand heterogeneoussystems[Batiniet al. 1986;Litwinet al. 1990; Shethand Larson1990; Thomaset al. 1990], 91;Bernsteinand Goodman1981; Grayet al. 1981; HaerderandReuter1983; Knapp1987],availabilityand reliability[Davidsonet al. 1985; Kim1984],queryoptimization[JarkeandKoch 1984; Manninoet al. 1988; Yu andChang1984],and a varietyof otherdatabase-relatedtopics [Adam and Wortmann1989; Atkinsonand Bunemann1987; Katz 1990; Kemperand Wallrath1987; Lyytinen1987; Teoroy et al. 1986].Bittonet al. [1984]havediscussedanumberof parallel-sortingtechniques,only a few of whichare reallyused inMishraandEich’sdatabasesystems.[1992] recent survey of relationaljoin dfrom one by Kitsuregawa et al. [1983] and also describes joinmethodsusing index structuresand joinmethodsfor distributedsystems.Thepresentsurvey is much broaderin scopeas it also considerssystem architecturesfor complexquery plans and for parallelexecution,selectionand aggregationalgorithms,the relationshipof sortingandhashingas it pertainsto databasequeryprocessing,special operationsfor nontraditionaldata models, and auxiliarytechniques such as compression.Section1 discusses the uationTechniques“77hashing,the two generalapproachestomanagingand matchingelementsof largesets, are describedin Section 2. Section 3focuses on accessinglarge data sets ondisk. Section4 begins the discussionofactualdata manipulationmethodswithalgorithmsfor aggregationand duplicateremoval,continuedin Section 5 with binarymatchingoperationssuch as joinand intersectionand in Section6 withoperationsfor universalquantification.Section 7 reviewsthe manv dualitiesbetweensortingand hashi gand pointsout their differencesthat have an impacton the performanceof algorithmsbasedon either one of these approaches.Execution of very complexqueryplanswithmany operatorsand with nontrivialplanshaues is discussedin Section 8. Section9 is’ devotedto mechanismsfor algorithms.Section11 outlinessome nonstandardoperatorsfor emergingdatabaseapplicationssuch as statisticaland scientificdatabasemanagementsystems.Section12 is a potpourriof additionaltechniquesthatenhancethe performanceof manyalgorithms,e.g., compression,precomputation,and specializedhardware.The final section containsa brief summarv . andan outlookon query processingresearchand its future.For readers who are more interestedinsome tonics thanothers.most sectionsare fair y self-contained.’Moreover,thehurriedreadermay wantto skip the;2 theirrederivationof cost functionssults and effects are summarizedlater indiagrams.1. eson usefulmechanisms for processingsets of items. Theseitems can be records,tuples,entities,orobjects.Furthermore,most of the tech-2 In any case, our cost functions cover only a limited, though Important, aspect of query executioncost, namely 1/0 effort.ACMComputingSurveys,Vol25, No 2, June1993

78“Goetz Graefeniquesdiscussedin this survey apply tosequences,not only sets, of items,althoughmost queryprocessingresearchhas assumed relationsand sets. All queryprocessingalgorithmimplementationsiterate over the membersof theirinputsets; thus,sets are alwaysrepresentedby sequences.Sequencescan be used torepresentnot only sets but also otherone-dimensional“bulktypessuchaslists, arrays,and time series, and manydatabasequeryprocessingalgorithmsand techniquescan be used to manipulate these otherbulktypes as well assets. The importantpoint is to thinkofthesealgorithmsas algebraoperatorsconsumingzero or more inputs(sets orsequences)and producingone (or sometimesmore) outputs.A completequeryexecutionengineconsistsof a collectionof operatorsand mechanismsto executecomplex expressionsusing multipleoperators, includingmultipleoccurrencesofthe same operator.Taken as a whole, thequery processingalgorithmsform an algebra which we call the physicalalgebraof a databasesystem.The physicalalgebrais equivalentto,but quite differentfrom, the logical algebra of the data modelor the databasesystem.Thelogicalalgebrais morecloselyrelatedto the datamodelanddefines what queries can be expressedinthe data model;for example,the relationalalgebrais a logicalalgebra.Aphysicalalgebra,on the other the same data model and thesame logicalalgebrabut may use verydifferentphysicalalgebras.For example,while one relationalsystem may use ested-loopsjoinandmerge-join,whilea thirdone may relyentirelyon hash joinalgorithms.(Joinalgorithmsare discussedin detaillaterin the section on binarymatchingoperators and icaland physicalalgebrasis the factthatspecificalgorithmsand thereforecost functionsare associatedonly withphysicaloperators,not with logical alge-ACMComput,ngSurveys,V.125, No. 2, June1993bra operators.Because of the lack of analgorithmspecification,a logical algebraexpressionis not directlyexecutableandmust be mappedinto a physicalalgebraexpression.For example,it is impossibleto determinethe executiontime for theleft expressionin Figure3, i.e., a logicalalgebraexpression,withoutmappingitfirstinto a physicalalgebraexpressionsuch as the query evaluationplan on therightof Figure3. This mappingprocesscan be trivialin some databasesystemsbutusuallyis fairlycomplexin realdatabasesvstems because it involvesalgorithmc oices and because logicalandphysicaloperatorsfrequentlydo not mapdirectlyinto one another,as shown in thefollowingfour examples.First,some operatorsin the physicalalgebramay implementmultiplelogicaloperators.Forexample,all seriousimplementationsofrelationaljoin algorithmsincludea facility to outputfewer thanall attributes,i.e., a relationaldelta-project(a projectionwithoutdudicateremoval)is included in the ph sical join operator.Second, some physicaloperatorsimplementonly part of a logical operator.For example, a duplicateremovalalgorithmimplementsonly the “secondhalf”of a ratorsdo not exist in thelogicalalgebra.Concretely,a sort operator has no place in pure relationalalgebra because it is an algebraof sets. andsets are, by theirdefi ition,unordered.Finally,some propertiesthatholdforlogical operatorsdo not hold, or only withsome qualifications,for the counterpartsin physicalalgebra.For example,whileintersectionand union are entirelysymmetricand commutative,algorithmsimplementingthem(e.g., nestedloops orhybridhash join) do not treat their twoinputsequally.The differenceof logicaland physicalalgebrascan also be looked at in a different way. Any databasesystem raises thelevelof abstractionabovefilesandrecords;to do so, there are some logicaltype constructorssuch as tuple, relation,set, list, array, pointer,etc. Each logicaltypeconstructoris complementedby

QueryEvaluationMerge-JoinIntersection/\Set ASet BFigure 3.Techniques 79(Intersect)/\sortsortIFile Scan AFile Scan BILogical and physical algebra expressions.some operationsthatare permittedoninstancesof such types,e.g., tc.On the physicalor representationlevel,there is typicallya smallerset of representationtypes and structures,e.g., file,record, record identifier(RID), and maybevery large byte arrays [Carey et al. 1986].Formanipulation,therepresentationtypes have theirown operations,whichwill be differentfrom the operationsonlogicaltypes. Multiplelogicaltypes andtype constructorscan be mappedto thesame physicalconcept. They may also besituationsin which one logical type constructorcan be mapped to multiplephysical concepts, e.g., a set dependingon itssize. The mappingfrom logicaltypes tophysicalrepresentationtypes and structures is called physicaldatabasedesign.Query optimizationis the mappingfromlogicalto physicaloperations,and thequeryexecutionengineis the implementationof operationson physicalrepresentationtypesandof mechanismsfor coordinationand cooperationamongmultiplesuch operationsin complex queries. The policiesfor using these mechanisms are part of the query optimizer.Synchronizationand data transferbetween operatorsis the main issue to beaddressed in the architectureof the queryexecutionengine.Imaginea query withtwo joins, and considerhow the result ofthe first join is passed to the second one.The simplestmethodis to create (write)and read a temporaryfile. The need fortemporaryfiles, whetherthey are kept inthe bufferor not, is a directresultofexecutingan operator’sinputsubplanscompletelybefore startingthe operator.Alternatively,it is possible to create oneprocess for each operatorand then to useinterprocesscommunicationmechanisms(e.g., pipes) to transferdata betweenoperators,leavingit to the operatingsystem to scheduleand suspendoperatorprocessesas pipesare fullor empty.Whilesuchdata-drivenexecutionremoves the need for temporarydisk files,it introducesanothercost, that of operating systemschedulingand interprocesscommunication.In orderto avoid bothtemporaryfilesand operatingsystemscheduling,Freytagand rogramsthattransforma plan represented as a tree structureinto a singleiterativeprogramwith nested loops andother control structures.However,the requiredrule set is not simple,in particular for algorithmswithcomplexcontrollogic such as sorting,merge-join,or evenhybridhash join (to be discussed later inthe section on matching).The most practicalalternativeis to implementall operatorsin such a way thatthey schedule each other withina singleoperatingsystem process. The basic ideais to define a granule,typicallya singlerecord,and to iterateover all granulescomprisingan intermediatequery result.3Eachtimean operatorneedsanothergranule,it calls its input(operator)toproduceone. This call is a simplepro-—3 It is possible to use multiple granule sizes withina single query-processing system and to providespecial operators with the sole purpose of translating from one granule size to another. An example Ma query processing system that uses records as aniteration granule except for the inputs of merge-join(see later in the section on binary matching), forwhich it uses “value packets,” i.e., groups of recordswith equal join attribute values.ACMComputmgSurveys,Vol. 25, No. 2, June1993

80Goetz Graefe cedurecall, muchcheaperthaninterprocess communicationsince it does notinvolve the operatingsystem. The callingoperatorwaits(just as any callingroutine waits)untilthe inputoperatorhasproducedan item. That inputoperator,in a complexquery plan, mightrequirean item from its own input to produce anitem; in that case, it calls its own input(operator)to produce one. Two importantfeaturesof operatorsimplementedin thisway are that they can be combinedintoarbitrarilycomplexqueryevaluationplans and that any numberof operatorscan execute and scheduleeach other in asingle process withoutassistancefrom orinteractionwiththe underlyingoperating system. This model of operatorimplementationand schedulingresemblesveryclosely those used in relationalsystems,e.g., SystemR (and laterSQL/DSandDB2),Ingres,Informix,and Oracle,aswell as in experimentalsystems, e.g., theE programminglanguageused in EXODUS [Richardsonand Carey 1987], Genesis [Batoryet al. 1988a;1988 b], andStarburst[Haas et al. 1989; 1990]. Operatorsimdementedin elines,row-sources,or similarnamesin the “lingo”of commercialsystems.To make the implementationof operators a littleeasier,it makessense toseparatethe functions(a) to prepareanoperatorfor producingdata, (b) to produce an item,and (c) to performfinalhousekeeping.In a file scan, these functionsare calledo en.next.and closeprocedures;we adopt these names for alloperators.Table 1 gives a rough idea ofwhatthe open, next, and close procedures for some operatorsdo, as well asthe principallocal state that needs to besaved from one invocationto the next.(Latersections will discuss sort and joinoperationsin detail. ) Thefirstthreeexamplesare trivial,but the hash joinoperatorshowshowan operatorcanscheduleitsinputsina nontrivialmanner.Theinterestingobservationsare that (i) the entirequery plan is executed withina single process, (ii) operatorsproduceone itemat a timeonLACMComputingSurveys,Vol25, No2. June1993request,(iii)this modeleffectivelyimplements,withina -drivendata flow. (iv) itemsneverwaitin a temporary‘file or bufferbetween operatorsbecause they are neverm-educedbeforethevare needed.(v) hereforethis model “& very efficientinitstime-space-productmemorycosts,(vi) iteratorscan scheduleanv tree. includingbushytrees(see below),‘(vii)no operatoris affectedby the complexity of the wholeplan,i.e., thismodelof operatorimplementationandsynchronizationworksfor sim leas wellas very complexquery plan;.As a finalremark,there are effectiveways to combinethe iteratormodelwith arallelquery processing,as will be disc ssedinSection 9.Since query plans are algebraexpressions, they can be representedas trees.Query plans can be divided into prototypical shapes, and query executionenginescan be dividedinto groupsaccordingtowhich shapes of plans they can evaluate.Figure4 showsprototypicalleft-deep,right-deep,and bushy plans for a join ntbecausejoinalgorithmsuse theirtwo inputsin different ways; for example,in the nested-loopsjoinalgorithm,the outerloop iteratesover one input(usuallydrawnas leftinput)while the inner loop iteratesoverthe other input. The set of bushy plans isthe most general as it includesthe sets ofbothleft-deepandright-deepplans.These names are taken from Graefe andDeWitt[ 1987]; left-deepplansare alsocalled“linearprocessingtrees”or “plans[Krishnamurthyet al. 1986]withno compositeinner”[OnoandLehman1990].For querieswithcommonsubexpressions, the query evaluationplan is not atree but an acyclic directedgraph (DAG).Mostsystemsthatidentifyand exploitcommon subexpressionsexecute the planequivalentto a commonsubexpressionseparately,savingthe intermediateresult in a temporaryfile to be scannedrepeatedlyand destroyedafter the last

QueryTable 1.IteratorOpenPrintopenScanopen fileSelectopenHash niquesLocal StateNextClosecall next on input;format the item onscreenclose inputclose filecall next on inputuntil an itemqualifiescloseallocate hashdirectory; open left“build” input; buildhash table callingnext on build input;close build input;open right “probe”inputcall next on probeinput until a match isfoundclose probe input;deallocate hashdirectoryMerge-Join(withoutduplicates)open both inputsget next item frominput with smallerkey until a match isfoundclose both inputsSortopen input; build allinitial run filescalling next on input;close input; mergerun files untd onlyone merge step is leftdetermine nextoutput item; readnew item from thecorrect run filedestroy remainingrun filesopen file descriptorinputhash directorymerge heap, open filedescriptors for run filesJoin A-BJoin C-DJo::@A81Examples of Iterator Functionsread next iteminput :fi:‘m:.-.BcFigure 4.DLeft-deep, bushy, and right-deep plans.scan. Each plan fragmentthatis executedas a unitis indeeda tree. Thealternativeis a “split”iteratorthat candeliverdata to multipleconsumers,i.e.,that can be invokedas iteratorby multiple consumeriterators.The split iteratorpaces its inputsubtreeas fast as thefastestconsumerrequiresit and holdsitemsuntilthe slowestconsumerhasconsumedthem.If the consumersrequest data at about the same rate, thesplit operatordoes not requirea temporary spool file; such a file and its associated 1/0cost are requiredonly if thedata rate requiredby the consumersdi-verges above some predefinethreshold.Amongthe implementationsof iterators for query processing,one group canbe called“stored-setorientedand theother “algebraoriented.”In System R, anexamplefor the first group, complex joinplans are constructedusing binaryjoiniteratorsthat“attach”one moreset(stored relation)to an existingin

database management systems. Thus, object-oriented database management systems that are designed for nontradi-tional database application domains and extensible database management system toolkits that support a variety of data models must provide excellent perfor-mance to meet the