Parsing - U-aizu.ac.jp

Transcription

Language Processing SystemsSyntax Analysis(Parsing)Prof. Mohamed HamadaSoftware Engineering Lab.The University of AizuJapan1.Uses Regular Expressions to define tokens2.Uses Finite Automata to recognize tokensnext charlexicalanalyzerget nextcharnext tokenSourceProgramParsingSyntaxanalyzerget nexttokenTop Down ParsingPredictive ParsingsymboltableLL(k) ParsingBottom Up ParsingShift-reduce ParsingLR(k) Parsing(Contains a recordfor each identifier)Left RecursionUses Top-down parsing or Bottom-up parsingLeft FactoringTo construct a Parse treeParsingTop Down ParsingPredictive ParsingLL(k) ParsingLeft RecursionA Predictive ParserBottom Up ParsingHow it works?1. Construct the parsing table from the given grammar2. Apply the predictive parsing algorithm to construct the parse treeShift-reduce ParsingLR(k) ParsingLeft FactoringTop-down parsers: starts constructing the parse tree at the top (root) of the treeand move down towards the leaves.Easy to implement by hand, but work with restricted grammars.Example: predictive parsers

A Predictive ParserA Predictive Parser2. Apply the predictive parsing algorithm to construct the parse tree1. Construct the parsing table from the given grammarThe following algorithm shows how we can construct the move parsing table for aninput string w with respect to a given grammar G.The following algorithm shows how we can construct the parsing table:set ip to point to the first symbol of the input string w repeatInput: a grammar Gif Top(stack) is a terminal or thenif Top(stack) Current-Input(ip) thenPop(stack) and advance ipelse nullelseif M[X,a] Xà Y1Y2 Yk thenbeginPop(stack);Push Y1; Y2; ; Yk onto the stack, with Y1 on top;Output the production Xà Y1Y2 YkendOutput: the corresponding parsing table MMethod: For each production A à α of the grammar do the following steps:1. For each terminal a in FIRST(α), add A à α to M[A,a].2. If λ in FIRST(α), add A à α to M[A,b] for each terminal b in FOLLOW(A).3. If λ FIRST(α) and in FOLLOW(A), add A à α to M[A, ]else nulluntil Top(stack) (i.e. the stack become empty)A Predictive Parser2. Apply the predictive parsing algorithm to construct the parse treeExampleE TE’E’ TE’ λT FT’T’ *FT’ λF ( E ) idGrammar:Set ipip toto pointpoint toto thethe firstfirst symbolsymbol ofof thethe inputinput stringstring w w Setrepeatif Top(stack) is a terminal or thenif Top(stack) Current-Input(ip) thenPop(stack) and advance ipelse else nullififM[X,a] M[X,a] XàXàYY1Y Ykkthenthen22 Y1YbeginPop(stack);Push YY11;; YY22; ; ;; YYkk ontoonto thethe stack,stack, withwith YY11 onon top;top;PushOutputthetheproductionproduction XàY1; Y21; ; Ykk ;OutputY2 Yendelse nulluntil Top(stack) (i.e.thestackbecomeempty)Top(stack) INPUT:NONINPUT SYMBOLTERMINALid *() E TE’E TE’EE’ TE’E’ λ E’ λE’T FT’T FT’TT’ λT’ *FT’T’ λ T’ λT’F idF (E)Fid id*ididid*OUTPUT: ENONTERMINALEE’TT’FidE TE’T FT’INPUT SYMBOL*(E TE’ E’ TE’T’ λT FT’T’ *FT’F idF (E)) E’ λE’ λT’ λT’ λE’A Predictive ParserOUTPUT: TPredictive ParsingProgramETE’ PARSINGTABLE:A Predictive ParserINPUT: ipSTACK:ParsingTable:idINPUT:id id*id OUTPUT:ESTACK:PARSINGTABLE:TPredictive ParsingProgramFTE’T’E’ NONTERMINALEE’TT’FidE TE’T FT’F id E’ TE’T’ λINPUT SYMBOL*(E TE’T’ *FT’T FT’F (E)FE’T’) E’ λE’ λT’ λT’ λESTACK:PARSINGTABLE:TPredictive ParsingProgramidTFE’T’E’ E’FT’idNONTERMINALEE’TT’FidE TE’T FT’F id E’ TE’T’ λINPUT SYMBOL*(E TE’T’ *FT’T FT’F (E)) E’ λE’ λT’ λT’ λ

A Predictive ParserA Predictive ParserAction when Top(Stack) input : Pop stack, advance input.INPUT:id id*idOUTPUT: INPUT:id id*idOUTPUT: ESTACK:Predictive ParsingProgramidFT’E’ ’FidE TE’T FT’ E’ TE’T’ λF idINPUT SYMBOL*(E TE’T’ *FT’T FT’F (E)) E’ λE’ λT’ λT’ λPredictive ParsingProgramE’T’E’ PARSINGTABLE:input string.LL(k) ParserThe grammarWhose PARSINGTABLE:NONTERMINALEE’TT’FidE TE’T FT’F idIs LL(1) grammar E’ TE’T’ λT’ *FT’T FT’F (E))T’ λF idT’ *FT’T FT’F (E)λ) E’ λE’ λT’ λT’ λE’T’*λFT’idλAn LL(k) parser looks k symbols ahead to decideits action.LL(1) A grammar whose parsing table has nomultiply-defined entriesLL(1) grammars enjoys several nice properties: for examplethey are not ambiguous and not left recursive.LL(1) A grammar whose parsing table has nomultiply-defined entriesE TE’E’ TE’ λT FT’T’ *FT’ λF ( E ) idINPUT SYMBOL*(E TE’T FT’ E’ TE’idLL(k) ParserLL(1) A grammar whose parsing table has nomultiply-defined entriesExample 1TFididE TE’INPUT SYMBOL*(E TE’T’This parser parses from left to right, and does aleftmost-derivation. It looks up 1 symbol ahead tochoose its next action. Therefore, it is known asa LL(1) parser.E’ NONTERMINALEE’TT’F E’ λE’ λT’ λT’ λE’FLL(k) ParserA Predictive ParserEThe predictive parser proceedsin this fashion emiting theTfollowing productions:FT’E’ TE’T FT’idλF idT’ * FT’F idWhen Top(Stack) input T’ λthe parser halts and accepts theE’ λTThe grammarExample 2S iEtSS aS’ eS λE FbWhose PARSINGTABLE:NONTERMINALaS aSbS’ λS’ eSS’EINPUT SYMBOLeitS iEtSS’E bIs NOT LL(1) grammar S’ λ

Bottom-Up ParsersParsingTop Down ParsingPredictive ParsingBottom Up ParsingShift-reduce ParsingLL(k) ParsingBottom-up parsers: build the nodes on the bottom of the parse tree first.Suitable for automatic parser generation, handle a larger class ofgrammars.Examples: shift-reduce parser (or LR(k) parsers)LR(k) ParsingLeft RecursionLeft FactoringGrammar HierarchyBottom-up ParsingNon-ambiguous CFG No problem with left-recursion Widely used in practice LR(1), SLR(1), LALR(1)CLR(1)LALR(1)LL(1)SLR(1)Bottom-up ParsingBottom-up Parsing1 (2) (3) Works from tokens to start-symbol Repeat:– identify handle - reducible sequence: non-terminal is not constructed but all its children have been constructedE (E) (3)E (3)EE (E)EE– reduce - construct non-terminal and updatestack Until reducing to start-symbolE E (E)E ii 0,1, 2, , 9E (2) (3)E1E (2E) (3)

Bottom-up ParsingBottom-Up Parser Is the following grammar LL(1) ?A bottom-up parser, or a shift-reduce parser, beginsat the leaves and works up to the top of the tree.E E (E)E i NOThe reduction steps trace a rightmost derivationon reverse.S aABeA Abc bB dConsider the Grammar:1 (2)1 (2) (3)We want to parse the input string abbcde. But this is a useful grammarBottom-Up ParserExampleINPUT:aProductionS aABeA AbcA bB dbbcde Bottom-Up ParserExampleOUTPUT:Bottom-Up ParserExampleProductionS aABeA AbcA bB daAbcde Bottom-Up ParsingProgramaProductionS aABeA AbcA bB dBottom-Up ParsingProgramINPUT:INPUT:bbcde Bottom-Up ParsingProgramOUTPUT:AbBottom-Up ParserExampleOUTPUT:AbINPUT:ProductionS aABeA AbcA bB daAbcde Bottom-Up ParsingProgramOUTPUT:AbWe are not reducing here in this example.A parser would reduce, get stuck and then backtrack!

Bottom-Up ParserExampleINPUT:aProductionS aABeA AbcA bB dAbcde Bottom-Up ParsingProgramBottom-Up ParserExampleOUTPUT:INPUT:ProductionS aABeA AbcA bB dAAbcbBottom-Up ParserExampleINPUT:aProductionS aABeA AbcA bB dAde Bottom-Up ParsingProgramOUTPUT:ABe de Bottom-Up ParsingProgramOUTPUT:AAbAABbc daProductionS aABeA AbcA bB dABe Bottom-Up ParsingProgramOUTPUT:AABbc dbBottom-Up ParserExampleOUTPUT:INPUT:S OUTPUT:SProductionS aABeA AbcA bB dBottom-Up ParsingProgramcbINPUT:Bottom-Up ParserExampleaABottom-Up ParserExamplebINPUT:aaAbSABbc deProductionS aABeA AbcA bB dBottom-Up ParsingProgramaAABbc dbThis parser is known as an LR Parser becauseit scans the input from Left to right, and it constructsa Rightmost derivation in reverse order.e

Bottom-Up ParserExampleThe scanning of productions for matching withhandles in the input string, and backtracking makesthe method used in the previous example veryinefficient.Can we do better?See next lecture

1. Construct the parsing table from the given grammar The following algorithm shows how we can construct the parsing table: Input: a grammar G Output: the corresponding parsing table M Method: For each production A à α of the grammar do the following steps: 1. For each terminal a in FIRST(α), add A à α to M[A,a]. 2.