CS453 Colorado State University ========================================== Top-Down Predictive Parsers ========================================== Goals for Today - Recall predictive parsing - when it works - when it doesn't - removing left recursion - left factoring - General error recovery for predictive parsers - General parsing versus top-down predictive - complexity - requirements on the grammar to be recursive descent parsed - Studying for the midterm ----------------------------- Predictive Parser Table A predictive parser is a recursive descent parser that does not require backtracking. Using the FIRST and FOLLOW sets we can construct a parse table that for each pairing of nonterminal and terminal indicates the relevant production rule. ----------------------------- Error Recovery Goals - Provide program with a list of as many errors as possible - Provide USEFUL error messages - appropriate line and position information - guidance for fixing the error - Avoid infinite loops or recursion - Add minimal overhead to the processing of correct programs Insert tokens? (done by some compilers, e.g. ";") Usually: skip tokens, or give up after x errors (to avoid avalanching) Approaches - Stop after first error - Panic mode, skip tokens until "synchronizing token" encountered (1) in function A() use FOLLOW(A) as set of synchronizing tokens (2) might want to also include start of other statement constructs (3) could put FIRST(A) in synchronizing tokens and restart parsing A (4) Use all tokens other than current error token as synchronizing set. ------------------------------------- Panic mode using heuristic (1) Predictive Parser - each nonterminal requires a function definition - the function for nonterminal X should have one clause for each possible production rule for X. A clause includes a case for every character in the FIRST set for the rhs of the production, each character in the FOLLOW set if the rhs is nullable, and calls to match the tokens or calls to other nonterminal functions to process the rhs of the production. - For panic mode, match tokens until get to follow of nonterminal currently parsing // write panic method for each nonterminal panic_nonterminal( ) { print error; while ( scan() not in (FOLLOW(nonterminal) union {EOF}) ) { } } Float Assignments Grammar S -> StmtList EOF Stm -> id ASSIGN float StmList -> Stm StmList -> epsilon What is nullable, FIRST, FOLLOW for each nonterminal? non terminal nullable FIRST FOLLOW -------------------------------------------------------- Stm no id id, EOF StmList yes id EOF S no id, EOF none - (slide 3) show the predictive parser code that has been modified to have match(tok) throw an exception and has try and catch blocks around the processing of each rhs of production rule ----------------------------- Predictive Parsing Complexity grammar classes LL(k) - left-to-right scan, left-most derivation, k tokens of lookahead General parsing versus top-down predictive - complexity of general parsing O(N^3), where N is number of tokens Earley parser does a form of dynamic programming: O(n^3) in the general case O(n^2) for unambiguous grammars O(n) for most LR(k) grammars http://en.wikipedia.org/wiki/Earley_parser - complexity of predictive parsing is O(N) - requirements on the grammar - for all the productions for nonterminal A - none of the FIRST(rhs) for A production rules can overlap' [-> students if they do what do we do?] - if nullable(A)==true, then FOLLOW(A) must not overlap with FIRST(rhs) for any A->rhs ----------------- Notes about studying for the midterm --------------------------------- mstrout@cs.colostate.edu, 1/25/10 wim bohm@cs.colostate.edu, 1/29/12 mstrout@cs.colostate.edu, 3/5/13