CS453 Colorado State University ========================================== Context-Free Grammars and Top-Down Parsing ========================================== class announcements - recitation this week will be about svn and will need to show a lexer that can handle about half the tokens in MiniSVG, you can provide your own input file - reading that was due today Ch. 2.1, 2.2, 4 thru 4.2 - reading for next week Ch 2.3 thru 2.5, 4.3, 4.4 --------------- Goals for today and Tuesday - Learn context free grammars and associated terminology - Relationship between parse trees and context free grammars - Syntax-directed translation - Top-down predictive parsing, an initial overview -------------------- Context free grammars -useful because we can express programming languages with context free grammars - show grammar for MiniSVG - show grammar for MeggyJava (webpage, MeggyJava grammar) - CF grammars for most languages: SSL, python bytecode, Java bytecode, etc. -why not just use regular expressions to describe programming languages? balanced parentheses example ---------- vocabulary terminal "characters" of the grammar alphabet, in our case tokens nonterminal represents a set of strings made up of terminals symbol a terminal, nonterminal, or epsilon production has a nonterminal on the left side and a string of terminals and nonterminals or epsilon on the right side, also called a rewrite rule start symbol a nonterminal from which derivations begin epsilon a special symbol derivations step by step replacement of nonterminals with the right hand side of production rules sentential form a string of symbols that derive from the start symbol of a grammar ------------ parse trees (slides 14 through 21) create them directly from derivations -root is the start symbol -each nonterminal in the derivation is an internal node -each leaf is a terminal or epsilon -each production links the left hand side nonterminal with each of the symbols on the right hand side Ambiguity (slide 22) No ambiguity (slide 23) show derivation for 42 + 7 * 6 expr1 -> expr2 + expr3 used production (1) -> expr2 + expr4 * expr5 used production (2) -> NUM(42) + NUM(7) * NUM(6) used production (3), 3 times --> have students do all both derivations and parse trees SUGGESTED EXERCISE: Is the MiniSVG grammar ambiguous? -We use parse trees to do syntax directed translation or interpretation... 1) translate from one language to another, 2) create a data structure that represents the program, or 3) evaluate the program like an interpreter. --------------------------- syntax-directed translation or interpretation (slides 25-27) -associate an action or rule with each production, can think of the action as being one more entity on the right-hand side of the production -------------------------------------- How this is related to parsing MiniSVG -------------------------------------- -> show example MiniSVG parse tree (slide ?) - syntax directed translation corresponds to rendering objects to the screen and printing to the log - The trick is that parse tree will be implicit. We won't actually be constructing an explicit data structure. Example: Using parse trees to evaluate expressions (slide 27) -semantics of Exp --> (Stm, Exp) is that Stm is executed and then the lhs Exp gets the value of the rhs expression -like the C comma operator -Do a depth-first, post-order traversal of nodes in the parse tree and map each expression node to a number and each statement node to a side effect. (slide 28) Two possible parse trees for 42 + 7 * 6 -> do interpretation of both -> which is correct? --------------------------- parsing (slides 29-43) - naive algorithm and its complexity - need for O(N) solutions - top-down, predictive parsing example - Next time we will be covering top-down, predictive parsing in detail ------------------------ mstrout@cs.colostate.edu, 1/27/10