CS453 Colorado State University ============================================== PA5, PA6, Debugging Grammars, Symbol Tables, and Semantic Analysis ============================================== ------------------------- Today More PA5 Details PA6 overview Debugging grammars Symbol Tables Scope Symbol Table with a single scope Symbol Table with nested scopes Error recovery for multiple variable redeclarations Representing class references as types Semantic analysis for ... -variables -integer and byte expressions -assignment statements -method calls ------------------------- More PA5 Details - sign extension After a byte x byte subtraction or addition, the result needs to be an int so that MeggyJava is a subset of Java. Therefore we need to sign extend the resulting byte into two bytes. IOW if the most significant bit of the result is 1 and the number is therefore negative, then the generated code needs the hi bits for the result to all be set. brmi L2 ; last result negative ldi r25,0 ; if not negative then hi8 bits are 0 jmp L3 L2: ; if is negative then hi8 bits are all set ldi r25,lo8(-1) L3: [FIXME: need to handle carry bit if generating code that handles overflow arithmetic] - example why can but shouldn't do everything as two bytes, byte casts Can do it because we said we won't test any arithmetic overflow or underflow, even in subexpressions. Shouldn't do it because the code you generate will have different results in some situations where there are byte casts. Meggy.setPixel((byte)0, (byte)((byte)(255+1)+(byte)(0-511)), Meggy.Color.VIOLET); (byte)(255+1) = 0 if not casting then get 256 (byte)(0-511) = 1 if not casting then get -511 - need to know what return value of method is to generate type specific code -> see PA5callExp.java example - callee saved register store and restore is not an issue until we do register allocation -we currently generate code where all of the parameters are being stored on the stack and accessed from there anyway -could and probably would be if any avr-gcc routines were calling our routines -will need callee-saved register store and restore in PA7 when doing register allocation What won't be tested - run-time integer and/or byte overflow or underflow - passing buttons as parameters to user-defined functions or assigning button values to variables just won't be done (see email on mailing list about this for more details) ------------------------- PA6 Overview Goals - code generation for objects and assignment statements - create a symbol table that is used by later passes/visitors in the compiler, - perform semantic analysis to find ALL redeclared variables and to report the first type error New pieces of grammar - variable declarations: method and local - assignment statements ------------------------- Debugging Grammars -> show them the AmbiguousGrammars .cup file and the shift-reduce error Kiley will be showing them the fix in recitation -------------------------- Symbol Table Intro -"the role of a symbol table is to pass information from declarations to uses" Basic info maintained - for each identifier what is its type, scope (includes lifetime), visibility, and location - for each named scope, what identifiers does it contain? - while processing program, what is the current set of scopes? -> what kind of information did we need in PA5 for code generation that required an additional pass? [Method return types] -> what about information that could be done in the same in the same pass as code gen, but might be nice to separate out? [exp types and parameter types] --------------------- Scope -In dragon book, "a statement can be a block [with variable declarations allowed within each block] so our language allows nested blocks, where an identifier can be redeclared" -> in MeggyJava can variables be redeclared within block statements? [look at the grammar] environments - where do you use environment variables every day? - the unix scripting environment assigns values to various environment variables - while compiling we need to maintain a mapping of identifier names to type, scope, and location information. Symbol Table is a kind of environment. scope - a range of stmts over which an identifier is visible lifetime - how long the variable must be stored at runtime What is the lifetime for each of the following: globals whole program execution locals while each instance of function is executed static locals whole program execution member variables from the time object is allocated until object is deallocated Example scoping in languages such as C and Java C global scope file scope function scope unnamed scopes {} Java package scope only class names in default package live in global scope Scoping in MeggyJava -> Show them an example program (PA6movedot.java) - can we declare variables in a while loop? - where can we declare variables? ------------------------- Possible strategy for PA6 (1) visitor that builds a symbol table so can look up type information for variables, methods, and expressions (2) visitor that does type checking (3) visitor that determines AVR-specific base and offsets for variables and puts that information in symbol table entries (4) visitor that does AVR code generation --------------------- SymTable and STE -> See slide for suggested interface for SymTable STE, symbol table entry VarSTE, variable symbol table entry - type, will now have to be implemented with more than an enumerated type - base, string for base register "Y" or "Z" - offset, number or string for offset from base register Allocating space in stack frame in run-time stack - recommend you have a separate visitor used after building the symbol table and before code generator visitor - maintain a current offset for each method and class in visitor - get STE from symbol table and then call setLocation() method on STE --------------------- Using the SymTable interface in single scope -> show insertion of variable into SymTable BuildSymTable visitor outVarDecl - check if var name has already been inserted in SymTable using SymTable.lookup(name) - create VarSTE, increment visitor maintained offset - call SymTable.insert -> error message due to duplicate declaration (next slide for summary) - in outVarDecl check, SymTable.lookup will return an STE -> assume that example no longer has duplicate declaration, how will the undeclared variable be detected? CheckTypes visitor outAssignStatement - do SymTable.lookup(name) on the variable name being assigned outIdExp - do SymTable.lookup(name) on the variable name --------------------- Creating Multiple Scoping Levels in Symbol Table Discussed in Ch. 2.7.1 as well. -> show the symbol table for PA5movedot.java Data structures (-> data structures slide) -The book suggests an Env class that contains a hashtable and a link to its enclosing Env. With this Env class, you can construct a tree structure while building the symbol table to represent scoping. The semantic rules in 2.38 maintain a stack of Env and keep a reference to the most deeply nested scope. -In the MeggyJava compiler, we are creating a symbol table that is then passed on to later stages of the compiler. We recommend that you have a single symbol table class instance that represents the symbol table and provides an interface for its creation and usage. We recommend the following set of classes: SymTable The symbol table should maintain a stack of scopes with the current most deeply nested scope at the top of the stack. The symbol table should also maintain a reference to the outermost (or global) scope. void insertAndPushScope(NamedScopeSTE) For first time a named scope is created like a MethodSTE. void pushScope(String) Looks up a named scope like a method and then pushes its scope on the stack. void popScope() Pops the top scope off the stack. STE lookup(String) Does a lookup in most nested scope. void insert(STE) Inserts symbol table entry into most deeply nested scope. MethodSTE The method symbol table entry contains a reference to signature information and to the method's scope. Scope Contains a dictionary that maps strings to symbol table entries. Also maintains a reference to enclosing scope. STE lookup(String) Looks for given symbol in this scope. If it doesn't find it, then calls lookup on enclosing scope. void insert(STE) Inserts symbol table entry into this scope. -> show AST for PA6movedot.java and go through symbol table creation calls made in the symbol table creation visitor. inMainClass: 1) create a signature for the main function formalTypes is an empty LinkedList of types Signature sig = new Signature(Type.INT, formalTypes); 2) create MethodSTE for main, main_ste 3) insert it into symbol table st.insertAndPushScope(main_ste) outMainClass: 1) st.popScope() inMethodDecl 0) check for name conflict of method name 1) create a signature for the method 2) create a MethodSTE, mste 3) st.insertAndPushScope(mste) outMethodDecl 1) st.popScope() -------------------------- Function signatures function signature: T1 x T2 x T3 -> T The function signature includes the types, order, and number of parameters as well as the return type for the function. -> Have students describe the type signatures for the methods in PA6movedot.java. TRICKY BIT The implicit "this" parameter type can be specified. Not really needed here though because when at CallExp or CallStatement will have to use the Type mapped to the receiver expression to get the ClassSTE so can look up MethodSTE. OR Could have SymTable keep a mapping of CallExp and CallStatement nodes to MethodSTEs, but that would require yet another pass after building the symbol table. -------------------------- Using Multiple Scoping Levels in Symbol Table Discussed in Ch. 2.7.2 as well. -> Show AST for PA6movedot.java and go through symbol table usage calls made in the type checking visitor. inMainClass: No symbols in main's scope so could just do nothing. outMainClass: Do nothing. inaMethodDecl 1) st.pushScope( method name ) outMethodDecl 1) check that the return expression type matches the return type in the method signature st.lookup( method name ) 2) st.popScope() ------------------------ Type checking for functions and function calls -> see slide with possible error messages associated with AST nodes --------------------- Representing class reference types See slide 8 -> we can try to construct the example in class Main idea is that each instance of the Type class represents a unique type within MeggyJava. --------------------- New type checking for PA6 (see slide) ----------------------- Action items for PA6 -implement a SymTable class and associated classes -implement a new Type class that can represent class references -visitor that builds a symbol table -type checking for method calls and assignments -visitor that assigns variables locations -code generation visitor ------------------------ mstrout@cs.colostate.edu, 3/31/11