CS 453 Programming Assignment #5 — Semantic Analysis
Due Friday, April 16th (by 11:59pm)
late policy
Preliminary Subversion Log
Due Monday, March 31st (by 11:59pm)
see below
Introduction:
For your last assignment you wrote a parser using JavaCUP.
The parser generates an abstract syntax tree representation of the
program.
Now
that we have a clean representation of the input program, it's
time to perform semantic analysis.
- As a first step, you will write
a visitor that determines what line and position to associate with
each node in the abstract syntax tree.
- We provide most of a symbol table implementation. You will
need to implement the lookup method for the class Scope.
-
Then you will use the symbol table classes and
ast.analysis classes to
write a short visitor
class that stores away information about classes, methods, and
variables as it traverses an AST. This visitor will also generate
errors
when a symbol is used twice in any visible scopes (e.g. in the
case of multiple declarations of the same variable name).
You will test all the pieces by
compiling with a driver that will generate an AST, traverse it
with your visitors, and dump the resulting symbol table information
to the file inputfile.ST.dot.
- While generating the symbol table entries for variables,
your symbol table building visitor will need to assign offsets for each
variable. For local variables including formal parameters, the offsets
are offsets relative to the frame pointer in the target MIPS code.
For member variables, the offset is relative to the start of the memory
allocation for the class instance.
- The final step will be to perform some semantic analysis. You will be
performing typechecking. You will test
the semantic analysis by writing test cases that contain the kinds of
errors for which you are expected to print error messages.
The error messages should be printed to the standard error stream.
The more test case sharing that occurs
on the mailing list, the better your chances of fully testing your
implementation.
To get started download and unpack MJSemDriverStart.tar.
Start a subversion repository and verify that the starter project
compiles and run for the SemDriver.java main.
Then
copy your parser from PA4 into the MJSemDriverStart/src/mjparser directory
and call your parser within SemDriver.java.
Lines and Positions
In order to print reasonable error messages, you will need to know
the line and position for each error. We provide the Lines data
structure (MJSemDriverStart/lines/Lines.java)
for maintaining the mapping between nodes and the line
and position information.
You need to write a visitor over the ast that determines this information
for each node in the AST. Token nodes have line and position information
already associated with them. Non-token nodes should be given the same
line and position as the FIRST child token with which they are associated.
If the non-token node has no children, then it should be given the line and
position for the first token that follows it in a depth-first search traversal.
For example,
a + b - c * d
assume that the variable c in the above is at line 10 and position 11.
The line and position associated with the multiplication (i.e. MulExp node)
should be line 10 and position 11.
You may want to use the ReversedDepthFirstAdapter to implement the visitor
that calculates lines and positions.
Symbol Tables
We provide most of a symbol table implementation in MJSemDriverStart/symtable.
The SymTable data structure contains a stack of scopes.
The outermost, or bottom-most, scope is the global scope. It is then possible
to have a class scope and a method scope on the stack as well.
Each
Scope instance will refer to its enclosing Scope. Therefore if a lookup
is performed on a symbol, if it is not found in the current scope then it
is possible to traverse to the enclosing scope to look for the symbol there.
Each symbol will have a symbol table entry (STE) associated with it.
Your first task will be to fill in the missing lookup method body in the
Scope class.
The Symbol Table Builder Visitor
Once you've finished with the symbol table code, define a visitor
class that will add entries to the table as it traverses the AST.
It may help to look at
PrintVisitor.java. The main difference between BuildSymTable and
PrintVisitor is that in BuildSymTable you don't have to
override the case methods. Instead you will be able to implement
your symbol table builder visitor using the in and out methods
for various AST nodes.
The parser has built lists of variable
declarations, method definitions, and class declarations, so you
won't have to look through much of the tree to find the information
you need. Make sure
your visitor class has a SymTable instance variable, and
a getSymTable() method that returns it so that we can get at
the fruits of its labor.
For the MiniJava we are implementing,
no symbol can be used twice in the same scope. Also,
no name can be used within an enclosing scope that already
has another definition for that name.
This means that
no method name or member variable name can be the same as any of
the classes that have been defined before it and
no local variable can be the same as any classes, current class methods, or current class members that have
been already defined.
The symbol table builder should print out the following error to
standard error anytime
a symbol is redefined:
[LINENUM,POSNUM] Redefined symbol SYMNAME
where LINENUM is the line number for the symbol,
POSNUM is the position number for the symbol, and
SYMNAME is the symbol name.
All such errors should be printed to standard error,
and at the end of symbol table construction
if any such errors are printed then the following error should be printed
to standard error and
your compiler should exit:
Errors found while building SymTable
The BuildSymTable visitor should just skip adding symbol table entries for
any of the identifiers within any class or method if the class name or method name is multiply defined.
Notice that the
SemanticDriver.java
expects to catch a SemanticException class. You can find the definition for
SemanticException in MJSemDriverStart/exceptions.
The MethodSTE constructor takes a Frame interface as one of its parameters.
For now, just pass the following:
mFrame.newFrame(
new Temp.Label(mCurrentClass.getName()+"_"+node.getName().getText()),
new LinkedList())
)
The LocalSTE constructor takes an Access interface
as one of their parameters. For now, just pass the following:
new Mips.InReg(new Temp.Temp())
The correct way to create this parameter is discussed in
the next step of the assignment.
Memory Layout
Next, you will change the MemberSTE and LocalSTE constructor
calls
so that each LocalSTE and MemberSTE ends
up with information about where it will be stored at runtime.
Local variables and
parameters will be stored in the stack frame. Class member variables
will be stored in a class instance, which will be allocated on the heap.
Stack Layout
The handout provided in class describes the Frame.Frame, Frame.Access,
Mips.Frame, Mips.InFrame, Mips.InReg, Temp.Temp, and Temp.Label classes.
These classes have been provided for you in the MJSemDriverStart.
When the BuildSymTable is processing a method declaration, it should
call newFrame to create a frame for the method and then it should
create LocalSTEs for each of the parameters including the implicit this
parameter.
When the BuildSymTable visitor is processing a local variable declaration,
that local should be given space on the frame using a call to the
allocLocal() for a particular method's frame.
Local variables should be allocated space in the order they are declared.
public abstract class Frame {
public abstract Frame newFrame(Label name, List<Boolean> formals);
public Label name;
public List<Access> formals;
public abstract Access allocLocal(boolean escape);
public abstract int wordSize();
}
Class Layout
The ClassSTE will need to keep track of the current available offset
within its layout. For example, let's say the class has three member
variables: int a, boolean b, and Class1 c. When the declaration for
int a is being processed, the BuildSymTable visitor will need to call
allocMember(frame.wordSize()) and the first offset received will be 0.
When allocMember() is called for boolean b, the return value will be
equivalent to frame.wordSize(), which is 4 in the case of MIPS. The offset
returned from allocMember() for Class1 c will be 8. The integer, boolean,
and class reference datatypes are all wordSize() in MiniJava.
Member variables must be allocated space in the order they are declared.
public class ClassSTE extends STE {
...
// returns an offset into the class for storing a member variable
public int allocMember(int sizeInBytes) { ... }
...
}
The Type Check Visitor
The type check visitor is responsible for flagging incorrect type
usage within a language, for example adding an integer to a boolean.
The type errors you are responsible for generating along with hints as
to where such errors might occur are listed below.
To make your grade better and Alan's life easier,
do not change the phrasing of the error messages.
[LINENUM,POSNUM] Invalid type returned from method METHODNAME
// any method declaration
[LINENUM,POSNUM] Class CLASSNAME does not exist
// anywhere an id token should be a class name
[LINENUM,POSNUM] Invalid condition type for if statement
[LINENUM,POSNUM] Invalid condition type for while statement
[LINENUM,POSNUM] Undeclared variable VARNAME
// anywhere an id token could be a variable name
[LINENUM,POSNUM] Invalid expression type assigned to variable VARNAME
// assignment statements
[LINENUM,POSNUM] Array reference to non-array type
// array expressions and array assignments
[LINENUM,POSNUM] Invalid index expression type for array reference
[LINENUM,POSNUM] Operator length called on non-array type
[LINENUM,POSNUM] Invalid expression type assigned into array
// array assignments
[LINENUM,POSNUM] Invalid left operand type for operator OP
[LINENUM,POSNUM] Invalid right operand type for operator OP
[LINENUM,POSNUM] Invalid operand type for operator !
[LINENUM,POSNUM] Receiver of method call must be a class type
[LINENUM,POSNUM] Method METHODNAME does not exist in class type CLASSNAME
[LINENUM,POSNUM] Method METHODNAME requires exactly NUM arguments
[LINENUM,POSNUM] Invalid argument type for method METHODNAME
// any call to a method (note that println is a method)
[LINENUM,POSNUM] Invalid operand type for new array operator
With this visitor, you should stop type checking after the first error
is detected and reported. If multiple errors are relevant for the same
AST node, then use the above list as an ordering of which error message
should be generated. For example, if a call is made to a non-existing method
and passed the wrong number of arguments, then the following error should be
printed to standard error:
[LINENUM,POSNUM] Method METHODNAME does not exist in class type CLASSNAME
The Assignment:
After each piece of the assignment you should use the
SemDriver.java driver
to test that piece (just comment out calls to later pieces).
We will use the SemDriver and perform diffs on the files you
generate and your standard error output with the corresponding output from
our reference compiler implementation. There are example programs and their output in the MJSemDriver/TestCases/. The .out files include the standard error and standard output results for running the reference compiler on the associated
input file.
To partially test the visitor that builds the symbol table, you can compare your
results with the .ST.dot files. Other relevant output files include
.ast.dot and .astlines.dot.
Submitting:
Preliminary Subversion Log
Assignment
-
Make sure you test your semantic analyzer thoroughly — ensure that it
recognizes all legal programs and generates appropriate error messages for
illegal programs. You should test your finished semantic analyzer on some
of the sample programs from the MiniJava site and write at least one test case
for each error message you are expected to generate.
- Create a jar file that contains all the source code, the .class files, a README file, and a subversion.txt file. We should be able to use the jar file to run the SemDriver class.
- Create a mainClass.txt file in MJParser-groupname/src/ with the following contents (make sure there is an newline after the second line in the file):
Main-Class: SemDriver
Class-Path: java-cup-11a-runtime.jar
- Also put the subversion.txt and README files into the src/ subdirectory.
- While in the src/ subdirectory, do the following commands:
% javac -classpath .:mjparser/java-cup-11a-runtime.jar SemDriver.java
% jar cmf mainClass.txt MJSemDriver-groupname.jar *.class */*.class */*/*.class README subversion.txt
-
The README file should include your username(s), emails(s), and any
information you want the TA to have before grading your assignment.
e.g., this part is not working due to..., a favorite quote or joke
to amuse the TA, etc.
-
The subversion.txt file should include output from the following:
svn log
svn info
svnlook tree REPOSITORY_PATH svnlook must be issued on a department machine
Submit assignment using checkin utility
~cs453/bin/checkin PA5 MJSemDriver-groupname.jar
Sanity Check (procedure TA will use to run your assignment):
% java -classpath java-cup-11a-runtime.jar -jar MJSemDriver-groupname.jar filename.java
% mkdir MJSemDriver-groupname/src/
% cd MJSemDriver-groupname/src/
% cp ../../MJSemDriver-groupname.jar .
% jar xf MJSemDriver-groupname.jar
Note that you need to have a copy of the java-cup-11a-runtime.jar file in the same directory as the MJParser-groupname.jar file. We will provide our own copy of the runtime jar file for testing. We will be running your parser with semantic analysis on multiple test files. Also, the TA will be looking at the source files.
Late Policy:
Late assignments will be accepted up to 24 hours past the due date for a deduction of
20% and will not accepted past this period.
mstrout@cs.colostate.edu
.... March 23, 2008
Originally written by Brad Richards 2006,
modified with permission by Michelle Strout 2007 and 2008.