My Project
cs270 Programming Assignment - Symbol Table
Programming Assignment P9: CS 270 Computer Organization

CS 270, Fall 2015
Programming Assignment P9
Symbol Table

P9 due Monday, Nov. 30 at 11:59pm, no late submissions.


Goals of Assignment

In this assignment you will build a symbol table. This assignment serves several purposes:

Overview

A symbol table is a data structure that relates an identifier to imformation about that identifier. For this assignment, the identifiers are the labels on lines of LC3 assembly language statements. The information associated with the label is its address. For more information on symbol tables see this wikipedia entry.

In this assignment, you will use a hash table with chaining. You will learn more about hash tables in your Data Structures course. The data structures you use are already defined for you in symbol.h/symbol.c.

The hash function you will use computes an integer value from a string. More than one string may hash to the same value. However, a good hash function will minimize this. You can think of the hash value as an "alias" for the string. Thus, when searching for a string, you first search for the hash value (using integer comparison, which is fast) and only when you find an equal hash value, use strcasecmp() (much slower than integer comparison) to verify you have found the string of interest.

The symbol table contains a hash_table which is an array. The index into this array is calculated as index = hash-value % size-of-hash-table. A given index in the hash table contains a pointer to the head of a singly linked list of nodes containing symbols whose hash values share the same index.

There is a scond optional table in the symbol table structure. It is a dynmacilly allocated array of char* (C's version of strings), that allow a fast lookup of a symbols name given its address.

Here is a graphical representation of the data structure. Make sure you understand how the declarations relate to the drawing.


Getting Started

  1. Create a directory for this assignment and cd to it
  2. Copy the following files to the directory you created. It is easiest to right click on the link, and do a Save Target As.. for each of the files.
  3. Build the excutable by typing make
  4. You now have a functional program, but it will not produce the corrrect results
  5. Enter ./testSymbol. You will see a usage message which tells you how to use the program. You may also use help within the program to reprint the message
  6. You are now ready to begin implementation. You have seven functions to implement. In the reference implmentation, approximately 90 line of code were added. Implement functions one at a time and test as you go.

Implementing symbol_init()

In this function, you must allocate space for your symbol table and initialize it approrpiately. Compare malloc()/calloc(). You may optionally allocate and initialize addr_table[].

Implementing symbol_search()

This function will do a lot of the work needed for the functions symbol_add() and symbol_find_by_name(). The basic algorithm to search for a symbol by name is:
  1. compute the hash value for the name using symbol_hash()
  2. compute the index based on (hash-value % table-size)
  3. search the linked list to find the symbol. To speed thing up search first for the hash value. Only when the hash values match do you need to do a case independent string comparison.
  4. return the information if the name is found, or NULL otherwise
You can test implementation by using the command search in the test driver.

Implementing symbol_add()

This function adds a symbol to the symbol table if it does not already exist. If it does not exist, you must allocate memory for the appropriate structure, initialize the fields of the structure appropriately and store in at the correct index in the symbol table. The entry at a given index of the symbol table is a pointer to the head of a linked list of nodes. The symbol just created should become the new head of the list. You may test your implementation using the add command the the test driver.

A hash table is designed to spread out the entries, often sacrificing space for speed. To thoroughtly test your function, you will need to force multiple names to occupy the same index in the hash table. Think about the pigeon hole princple from CS161, study the code and you should understand what to modify to test your code. Your code will be tested with a different values.


Implementing the remaining functions

Implement the remaining functions one by one and test them as you go.
  1. The function symbol_find_by_name() is only a few lines of code. Most of the work is done by symbol_search(). Test by using the command get in the test driver.
  2. The function symbol_iterate() is only a few (10) lines of code. It consists of nested loops with the outer loop varying over the symbol table and the inner loop varying over the linked list at the current index. Using a function pointer is really straight forward. The actual line of code you will use is:
    
        (*fnc)(symbol, data);
        
        This is almost identical to calling a normal function named fnc() except
        that you would write: 
    
        fnc(symbol, data);
        
        You can test your function using the commands count/list in the test
        driver
  3. The function symbol_reset() consist of two nested loops as you wrote for symbol_iterate(). For each node you will need to free any allocated memory. Test it with the command reset in the test driver.
  4. The function symbol_term() may use symbol_reset to do most of the work. Be sure and deallocate the main symbol table as well. This will be tested when the test driver terminates.
  5. Implement the remaining functions defined in the TODO list. The symbol_find_by_addr function is optional, for extra credit.

Thoroughly test your code using the test driver. You may find it convenient to build a file containing a series of commands to execute, then use this file to test your program without having to type in commands over and over. Execute

   ./testSymbol  <0/1> < testFile


Grading Criteria

Preliminary testing:

  • Compile test: make sure your code compiles (0 points)
  • Basic test: tests symbol_init, symbol_add, symbol_search, symbol_iterate, symbol_term, with one symbol (25 points)
  • Simple test: same as basic test, with more symbols (25 points)
  • Valgrind test: test that all memory is freed (10 points)
Final testing:
  • Reset test: tests symbol_reset (25 points)
  • Complex test: same as simple test, with many collisions in hash table(25 points)
  • Extra credit: tests symbol_find_by_addr (10 points)

Checking in Your Code

You will submit the single file symbol.c using the checkin program or the Checkin tab of the course web page. Use the key SYMBOL. At the terminal type:

    ~cs270/bin/checkin P9 symbol.c