CS 270
cs270 Stack Dump
Recitation Stack Dump: CS 270 Computer Organization

Recitation 5 - Stack Dump


What to turn in

You will get credit for this recitation by completing the Recitation 5 Quiz

Objectives:

  1. to learn about the memory layout and stack for C programs,
  2. to learn how to use C pointers to view memory,
  3. to write a C program prints a dump of the top of the stack,
  4. to analyze the stack layout in the stack dump,
  5. to see if you can follow directions,
  6. to learn about the assembly code created by C compilers (if time)

About The Recitation

In this recitation you will learn about the memory layout for programs, how the programs use the stack to support function calls, and how to use pointers to view the contents of memory.

The goal of the recitation is to implement a small routine that prints the contents of the program stack so you can analyze the information.

C Pointers

In C we can examine the contents of memory directly using pointers. Here is an example of an integer and a pointer to the integer. It shows how to:

  • declare a pointer to an integer: int *pi;,
  • set a pointer to the address of a variable: pi = &i,
  • obtain the value of the integer using: *pi.
We will use a simple program to learn about pointers. As we progress you can add each new code example after the previous code and before the return statement so it is one large program when you are finished. This is required since each example builds on the previous examples.
  • Copy and paste the following code in a file names pointer.c.
  • Compile the program using gcc -m32 -o pointer pointer.c.
  • Run the program as ./pointer.
#include <stdio.h>

int main() {
    int i;  	// an integer value
    int *pi;	// a pointer to an integer value
    
    i = -1;
    pi = &i ;	// set pi to the address of i
    
    printf("i: %d, pi: %p > *pi: %d\n", i, pi, *pi);
    
    // place additional code here
    
    return 0;    
} 
Pointers may also point to other pointers. This example shows how to:
  • declare a pointer to a pointer to an integer: int **ppi;,
  • set the value using the address of a variable that is a pointer to an integer: &pi,
  • obtain the value of the integer using: **ppi.
Add this to the end of your program and try again.
    int **ppi;
    ppi = &pi ; // set ppi to the address of pi
    
    printf("i: %d, pi: %p > *pi: %d, ppi: %p > **ppi: %d \n", i, pi, *pi, ppi, **ppi);
We can use pointers to traverse consecutive memory locations to view their values rather than using array indices. After setting the pointer to the first element, all we need to do is increment the pointer to the next memory location as in the following example. Add this to the end of your program and try again.
    int memory[] = {1,2,3,4,5,6,7,8,9};
    int *pm;

    pm = memory;  // &memory[0] is an alternative.
    for (int i=0; i<9; i++) {
        printf("pm: %p > *pm: %d\n", pm, *pm);
        pm++;
    }
We can also assign a 32 bit integer from memory to a pointer with the appropriate type casting. Here we obtain a 32 bit integer using *pm, casting it with (int *) to assign to the unsigned int pointer nm. In general this is propably not a good idea as an integer variable is unlikely to be a valid memory address, but there are times when this can be usefull. Add this code to the end of your program and try again.
    int *nm;
    pm = memory;
    nm = (int *) *pm;
    printf ("pm: %p, nm: %p\n", pm, nm);
Pointers can be used with any type, including characters. When we increment the int pointer by 1 to the next int, the memory address is actually incremented by 4 since pointers refer to bytes, not words. When we increment a char pointer by 1 to the next char, the memory address is incremented by 1. This code prints the same values as above, except as 36 bytes instead of 9 words.
    char *cp;
    cp = (char *) pm;
    
    for (int i=0; i<36; i++) {
      printf ("cp: %p > *cp: 0x%02X\n", cp, *cp);
      cp++;
      }
Try changing the first value in 'memory' to 0x12345678 by adding this line of code
 
memory[0] = 0x123454678;
between the code
  cp = (char *) pm;
  for (int i=0; i<36; i++);
above, then run the code again. What do you notice about how the value is stored in memory? If you are curious about this google 'endianness'.

To summarize, given these declarations:

  int word[50];
  int *cptr;
  cptr = word;
we can say the following:

cptrword&word[0] all refer to the address of the first element of the word array.
(cptr + n)(word + n)&word[n] all refer to the address of the nth element of the word array.
*cptr*wordword[0] all refer to the first value in the word array.
*(cptr+n)*(word+n)word[n] are refer to the nth value in the word array.

Memory Layout for C programs

C programs use the following memory layout where the high addresses are at the top and the low address are at the bottom of the diagram.

  • The stack contains function parameters and local variables. The stack grows down towards the heap as additional functions are called.
  • Free space exists between the stack and heap that allows each to grow.
  • The heap contains dynamically allocated memory that we will learn more about later. The heap grows up towards the stack.
  • The BSS segment contains uninitialized global and static variables. This segment is initialized to zeros.
  • The data segment contains initialized global and static variables.
  • The text segment contains the object code for the program.
address      ________
0xffffffff  |        | 
            |        | stack (function local data)
            |_ _  _ _|   
            |   vv   | (grows down)
            |        |  
            |        | free space
            |        |
            |_ _^^_ _| (grows up)
            |        |  
            |        | heap (dynamic)  
            |________| 
            |        |    
            |        | bss (uninitialized data)   
            |________| 
            |        |
            |        | data (initialized data)
            |________| 
            |        |
            |        | text (code)
0x00000000  |________|


The Stack

The program stack is a series of memory locations. The 64-bit and 32-bit Intel sytems use somewhat different stack architectures. For the purpose of this recitation we are studying the 32-bit Intel architecture since it is similar to the LC3 architecture we use later in the semester. Memory addresses and integers are 32 bits (4 bytes) so we will study the stack as an array of unsigned ints. The stack is often organized as depicted in this diagram.

top of stack
fn   address     memory  description
f2  fffeffc0  |--------| ^
f2  fffeffc4  |--------| ^
f2  fffeffc8  |--------| local variables
f2  fffeffcc  |--------| exception
f2  fffeffd0  |--------|  handler
f2  fffeffd4  |ffff0000| frame pointer
f2  fffeffd8  |--------| return address
f2  fffeffdc  |--------| return value
f2  fffeffe0  |--------| parameters
f2  fffeffe4  |--------| v
f2  fffeffe8  |--------| v
f1  fffeffec  |--------| ^
f1  fffefff0  |--------| ^
f1  fffefff4  |--------| local variables
f1  fffefff8  |--------| exception
f1  fffefffc  |--------|  handler
f1  ffff0000  |ffff002c| frame pointer
f1  ffff0004  |--------| return address
f1  ffff0008  |--------| return value
f1  ffff000c  |--------| parameters
f1  ffff0010  |--------| v
f1  ffff0014  |--------| v
... ffff0018  |--------| ...

Note we have shown the top of the stack at the top of the diagram. The addresses in the diagram increase as you go down, the opposite of the memory layout diagram. Pushing a 32 bit value on the stack subtracts 4 from the stack pointer. Popping a 32 bit value from the stack adds 4 to the stack pointer.

Each called function has a similar layout in the stack. In this sample, the function f1 called the function f2. The addresses labelled f2 denote the values accessible by the function f2, and similarly for f1. The information added to the stack for each function includes:

  • The local variables allocated by this function.
  • Space for the exception handler.
  • The frame pointer value for the calling function. The current frame pointer points to this value and is used to access local variables and parameters.
  • The return address contains the address of the next instruction to execute when this function returns.
  • The function's return value.
  • The parameters passed to this function.
Compilers are responsible for the code generation that maintains the stack during execution. Viewing the assembly code for a particular architecture can help you understand how the stack works.

There are many variations depending on the hardware architecture and compiler optimization level. Parameters and return values may be passed in registers. Local variables may stored in registers. Frame pointers may not be maintained in the stack. In our case the value returned by a function is placed in a register and does not appear on the stack. On other architectures, the return value is placed on the stack between the parameters and the return address.


Dumping the Stack

You only need to complete the dumpStack function to print the stack starting at the top for a fixed number of memory locations. The function should print the header as given with a line from each memory location starting at the top of the stack, tos. Each line should contain the memory address, it's contents in hexadecimal, and it's contents in decimal. Remember that the local variables of the current function are on the top of the stack. My solution was several lines of code.

/*
 * dumpStack prints a memory dump starting at the top of the stack 
 * for a fixed number of locations.
 * For each memory location, dumpStack prints a line containing 
 * the memory address and its contents in both Hexadecimal and decimal.
 */
 
#include <stdio.h>

int depth = 0; // number of memory locations to print in stack dump

void dumpStack(char *label) {

    unsigned int *tos;  // literally the memory location at the top of the stack
                        // when this function is called.
    
    printf("Top of stack: %s\n       Address  Hexadecimal      Decimal\n", label);

    // @todo complete the function.
}

// Do not change the code below this comment 

int fact(int n, int parm2) {
    // the extra parameter and local variables are 
    // sentinel values to help analyze the stack.
    int local1 = -1;
    int local2 = 0x55555555;

    // base case
    if (n < 1) {
        dumpStack("at the base case for fact(n)");
        return 1;
    }

    // capture the return value for the stack dump
    local1 = n * fact(n-1,1-n); 

    return local1; 
}

int main(int argc, char *argv[]){

    int f;
    int n; 

    // initialize locals and global
    f = 65535;
    n = 4;
    depth = 100;

    // obtain n,depth from command line?

    f = fact(n,-n);  

    return f;
}

You will notice an extra parameter parm1 and local variables local1, local2 in this source. These sentinel values were added to aid your analysis of the values on the stack by giving you some known reference points.

Create a dump.c file and paste the code below into the file. To compile the C source file and create the program file:

    gcc -Wall -m32 -g -o dump dump.c  		
		
To run the program:
    ./dump		
		
Remember to use the -m32 option to get the correct object code. The default -m64 will produce results substantially different than this recitation. You may wish to try -m64 option to see the differences.


Analyzing the stack dump

If your function is implemented correctly, the first line should have the same address and value since it is tos. Things to identify in the stack dump include:

  • The frame pointers should be easy to identify since the memory value will be similar to the address. You should be able to trace them through the stack dump. Identify how to tell when you reach the bottom of the stack at main. This is important for the programming assignment.
  • The return address will have a much smaller value than the addresses in the stack since it is in a different segment in the memory layout. You should see the same return addresses for the recursive calls to fact. The first return address will be slightly different since it is returning from the call to the line after dumpStack.
  • The parameters to the function should be adjacent to the return address. Notice the order of the parameters. There may be some unused values that were allocated by the compiler.
  • The local variables will be near the frame pointer. There may be some unused values that were allocated by the compiler.
  • Determine if there are missing or extra elements on this implementation of the stack.
You will notice some significant differences if you repeat this with the default -m64 option.


Optional, work on only after fully completing above instructions


Analyzing assembly code

The hardware does not actually implement C. The compiler translates the C source code into assembly code (machine instructions) that can be executed. We can view the assembly code produced by the compiler.

  • Copy and paste the following code in a file named fact.c.
  • Compile the file using gcc -m32 -g -c fact.c.
  • Generate the object code listing using objdump -S --no-show-raw-insn fact.o >fact.s.

int fact(int n, int o) {
    // extraneous local variable to help view the stack.
    int f = 0x12346789; 
    int g = 0x55555555;

    // base case
    if (n < 1) {
        return 1;
    }

    // capture the return value for the stack dump
    f = n * fact(n-1,1-n); 

    return f; 
}

Now view the object code in the file fact.s. This cheat sheet might help you intrepret the assembly code listing. The listing contains the source instructions interspersed with the corresponding object code. Each line of object code contains the following information:

  • The relative address of the line of object code in hexadecimal. You should notice that instructions are not all the same length.
  • The instruction mnemonic follows the address. These are fairly straight forward.
  • One or two operands for the instructions follow the mnemonic. The operands refer to constants $0x18, registers %ebp, and offsets from the registers -0x10(%ebp).
Some registers are reserved for special purposes while others can be used for computation. Special purpose registers include
  • ebp is the current frame pointer. It points the the memory location containing the previous frame pointer.
  • esp is the stack pointer. It points to the memory location containing the value on the top of the stack.
  • eax is the function return value. It is set before the function returns and may be used for computation during the function.
Draw a stack and execute the code by hand to get a better understanding how this function uses the stack. You can find more information on the internet about the registers and instruction set. Highlight the code and note what happens to the registers and stack:
  • when you enter the function.
  • before you make the recursive function call.
  • after you make the recurvsive function call.
  • before you return from the function.
Which instructions change the stack? Push and pop are the obvious ones. What other instructions change the stack or stack pointer?

Try this again with the -m64 compiler option to compile for a 64 bit machine and note the differences.

  • Compile the file using gcc -m64 -g -c fact.c.
  • Generate the object code listing using objdump -S --no-show-raw-insn fact.o >fact64.s.
Then try again when the -O or -O2 (letter O, not the number 0) compiler options to optimize the code at different levels and note the differences. You might find the second one quite interesting.
  • Copy and paste the following code in a file named fact.c.
  • Compile the file using gcc -m64 -O -g -c fact.c.
  • Generate the object code listing using objdump -S --no-show-raw-insn fact.o >fact64O.s.
  • Compile the file using gcc -m64 -O2 -g -c fact.c.
  • Generate the object code listing using objdump -S --no-show-raw-insn fact.o >fact64O2.s.


© 2018 CS270 Colorado State University. All Rights Reserved.