Stack Frames

A stack frame (aka activation record) contains all of the data required by a function (aka method, subroutine) during its execution. At a minimum, it includes:

  • space for the parameters

  • space to hold the return address/value and other book keeping value(s)

  • space for the local variables

The stack frame is created before function execution begins, and deleted after the function execution is complete. As such, it satisfies the requirement of only allocating space during the execution of the routine. It also handles recursion nicely, as a new stack frame is created for every call. All "data" is physically grouped together which aids caching.

Conceptually, the stack frame is a C struct. A struct is a complex data type declaration the defines a physically grouped list of variables placed together in a contiguous block of memory. (See Wikipedia for additional information.) Given a pointer to a struct, the individual elements are referenced with the notation ptr→field.

Conceptual Runtime Stack

The stack frames are manged as a stack of stack frames. The stack frame of the currently executing function is on the top of the stack. Its caller is in the next stack frame. Its caller is in the following stack frame and so on. A new stack frame is pushed on the stack when the function begins, and popped off when the function returns. This corresponds to what happens when a function is called. The currently executing function is suspended and the called function begins execution. When it terminates, execution resumes in the calling function.

One way of implementing a stack is via a singly linked list. The head points to the top of the stack and each element of the stack contains a next pointer to successive elements.

So the entry to a function looks like:

  StackFrame* sf = malloc(sizeof(StackFrame));
  sf->next       = head;  // link it into the list
  head           = sf;    // reset the head (push)
                          // initialize parameters

And exit from the function looks like:

                             // grab return value (if present)
  StackFrame* sf = head;
  head           = sf->next; // reset the head (pop)
  free(sf);                  // clean up memory

In diagram form, the runtime stack looks like the following:

stack

Consider a program that calls function1() and that function calls function2(), which in turn calls function3(). This represents three nested function calls, with the currently executing function (function3) at the top of the stack (and pointed to by head). Notice that you can access all of the fields of the stack frame (e.g. head→param1 or head→local1. And by following the next field you can access the fields of the function2 stack frame (e.g. head→next→local1), and so on.

While this diagram and the sample code is a good conceptual view of what is happening, there are several problems:

  • Stack frames differ in size, so how can the space be allocated?

  • The size of the stack frame for a function may differ from call to call. Think about printf(), which has a variable number of parameters.

  • Only the caller of a function knows the values passed as parameters. Hence the idea of call-by-value.

  • Only the function knows how many locals it requires.

So building the activation record requires cooperation between the caller/callee. And dynamically allocating space (i.e. malloc) is too much overhead for function calls.

The Hardware Stack

Some computers have special hardware devoted to the stack. However, many simply choose to use a portion of memory and use it as a stack. In this case, one need only maintain a pointer to the top of the stack. Allocating/deallocating space on the stack only requires moving the stack pointer (i.e. changing its value). Thus, it is fast and allows one to allocate/deallocate variable amounts of space easily and quickly. It also facilitates dividing the effort to create the activation record between the caller and callee.

In most cases, the stack is placed in high memory and grows downward towards lower memory addresses. Placing the stack in high memory and dynamic memory (i.e. malloc()) in lower memory, the two grow/shrink towards/away from each other and only if they meet are we "out of memory". Pushing values onto the stack involves decrementing the stack pointer, while popping values involves incrementing the stack pointer. A simple add/subtract is all that is required. Thus the allocation/deallocation of space is fast. The stack is referenced so often that a register of the CPU is dedicated to holding the current value. In the case of the LC3 computer, that register is R6 (SP).

Building the LC3 stack runtime stack

The stack frame is built as a cooperative effort by the calling function and the called function. Two registers are dedicated to managing the runtime stack. Register R6 (SP) is used as the stack pointer. Its value contains the address of the word at the top of the stack. Register R5 contains the "head" pointer of the linked list of stack frames and is referred to as the frame pointer (FP).

Since the caller know the values of the parameters, it passes them to the called function by pushing them onto the stack, one at a time. The values could be pushed either left to right, or right to left. The convention is to push the values right to left. As a result, the first parameter ends up at the top of the stack. Thus, the first parameter is at a known location regardless of how many parameters were actually passed. Think of the printf() function. The format parameter (the first) tells how many other parameters there are (count of the % specifiers). By putting it at the top of the stack, the printf() function knows where to find the format parameter and can thus find the others.

A function call begins by having the caller push its parameters right to left onto the stack. The result is that just before the JSR to the function, the stack looks like:

pushParam

Then the code uses JSR to being execution of the function. Now it is time for the called function to take over. If the function has a return value, it must allocate some space to store it. By placing it next on the stack, upon return, the caller can get the value by simply popping it. It should also store the return address so that this function may call other functions. To reserve space for the return value, the function need only adjust the stack pointer (ADD R6,R6,#-1). The return address may be saved by PUSH R7. After these two steps, the stack looks like:

overhead

To complete the creation of the stack frame, two things are required. One is to make the new stack frame the new head of the list. The other is to allocate space for local variables. The first step requires saving head as the "next" pointer of the new stack frame This is accomplished by PUSH R5. Then, the head is reset. The head could point anywhere in the stack frame. Elements of the stack frame are accessed at fixed offsets from the frame pointer. Since LC3 instructions can have both positive and negative offsets, it make sense that the frame pointer points somewhere in the middle of the stack frame to handle the largest possible stack frame. The convention used is to have the head (the frame pointer) point to the first local variable.

Finally, space of the locals is allocated by decrementing the stack pointer. At this point, the stack looks like:

final

The stack with three nested function calls

Note how the three functions are contiguous on the stack. That is, the last parameter of the called function is adjacent to the last local of the calling function.

functions3

Unwinding the runtime stack

On completion of the function, the work of building the runtime stack must be undone. Again, it is a cooperative effort split between the called and calling functions. The work is done in the opposite order to the building of the stack. The called function cleans up its part, and after RET, the calling function cleans up its part. Note that after this work is complete, the memory used for the stack is not cleared. It still contains whatever was stored there. Logically, however, it is no longer part of the stack and will be overwritten on the next function call(s).

A summary of function call/return

There are a lot of steps involved in using the runtime stack. However, the steps are very mechanical and straightforward if you follow the steps precisely. Normally the code is created by a compiler.

  1. caller pushes arguments right to left

  2. caller executes JSR

  3. callee makes space for return value (ADD R6,R6,#-1)

  4. callee pushes return address (PUSH R7)

  5. callee pushes old frame pointer (PUSH R5)

  6. callee resets frame pointer (ADD R5, R6 #-1) This makes the frame pointer point to the first local. When the locals are removed in step 10 the old frame pointer is at the top of the stack and can be popped.

  7. callee allocates locals by decrementing stack pointer(ADD R6, R6, #-numLocals)

  8. callee does its work (includes saving return value STR Rx,R5,#3)

  9. callee removes locals by incrementing stack pointer (ADD R6, R6, #numLocals)

  10. callee restores old frame pointer (POP R5)

  11. callee pops return address (POP R7)

  12. callee executes RET

  13. caller pops return value

  14. caller removes parameters by incrementing stack pointer (ADD R6, R6, #numParams)

  15. caller continues its execution

Notice the symmetry of the steps. The actions taken in the beginning to build the stack frame are done in the reverse order to remove the stack frame. See steps 7/9, 5&6/10, 4/11. Note that steps 2,3 and 13,14 are not symmetric. If they were, one would have the caller allocate space for the return value before the JSR. But, it is easier to put the code in the callee once, rather that before each call to the function each place it is called.

Other protocalls

The previous section detailed one possible call/return protocol. There are others and we will consider some of them now. Which protocol is not important as long as the the caller and callee agree.

Using a register for 1st parameter

A typical entry to a subroutine looks like the following:

Entry code
   LD   someReg, addr   # get the 1st/only parameter
   PUSH someReg         # save it on the stack
   JSR  mySubroutine    # execute the subroutine

In this prototocol, the caller and collee agree that the first/only parameter will be passed in an agreed upon register (say R0). All other parameters will be passed as before. Now, the entry code looks like:

Entry code
   LD   R0, value       # get the 1st/only parameter
   JSR  mySubroutine    # execute the subroutine

One then modifies the callee entry code to push R0 as its very first action. We have moved the push from the caller to the callee. As a result, the callers code is shortened by a single instruction. A small, but easy change.

Using a register for the return value

Consider the code typically following a subroutine call

Exit code
   JSR mySubroutine
   POP someRegister
   ST  someRegister, addr # Save the return value
   ADD R6,R6,#params      # clean up the stack

Again, by returning the value via a register, we eliminate a POP after every JSR. One woud most likely choose to use the same reqister used for the first parameter. After all, we have completed the subroutine call and no longer have any need of the parameter. Another small, but easy change.

Exit code
   JSR mySubroutine
   ST  R0, addr           # Save the return value
   ADD R6,R6,#params      # clean up the stack

Because the return value is in a register, it no longer takes space on the stack.

Subroutines with a fixed number of parameters

When the number of parameters is fixed, one can actually move the cleanup of the parameters from the caller to the calee. Furthermore, in the callee, the cleanup of the entire stack frame (locals/book-keeping/paramters) can be collapsed into a single ajustment of the stack pointer. This assumes that the return value is in a register. In some architectures, the RET statement encodes a stack-adjustment value directly in the RET instruction itself. Then executing the RET changes both the PC to the return address, and updates the stack pointer.

Incorporating these ideas, the subroutine call/return code looks like:

The new flow
                    # in the *caller*
   LD R0,addr
   JSR mySuboutine
                    # now in the *callee*
   PUSH R0          # save the parameter
   ...              # the rest of the subroutine
                    # restore the return address/frame pointer
   ADD R6,R6,size   # clean up everything
   RET
                    # back in the *caller*
   ST  R0,addr      # save the return value

Takeaways

  1. The runtime stack is a linked list of stack frames.

  2. The elements of the stack frame are located at fixed offsets from the frame pointer for that frame, regardless of the actual address of the stack frame.

  3. Allocating and deallocating the stack frame is a cooperative effort with some of the code in the caller and some of the code in the callee.

  4. The stack frames occupy contiguous addresses in memory.

(c) Fritz Sieker - 2016-2026