C to LC3 Assembly language

One of they keys to learning how to write an assembly language is to understand how to map high level language constructs to equivalent assembly language. Here are the constructs described on this page. The registers used in the examples are arbitrary. The register you use may be any of the registers. However, you must not use any register that contains data you wish to use later, unless that data is also stored somewhere in memory.

Simple assignment

The most basic opertion is the assignment of one variable to another. Most modern computer architectures are load/store. This means that a value must be loaded from memory into a register, then written back to memory. There are no direct memory to memory copies. There may be additional instructions between the LD and ST.

C code LC3 code

b = a;

LD R0, a ; load from memory to a register
ST R0, b ; store from register to memory

b = a + 1;

LD R0, a       ; load from memory to a register
ADD R0, R0, #1 ; increment value
ST R0, b       ; store from register to memory

When using LD/ST, the location of the variable is fixed at assembly time and the location must be within 256 of the instruction. If either of these condtions is ot met, pointers must be used.

Assignment using Pointers

A pointer is a variable whose contents is the address of something else. This is most commonly another variable, bit it may be the address of an instruction. Since a variable in LC3 is 16 bits, a pointer can access any location in the LC3's memory.

C code LC3 code

pa = &a;

LEA R0, a  ; get the address of the varaible
ST  R0, pa ; store it in the pointer variable

b = *a;

LDI R0, pa ; get the value at the address stored in pa
ST  R0, b  ; store it in b

*pa = b;

LD  R0, b   ; load the value of b
STI R0, pa  ; store it at the address stored in pa

Like the LD/ST instructions, the LDI/STI/LEA instructions have a limited range. The pointer variable itself must be within 256 of the instruction. The pointer can address anywhere in the memory.

Alternative pointer access

The LDR/STR instructions also access memory via pointers. However, the value of the pointer is stored in a register instead of a pointer variable. This is useful for accessing structures and variable on the stack. Consider a simple C structure that represents a date. It contains three integers representing a day, month and year. Such a structure might be declared as

C code LC3 code

   typedef struct date {
     int day;
     int month;
     int year;
   } date_t;
No LC3 equivalent

date_t  birthday;     // a date
date_t* birthday_ptr; // pointer to a date

birthday     .BLKW 3 ; a date is 3 words of memory
birthdat_ptr .BLKW 1 ; a pointer is 1 word of memory

birthday_ptr = &birthday;

LEA R0,birthday
ST  R0,bithday_ptr

int d = birthday_ptr->day;
int m = birthday_ptr->month;
int y = birthday_ptr->year;

LD R0,birthday_ptr
LDR R1,R0,#0
STR R1,d
LDR R1,R0,#1
STR R1,m;
LDR R1,R0,#2
STR R1,y;

Note how the address is stored in a register and then different offests are used to access the different fields of a structure. In fact, C has a construct offsetof() that is used to compute the address of a field within a structure. Of course, in LC3 you must compute the offsets yourself.

The LDR/STR instructions are also important in accessing method parameters and local variables. This is done using offset from the frame pointer.


Conditionals

Conditionals allow different code to be executed depending on values encountered when the program is running. In C, we write logical expressions to compare values using a variey of symbols (<, >, ==, !=, ...). In assembly language we can only test a value and determine if it is negative, zero or positive. Therefore, the logical expressions in C must be converter to operation with respect to zero. The following table show the conversion.

logical expr numeric expression branch instruction negated branch instruction
if (a < b) if ((a - b) < 0) BRn BRzp
if (a <= b) if ((a - b) <= 0) BRnz BRp
if (a == b) if ((a - b) == 0) BRz BRnp
if (a >= b) if ((a - b) >= 0) BRzp BRn
if (a > b) if ((a - b) > 0) BRp BRnz
if (a != b) if ((a - b) != 0) BRnp BRz

Simple if

Converting a typical C code with a condition is a two part process. First one must generate an LC3 operation that sets the condition code in a way that can be used to make a decision. Then one has to conditionally execute or not execute some code.

Comparison in LC3 often require subtraction. But, the LC3 has no subtract operation, so one must change this to addition with a negated value.. Negation is two's complement. The logical structure of a condition in C is if condition is true, do the following. If we want the stucture of assembly language code to mimic the form of the high level code (i.e. the code to exectue directly follows the test), the test must be changed to if condtion is not true, skip the following. This is done by using the negated test.

C code LC3 code

if (a < b) {
  // do something
}

     LD  R0, a       ; load a
     LD  R1, b       ; load b
     NOT R1, R1      ; begin 2's complement of b
     ADD R1, R1, #1  ; R1 now has -b
     ADD R0, R0, R1  ; R0 = a + (-b)
                     ; condition code now set
     BRzp SKIP       ; if false, skip over code

                     ; code to do something (the then clause)

SKIP                 ; remainder of code after if

Simple if/else

Now consider the case of if/else. Now one selects between two pieces of code. One is executed if the condition is true, the other if the condition is false. To make this change, the else is added as is a branch to skip the else if the condition is true. The final code becomes:

C code LC3 code

if (a < b) {
  // do something
}
else {
  // do something else
}

     LD  R0, a       ; load a
     LD  R1, b       ; load b
     NOT R1, R1      ; begin 2's complement of b
     ADD R1, R1, #1  ; R1 now has -b
     ADD R0, R0, R1  ; R0 = a + (-b)
                     ; condition code now set
     BRzp ELSE       ; if false, skip over code

                     ; code to do something (the then clause)

     BR   END_ELSE   ; don't execute else code

ELSE                 ; code for else clause here

END_ELSE             ; remainder of code after else

Conditions with constants

Conditions often involve constants. For example, a typical logical expression might be:

  if (score >= 90) {
    grade = 'A'
  }
When convering this to assembly, one needs to compute score -90. One can store the value 90 using an .FILL. However, the code will still need to negate it and add 1 to get the negative value. An easy alternative is to store the negative value. Then the final code will only need to add. Compare the two possibilites in the folllowing table.

code using positive constant code using negative constant

NINETY .FILL #90        ; the constant 90
       ...
       LD R0, score     ; load score
       LD R1, NINETY    ; start two's complement
       ADD R1, R1, #1   ; R1 = -90
       ADD R0, R0, R1   ; R0 = score + (-90)
       BRn NOT_AN_A     ; didn't get an A

NNINETY .FILL #-90      ; the constant -90
        ...
        LD R0, score    ; load score
        LD R1, NNINETY  ; get -90

        ADD R0, R0, R1  ; R0 = score + (-90)
        BRn NOT_AN_A    ; didn't get an A


Loops

Loops combine a condition that is used to terminate the loop and an unconditional branch which causes the body of the loop to be repeated until the termination condition occurs. The test can occur at the beginning of the loop, at the end of the loop, or in the middle of the loop. Multiple tests and terminations are allowed. As an example, consider the for loop of C. The syntax for the for is:


  for (intitialization; termination condition; increment) {
    // body of loop
  }

The for has three distinct parts:

Using the concepts presents in previos sections, we will write three snipets of code, one for each section. For an example, we will convert this for to LC3 assembly.


  for (int i = 0; i < limit; i++) {
    // body of loop
  }

In the following table, the numbers at the beginning of the line are used for reference in the discussion. They do not actually occur in the code.

C code LC3 code Optimized LC3 code

i = 0;

01:      AND R0, R0, #0 ; AND with 0 yields zero
02:      ST  R0, i      ; store 0 in i

01:      AND R0, R0, #0 ; AND with 0 yields zero
02: TOP  ST  R0, i      ; store i (0 1st time)

i < limit;

03: TEST LD R0, i       ; get i
04:      LD R1, limit   ; get limit
05:      NOT R1, R1     ; two's complement
06:      ADD R1, R1, #1
07:      ADD R0, R0, R1 ; compute i - limit
08:      BRzp END       ; done when (i - limit) >= 0

                        ; R0 contains i
03:      LD R1, limit   ; get limit
04:      NOT R1, R1     ; two's complement
05:      ADD R1, R1, #1
06:      ADD R0, R0, R1 ; compute i - limit
07:      BRzp END       ; done when (i - limit) >= 0

C code for body

LC3 code for body

LC3 code for body

i++

09:      LD  R0, i      ; get i
10:      ADD R0, R0, #1 ; increment it
11:      ST  RO, i      ; save the new value

08:      LD  R0, i      ; get i
09:      ADD R0, R0, #1 ; increment it



12:      BR TEST        ; go back and test again
13: END                 ; code after loop completes

10:      BR TOP         ; store and test again
11: END                 ; code after loop completes

There are redundant instructions in the above code. In general, it is the job of the optimizer to fix this, but if you are writing in assembly language, you may choose to do it yourself. For example, consider the use of R0 to hold the value i. Note that at line 02: the value of R0 is stored in i, only to be immediately reloaded at statement 03:. And, at line 11: the updated value of i is also in R0. Thus, the LD at line 03: is redundant and can be removed. There is also an redundant ST instruction. A word to the wise, first get it correct, then optimize (if necessary).


Simple Array Access

The following assumes that each element of the array takes one word of memory. See the section on Advanced Array Access for more complex examples. Array access requires two variable, the array and an index into the array. To do this in assembly language, we get the address of the beginning of the array, and compute the address of the i-th item by adding the index. This is illustrated in the following table.

C code LC3 code

b = c[i];

LEA R0, c      ; address of beginning of c
LD  R1, i      ; load index (i)
ADD R0, R0, R1 ; compute c + i (address of i-th element)
LDR R0, R0     ; load c[i]
ST  R0, b      ; store it back (now b = c[i])

c[i] = b;

LEA R0, c      ; address of beginning of c
LD  R1, i      ; load index (i)
ADD R0, R0, R1 ; compute c + i (address of i-th element)
LDR R1, b      ; load b
STR R1, R0     ; store it back (now c[i] = b)

In C, array access (e.g. c[i]) is equivalent to pointer access (e.g. *(c + i)).


Advanced Array Access

The correct way to do array acess it that instead of simply computing (c + i), i must be multiplied by the sizeof one element in the array. Each element of the array takes the same amount of space, but the size of an element may not be 1. Here is an example where the size of each element is greater that 1.

LC3 code

       .ORIG x3000      ; print out an abreviation of the i-th month
       LD  R0, I        ; get the index;
       ADD R0, R0, #-1  ; convert to 0 based index
       ADD R0, R0, R0
       ADD R0, R0, R0   ; R0 = 4*I (sizeof each entry is 4)
       LEA R1, MONTHS   ; R1 is the address of the beginning of the array
       ADD R0, R0, R1   ; R0 cntains address of the I-th element
       PUTS             ; print out the abreviation
       HALT
I      .BLKW 1          ; the variable I (1-12)
                        ; an array of strings each 4 characters long
                        ; 3 char plus terrminating null
MONTHS .STRINGZ "Jan"
       .STRINGZ "Feb"
       .STRINGZ "Mar"
       .STRINGZ "Apr"
       .STRINGZ "May"
       .STRINGZ "Jun"
       .STRINGZ "Jul"
       .STRINGZ "Aug"
       .STRINGZ "Sep"
       .STRINGZ "Oct"
       .STRINGZ "Nov"
       .STRINGZ "Dec"
       .END


Fritz Sieker - Apr 2012