Assembly Line

Recall that C programs are first compiled into a lower-level language called “assembly” before that assembly is assembled into machine code that a computer can execute. There are a number of different types of assembly languages, but they generally share similar properties: assembly languages have a limited set of instructions for performing basic operations like putting data in a variable (otherwise known in this context as a “register”), moving data from one location in memory to another, etc.

Consider a simplified assembly language wherein there are four registers (i.e., locations to store values) called r1, r2, r3, and r4. This assembly language supports the following instructions, wherein R, Rx, Ry, and Rz represent (any of those) registers, V represents a literal value (an integer or a string), and L represents a line number:

  • PRINT R prints the value in register R.
  • INPUT R prompts the user for input and stores it in register R.
    • You may assume that if the input looks like an integer (i.e., it consists of only digits), it will be stored as an integer; otherwise, it will be stored as a string.
  • SET R V stores the value V in register R.
    • For example, SET r1 50 would store the value 50 in register r1.
  • ADD Rz Rx Ry adds the value stored in register Rx to the value stored in register Ry and stores the result in register Rz.
  • JUMPEQ Rx Ry L checks if the values stored at registers Rx and Ry are equal to one another. If so, the program jumps to line L. Otherwise, the program continues to the next instruction.
  • JUMPLT Rx Ry L checks if the values stored at register Rx is less than the value stored at register Ry. If so, then the program jumps to line L. Otherwise, the program continues to the next instruction.
  • EXIT exits the program.

Every line of code in this assembly language consists of a line number followed by a single instruction. No parentheses, curly braces, semicolons, or any other syntax other than the above instructions!

For example, here is a program that prompts the user for two numbers and prints whether they are equal or not:

 1  SET r1 "x: "
 2  PRINT r1
 3  INPUT r2
 4  SET r1 "y: "
 5  PRINT r1
 6  INPUT r3
 7  JUMPEQ r2 r3 11
 8  SET r1 "x is not equal to y"
 9  PRINT r1
10  EXIT
11  SET r1 "x is equal to y"
12  PRINT r1
13  EXIT
  1. (2 points.) In English, explain how the program above works, making clear why it is correct, as by explaining the role of each line, from 1 through 13.

  2. (3 points.) Rewrite the program above in such a way that, instead of just printing out x is not equal to y when the two numbers are not equal, it instead prints either x is less than y or x is greater than y, depending on which number is greater. The program should still print x is equal to y if the two numbers are equal.

  3. (4 points.) Write a program in this assembly language that “coughs” (i.e., prints cough) some number of times. Your program should first prompt the user for a number and then print cough exactly that many times. You may assume the user will input a non-negative number.

The assembly language you just used to write these programs is a simplified version of the assembly language your computer might use when compiling a C program. When clang compiles your C program in CS50 IDE, it first compiles your C program into an assembly language called “x86-64” and then assembles assembly into machine code. It turns out we can actually stop clang midway through that process so as to take a look at the assembly code corresponding to our program.

Copy the program below into a file called compare.c in CS50 IDE.

#include <cs50.h>
#include <stdio.h>

int main(void)
{
    int x = get_int("x: ");
    int y = get_int("y: ");

    if (x < y)
    {
        printf("x is less than y\n");
    }
    else
    {
        printf("x is not less than y\n");
    }
}

In your terminal, run clang -S compare.c. The -S flag tells clang to output the assembly code for the program. After you run the command, you should see a file called compare.s containing the assembly. Open that file and take a look!

Odds are it looks pretty complicated! No need to understand all the details, but notice that most lines contain some instruction followed by one or more arguments for that instruction. The movl instruction, for example, moves data from one location to another.

  1. (1 point.) Unlike our own assembly language above, x86-64 has an instruction for calling a function from inside of a program. Based on the assembly code in compare.s, what is the name of the instruction for calling a function? How do you know?
  2. (2 points.) Based on the assembly code in compare.s, what is the name of the x86-64 instruction via which the program decides what to print? And how does that instruction decide what to print?

CHANGELOG

  • Fixed line 7 of the sample program to be 7 JUMPEQ r2 r3 11.
  • Fixed printf call in compare.c to print "x is not less than y\n".