Assembly Line
Recall that C programs are first compiled into a lower-level language called “assembly” before that assembly is assembled into machine code that a computer can execute. There are a number of different types of assembly languages, but they generally share similar properties: assembly languages have a limited set of instructions for performing basic operations like putting data in a variable (otherwise known in this context as a “register”), moving data from one location in memory to another, etc.
Consider a simplified assembly language wherein there are four registers (i.e., locations to store values) called r1
, r2
, r3
, and r4
. This assembly language supports the following instructions, wherein R
, Rx
, Ry
, and Rz
represent (any of those) registers, V
represents a literal value (an integer or a string), and L
represents a line number:
PRINT R
prints the value in registerR
.INPUT R
prompts the user for input and stores it in registerR
.- You may assume that if the input looks like an integer (i.e., it consists of only digits), it will be stored as an integer; otherwise, it will be stored as a string.
SET R V
stores the valueV
in registerR
.- For example,
SET r1 50
would store the value50
in registerr1
.
- For example,
ADD Rz Rx Ry
adds the value stored in registerRx
to the value stored in registerRy
and stores the result in registerRz
.JUMPEQ Rx Ry L
checks if the values stored at registersRx
andRy
are equal to one another. If so, the program jumps to lineL
. Otherwise, the program continues to the next instruction.JUMPLT Rx Ry L
checks if the values stored at registerRx
is less than the value stored at registerRy
. If so, then the program jumps to lineL
. Otherwise, the program continues to the next instruction.EXIT
exits the program.
Every line of code in this assembly language consists of a line number followed by a single instruction. No parentheses, curly braces, semicolons, or any other syntax other than the above instructions!
For example, here is a program that prompts the user for two numbers and prints whether they are equal or not:
1 SET r1 "x: "
2 PRINT r1
3 INPUT r2
4 SET r1 "y: "
5 PRINT r1
6 INPUT r3
7 JUMPEQ r2 r3 11
8 SET r1 "x is not equal to y"
9 PRINT r1
10 EXIT
11 SET r1 "x is equal to y"
12 PRINT r1
13 EXIT
-
(2 points.) In English, explain how the program above works, making clear why it is correct, as by explaining the role of each line, from
1
through13
. -
(3 points.) Rewrite the program above in such a way that, instead of just printing out
x is not equal to y
when the two numbers are not equal, it instead prints eitherx is less than y
orx is greater than y
, depending on which number is greater. The program should still printx is equal to y
if the two numbers are equal. -
(4 points.) Write a program in this assembly language that “coughs” (i.e., prints
cough
) some number of times. Your program should first prompt the user for a number and then printcough
exactly that many times. You may assume the user will input a non-negative number.
The assembly language you just used to write these programs is a simplified version of the assembly language your computer might use when compiling a C program. When clang
compiles your C program in CS50 IDE, it first compiles your C program into an assembly language called “x86-64” and then assembles assembly into machine code. It turns out we can actually stop clang
midway through that process so as to take a look at the assembly code corresponding to our program.
Copy the program below into a file called compare.c
in CS50 IDE.
#include <cs50.h>
#include <stdio.h>
int main(void)
{
int x = get_int("x: ");
int y = get_int("y: ");
if (x < y)
{
printf("x is less than y\n");
}
else
{
printf("x is not less than y\n");
}
}
In your terminal, run clang -S compare.c
. The -S
flag tells clang
to output the assembly code for the program. After you run the command, you should see a file called compare.s
containing the assembly. Open that file and take a look!
Odds are it looks pretty complicated! No need to understand all the details, but notice that most lines contain some instruction followed by one or more arguments for that instruction. The movl
instruction, for example, moves data from one location to another.
- (1 point.) Unlike our own assembly language above, x86-64 has an instruction for calling a function from inside of a program. Based on the assembly code in
compare.s
, what is the name of the instruction for calling a function? How do you know? - (2 points.) Based on the assembly code in
compare.s
, what is the name of the x86-64 instruction via which the program decides what to print? And how does that instruction decide what to print?