Assembly Line
Recall that C programs are first compiled into a lower-level language called “assembly” before that assembly is assembled into machine code that a computer can execute. There are a number of different types of assembly languages, but they generally share similar properties: assembly languages have a limited set of instructions for performing basic operations like putting data in a variable (otherwise known in this context as a “register”), moving data from one location in memory to another, etc.
Consider a simplified assembly language wherein there are four registers (i.e., locations to store values) called r1, r2, r3, and r4. This assembly language supports the following instructions, wherein R, Rx, Ry, and Rz represent (any of those) registers, V represents a literal value (an integer or a string), and L represents a line number:
PRINT Rprints the value in registerR.INPUT Rprompts the user for input and stores it in registerR.- You may assume that if the input looks like an integer (i.e., it consists of only digits), it will be stored as an integer; otherwise, it will be stored as a string.
SET R Vstores the valueVin registerR.- For example,
SET r1 50would store the value50in registerr1.
- For example,
ADD Rz Rx Ryadds the value stored in registerRxto the value stored in registerRyand stores the result in registerRz.JUMPEQ Rx Ry Lchecks if the values stored at registersRxandRyare equal to one another. If so, the program jumps to lineL. Otherwise, the program continues to the next instruction.JUMPLT Rx Ry Lchecks if the values stored at registerRxis less than the value stored at registerRy. If so, then the program jumps to lineL. Otherwise, the program continues to the next instruction.EXITexits the program.
Every line of code in this assembly language consists of a line number followed by a single instruction. No parentheses, curly braces, semicolons, or any other syntax other than the above instructions!
For example, here is a program that prompts the user for two numbers and prints whether they are equal or not:
1 SET r1 "x: "
2 PRINT r1
3 INPUT r2
4 SET r1 "y: "
5 PRINT r1
6 INPUT r3
7 JUMPEQ r2 r3 11
8 SET r1 "x is not equal to y"
9 PRINT r1
10 EXIT
11 SET r1 "x is equal to y"
12 PRINT r1
13 EXIT
-
(2 points.) In English, explain how the program above works, making clear why it is correct, as by explaining the role of each line, from
1through13. -
(3 points.) Rewrite the program above in such a way that, instead of just printing out
x is not equal to ywhen the two numbers are not equal, it instead prints eitherx is less than yorx is greater than y, depending on which number is greater. The program should still printx is equal to yif the two numbers are equal. -
(4 points.) Write a program in this assembly language that “coughs” (i.e., prints
cough) some number of times. Your program should first prompt the user for a number and then printcoughexactly that many times. You may assume the user will input a non-negative number.
The assembly language you just used to write these programs is a simplified version of the assembly language your computer might use when compiling a C program. When clang compiles your C program in CS50 IDE, it first compiles your C program into an assembly language called “x86-64” and then assembles assembly into machine code. It turns out we can actually stop clang midway through that process so as to take a look at the assembly code corresponding to our program.
Copy the program below into a file called compare.c in CS50 IDE.
#include <cs50.h>
#include <stdio.h>
int main(void)
{
int x = get_int("x: ");
int y = get_int("y: ");
if (x < y)
{
printf("x is less than y\n");
}
else
{
printf("x is not less than y\n");
}
}
In your terminal, run clang -S compare.c. The -S flag tells clang to output the assembly code for the program. After you run the command, you should see a file called compare.s containing the assembly. Open that file and take a look!
Odds are it looks pretty complicated! No need to understand all the details, but notice that most lines contain some instruction followed by one or more arguments for that instruction. The movl instruction, for example, moves data from one location to another.
- (1 point.) Unlike our own assembly language above, x86-64 has an instruction for calling a function from inside of a program. Based on the assembly code in
compare.s, what is the name of the instruction for calling a function? How do you know? - (2 points.) Based on the assembly code in
compare.s, what is the name of the x86-64 instruction via which the program decides what to print? And how does that instruction decide what to print?