Lecture 2
From Binary to Programming Languages
Machine Code
-
Computer manufacturers make CPUs or Central Processing Units which recognize certain patterns of bits. Thus, these patterns are computer or CPU specific.
-
CPUs understand machine code. These are the zeroes and ones that tell the machine what to do. Machine code might look like this:
01111111 01000101 01001100 01000110 00000010 00000001 00000001 00000000
.
Assembly Code
-
It’s quite difficult for us to code in machine code, so assembly code was created.
-
Assembly code includes more english-like syntax. Assembly code is an example of source code.
-
Source code is code with a more english-like syntax that can be translated to machine code.
-
Some sequences of characters in assembly code include these:
movl
,addq
,popq
, andcallq
, which we might be able to assign meaning to. For example, perhapsaddq
means to add orcallq
means to call a function. What values are we doing these operations on? Well, registers! -
The smallest unit of useful memory is called a register. These are the smallest units that we can do some operation on. These registers have names, and we can find them in assembly code as well, such as
%ecx
,%eax
,%rsp
, and%rsb
. -
Languages with easier to understand syntax than assembly code were created. Below is a program called
hello.c
that prints “hello, world” in the programming language C.#include <stdio.h> int main(void) { printf("hello, world\n"); }
Compilers and Interpreters
-
With
hello.c
from earlier, we have to convert the program to the zeroes and ones the computer can understand. -
To do this, we can use compilers, pieces of software that know both how to understand source code and the patterns of zeroes and ones in machine code and can translate one language to another.
-
To compile
hello.c
, we can use something installed on our computers called CC, or C Compiler. - To use the compiler, we go to our terminal window and type at the prompt.
- A terminal window is a keyboard only interface to tell your computer what to do.
- The prompt is represented by a dollar sign,
$
.
-
We type
cc -o hello hello.c
. This creates a new file calledhello
. -
To run this program called
hello
, we type./hello
at the prompt where.
represents the folder or directory that this file is in. -
A sample of the terminal window might look like this:
$ cc -o hello hello.c $ ./hello hello, world
-
-
Some languages skip the step of compilers and instead use interpreters. Interpreters take in source code and run the source code, line by line, from top to bottom and left to right.
-
Interpreters are created with the zeroes and ones that the CPU understands. These zeroes and ones can recognize keywords and functions in the source code.
-
Python is an interpreted language. To say “hello, world” in Python, we write the following line in
hello.py
.print("hello, world")
-
To interpret this source code, at the terminal, we simply type
python hello.py
, wherepython
is the name of the interpreter. -
The program
python
, in this case, opens up the filehello.py
, reads it top to bottom, recognized the functionprint
and knew what to do, namely print “hello, world” on the screen and quit. -
A sample of the terminal window might look like this:
$ python hello.py hello, world
-
-
Comparing compilers and interpreters, we might note that interpreters skip the step of having a compiled program before running it. This causes a performance penalty for interpreter languages, since each time, the interpreter will have to re-interpret the code.
-
To combat this issue, Python now generates bytecode, where it has already compiled the code and saved the results in a temporary file. When running the program again, Python will not interpret the code again but instead look at the pre-compiled version.
-
Bytecode looks something like this:
0 LOAD_GLOBAL 0 (print) 3 LOAD_CONST 1 ('hello, world') 6 CALL_FUNCTION 1 (1 positional, 0 keyword pair) 9 POP_TOP 10 LOAD_CONST 0 (None) 13 RETURN_VALUE
-
Virtual Machines
What if we want to run these programs on different computers, with different CPUs?
-
A virtual machine is a software that mimics the behavior of an imaginary machine.
-
With a virtual machine, instead of compiling the same code over and over again for different platforms, if each platform has this virtual machine installed, the exact same code can be run.
Python
Input and Printing
-
To greet our human, we might write this in
hello1.py
:name = input("What is your name? ") print("hello, " + name)
-
In Python,
input
is a function to get user input. -
This function takes in a string (this string prompts the human for an input) and returns a string.
-
After returning this string, we would like to store it somewhere for access in the future. We can store these values in variables.
-
To set a variable equal to a value, we use one single equal sign, often called the assignment operator.
-
When printing, we can use the
+
operator to concatenate two strings.
-
-
In the terminal, we would then have this, using “David” as input:
$ python hello1.py What is your name? David hello, David
-
We can print in multiple ways.
-
The
print
function can take multiple arguments, and it separates arguments with spaces.-
If we wrote
print("hello, ", name)
, we would get two spaces between “hello,” and “David”, one in the string with hello, and another as the separator between the arguments. -
To fix this, we can simply write
print("hello,", name)
.
-
-
The
print
function can be formatted such that we can literally writename
in the string and instead print the value. We must surround the variable with curly braces and prefix the string withf
; this tells Python that this string should be formatted in a special way. These strings are often called format strings or f-strings.- We can write
print(f"hello {name}")
.
- We can write
-
-
Let’s write the following code in
arithmetic.py
x = input("x: ") y = input("y: ") print(x + y)
-
Running this in the terminal, we get…
$ python arithmetic.py x: 1 y: 2 12
-
We get 1 + 2 = 12. Remember that the
input
function returns a string and the+
operator concatenates strings, and thus, we get the string “1” concatenated to “2”.
-
-
To fix this issue, we can change the input value from a string to an int, or integer. The function to do that is simply
int
. -
Our code can then be written as…
x = int(input("x: ")) y = int(input("y: ")) print(x + y)
Conditionals
- Let us instead write a program that compares two numbers.
-
In
conditions.py
, we might write…x = int(input("x: ")) y = int(input("y: ")) if x < y : print("x is less than y") elif x > y: print("x is greater than y") elif x == y: print("x equals y")
- The Boolean expressions are
x < y
,x > y
, andx == y
.- To check for equality, we have to use
==
, since=
is already the assignment operator.
- To check for equality, we have to use
-
The colon after the
if
andelif
statements specifically say to do the following if the Boolean expression is true. -
The indentations are necessary, so the print statements aren’t executed unless the Boolean expressions above them evaluate to true.
-
The second
elif
, or “else if”, statement is unnecessary since if a number is not less than or greater than another number, it must be equal to that number. We can modify our program to get this…x = int(input("x: ")) y = int(input("y: ")) if x < y : print("x is less than y") elif x > y: print("x is greater than y") else: print("x equals y")
- In Boolean expressions, we can also use certain keywords:
or
andand
. -
We might write a program
answer.py
that does the following:c = input("Answer: ") if c == "Y" or c == "y": print("yes") elif c == "N" or c == "n": print ("no")
- In this program, if the user inputs “Y”,
c == "Y"
will evaluate to true, and the program will print “yes”. If the user inputs “y”,c == "y"
will evaluate to true, and the program will also print “yes”.
Functions
-
We might want to define our own function, such as square, where calling it returns the square of an input.
-
In
return.py
, we might define our own function calledsquare
.def main(): x = int(input("x: ")) print(square(x)) def square(n): return n * n if __name__ == "__main__": main()
-
Note that we can’t call the function
square
before defining the functionsquare
since the interpreter reads from top to bottom. To fix this, we can create amain
function, and then call themain
function at the end of the file. -
When we call the
main
function, we normally write a strange set of lines to ensure that the main function is not executed at the wrong time. -
With the
square
function, we’ve abstracted away the multiplication, and now we can simply callsquare
.
Loops
While Loops
-
To write a program
positive.py
that will pester the human until the human inputs a positive integer, we might write the following:def main(): i = get_positive_int("i: ") print(i) def get_positive_int(prompt): while True: n = int(input(prompt)) if n > 0: break return n if __name__ == "__main__": main()
- In the function
get_positive_int
,while True
gives us an infinite loop. Python will then execute the indented code again and again until it is told to stop. - Note that
True
andFalse
are Boolean values. - The
break
keyword tells Python to stop. - Once the loop has been broken, the function returns the value.
- In the function
For Loops
-
To write a program
score.py
, where the user inputs a number and that many hashes are printed, we might write the following:n = int(input("n: ")) for i in range(n): print("#", end="") print()
-
range
is a function built into Python that returns a range of values from 0 to n - 1 inclusive. -
The
print
function automatically prints a new line. In other words, it moves the cursor to the next line after printing. To stop Python from printing each hash on a separate line, we specifyend=""
as another argument toprint
, which tells Python to end the lines with nothing. -
The final
print()
moves the cursor to the next line.
-
-
In the terminal, if we input 10 as
n
, we might see the following:$ python score.py n: 10 ##########
Mario
-
In Super Mario Bros., a two dimensional world is created! Here’s one setting:
-
To print the series of question marks shown, we might write
for i in range(4): print("?", end="") print()
-
Here’s another setting with a 4x4 block.
-
To print the block shown, we’ll need to print hashes on both rows and columns. We must first iterate through the rows, and within each row, we then iterate through each column and print a hash.
for row in range(4): for column in range(4): print("#", end="") print()
Types
- In Python, there are many data types.
bool
: True/Falseint
: Numbersstr
: Strings of textfloat
: Real numbers with decimal points and digits afterdict
: Hash tablelist
: Any number of values back to backrange
: Range of valuesset
: A set of values with no duplicatestuple
: x, y or latitude, longitude
Libraries
-
In addition to the functions built into the core language, there are libraries and frameworks that provide additional features. These have to be imported manually to be used.
-
For example, in Python, if we want to generate pseudorandom numbers, we have to import a function
randint
from a library calledrandom
.-
For example, to get a random integer between 1 and 10, we can write this:
from random import randint print(randint(1, 10))
-
We can also just write
import random
without importing the specific function. In this case, we’ll have to prefix the function with the library name using dot notation as shown below. -
To create a game where the user guesses a random integer between 1 and 10, we can write this:
import random n = random.randint(1, 10) guess = int(input("Guess: ")) if guess == n: print("Correct") else: print("Incorrect")
-
-
Note that these numbers are pseudorandom because computers can’t pick a random number like humans, they have to use algorithms, which are deterministic processes.
Memory
-
Inside a computer is hardware. These hardware chips are called RAM, or Random Access Memory. Inside each of these chips is some finite number of bytes used to represent values in our programs.
-
Python, and most other languages, decide a priori how many bits to use to represent values in our programs.
-
Thus, if our value cannot be represented in only that many bits, the language will instead approximately represent that value.
Imprecision
-
Let’s take a look at a program called
imprecision.py
that divides two numbers and returns the quotient.x = int(input("x: )) y = int(input("y: )) z = x / y print(f"{z:.30f}")
- The syntax
:.30f
signifies that we’re printingz
as a float to30
decimal places. -
We get…
$ python imprecision.py x: 1 y: 10 x / y = 0.100000000000000005551115123126
- The syntax
-
This value isn’t what we expect! We don’t have enough bits to store the entire precise value, so the computer approximates the quotient. This is called floating-point imprecision.
Integer Overflow
-
A similar problem occurs with integers.
- Consider a number that has been allocated three digits.
- We start by counting.
- Suppose we count until 999. We carry, and we get 1000.
- However, the computer has only allocated three digits, so our 1000 gets mistaken for 000.
- This is an example of integer overflow, where our large number has wrapped to a small number.
-
On December 31, 1999, people began to get nervous—programs stored the calendar year with only two digits. For 1999, the year was stored as 99. When the year 2000 approached, then, the year would be stored as 00, leading to confusion between the year 1900 and 2000. This became known as the Y2K problem.
- In the past, Boeing 787 planes stored the number of hundredths of seconds in a counter. Once that counter overflowed (occurring on the 248th day), the plane would go into fail-safe mode and the power would shut off.