Internet Technologies

by Nick Wong ‘20


Protocols

  • A way of communicating - more specifically, a protocol is a set of rules or conventions that computers or computer programs use while communicating with each other
  • Real World Example - We shake hands, a protocol that has plenty of rules regarding length of the handshake, and the politeness of responding or not

How the Internet Works

  • We go “on the Internet” and visit some webpage (Facebook, Google, Amazon)
  • Somehow, that webpage understands how to interpret what we want and display an appropriate output
  • We can imagine that somehow, our computer puts information into the Internet, which delivers that information to some computer or server owned by the company who owns the site we’re trying to visit
  • Then, this server sends back the appropriate information to the Internet, which makes sure it gets delivered to our computer at home
  • In essence, the Internet is a delivery mechanism for information…but how?

DHCP - Dynamic Host Configuration Protocol

  • This protocol makes it so that when a computer you have - a phone, a laptop, etc - it can announce itself and ask for an address
  • The protocol says that these devices will be assigned a numeric address, much like our physical addresses, except unlike our physical addresses (like 123 Main St), this looks like #.#.#.#, where each # is between 0 and 255 (why?)
  • Despite the large number of combinations here, there are actually a very large number of devices, to where we are increasingly using something called IPv6 (the previous was IPv4), which allows for far more combinations

DNS - Domain Name Service

  • When my computer sends out a request, it has to use this IP address to make sure our data goes to the proper place
  • However, we, as humans, don’t really read addresses like 8.8.8.8 or 192.168.0.1
  • There is a system to “translate” the human-readable domain names (google.com, facebook.com, cs50.io) to their IP address counterparts
  • This service is called DNS, which allows us to use this translation to get from point A to B
  • We also have routers or gateways, which know how to take in information, look at where it’s going, and send it to the proper router
  • Data doesn’t have to follow the same path each time, but it will get to where it needs to go in around 30 hops or jumps from router to router or gateway to gateway

TCP - Transmission Control Protocol

  • Guarantees with high probability that data gets to where it needs to go
  • Sometimes, computers drop packets (data) - they get more data than they can, or they miss it entirely
  • TCP allows computers to know if they should resend data
  • Port numbers, specifically TCP Port numbers, help identify which service should take which data
  • For example - Data headed to #.#.#.#:80 says that the data should be sent to #.#.#.# and put through port 80, which happens to be a human-defined standard port for HTTP or web requests.

UDP - User Datagram Protocol

  • The feature here is to not guarantee redelivery….what?
  • Still fairly common and appropriate
  • For example, video streaming, video conferencing, live communication - we don’t want a retransmission, we would rather stay up to date chronologically
  • In these cases, UDP is actually more optimal than is TCP, can you see why?

Traceroute

  • Literally traces the route that information takes from our computer to some destination
  • Allows us to see which routers are being used by data to get to where it needs to go
  • This route may change over time and according to web traffic patterns

Undersea Cabling

  • We can also traceroute to international destinations, especially those on different continents
  • There is a lot of cabling that connects locations across oceans, including the Pacific and Atlantic

TCP/IP

  • How do we make sure that data, even large amounts of data, gets to where it needs to go, and does so “fairly”, so that a single piece of data doesn’t take up more space than it should?
  • How do we send the data and make sure whoever gets it knows what to do with it?
  • Maybe we could label the data in order, so that the recipient knowns that whichever data they get belongs in whichever order it’s supposed to
  • Additionally, if some data gets lost along the way, TCP allows us to ask for the missing data and complete it

HTTP - Hyper Text Transfer Protocol

  • A very common protocol, which you’ve likely seen before - http://example.com
  • HTTP is a sort of virtual envelope, which allows computers to communicate with one another, specifically in a webpage context (so between web browsers and servers)
  • We can use nslookup to check the IP address of a web domain - i.e. nslookup www.facebook.com
  • We can pretend to be a browser, and see the response that comes back when we visit a webpage
  • curl -I http://31.13.65.36/ - which tells us that Facebook would prefer we used their domain name (specifically which type of HTTP?)
  • This returns to us the response (if it worked) 200, which means to us, and our computers, that the response was ok
  • We’re likely more familiar with 404, which means things didn’t quite goes as planned
  • What we see here are called headers, which give us additional information about the data we’re given

HTML - Hyper Text Markup Language

  • curl without the -I flag lets us see the HTML results, or the data
  • This language tells the browser how to display everything from where pictures are located to how to format text on the page