by Spencer Tiberi


  • Odds are you use it everyday, but what is it?


  • Computers are good at recording, playing back, and generating audio
  • Uses different file formats
    • File formats are just a way of storing 0s and 1s on disk so that certain software knows how to interpret it
  • MIDI
    • Way of storing musical notes for certain songs
    • Can do this for different instruments
      • Programs can render the notes for these instruments
  • GarageBand
    • Included with macOS


    • This is the Star Wars theme in MIDI
      • Doesn’t sound quite as good as the actual version
      • Computer synthesizes the notes
        • Not an actual recording
        • Computer interprets notes in the MIDI file
  • MIDI is common in the digital workspace among musicians who wish to share music with each other.
  • Humans typically like to hear music preformed and recorded by humans
    • File formats for recorded music include:
      • AAC
      • MIDI
      • MP3
      • WAV
  • WAV is an early sound format, but still used
    • Uncompressed data storage allowing high quality
  • MP3
    • File format for audio that uses compression
      • Significantly reduce how many bits are necessary to store a song
      • Discards 0s and 1s that humans can’t necessarily hear
        • True audiophiles may disagree
      • Trade off between optimizing storage space and sacrificing quality
      • This compression is said to be lossy
        • Losing the quality in the compression process
  • AAC
    • Similar to MP3
    • May see when you download a song from iTunes
  • Streaming services such as Spotify don’t transfer a file to you but rather stream bits of information to you
  • How do we think about the quality of these formats?
    • Sampling frequency
      • Number of times per seconds we take a digital snapshot of what a person would hear
    • Bit depth
      • Number of bits used for these individual snapshots
    • Sampling frequency x bit depth = number of bits necessary to store one second of music
    • Audio file formats allow you to modify what these parameters are


  • A graphic, what we see with multimedia, is really just a bunch of pixels both horizontal and vertical
    • All file formats are rectangular in nature, though transparent pixels can make images look to take on other shapes
    • In the simplest form, each of the dots or pixels is a bunch of 0s and 1s


    • To create a file format, we just need to determine a mapping
  • This image is only black and white, so how to represent color?


  • RGB stands for Red Green Blue
    • With information giving an amount of red, an amount of green, and an amount of blue, you can tell a computer how to colorize pixels
    • None of the colors yields a black pixel
    • All of the colors yields a white pixel
    • In between these two options is where we get all sorts of colors
  • Consider the three bytes: 11111111 00000000 00000000
    • If we interpret these bytes to represent colors, it appears we want all of the red, none of the green, and none of the blue
    • These 24 bits (3 bytes = 3 x 8 bits = 24 bits) represent the color we know as red!
      • If a computer wanted to represent this color, it would store these 24 bits
  • Consider the three bytes: 00000000 11111111 00000000
    • Green
  • Consider the three bytes: 00000000 0000000 11111111
    • Blue
  • Consider the three bytes: 00000000 0000000 0000000
    • Black
  • Consider the three bytes: 11111111 11111111 11111111
    • White
  • Can get many color variations by mixing the above colors in different quantities
  • When we talk about image formats, we typically don’t talk in terms of binary but rather something called hexadecimal (base-16, contains 16 digits)
    • 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f
    • 0 is the smallest number we can represent in single digit
    • f is the largest number (value of 15) we can represent in a single digit
  • Consider the 8 bits: 1111 1111
    • Each hexadecimal digit represents four bits
    • One hexadecimal digit can represent the first four bits, another can represent the second four
      • Represent something with eight symbols using only two!
    • 1111 is the decimal number 15, which is f
    • Therefore, 1111 1111 in hexadecimal is ff
  • Red can thus be represented in hexadecimal as ff 00 00
  • Green can be represented in hexadecimal as 00 ff 00
  • Blue can be represented in hexadecimal as 00 00 ff
  • A lot of graphical editing software such as Photoshop use hexadecimal to represent colors

Bitmap Format


  • This background for Windows XP was a bitmap file (.bmp)
    • A mapping or grid of bits much like the smiley face from before
  • Zooming in on this image show that it is just a grid of dots

    XP zoom

    • Notice the pixelation
  • Much like with audio, so too in the world of images do you have discretion over how many bits to use
    • How many bits to represent each pixel’s color?
    • Resolution is another factor
      • An image that is only 100 pixels scaled up only duplicates the existing limited information, resulting in a blotchy image
      • Would be better to start with image that has a higher resolution (more pixels)
  • A lot of repeated colors, so it seems silly to represent each color with the same number of bits

Image Compression

  • Graphical file formats can often be compressed
  • Can be done lossy or losslessly
    • With audio, we threw away audio information that the human ear can’t necessarily hear
      • This is lossy compression; throwing information away
    • Using fewer bits to represent the same information is lossless compression
  • Lossless compression


  • There is a lot of repeated blue in the first image
    • Using the same 24 bits to represent each pixel!
  • The second image is compressed and not what a user would see
    • The first column contains the color that the rest of the row (scan line) should have
      • Image contains instructions on how to repeat the color in a particular row
    • When a color is encountered that isn’t in the first column (the apple in this case), the instructions would list the colors for each non-repeated pixel
    • This uses less bits but makes the original information recoverable
  • Lossy compression


  • This is a .jpg photograph that is somewhat compressed, but not easy to tell
  • Let’s say we want to compress this image further so that we can share it without going over a social media platform’s limit
  • It contains more complicated patterns of colors, so let’s try a lossy compression resulting in the following:


  • Lossy compression means that I won’t be able to get that original image back
    • The compression throws away bits of information
      • “Does the sky really need this many shades of blue?”
      • “Does this leaf really need this many shades of green”
        • Replaces bits with only a few colors giving an approximation
        • I will not be able to know how clear the sky used to be from this information

Image File Formats

  • BMP
    • Originally used by Windows
    • Not super common these days
  • GIF
    • Low quality images
      • Only supports 8-bit color
    • Often used for memes
      • Can be animated
      • Like a video file with only a few images


  • JPEG
    • Supports 24-bit color
    • Losslessly compresses
      • Can minimize amount of compression to create high quality photos
  • PNG
    • High quality graphics
    • Supports 24-bit color
  • All these formats ultimately have an limited amount of information
    • Ultimate just store pixels and colors of when the image was taken


  • Common for popular culture abuses of what it means to be a multimedia format
    • “Enhancing” means to make an image as clear as possible not matter what format it was saved in
  • David shows a clip of characters “enhancing” an image
    • The characters zoom into a pixelated frame of a video and somehow clear it up to see a reflection
      • Video is just a whole bunch of images being shown to us quickly (24 frames per second, etc.)
    • The pixelated image only contains information for those pixels
      • There is no way to obtain a clear image unless the original image was already at a high resolution
  • David contrasts this with an aware clip of Futurama

Video Compression

  • You can think of a video format as similar to a flip book
  • Video formats are just a bunch of images shown quickly in succession to create the illusion of motion
    • Not necessarily all information stored as png, jpg, gif, or even images
    • Algorithms and mathematics can help go from one frame to another
  • Opportunities for compression
    • Can leverage same image compression techniques for each frame (intra-frame coding)
    • Background of multiple frames can contain redundant information


    • Compare current frame and next frame of video and determine what has changed
      • Store these differences
      • Key frames store a snapshot of time to remember what the video looks like
      • In each subsequent frame remember what has changed
        • Using algorithms and math, background is drawn
      • Key frames are stored multiple times to guarantee that frames can be recovered

Video File Formats

  • In the world of video, there are more solutions on how to store information
  • Video file formats are containers
    • Containers are digital container in which you can put multiple types of data
    • Can include a video track, audio track, a secondary audio track (for different languages), closed captions, …
  • AVI
    • Commonly used in Windows
  • DivX
  • Matroska
    • Open source container meant to be more versatile
  • MP4
    • Pretty much universal in all browsers
  • QuickTime
    • Commonly used in MacOS
  • Codecs
    • Ways of storing and encoding information
    • For video:
      • H.264
      • MPEG-4 Part-2
    • For audio:
      • Can be stand alone files or tracks in a container!
      • AAC
      • MP3

3D Video

  • Increasingly, 3D formats are becoming more common


    • This is a 360 degree image of Sanders Theatre
      • A spherical image
      • Looks distorted in 2D
      • Like flattening a globe
  • Images can contain metadata
    • Information that viewers can’t see
    • Tells programs, applications, and browsers how to display the image
  • With sensors on a headset, users can experience virtual reality


  • More file formats are still on the horizon, but ultimately all of them boil down to storing 0s and 1s and why!