Less is More

Problem 1

T3A4C2G1A3

Problem 2

Sequences with few runs of repeated characters are likely to increase the file size rather than decrease it. For example, ATGC would be represented as A1T1G1C1, which requires 8 characters instead of the original 4.

Problem 3

Germany’s flag would be compressed more. Since each row of pixels in Germany’s flag only contains a single color (as compared to Romania’s three), the runs of pixels in Germany’s flag are longer and therefore can be compressed more.

Problem 4

11101011110

Problem 5

11101011110

Problem 6

The problem with the encoding is that it’s ambiguous: it’s possible that two distinct DNA sequences have the same encoding. Another encoding would be to use two bits for each nucleotide: 00 for A, 01 for C, 10 for G, and 11 for T.