Count on It
Recall that ASCII is just one way to represent characters.
- (1 point.) How many total characters can be represented in ASCII, if each character is represented using 7 bits?
If we want to represent more characters than ASCII allows, we can use Unicode, which uses more bits than ASCII to represent some characters. One implemetnation of Unicode, UTF-8, uses “variable-width encoding” to represent characters: characters can be represented by either one, two, three, or four bytes.
Read up on UTF-8 at fileformat.info/info/unicode/utf8.htm.
- 
    (2 points.) When reading, as via fread, a text file encoded as UTF-8, how can you determine how many bytes a character will take?
- 
    (2 points.) If you’re reading a file encoded as UTF-8, and you read a byte, how can you determine if that byte is a continuation of an existing character, rather than the beginning of a new character? 
- 
    (4 points.) The program below counts the number of characters in a file, assuming the file is encoded as ASCII. Modify the program so that it counts the number of characters in a file encoded as UTF-8. 
#include <stdbool.h>
#include <stdio.h>
typedef unsigned char BYTE;
int main(int argc, char *argv[])
{
    if (argc != 2)
    {
        printf("Usage: ./count INPUT\n");
        return 1;
    }
    FILE *file = fopen(argv[1], "r");
    if (!file)
    {
        printf("Could not open file.\n");
        return 1;
    }
    int count = 0;
    while (true)
    {
        BYTE b;
        fread(&b, 1, 1, file);
        if (feof(file))
        {
            break;
        }
        count++;
    }
    printf("Number of characters: %i\n", count);
}
In addition to submitting this subquestion using the instructions contained in the “How to Submit” section of the Test, you must also:
- Write your program in a file called count.c
- Submit count.cby runningsubmit50 cs50/problems/2019/fall/test/count.