Count on It

Problem 1

128

Problem 2

Looking at the first byte of a character tells us how many bytes the character uses. If the first byte begins with 0, then the character is represented by one byte; otherwise, the number of 1s before the first 0 indicates the number of bytes used to represent the character.

Problem 3

If the byte has a value in the range of 128 to 191 (in other words, its first two bits are 10), then it is a continuation in a multi-byte sequence.

Problem 4

#include <stdbool.h>
#include <stdio.h>

typedef unsigned char BYTE;

int main(int argc, char *argv[])
{
    if (argc != 2)
    {
        printf("Usage: ./count INPUT\n");
        return 1;
    }

    FILE *file = fopen(argv[1], "r");
    if (!file)
    {
        printf("Could not open file.\n");
        return 1;
    }

    int count = 0;
    while (true)
    {
        BYTE b;
        fread(&b, 1, 1, file);
        if (feof(file))
        {
            break;
        }
        if (b < 128 || b > 191)
        {
            count++;
        }
    }
    printf("Number of characters: %i\n", count);
}