Houses

For this lab, we’ll implement a program to import student data into a database and then produce class rosters, a la the below.

$ python import.py characters.csv

$ python roster.py
House: Gryffindor
Lavender Brown, born 1979
Colin Creevey, born 1981
Seamus Finnigan, born 1979
Hermione Jean Granger, born 1979
Neville Longbottom, born 1980
Parvati Patil, born 1979
Harry James Potter, born 1980
Dean Thomas, born 1980
Romilda Vane, born 1981
Ginevra Molly Weasley, born 1981
Ronald Bilius Weasley, born 1980

Getting Started

Visit sandbox.cs50.io/2089cc5b-2bf6-436b-810d-9fac44eb0a5e, which should contain four files:

  • characters.csv, a CSV file containing data on Hogwarts students
  • import.py, in which you’ll soon write a program via which to import that CSV into a SQLite database
  • roster.py, in which you’ll soon write a program via which to generate a class roster for a specific Hogwarts house
  • students.db, a SQLite database with this schema:
      CREATE TABLE students (
          id INTEGER PRIMARY KEY AUTOINCREMENT,
          first VARCHAR(255),
          middle VARCHAR(255),
          last VARCHAR(255),
          house VARCHAR(10),
          birth INTEGER
      );
    

You’re welcome to download students.db from cdn.cs50.net/hbs/2020/spring/labs/5/houses/students.db to your own computer if you’d like to try out some INSERTs and SELECTs with DB Browser.

Background

Hogwarts is in need of a student database. For years, the professors have been maintaing a CSV file containing all of the students’ names and houses and years. But that file didn’t make it particularly easy to get access to certain data, such as a roster of all the Ravenclaw students, or an alphabetical listing of the students enrolled at the school.

The challenge ahead of you is to import all of the school’s data into a SQLite database, and write a Python program to query that database to get house rosters for each of the houses of Hogwarts.

Specification

In import.py, write a program that imports data from characters.csv.

  • You may assume that characters.csv exists and has columns name, house, and birth.
  • For each student in the CSV file, insert the student into the students table in the students.db database.
    • While the CSV file provided to you has just a name column, the database has separate columns for first, middle, and last names. You’ll thus want to first parse each name and separate it into first, middle, and last names. You may assume that each person’s name field will contain either two space-separated names (a first and last name) or three space-separated names (a first, middle, and last name). For students without a middle name, you should leave their middle name field as NULL in the table.

In roster.py, write a program that prints a list of students for a given house in alphabetical order.

  • Your program should accept the name of a house via input.
  • Your program should query the students table in the students.db database for all of the students in the specified house.
  • Your program should then print out each student’s full name and birth year (formatted as, e.g., Harry James Potter, born 1980 or Luna Lovegood, born 1981).
    • Each student should be printed on their own line.
    • Students should be ordered by last name. For students with the same last name, they should be ordered by first name.

Usage

Your program should behave per the example below:

$ python import.py
$ python roster.py
House: Gryffindor
Hermione Jean Granger, born 1979
Harry James Potter, born 1980
Ginevra Molly Weasley, born 1981
Ronald Bilius Weasley, born 1980
...

References

Hints

Recall that after you’ve imported SQL from cs50, you can set up a database connection using syntax like

db = SQL("sqlite:///students.db")

Then, you can use db.execute to execute SQL queries from inside of your Python script.

Recall, too, that when you call db.execute and perform a SELECT query, the return value will be a list of rows that are returned, where each row is represented by a Python dict.

If you realize you’ve imported data into students.db incorrectly, you can empty the table manually using sqlite3:

$ sqlite3 students.db
sqlite> DELETE FROM students;
sqlite> .quit

Or automatically, every time you execute import.py, with a line of code atop your program like:

db.execute("DELETE FROM students")

Your code for roster.py, meanwhile, once correct, should behave like the below.

$ python roster.py
House: Gryffindor
Lavender Brown, born 1979
Colin Creevey, born 1981
Seamus Finnigan, born 1979
Hermione Jean Granger, born 1979
Neville Longbottom, born 1980
Parvati Patil, born 1979
Harry James Potter, born 1980
Dean Thomas, born 1980
Romilda Vane, born 1981
Ginevra Molly Weasley, born 1981
Ronald Bilius Weasley, born 1980

$ python roster.py
House: Hufflepuff
Hannah Abbott, born 1980
Susan Bones, born 1979
Cedric Diggory, born 1977
Justin Finch-Fletchley, born 1979
Ernest Macmillan, born 1980

$ python roster.py
House: Ravenclaw
Terry Boot, born 1980
Mandy Brocklehurst, born 1979
Cho Chang, born 1979
Penelope Clearwater, born 1976
Michael Corner, born 1979
Roger Davies, born 1978
Marietta Edgecombe, born 1978
Anthony Goldstein, born 1980
Robert Hilliard, born 1974
Luna Lovegood, born 1981
Isobel MacDougal, born 1980
Padma Patil, born 1979
Lisa Turpin, born 1979

$ python roster.py
House: Slytherin
Millicent Bulstrode, born 1979
Vincent Crabbe, born 1979
Tracey Davis, born 1980
Marcus Flint, born 1975
Gregory Goyle, born 1980
Terence Higgs, born 1979
Draco Lucius Malfoy, born 1980
Adelaide Murton, born 1982
Pansy Parkinson, born 1979
Adrian Pucey, born 1977
Blaise Zabini, born 1979