Hall of Prophecy
Learning Goals
- “Refactor” a SQLdatabase to eliminate redundancies
- Use SELECT,CREATE, andINSERTstatements to reorganize data
- Write Python to load new SQLtables

Background
The keeper of the Hall of Prophecy, whose job entailed labelling records and keeping them up to date, decided to create a SQL database of Hogwarts students. Unfortunately the database was poorly designed! The database, roster.db, contains only one table, students, with the name and head of each of the four Hogwarts Houses.
Stumbling upon this database, you decide it could be better designed. Take a look at roster.db and notice how the name and head of each house repeats over and over. A better design would contain a students table (for only students), a houses table (for only houses), and a table that codifies the relationship between students and houses. This process of changing the “schema” of a database is known as refactoring.
- Hint
    - You can view the data in the Hall of Prophecy’s table by executing sqlite3 roster.db, followed by.schemain yoursqlite3prompt. This will output theCREATE TABLEstatement that was used to generate thestudentstable.
- You can then use SELECTstatements to view the contents of this table.
 
- You can view the data in the Hall of Prophecy’s table by executing 
Getting Started
- Log into cs50.dev using your GitHub account.
- Click inside the terminal window and execute cd.
- Execute wget https://cdn.cs50.net/2022/fall/labs/7/prophecy.zipfollowed by Enter in order to download a zip calledprophecy.zipin your codespace. Take care not to overlook the space betweenwgetand the following URL, or any other character for that matter!
- Now execute unzip prophecy.zipto create a folder calledprophecy.
- You no longer need the ZIP file, so you can execute rm prophecy.zipand respond with “y” followed by Enter at the prompt.
Implementation Details
You will use the existing data in roster.db to create a new database, one with a table for students, a table for houses, and a table for house assignments. You can do this with individual SQL queries, though we recommend ultimately writing your own Python program to automate the process! Notice we’ve given you the data from the students table of roster.db in CSV format, students.csv, for your convenience.
Developing a Schema
First, let’s develop a new schema for the database. In schema.sql, document the three CREATE TABLE commands you’ll need to create your three new tables.
Keep in mind you’ll want each table to represent a single entity: that is, your students table should represent only students, your houses table should represent only houses, and your house assignments table should represent only house assignments. You might find it helpful to first consider the pieces of information you’ll need in each table, and then about which SQLite data type best represents that information. For example, this was the CREATE TABLE command for students:
CREATE TABLE students (
    id INTEGER,
    student_name TEXT,
    house TEXT,
    head TEXT,
    PRIMARY KEY(id)
);
Keep in mind that every table should have a primary key: a column that uniquely identifies every row in the table. Some tables may be best designed with foreign keys: columns that reference the primary keys of another table.
When you’re ready to configure your database with your new schema, run your three new CREATE TABLE queries. Type .schema to see your results.
Inserting Data
After you’ve configured your database with your new schema, you can begin inserting data into your new database (using your new schema!). It’s best to write a Python program here, which can save you the trouble of writing many INSERT queries. Keep in mind that you have students.csv, which contains the data from the previous database’s students table.
Thought Question
- Why do you think it’s considered better design not to repeat information like houses and heads for each student?
How to Test Your Code
You’ll likely find the following commands helpful for testing your code:
- .schemato check the schema of your database
- SELECT * FROM table;where- tableis the name of the table you’d like to see data from
No check50 for this one!
How to Submit
No need to submit. This is a practice problem!