On Time

“Adams Square Station, looking southerly,” taken on Sept. 10, 1898.
Problem to Solve
The first subway tunnels in the United States, built to help people stay on time while moving around the city, are still in use today under Boston Common (not too far from Harvard itself!). Over 100 years later, the MBTA—the Massachusetts Bay Transportation Authority—manages public transportation around Boston, still ensuring everyone can stay on time, whether by subway, bus, railroad, or ferry.
The MBTA has asked you to clean up a data set it will use to calculate its service reliability, putting the data in a SQLite database for easy use and maintenance.
Distribution Code
Download the distribution code
Log into cs50.dev, click on your terminal window, and execute cd by itself. You should find that your terminal window’s prompt resembles the below:
$
Next execute
wget https://cdn.cs50.net/sql/2025/spring/sections/3/ontime.zip
in order to download a ZIP called ontime.zip into your codespace.
Then execute
unzip ontime.zip
to create a folder called ontime. You no longer need the ZIP file, so you can execute
rm ontime.zip
and respond with “y” followed by Enter at the prompt to remove the ZIP file you downloaded.
Now type
cd ontime
followed by Enter to move yourself into (i.e., open) that directory. Your prompt should now resemble the below.
ontime/ $
If all was successful, you should execute
ls
and see a database named ontime.db alongside a rail.csv file.
Specification
Write a series of SQL (and SQLite) statements to import and clean the data from rail.csv into a table, rail, in a database called ontime.db.
Within ontime.db, the rail table is already created for you.
To consider the data in the rail table clean, you should ensure…
- The data includes only those rows related to “PEAK” transit times.
- All green line trains (e.g., “Green-D”, “Green-E”, etc.) are listed simply as “Green.”
- Each row has a unique ID.
Advice
Begin by importing rail.csv into a temporary table
Start by getting all of the data from rail.csv into a temporary table, one called rail_temp. This table was created for you already in ontime.db.
A temporary table is a helpful placeholder: you can use it to clean your data until it’s in a form that’s suitable for your final rail table.
Recall that .import is a SQLite statement that can import a CSV file into a table of your choice.
Clean the imported data
With your data in a temporary table, continue writing SQL statements to clean the data.
You’ll need to use UPDATE and DELETE statements to ensure the data includes only those rows related to “PEAK” transit times, and that all green line trains (e.g., “Green-D”, “Green-E”, etc.) are listed simply as “Green.”
Transfer the data from your temporary table into the rail table
Recall that you can INSERT values into a new table by SELECTing rows from another:
INSERT INTO "table0" ("column0", "column1")
SELECT "column0", "column1" FROM "table1";
If your table has a primary key column, as rail does, a unique ID should be automatically generated for each row you insert.