Players
The 1901 Boston Americans (now Red Sox) Team
Problem to Solve
If you’re not familiar, baseball is a popular sport in which two teams of 9 players take turns batting (hitting a ball) and fielding (catching and throwing hit balls). Points (“runs”) are scored when a hitting team’s player hits a ball and eventually touches all bases before the fielding team’s players have the chance to get them “out.” Baseball is arguably most popular in the United States and Canada, where the MLB (Major League Baseball) has served as the professional association for players since 1876.
In a database called players.db
, using a table called players
, answer questions about MLB players who’ve played from 1871 to 2023.
Demo
Distribution Code
For this problem, you’ll need to download players.db
, along with several .sql
files in which you’ll write your queries.
Download the distribution code
Log into cs50.dev, click on your terminal window, and execute cd
by itself. You should find that your terminal window’s prompt resembles the below:
$
Next execute
wget https://cdn.cs50.net/sql/2023/x/psets/0/players.zip
in order to download a ZIP called players.zip
into your codespace.
Then execute
unzip players.zip
to create a folder called players
. You no longer need the ZIP file, so you can execute
rm players.zip
and respond with “y” followed by Enter at the prompt to remove the ZIP file you downloaded.
Now type
cd players
followed by Enter to move yourself into (i.e., open) that directory. Your prompt should now resemble the below.
players/ $
If all was successful, you should execute
ls
and see a database named players.db
alongside several .sql
files. Executing sqlite3 players.db
should open the database in sqlite3
, via which you’ll execute SQL queries. If not, retrace your steps and see if you can determine where you went wrong!
Schema
In players.db
you’ll find a single table, players
. In the players
table, you’ll find the following columns:
id
, which uniquely identifies each row (player) in the tablefirst_name
, which is the first name of the playerlast_name
, which is the last name of the playerbats
, which is the side (R
for right orL
for left) the player bats onthrows
, which is the hand (R
for right orL
for left) the player throws withweight
, which is the player’s weight in poundsheight
, which is the player’s height in inchesdebut
, which is the date (expressed asYYYY-MM-DD
) the player began their career in the MLBfinal_game
, which is the date (expressed asYYYY-MM-DD
) the player played their last game in the MLBbirth_year
, which is the year the player was bornbirth_month
, which is the month (expressed as an integer) the player was bornbirth_day
, which is the day the player was bornbirth_city
, which is the city in which the player was bornbirth_state
, which is the state in which the player was bornbirth_country
, which is the country in which the player was born
Specification
- In
1.sql
, write a SQL query to find the hometown (including city, state, and country) of Jackie Robinson. - In
2.sql
, write a SQL query to find the side (e.g., right or left) Babe Ruth hit. - In
3.sql
, write a SQL query to find the ids of rows for which a value in the columndebut
is missing. - In
4.sql
, write a SQL query to find the first and last names of players who were not born in the United States. Sort the results alphabetically by first name, then by last name. - In
5.sql
, write a SQL query to return the first and last names of all right-handed batters. Sort the results alphabetically by first name, then by last name. - In
6.sql
, write a SQL query to return the first name, last name, and debut date of players born in Pittsburgh, Pennsylvania (PA). Sort the results first by debut date—from most recent to oldest—then alphabetically by first name, followed by last name. - In
7.sql
, write a SQL query to count the number of players who bat (or batted) right-handed and throw (or threw) left-handed, or vice versa. - In
8.sql
, write a SQL query to find the average height and weight, rounded to two decimal places, of baseball players who debuted on or after January 1st, 2000. Return the columns with the name “Average Height” and “Average Weight”, respectively. - In
9.sql
, write a SQL query to find the players who played their final game in the MLB in 2022. Sort the results alphabetically by first name, then by last name. - In
10.sql
, write SQL query to answer a question of your choice. This query should:- Make use of
AS
to rename a column - Involve at least condition, using
WHERE
- Sort by at least one column using
ORDER BY
- Make use of
Feeling more comfortable?
You can optionally attempt the below queries, which may require some advanced knowledge!
- Write a single SQL query to list the first and last names of all players of above average height, sorted tallest to shortest, then by first and last name.
Usage
To test your queries as you write them in your .sql
files, you can query the database by running
.read FILENAME
where FILENAME
is the name of the file containing your SQL query. For example,
.read 1.sql
You can also run
$ cat FILENAME | sqlite3 players.db > output.txt
to redirect the output of the query to a text file called output.txt
. (This can be useful for checking how many rows are returned by your query!)
Assessment
See the feedback page to learn more about how problems like these will be assessed.
Correctness
While check50
is available for this problem, you’re encouraged to instead test your code on your own for each of the following. If you’re using the players.db
database provided in this problem’s distribution, you should find that…
- Executing
1.sql
results in a table with 3 columns and 1 row. - Executing
2.sql
results in a table with 1 column and 1 row. - Executing
3.sql
results in a table with 1 column and 213 rows. - Executing
4.sql
results in a table with 2 columns and 2814 rows. - Executing
5.sql
results in a table with 2 columns and 12878 row. - Executing
6.sql
results in a table with 3 columns and 134 rows. - Executing
7.sql
results in a table with 1 columns and 1 row. - Executing
8.sql
results in a table with 2 columns and 1 row. - Executing
9.sql
results in a table with 2 columns and 516 rows.
10.sql
is up to you!
Note that row counts do not include header rows that only show column names.
check50
check50 cs50/problems/2024/sql/players
How to Submit
After you submit, be sure to check your autograder results. If you see SUBMISSION ERROR: missing files (0.0/1.0)
, it means your file was not named exactly as prescribed (or you uploaded it to the wrong problem).
Correctness in submissions entails everything from reading the specification, writing code that is compliant with it, and submitting files with the correct name. If you see this error, you should resubmit right away, making sure your submission is fully compliant with the specification. The staff will not adjust your filenames for you after the fact!
Ensure your terminal’s working directory is the players
folder. Your prompt should resemble the below:
players/ $
When you type ls
to list the files in your working directory, you should see your .sql
files for players
.
Zip up your solution files by executing the following:
zip players-solutions.zip *.sql
Want to learn about this command?
Notice this command has three parts:
zip
, which is the name of the tool which will create a.zip
fileplayers-solutions.zip
, which is the name of the.zip
file to create*.sql
, which represents the file(s) to include.*.sql
matches all files that end with.sql
, as*
is a “wildcard” character which matches any set of characters (similar to%
in SQL!).
Next, download players-solutions.zip
by control-clicking or right-clicking on the file in your codespace’s file browser and choosing Download.
Go to CSCI E-151’s Gradescope page.
Click Problem Set 0: Players.
Drag and drop your .zip
file to the area that says Drag & Drop. Be sure that each .sql
file is correctly named exactly as prescribed above, lest the autograder fail to run on your submission! Note that your submission is considered incomplete if any of the files are missing—be sure they’re all there!
Click Upload.
You should see a message that says “Problem Set 0: Players submitted successfully!”
Be sure to double-check your autograder results before moving on!
Acknowledgements
Data retrieved and modified from github.com/chadwickbureau/baseballdatabank/tree/master, licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License.