Exploratory Data Analysis

Hurricane Anita

Problem to Solve

Exploratory data analysis (EDA) is a method used by data scientists to find interesting characteristics of data and test hypotheses. It’s often one of the initial steps a researcher might take in a larger data analysis program.

You now have many of the tools you need to perform your own exploratory data analysis! In a program called eda.R, in a folder called eda, write a program to visualize and explore a data set of your choice.

Getting Started

For this problem, you’ll need to create eda.R in a folder called eda.

Create eda.R

Open RStudio per the linked steps and navigate to the R console:

>

Next execute

getwd()

to print your working directory. Ensure your current working directory is where you’d like to create this problem’s folder. If using RStudio through cs50.dev the recommended directory is /workspaces/NUMBER where NUMBER is a number unique to your codespace.

If you do not see the right working directory, use setwd to change it! Try typing setwd("..") if in the working directory of another problem, which will move you one directory higher.

Next execute

dir.create("eda")

in order to create a folder called eda in your codespace.

Now type

setwd("eda")

followed by Enter to move yourself into (i.e., open) that directory. Your working directory should now end with

eda/

Finally, type

file.create("eda.R")

to create a file called eda.R inside of the eda folder.

If all was successful, you should execute

list.files()

and see eda.R. If not, retrace your steps and see if you can determine where you went wrong!

Specification

The only requirement for this problem is that you produce a visualization you care about, that is interesting to you, and that you feel proud of. Oh, and that you save the visualization in a file called visualization.png!

You might find it helpful to get a bit of inspiration:

Usage

Assuming eda.R is in your working directory, enter the below in the R console to test your program:

source("eda.R")

How to Test

Afraid there isn’t a specific way to test your code, other than by trial and error to produce the visualization you’re looking for!

check50

You can also check your code using check50, a program that CS50 will use to test your code when you submit. But be sure to test it yourself as well!

Run the following command in the RStudio console:

check50("cs50/problems/2024/r/eda")

Green smilies mean your program has passed a test! Red frownies will indicate your program output something unexpected. Visit the URL that check50 outputs to see the input check50 handed to your program, what output it expected, and what output your program actually gave.

How to Submit

You can submit your code using submit50.

Keeping in mind the course’s policy on academic honesty, run the following command in the RStudio console:

submit50("cs50/problems/2024/r/eda")