Big 5
Problem to Solve
Ever taken a BuzzFeed quiz to determine whether you’re more like a brownie or a chocolate chip cookie? Turns out personality can be characterized in many ways and—within the present-day psychological community—some of the most common traits to describe personality include:
- Extroversion, the extent to which one might be socially outgoing
- Neuroticism, the extent to which one might experience emotional swings
- Agreeableness, the extent to which one might seek to be cooperative and empathetic
- Conscientiousness, the extent to which one might prioritize order and self-discipline
- Openness, the extent to which one might be open to new experiences
These 5 personality traits are together referred to as “The Big 5.” Psychologists (or those who are just curious about their personality!) might use various personality tests to assess the relative strength of these traits in one’s personality.
In a program called big5.R
, in a folder called big5
, write a program to analyze the results of thousands of Big 5 personality tests.
Distribution Code
For this problem, you’ll need to download big5.R
, along with a tests.tsv
file and corresponding codebook.
Download the distribution code
Open RStudio per the linked steps and navigate to the R console:
>
Next execute
getwd()
to print your working directory. Ensure your current working directory is where you’d like to download this problem’s distribution code. If using RStudio through cs50.dev the recommended directory is /workspaces/NUMBER
where NUMBER
is a number unique to your codespace.
If you do not see the right working directory, use setwd
to change it!
Next execute
download.file("https://cdn.cs50.net/r/2024/x/psets/1/big5.zip", "big5.zip")
in order to download a ZIP called big5.zip
into your codespace.
Then execute
unzip("big5.zip")
to create a folder called big5
. You no longer need the ZIP file, so you can execute
file.remove("big5.zip")
Now type
setwd("big5")
followed by Enter to move yourself into (i.e., open) that directory. Your working directory should now end with
big5/
If all was successful, you should execute
list.files()
and see files named big5.R
, codebook.txt
, and tests.tsv
. If not, retrace your steps and see if you can determine where you went wrong!
Specification
In big5.R
, analyze the personality tests in tests.tsv
, writing the results to a new file, analysis.csv
.
analysis.csv
should retain all columns in tests.tsv
, with the following updates:
- Convert the gender column from a numeric representation to a textual representation.
- Add the following columns:
- extroversion, a column that represents each test’s result on the extroversion trait
- neuroticism, a column that represents each test’s result on the neuroticism trait
- agreeableness, a column that represents each test’s result on the agreeableness trait
- conscientiousness, a column that represents each test’s result on the conscientiousness trait
- openness, a column that represents each test’s result on the openness trait
To understand tests.tsv
, be sure to reference codebook.txt
!
Convert Demographic Data
To convert values in the gender column to text, adhere to the mapping between numbers and text provided by codebook.txt
.
Compute Test Results
Test results for each Big 5 personality trait should be computed as follows:
- Sum the values of the relevant columns.
- Divide by the maximum possible sum for those columns (which is 15!).
- Round the test results to 2 decimal places using a function called
round
.
Advice
Consider the below as advice to help you on your way:
Read a .tsv
file
tests.tsv
is a Tab-Separated Values file. A .tsv
is much like a .csv
, save for the fact that values are separated by tab characters, not commas. For this reason, a function like read.csv
won’t be suitable.
Consider the more generic read.table
, passing the right value to its sep
parameter. In particular, a tab character can be represented with "\t"
. If curious, \t
is an example of an escape character.
Add a new column to a data frame
To add a new column to a data frame, simply assign a new vector to the data frame. For example, to create a new column called extroversion
on a data frame called tests
, consider the below
tests$extroversion <- ...
where ...
is replaced with the vector you wish to assign to the extroversion
column.
Usage
Assuming big5.R
is in your working directory, enter the below in the R console to test your program:
source("big5.R")
Assessment
See the feedback page to learn more about how problems like these will be assessed.
Correctness
While check50
is available for this problem (see below), here’s how to test your code manually.
Run your program with source("big5.R")
. Your program should output a file, analysis.csv
, where the first few rows have the following values:
age | gender | country | E1 | E2 | E3 | N1 | N2 | N3 | A1 | A2 | A3 | C1 | C2 | C3 | O1 | O2 | O3 | extroversion | neuroticism | agreeableness | conscientiousness | openness |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
53 | Male | US | 4 | 5 | 5 | 1 | 2 | 1 | 5 | 5 | 5 | 4 | 5 | 4 | 3 | 4 | 5 | 0.93 | 0.27 | 1.00 | 0.87 | 0.80 |
46 | Female | US | 2 | 3 | 3 | 2 | 4 | 4 | 3 | 4 | 3 | 4 | 3 | 4 | 3 | 3 | 2 | 0.53 | 0.67 | 0.67 | 0.73 | 0.53 |
14 | Female | PK | 5 | 1 | 5 | 5 | 5 | 5 | 1 | 5 | 5 | 4 | 5 | 5 | 5 | 5 | 5 | 0.73 | 1.00 | 0.73 | 0.93 | 1.00 |
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
check50
You can check your code using check50
, a program that CS50 will use to test your code when you submit. But be sure to test it yourself as well!
Run the following command in the R Studio console:
check50("cs50/problems/2024/r/big5")
Green smilies mean your program has passed a test! Red frownies will indicate your program output something unexpected. Visit the URL that check50 outputs to see the input check50 handed to your program, what output it expected, and what output your program actually gave.
How to Submit
After you submit, be sure to check your autograder results. If you see SUBMISSION ERROR: missing files (0.0/1.0)
, it means your file was not named exactly as prescribed (or you uploaded it to the wrong problem).
Correctness in submissions entails everything from reading the specification, writing code that is compliant with it, and submitting files with the correct name. If you see this error, you should resubmit right away, making sure your submission is fully compliant with the specification. The staff will not adjust your filenames for you after the fact!
In RStudio, select the big5.R
file containing your work for this problem, as by checking the box to the left of the file’s name. With the file selected, click on the icon at the top of the file explorer. Choose Export followed by Download.
Go to CSCI E-5a’s Gradescope page.
Click Problem Set 1: Big 5.
Drag and drop your .R
file to the area that says Drag & Drop. Be sure that your .R
file is correctly named exactly as prescribed above, lest the autograder fail to run on your submission! Note that your submission is considered incomplete if any of the files are missing—be sure they’re all there!
Click Upload.
You should see a message that says “Problem Set 1: Big 5 submitted successfully!”
Be sure to double-check your autograder results before moving on!
Acknowledgements
Data adapted from the Open-Source Psychometrics Project, openpsychometrics.org/_rawdata. Cover photo retrieved from commons.wikimedia.org/wiki/File:Wiki-grafik_peats-de_big_five_ENG.svg.