Nobel Prize Winners
Data manipulation, Graphing for EDA (Exploratory Data Analysis)
Notable topics: Data manipulation, Graphing for EDA (Exploratory Data Analysis)
Recorded on: 2019-05-23
Timestamps by: Alex Cookson
Screencast
Timestamps
Creating a stacked bar plot using geom_col and the aes function's fill argument (also bins years into decades with truncated division operator %/%)
Using n_distinct function to quickly count unique years in a group
Using distinct function and its .keep_all argument to de-duplicate data
Using coalesce function to replace NAs in a variable (similar to SQL COALESCE verb)
Using year function from lubridate package to calculate (approx.) age of laureates at time of award
Using fct_reorder function to arrange boxplot graph by the median age of winners
Defining a new variable within the count function (like doing a mutate in the count function)
Creating a small multiples bar plot using geom_col and facet_wrap functions
Importing income data from WDI package to explore relationship between high/low income countries and winners
Using fct_relevel to change the levels of a categorical income variable (e.g., "Upper middle income") so that the ordering makes sense
Starting to explore new dataset of nobel laureate publications
Taking the mean of a subset of data without needing to fully filter the data beforehand
Using rank function and its ties.method argument to add the ordinal number of a laureate's publication (e.g., 1st paper, 2nd paper)
Lots of playing around with exploratory histograms (geom_histogram)
Discussion of right-censoring as an issue (people winning the Nobel prize but still having active careers)
Summary of screencast