Tour de France
Survival analysis, Animated bar graph (gganimate package)
Notable topics: Survival analysis, Animated bar graph (gganimate package)
Recorded on: 2020-04-06
Timestamps by: Alex Cookson
Screencast
Timestamps
Getting an overview of the data
Aggregating data into decades using the truncated division operator %/%
Noting that death data is right-censored (i.e., some winners are still alive)
Using transmute function, which combines functionality of mutate (to create new variables) and select (to choose variables to keep)
Using survfit function from survival package to conduct survival analysis
Using glance function from broom package to get a one-row model summary of the survival model
Using extract function to pull out a string matching a regular expression from a variable (stage number in this case)
Theorizing that there is a parsing issue with the original data's time field
Using group_by function's built-in "peeling" feature, where a summarise call will "peel away" one group but left other groupings intact
Using rank function, then upgrading to percent_rank function to give percentile rankings (between 0 and 1)
Using geom_smooth function with method argument as "lm" to plot a linear regression
Using cut function to bin numbers (percentiles in this case) into categories
Reviewing boxplots exploring relationship between first-stage performance and overall Tour performance
Starting to create an animation using gganimate package
Actually writing the code to create the animation
Using reorder_within function from tidytext package to re-order factors that have the same name across multiple groups
Summary of screencast