Maryland Bridges
Data manipulation, Map visualization
Notable topics: Data manipulation, Map visualization
Recorded on: 2018-11-26
Timestamps by: Alex Cookson
Screencast
Timestamps
Using geom_line to create an exploratory line graph
Using %/% operator (truncated division) to bin years into decades (e.g., 1980, 1984, and 1987 would all become "1980")
Converting two-digit year to four-digit year (e.g., "16" becomes "2016") by adding 2000 to each one
Using percent_format function from scales package to get nice-looking axis labels
Using geom_col to create an ordered nice bar/column graph
Using replace_na to replace NA values with "Other"
Starting exploration of average daily traffic
Using comma_format function from scales package to get more readable axis labels (e.g., "1e+05" becomes "100,000")
Using cut function to bin continuous variable into customized breaks (also does a mutate within a group_by!)
Starting to make a map
Encoding a continuous variable to colour, then using scale_colour_gradient2 function to specify colours and midpoint
Specifying the trans argument (transformation) of the scale_colour_gradient2 function to get a logarithmic scale
Using str_to_title function to get values to Title Case (first letter of each word capitalized)
Predicting whether bridges are in "Good" condition using logistic regression (remember to specify the family argument! Dave fixes this at 52:54)
Explanation of why we should NOT be using an OLS linear regression
Using the augment function from the broom package to illustrate why a linear model is not a good fit
Specifying the type.predict argument in the augment function so that we get the actual predicted probability
Explanation of why the sigmoidal shape of logistic regression can be a drawback
Using a cubic spline model (a type of GAM, Generalized Additive Model) as an alternative to logistic regression
Explanation of the shape that a cubic spline model can take (which logistic regression cannot)
Visualizing the model in a different way, using a coefficient plot
Using geom_vline function to add a red reference line to a graph
Adding confidence intervals to the coefficient plot by specifying conf.int argument of tidy function and graphing using the geom_errorbarh function
Brief explanation of log-odds coefficients
Summary of screencast