Government Spending on Kids
Data Manipulation, Functions, Embracing, Reading in Many .csv Files, Pairwise Correlation
Notable topics: Data Manipulation, Functions, Embracing, Reading in Many .csv Files, Pairwise Correlation
Recorded on: 2020-09-14
Timestamps by: Eric Fletcher
Screencast
Timestamps
Using geom_line
and summarize
to visualize education spending over time. First for all states. Then individual states. Then small groups of states using %in%
. Then in random groups of size n using %in%
and sample
with unique
. fct_reorder
is used to reorder state
factor levels by sorting along the inf_adj
variable.
geom_vline
used to add reference to the 2009 financial crisis.
Take the previous chart setting the inf_adj_perchild
for the first year 1997
to 100%
in order to show a measure of increase from 100%
as opposed to absolute value for change over time for each state relative to 1997
. geom_hline
used to add reference for the 100%
starting point. David ends up changing the starting point from 100%
to 0%
fct_reorder
with max
used to reorder the plots in descending order based on highest peak values.
David briefly mentions the small multiples approach to analyzing data.
Create a function
named plot_changed_faceted
to make it easier to visualize the many other variables included in the dataset.
Create a function
named plot_faceted
with a {{ y_axis }}
embracing argument. Adding this function creates two stages: one for data transformation and another for plotting.
Use the dir
function with pattern
and purrr
package's map_df
function to read in many different .csv
files with GDP values for each state.
Troubleshooting Can't combine <character> and <double> columns
error using function
and mutate
with across
and as.numeric
.
Extract state name from filename using extract
from tidyr
and regular expression
.
Unsuccessful attempt at importing state population data via a not user friendly dataset from census.gov
by skipping the first 3 rows of the Excel file.
Use geom_col
to see which states spend the most for each child for a single variable and multiple variables using %in%
.
Use scale_fill_discrete
with guide_legend(reverse = TRUE)
to change the ordering of the legend.
Use geom_col
and 'pairwise_corrto visualize the correlation between variables across states in 2016 using
pairwise correlation`.
Use geom_point
to plot inf_adjust_perchild_PK12ed
versus inf_adj_perchild_highered
. geom_text
used to apply state names to each point.
Summary of screencast.