Horror Movie Profits
Graphing for EDA (Exploratory Data Analysis)
Notable topics: Graphing for EDA (Exploratory Data Analysis)
Recorded on: 2018-10-22
Timestamps by: Alex Cookson
Screencast
Timestamps
Using parse_date function from lubridate package to convert date formatted as character to date class (should have used mdy function though)
Using fct_lump function to aggregate distributors into top 6 (by number of movies) and and "Other" category
Investigating strange numbers in the data and discovering duplication
Using problems function to look at parsing errors when importing data
Using arrange and distinct function and its .keep_all argument to de-duplicate observations
Using geom_boxplot function to create a boxplot of budget by distributor
Using floor function to bin release years into decades (e.g., "1970" and "1973" both become "1970")
Using summarise_at function to apply the same function to multiple variables at the same time
Using geom_line to visualize multiple metrics at the same time
Using facet_wrap function to graph small multiples of genre-budget boxplots by distributor
Starting analysis of profit ratio of movies
Using paste0 function in a custom function to show labels of multiple (e.g., "4X" or "6X" to mean "4 times" or "6 times")
Starting analysis of the most common genres over time
Starting analysis of the most profitable individual horror movies
Using paste0 function to add release date of movie to labels in a bar graph
Using geom_text function, along with its check_overlap argument, to add labels to some points on a scatterplot
Using ggplotly function from plotly package to create an interactive scatterplot
Reviewing unexplored areas of investigation