Bob Ross Paintings

Network graphs, Principal Component Analysis (PCA)

Published

August 11, 2019

Notable topics: Network graphs, Principal Component Analysis (PCA)

Recorded on: 2019-08-11

Timestamps by: Alex Cookson

View code

Screencast

Timestamps

clean_names
janitor

Using clean_names function in janitor package to get field names to snake_case

gather

Using gather function to get wide elements into tall (tidy) format

str_to_titlestr_replace

Cleaning text (str_to_title, str_replace) to get into nicer-to-read format

str_remove_all

Using str_remove_all function to trim trimming quotation marks and backslashes

extract

Using extract function to extract the season number and episode number from episode field; uses regex capturing groups

add_count

Using add_count function's name argument to specify field's name

Getting into whether the elements of Ross's paintings changed over time (e.g., are mountains more/less common over time?)

broom

Quick point: could have used logistic regression to see change over time of elements

widyr

Asking, "What elements tends to appear together?" prompting clustering analysis

pairwise_cor
widyr

Using pairwise_cor to see which elements tend to appear together

Discussion of a blind spot of pairwise correlation (high or perfect correlation on elements that only appear once or twice)

Asking, "What are clusters of elements that belong together?"

ggraphigraph

Creating network plot using ggraph and igraph packages

Reviewing network plot for interesting clusters (e.g., beach cluster, mountain cluster, structure cluster)

Explanation of Principal Component Analysis (PCA)

Start of actual PCA coding

acast
reshape2

Using acast function to create matrix of painting titles x painting elements (initially wrong, corrected at 36:30)

tcolSumscolMeans

Centering the matrix data using t function (transpose of matrix), colSums function, and colMeans function

svd

Using svd function to performn singular value decomposition, then tidying with broom package

Exploring one principal component to get a better feel for what PCA is doing

reorder_within
tidytext

Using reorder_within function to re-order factors within a grouping

Exploring different matrix names in PCA (u, v, d)

Looking at top 6 principal components of painting elements

Showing percentage of variation that each principal component is responsible for