GDPR Violations
Data manipulation, Interactive dashboard with shinymetrics and tidymetrics
Notable topics: Data manipulation, Interactive dashboard with shinymetrics and tidymetrics
Recorded on: 2020-04-20
Timestamps by: Eric Fletcher
Screencast
Timestamps
Use the mdy function from the lubridate package to change the date variable from character class to date class.
Use the rename function from the dplyr package to rename variable in the dataset.
Use the fct_reorder function from the forcats package to sort the geom_col in descending order.
Use the fct_lump function from the forcats package within count to lump together country names except for the 6 most frequent.
Use the scale_x_continuous function from ggplot2 with the scales package to change the x-axis values to dollar format.
Use the month and floor_date function from the lubridate package to get the month component from the date variable to count the total fines per month.
Use the na_if function from the dplyr package to convert specific date value to NA.
Use the fct_reorder function from the forcats package to sort the stacked geom_col and legend labels in descending order.
Use the dollar function from the scales package to convert the price variable into dollar format.
Use the str_trunc to shorten the summary string values to 140 characters.
Use the separate_rows function from the tidyr package with a regular expression to separate the values in the article_violated variable with each matching group placed in its own row.
Use the extract function from the tidyr package with a regular expression to turn each matching group into a new column.
Use the geom_jitter function from the ggplot2 package to add points to the horizontal box plot.
Use the inner_join function from the dplyr package to join together article_titles and separated_articles tables.
Use the paste0 function from base R to concatenate article and article_title.
Use the str_detect function from the stringr package to detect the presence of a pattern in a string.
Use the group_by and summarize functions from the dplyr package to aggregate fines that were issued to the same country on the same day allowing for size to be used in geom_point plot.
Use the scale_size_continuous function from the ggplot2 package to remove the size legend.
Create an interactive dashboard using the shinymetrics and tidymetrics which is a tidy approach to business intelligence.
Use the cross_by_dimensions and cross_by_periods functions from the tidyr package which stacks an extra copy of the table for each dimension specified as an argument (country, article_title, type), replaces the value of the column with the word All and periods, and groups by all the columns. It acts as an extended group_by that allows complete summaries across each individual dimension and possible combinations.