library(tidyverse)
library(gganimate)
library(ggbeeswarm)
library(scales)
theme_set(theme_minimal())
# colors for the Democratic and Republican parties
<- "#00AEF3"
dem <- "#E81B23" rep
AE 18: Visualizing increased polarization of baby names
In this application exercise we will use animation and the {gganimate} package to visualize the increasing polarization of baby names in the United States.
Import data
Our data comes from the Social Security Administration which publishes detailed annual data on all babies born in the United States. We have prepared it by matching state-level births to the results of the 2024 U.S. presidential election so we can distinguish “red states” (those won by the Republican candidate Donald Trump) from “blue states” (those won by the Democratic candidate Kamala Harris). The data is stored in data/partisan-names.csv
.
<- read_csv(file = "data/partisan-names.csv") |>
partisan_names # convert year to a factor column for visualizations
mutate(year = factor(year))
Rows: 200 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): outcome, sex, name
dbl (4): year, Trump, Harris, part_diff
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
partisan_names
# A tibble: 200 × 7
outcome sex year name Trump Harris part_diff
<chr> <chr> <fct> <chr> <dbl> <dbl> <dbl>
1 Trump M 1983 Kendrick 0.942 0.0580 0.884
2 Trump M 1983 Trey 0.932 0.0682 0.864
3 Trump M 1983 Rodrick 0.927 0.0732 0.854
4 Trump F 1983 Ashlea 0.917 0.0833 0.833
5 Trump F 1983 Tosha 0.890 0.110 0.780
6 Trump F 1983 Misti 0.886 0.114 0.773
7 Trump F 1983 Latoria 0.883 0.117 0.767
8 Trump M 1983 Jackie 0.876 0.124 0.753
9 Trump M 1983 Demarcus 0.870 0.130 0.740
10 Trump F 1983 Angelia 0.869 0.131 0.737
# ℹ 190 more rows
We have aggregated the data to show the percentage of babies with a given name born in states that voted for Trump and Harris. The data contains the following columns:
outcome
- winner of the state-level vote in the 2024 U.S. presidential electionsex
- sex assigned at birth to the babyyear
- year of birthname
- name of the babyTrump
- percentage of the babies with the given name born in states that voted for Donald Trump in 2024Harris
- percentage of the babies with the given name born in states that voted for Kamala Harris in 2024part_diff
- difference between the percentage of babies with the given name born in Trump states and Harris states. Positive values indicate the name is more common in Trump states, whereas negative values indicate the name is more common in Harris states.
We have filtered the data to focus on the years 1983, 1993, 2003, 2013, and 2023 (the most recent year of births for which data is available), and also limited the data to the 20 most-highly partisan names for red states and blue states (i.e. each year contains 40 rows - the top 20 most “Republican” names and the top 20 most “Democratic” names).
Create a static plot
Before we attempt to animate a chart, let’s first create a single static visualization that communicates the change in polarization over time.
Your turn: Implement a jittered plot that shows the distribution of partisanship for names in red states and blue states for each year. Use the geom_quasirandom()
function from the {ggbeeswarm} package to create the plot.1
# add code here
Define plot components for the animated chart
Before we attempt to animate the chart, let’s first define the components of the chart that will remain static throughout the animation.
Your turn: Modify your plot above to have a single category on the \(y\)-axis. This is what will change throughout the animation.
When we animate the chart, we will define each frame based on the year
variable. When testing the static portions of the chart, you can facet the graph on the year
variable to see how the chart will look for each year. When we animate it, each facet panel will essentially become the frames of the animation.
# add code here
Implement basic animation
Now that we have the static components of the chart, we can animate it. The {gganimate} package provides a simple way to animate a plot by defining the transition_*()
function.
Your turn: Implement the animation using the appropriate transition_*()
function.
# add code here
Your turn: We need to know what years the points are transitioning between. Add this using an appropriate label to the plot.
Look at the documentation for your transition_*()
function. It should provide information on how to add labels to the animation based on the transitioning states.
# add code here
Your turn: Adjust the animation to make it smoother. Consider adjusting appropriate parameters in the transition_*()
function as well as the easing used for interpolation via ease_aes()
.
# add code here
Add shadows
To make the animation easier to interpret, it’s helpful to add reference marks during the transition to show where the points are moving from and to. This can be done using the shadow_*()
functions.
Your turn: Implement a shadow_*()
function to show the transition more smoothly.
# add code here
Rendering
We can control the rendering process using the animate()
function.
Your turn: Improve the animated chart through it’s rendering. Some suggestions include:
- Increase the number of frames to make the animation smoother.
- Increase the length of the animation to give the reader more time to interpret each frame.
- Add a pause at the start and end of the animation to give the reader time to interpret the chart.
# add code here
Your turn: Implement the same settings using a Quarto code chunk option. Adjust the aspect ratio of the plot to make it more compact vertically.
gganimate
code chunk option
Quarto can pass arbitrary R expressions through code chunk options using the syntax !expr
. For example, if we wanted to render the animation with 300 frames at 15 frames per second, we could use the following code chunk option:
#| gganimate: !expr list(nframes = 300, fps = 15)
# add code here
Acknowledgments
- Application exercise inspired by Philip Bump’s How to Read This Chart March 29th newsletter
Footnotes
Unlike
geom_jitter()
, this will ensure there are no overlapping points in the plot. You can also usegeom_beeswarm()
but this introduces unnecessary curvatures into the jittering.↩︎