Annotating charts

Lecture 10

Dr. Benjamin Soltoff

Cornell University
INFO 3312/5312 - Spring 2026

February 24, 2026

Announcements

Announcements

  • Homework 04
  • Project proposal grades/feedback

2026 Winter Olympics figure skating ⛸️ drama!

The gold medal win in Ice Dance 🧊💃 by the French 🇫🇷🥇 team of Laurence Fournier Beaudry & Guillaume Cizeron over the American 🇺🇸🥈 team of Madison Chock & Evan Bates has become a controversy

The free dance was scored by 9 judges

The French judge gave Beaudry & Cizeron 🇫🇷 a 137.45 but only gave Chock & Bates 🇺🇸 a 129.74

All other judges were relatively close in their scores of the two teams 🤔

Figure skating uses a “trimmed mean” to determine final scores, meaning the top and bottom judges’ scores are discarded from the averages. Still, an outlier score can have an impact, since it affects which other scores are counted. I have yet to see a thorough mathematical analysis of whether or not the French judge lowballing Chock & Bates actually prevented them from winning gold. Can anyone help out here?

Also, judge No. 4 was just in a bad mood overall 😆

#dataviz #figureskating #olympics #winterolympics #iceskating #skating #usfigureskating #icedance

Learning objectives

  • Identify the importance of details in charts
  • Introduce methods for annotating charts
  • Practice designing annotations

Axes

Axis breaks

How can the following figure be improved with custom breaks in axes, if at all?

Context matters

pac_plot +
  scale_x_continuous(breaks = seq(from = 2000, to = 2024, by = 2))

Conciseness matters

pac_plot +
  scale_x_continuous(breaks = seq(2000, 2024, 4))

Precision matters

pac_plot +
  scale_x_continuous(breaks = seq(2000, 2024, 4)) +
  labs(x = "Election year")

Fretting the little things

Little details matter

Obsession with tiny details

Human-focused design

“This is what customers pay us for – to sweat all these details so it’s easy and pleasant for them to use our computers.”

Graph details: Redundant coding

Graph details: Consistent ordering

Annotating plots

Annotating plots

How can plots be annotated to enhance their clarity and interpretability?

  • Text
  • Arrows/lines
  • Rectangles
  • Colors/fills

04:00

04:00

Text in plots

Including text on a plot

Label actual data points

geom_text(), geom_label(), geom_text_repel(), etc.

Label actual data points

library(gapminder)

gapminder_europe <- gapminder |>
  filter(
    year == 2007,
    continent == "Europe"
  )

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_text(aes(label = country))

Label actual data points

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_label(aes(label = country))

Solution 1: Repel labels

library(ggrepel)

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_text_repel(aes(label = country))

Solution 1: Repel labels

library(ggrepel)

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_label_repel(aes(label = country))

Solution 2a: Don’t use so many labels

gapminder_europe <- gapminder_europe |>
  mutate(
    should_be_labeled = country %in% c(
      "Albania",
      "Norway",
      "Hungary"
    )
  )

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_label_repel(
    data = filter(
      gapminder_europe,
      should_be_labeled == TRUE
    ),
    aes(label = country)
  )

Solution 2b: Use other aesthetics too

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point(aes(color = should_be_labeled)) +
  geom_label_repel(
    data = filter(
      gapminder_europe,
      should_be_labeled == TRUE
    ),
    aes(
      label = country,
      fill = should_be_labeled
    ),
    color = "white"
  ) +
  scale_color_manual(values = c(
    "grey50",
    "red"
  )) +
  scale_fill_manual(values = c("red")) +
  guides(color = "none", fill = "none")

(Highlight non-text things too!)

# Color just Oceania
gapminder_highlighted <- gapminder |>
  mutate(
    is_oceania = continent == "Oceania"
  )

ggplot(
  gapminder_highlighted,
  aes(
    x = year, y = lifeExp,
    group = country,
    color = is_oceania,
    size = is_oceania
  )
) +
  geom_line() +
  scale_color_manual(values = c(
    "grey70",
    "red"
  )) +
  scale_size_manual(values = c(0.1, 0.5)) +
  guides(color = "none", size = "none") +
  theme_minimal()

Including text on a plot

Label actual data points

geom_text(), geom_label(), geom_text_repel(), etc.

Add arbitrary annotations

annotate()

Adding arbitrary annotations

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  annotate(
    geom = "text",
    x = 40000, y = 76,
    label = "Some text!"
  )

Adding arbitrary annotations

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  annotate(
    geom = "label",
    x = 40000, y = 76,
    label = "Some text!"
  )

Any geom works

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  # This is evil though!!!
  # We just invented a point
  annotate(
    geom = "point",
    x = 40000, y = 76
  )

Any geom works

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  annotate(
    geom = "rect",
    xmin = 30000, xmax = 50000,
    ymin = 78, ymax = 82,
    fill = "red", alpha = 0.2
  )

Use multiple annotations

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  annotate(
    geom = "rect",
    xmin = 30000, xmax = 50000,
    ymin = 78, ymax = 82,
    fill = "red", alpha = 0.2
  ) +
  annotate(
    geom = "label",
    x = 40000, y = 76.5,
    label = "Rich and long-living"
  ) +
  annotate(
    geom = "segment",
    x = 40000, xend = 40000,
    y = 76.8, yend = 77.8,
    arrow = arrow(
      length = unit(0.1, "in")
    )
  )

World development indicators

World development indicators

Rows: 6,020
Columns: 9
$ iso2c         <chr> "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF"…
$ iso3c         <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG",…
$ country       <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "…
$ year          <dbl> 2001, 1998, 2009, 2000, 2012, 1996, 1999, 2002, 2003, 2004, 2005, 2006, 2007…
$ population    <dbl> 19688632, 18493132, 27385307, 19542982, 30466479, 17106595, 19262847, 210002…
$ co2_emissions <dbl> 0.05529272, 0.07126970, 0.23950690, 0.05516661, 0.33506104, 0.08226652, 0.05…
$ gdp_per_cap   <dbl> NA, NA, 490.2728, NA, 570.6761, NA, NA, 344.2242, 347.4152, 338.7394, 363.54…
$ region        <chr> "South Asia", "South Asia", "South Asia", "South Asia", "South Asia", "South…
$ income        <chr> "Low income", "Low income", "Low income", "Low income", "Low income", "Low i…

Clean and reshape data

Basic plot

Application exercise

ae-09

Brainstorm methods to improve the readability and interpretability of the chart through annotations

Potential aspects to emphasize

  • What is a “good” rank? What is a “bad” rank?

    Note

    • 1 is lowest carbon emissions per capita
    • 170 is the highest carbon emissions per capita
  • What are the countries that have significantly improved or worsened their rank?

  • What other aspects do you feel should be emphasized?

Methods for annotation

  • Text labels
  • Arrows/lines
  • Rectangles
  • Colors/fills

Wrap up

Recap

  • Visual storytelling requires a combination of data, visualization, and annotation
  • Attention to detail is key to ensure that the message is clear
  • Annotation is a powerful method for enhancing clarity and interpretability of plots
  • {ggplot2} has powerful annotation tools
  • Alternatively, export to vector format and use a vector graphics editor (e.g. Illustrator, GIMP)

Acknowledgements