Annotating charts

Lecture 10

Dr. Benjamin Soltoff

Cornell University
INFO 3312/5312 - Spring 2025

February 25, 2024

Announcements

Announcements

  • Homework 03
  • Project proposal grades/feedback
  • Project 01 presentations
    • Clarifications
    • Presentation times

Visualization critique

  • What is the story?
  • How does the visual design impact interpretability?

Axes

Axis breaks

How can the following figure be improved with custom breaks in axes, if at all?

Context matters

pac_plot +
  scale_x_continuous(breaks = seq(from = 2000, to = 2020, by = 2))

Conciseness matters

pac_plot +
  scale_x_continuous(breaks = seq(2000, 2020, 4))

Precision matters

pac_plot +
  scale_x_continuous(breaks = seq(2000, 2020, 4)) +
  labs(x = "Election year")

Fretting the little things

Little details matter

Obsession with tiny details

Human-focused design

“This is what customers pay us for – to sweat all these details so it’s easy and pleasant for them to use our computers.”

Graph details: Redundant coding

Graph details: Consistent ordering

Annotating plots

Annotating plots

How can plots be annotated to enhance their clarity and interpretability?

  • Text
  • Arrows/lines
  • Rectangles
  • Colors/fills

04:00

04:00

Text in plots

Including text on a plot

Label actual data points

geom_text(), geom_label(), geom_text_repel(), etc.

Label actual data points

library(gapminder)

gapminder_europe <- gapminder |>
  filter(
    year == 2007,
    continent == "Europe"
  )

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_text(aes(label = country))

Label actual data points

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_label(aes(label = country))

Solution 1: Repel labels

library(ggrepel)

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_text_repel(aes(label = country))

Solution 1: Repel labels

library(ggrepel)

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_label_repel(aes(label = country))

Solution 2a: Don’t use so many labels

gapminder_europe <- gapminder_europe |>
  mutate(
    should_be_labeled = country %in% c(
      "Albania",
      "Norway",
      "Hungary"
    )
  )

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  geom_label_repel(
    data = filter(
      gapminder_europe,
      should_be_labeled == TRUE
    ),
    aes(label = country)
  )

Solution 2b: Use other aesthetics too

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point(aes(color = should_be_labeled)) +
  geom_label_repel(
    data = filter(
      gapminder_europe,
      should_be_labeled == TRUE
    ),
    aes(
      label = country,
      fill = should_be_labeled
    ),
    color = "white"
  ) +
  scale_color_manual(values = c(
    "grey50",
    "red"
  )) +
  scale_fill_manual(values = c("red")) +
  guides(color = "none", fill = "none")

(Highlight non-text things too!)

# Color just Oceania
gapminder_highlighted <- gapminder |>
  mutate(
    is_oceania = continent == "Oceania"
  )

ggplot(
  gapminder_highlighted,
  aes(
    x = year, y = lifeExp,
    group = country,
    color = is_oceania,
    size = is_oceania
  )
) +
  geom_line() +
  scale_color_manual(values = c(
    "grey70",
    "red"
  )) +
  scale_size_manual(values = c(0.1, 0.5)) +
  guides(color = "none", size = "none") +
  theme_minimal()

Including text on a plot

Label actual data points

geom_text(), geom_label(), geom_text_repel(), etc.

Add arbitrary annotations

annotate()

Adding arbitrary annotations

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  annotate(
    geom = "text",
    x = 40000, y = 76,
    label = "Some text!"
  )

Adding arbitrary annotations

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  annotate(
    geom = "label",
    x = 40000, y = 76,
    label = "Some text!"
  )

Any geom works

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  # This is evil though!!!
  # We just invented a point
  annotate(
    geom = "point",
    x = 40000, y = 76
  )

Any geom works

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  annotate(
    geom = "rect",
    xmin = 30000, xmax = 50000,
    ymin = 78, ymax = 82,
    fill = "red", alpha = 0.2
  )

Use multiple annotations

ggplot(
  gapminder_europe,
  aes(x = gdpPercap, y = lifeExp)
) +
  geom_point() +
  annotate(
    geom = "rect",
    xmin = 30000, xmax = 50000,
    ymin = 78, ymax = 82,
    fill = "red", alpha = 0.2
  ) +
  annotate(
    geom = "label",
    x = 40000, y = 76.5,
    label = "Rich and long-living"
  ) +
  annotate(
    geom = "segment",
    x = 40000, xend = 40000,
    y = 76.8, yend = 77.8,
    arrow = arrow(
      length = unit(0.1, "in")
    )
  )

World development indicators

World development indicators

Rows: 6,020
Columns: 9
$ iso2c         <chr> "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF"…
$ iso3c         <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG",…
$ country       <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "…
$ year          <dbl> 2001, 1998, 2009, 2000, 2012, 1996, 1999, 2002, 2003, 2004, 2005, 2006, 2007…
$ population    <dbl> 19688632, 18493132, 27385307, 19542982, 30466479, 17106595, 19262847, 210002…
$ co2_emissions <dbl> 0.05529272, 0.07126970, 0.23950690, 0.05516661, 0.33506104, 0.08226652, 0.05…
$ gdp_per_cap   <dbl> NA, NA, 490.2728, NA, 570.6761, NA, NA, 344.2242, 347.4152, 338.7394, 363.54…
$ region        <chr> "South Asia", "South Asia", "South Asia", "South Asia", "South Asia", "South…
$ income        <chr> "Low income", "Low income", "Low income", "Low income", "Low income", "Low i…

Clean and reshape data

# A tibble: 170 × 9
   iso3c country            region income rank_1995 rank_2020 rank_diff big_change better_big_change
   <chr> <chr>              <chr>  <chr>      <dbl>     <dbl>     <dbl> <lgl>      <chr>            
 1 ZWE   Zimbabwe           Sub-S… Lower…        75        39       -36 TRUE       Rank improved    
 2 DNK   Denmark            Europ… High …       160       127       -33 TRUE       Rank improved    
 3 SWE   Sweden             Europ… High …       132       100       -32 TRUE       Rank improved    
 4 SYR   Syrian Arab Repub… Middl… Low i…        96        64       -32 TRUE       Rank improved    
 5 MLT   Malta              Middl… High …       128        99       -29 FALSE      Rank changed a l…
 6 EST   Estonia            Europ… High …       161       133       -28 FALSE      Rank changed a l…
 7 UKR   Ukraine            Europ… Lower…       139       111       -28 FALSE      Rank changed a l…
 8 YEM   Yemen, Rep.        Middl… Low i…        53        25       -28 FALSE      Rank changed a l…
 9 VEN   Venezuela, RB      Latin… Not c…       119        92       -27 FALSE      Rank changed a l…
10 SWZ   Eswatini           Sub-S… Lower…        79        53       -26 FALSE      Rank changed a l…
# ℹ 160 more rows

Basic plot

Application exercise

ae-09: Improving the chart through annotation

Instructions

Brainstorm methods to improve the readability and interpretability of the chart through annotations

Points to emphasize

  • What is a “good” rank? What is a “bad” rank?

    Note

    • 1 is lowest carbon emissions per capita
    • 170 is the highest carbon emissions per capita
  • What are the countries that have significantly improved or worsened their rank?

  • What other aspects do you feel should be emphasized?

Methods for annotation

  • Text labels
  • Arrows/lines
  • Rectangles
  • Colors/fills

Wrap up

Recap

  • Visual storytelling requires a combination of data, visualization, and annotation
  • Attention to detail is key to ensure that the message is clear
  • Annotation is a powerful method for enhancing clarity and interpretability of plots
  • {ggplot2} has powerful annotation tools
  • Alternatively, export to vector format and use a vector graphics editor (e.g. Illustrator, GIMP)

Acknowledgements