Presentation ready plots

Lecture 11

Dr. Benjamin Soltoff

Cornell University
INFO 3312/5312 - Spring 2024

February 29, 2024

Announcements

Project 01 due tomorrow
- Presentation
  - Teaching team (me)
  - Peers
- Write-up
- Reproducibility, style, and organization
Homework 04 deferred to next week

Visualization critique

U.S. Median House Prices vs. Income

What is the story?
How does the visualization utilize annotations? How effective is it?

Telling a story

Multiple ways of telling a story

Sequential plots: Motivation, then resolution
A single plot: Resolution, and hidden in it motivation

Project note: you’re asked to create two plots per question. One possible approach: Start with a plot showing the raw data, and show derived quantities (e.g. percent increases, averages, coefficients of fitted models) in the subsequent plot.

Simplicity vs. complexity

When you’re trying to show too much data at once you may end up not showing anything.

Never assume your audience can rapidly process complex visual displays
Don’t add variables to your plot that are tangential to your story
Don’t jump straight to a highly complex figure; first show an easily digestible subset (e.g., show one facet first)
Aim for memorable, but clear

Project note: Make sure to leave time to iterate on your plots after you practice your presentation. If certain plots are getting too wordy to explain, take time to simplify them!

Consistency vs. repetitiveness

Be consistent but don’t be repetitive.

Use consistent features throughout plots (e.g., same color represents same level on all plots)
Aim to use a different type of visualization for each distinct analysis

Project note: If possible, ask a friend who is not in the class to listen to your presentation and then ask them what they remember. Then, ask yourself: is that what you wanted them to remember?

Designing effective visualizations

Keep it simple

Judging relative area

Use color to draw attention

Clarify the story

Leave out non-story details

Order matters

Clearly indicate missing data

Reduce cognitive load

Use descriptive titles

Annotate figures

Untangle a messy line chart

Online restaurant reservations

# A tibble: 3,420 × 5
   type    name          abbrev date       pct_change
   <chr>   <chr>         <chr>  <date>          <dbl>
 1 country United States US     2020-04-01      -1.00
 2 country United States US     2020-04-02      -1.00
 3 country United States US     2020-04-03      -1.00
 4 country United States US     2020-04-04      -1.00
 5 country United States US     2020-04-05      -1.00
 6 country United States US     2020-04-06      -1   
 7 country United States US     2020-04-07      -1.00
 8 country United States US     2020-04-08      -1.00
 9 country United States US     2020-04-09      -1   
10 country United States US     2020-04-10      -1.00
# ℹ 3,410 more rows

All the trends

Highlight specific areas

Small multiples

Incorporate geography

Tell a different story

Project workflow overview

Demo

proj-01

Rendering individual documents
Write-up
Presentation
Website: https://pages.github.coecis.cornell.edu/info3312-sp24/proj-01-YOUR_TEAM_NAME/
- Rendering site
- Making sure your website reflects your latest changes
- Customizing the look of your website

Plot layout

Sample plots

library(gapminder)

gapminder_07 <- filter(.data = gapminder, year == 2007)

p_hist <- ggplot(data = gapminder_07, mapping = aes(x = lifeExp)) +
  geom_histogram(binwidth = 2)
p_box <- ggplot(data = gapminder_07, mapping = aes(x = continent, y = lifeExp)) +
  geom_boxplot()
p_scatter <- ggplot(data = gapminder_07, mapping = aes(x = gdpPercap, y = lifeExp)) +
  geom_point()
p_text <- gapminder_07 |>
  filter(continent == "Americas") |>
  ggplot(mapping = aes(x = gdpPercap, y = lifeExp)) +
  geom_text_repel(mapping = aes(label = country)) +
  coord_cartesian(clip = "off")

Slide with single plot, little text

The plot will fill the empty space in the slide.

p_hist

Slide with single plot, lots of text

If there is more text on the slide
The plot will shrink
To make room for the text

p_hist

Small `fig-width`

For a zoomed-in look

```{r}
#| fig-width: 3
#| fig-asp: 0.618

p_hist
```

Large `fig-width`

For a zoomed-out look

```{r}
#| fig-width: 10
#| fig-asp: 0.618

p_hist
```

`fig-width` affects text size

Multiple plots on a slide

First, ask yourself, must you include multiple plots on a slide? For example, is your narrative about comparing results from two plots?

If no, then don’t! Move the second plot to to the next slide!
If yes:
- Insert columns using the Insert anything tool
- Use layout-ncol chunk option
- Use the patchwork package
- Possibly, use pivoting to reshape your data and then use facets

Columns

Insert > Slide Columns

Quarto will automatically resize your plots to fit side-by-side.

`layout-ncol`

```{r}
#| fig-width: 5
#| fig-asp: 0.618
#| layout-ncol: 2
#| out-width: 100%

p_hist
p_scatter
```

patchwork

```{r}
#| fig-width: 7
#| fig-asp: 0.4

p_hist + p_scatter
```

patchwork layout I

(p_hist + p_box) /
  (p_scatter + p_text)

patchwork layout II

p_text / (p_hist + p_box + p_scatter)

patchwork layout III

p_text + p_hist + p_box + p_scatter +
  plot_annotation(title = "Gapminder", tag_levels = c("A"))

patchwork layout IV

p_text +
  {
    p_hist + {
      p_box + p_scatter + plot_layout(ncol = 1) + plot_layout(tag_level = "new")
    }
  } +
  plot_layout(ncol = 1) +
  plot_annotation(tag_levels = c("1", "a"), tag_prefix = "Fig ")

More patchwork

Learn more at https://patchwork.data-imaginist.com.

Wrap up

Use data to effectively tell a story
Use the right plot(s) for your story
Ensure plots are clearly legible and interpretable to the audience

Code

library(tidyverse)
library(rvest)
library(tvthemes)

# get episode ratings for season 1
ratings_page <- read_html(x = "https://www.imdb.com/title/tt9018736/episodes/?ref_=tt_eps_sm")

# extract elements
ratings_raw <- tibble(
  episode = html_elements(x = ratings_page, css = ".bblZrR .ipc-title__text") |>
    html_text2(),
  rating = html_elements(x = ratings_page, css = ".ratingGroup--imdb-rating") |>
    html_text2()
)

# clean data
ratings <- ratings_raw |>
  # separate episode number and title
  separate_wider_delim(
    cols = episode,
    delim = " ∙ ",
    names = c("episode_number", "episode_title")
  ) |>
  separate_wider_delim(
    cols = episode_number,
    delim = ".",
    names = c("season", "episode_number")
  ) |>
  # separate rating and number of votes
  separate_wider_delim(
    cols = rating,
    delim = " ",
    names = c("rating", "votes")
  ) |>
  # convert numeric variables
  mutate(
    across(
      .cols = -episode_title,
      .fns = parse_number
    ),
    votes = votes * 1e03
  )

# draw the plot
ratings |>
  # generate x-axis tick mark labels with title and epsiode number
  mutate(
    episode_title = str_glue("{episode_title}\n(S{season}E{episode_number})"),
    episode_title = fct_reorder(.f = episode_title, .x = episode_number)
  ) |>
  # draw a lollipop chart
  ggplot(mapping = aes(x = episode_title, y = rating)) +
  geom_point(mapping = aes(size = votes)) +
  geom_segment(
    mapping = aes(
      x = episode_title, xend = episode_title,
      y = 0, yend = rating
    )
  ) +
  # adjust the size scale
  scale_size(range = c(3, 8)) +
  # label the chart
  labs(
    title = "Live-action Avatar The Last Airbender is decent",
    x = NULL,
    y = "IMDB rating",
    caption = "Source: IMDB"
  ) +
  # use an Avatar theme
  theme_avatar(
    # custom font
    title.font = "Slayer",
    text.font = "Slayer",
    legend.font = "Slayer",
    # shrink legend text size
    legend.title.size = 8,
    legend.text.size = 6
  ) +
  theme(
    # remove undesired grid lines
    panel.grid.major.x = element_blank(),
    panel.grid.minor.y = element_blank(),
    # move legend to the top
    legend.position = "top",
    # align title flush with the edge
    plot.title.position = "plot",
    # shink x-axis text labels to fit
    axis.text.x = element_text(size = rel(x = 0.7))
  )

It was decent

Presentation ready plots

Announcements

Announcements

Visualization critique

U.S. Median House Prices vs. Income

Telling a story

Multiple ways of telling a story

Simplicity vs. complexity

Consistency vs. repetitiveness

Designing effective visualizations

Keep it simple

Judging relative area

Use color to draw attention

Clarify the story

Leave out non-story details

Order matters

Clearly indicate missing data

Reduce cognitive load

Use descriptive titles

Annotate figures

Untangle a messy line chart

Online restaurant reservations

All the trends

Highlight specific areas

Small multiples

Incorporate geography

Tell a different story

Project workflow overview

Demo

Plot layout

Sample plots

Slide with single plot, little text

Slide with single plot, lots of text

Small fig-width

Large fig-width

fig-width affects text size

Multiple plots on a slide

Columns

layout-ncol

patchwork

patchwork layout I

patchwork layout II

patchwork layout III

patchwork layout IV

More patchwork

Wrap up

Wrap up

It was decent

Small `fig-width`

Large `fig-width`

`fig-width` affects text size

`layout-ncol`