Deep dive: coordinates + facets

Lecture 6

Dr. Benjamin Soltoff

Cornell University
INFO 3312/5312 - Spring 2024

February 9, 2024

Announcements

Announcements

  • Homework 1 collected
  • Team projects begin tomorrow – look for team assignments in Canvas

Visualization critique

Where did the school buses go?

  • What is the story?
  • How does the design account for the time gaps?

Agenda

Agenda for today

  • Coordinate systems

  • Facets

  • Themes (a little bit)

Coordinate systems

Coordinate systems: purpose

  • Combine the two position aesthetics (x and y) to produce a 2d position on the plot:
    • linear coordinate system: horizontal and vertical coordinates
    • polar coordinate system: angle and radius
    • maps: latitude and longitude
  • Draw axes and panel backgrounds in coordination with the coordinate systems

Coordinate systems: types

  1. Linear coordinate systems: preserve the shape of geoms
  • coord_cartesian(): the default Cartesian coordinate system, where the 2d position of an element is given by the combination of the x and y positions.
  • coord_flip(): Cartesian coordinate system with x and y axes flipped (won’t be using much now that geoms can take aesthetic mappings in x and y axes)
  • coord_fixed(): Cartesian coordinate system with a fixed aspect ratio. (useful only in limited circumstances)
  1. Non-linear coordinate systems: can change the shapes – a straight line may no longer be straight. The closest distance between two points may no longer be a straight line.
  • coord_trans(): Apply arbitrary transformations to x and y positions, after the data has been processed by the stat
  • coord_polar(): Polar coordinates
  • coord_map() / coord_quickmap() / coord_sf(): Map projections

Setting limits: what the plots say

base_plot <- ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g, color = species)) +
  geom_point(alpha = 0.25) +
  geom_smooth()
base_plot

Setting limits: what the plots say

Identify the differences between each plot. Focus on the range of the x and y axes as well as the contents of the plots.

02:00
base_plot +
  labs(title = "Plot 1")

base_plot +
  scale_x_continuous(limits = c(190, 220)) +
  scale_y_continuous(limits = c(4000, 5000)) +
  labs(title = "Plot 2")

base_plot +
  xlim(190, 220) +
  ylim(4000, 5000) +
  labs(title = "Plot 3")

base_plot +
  coord_cartesian(xlim = c(190, 220),
                  ylim = c(4000, 5000)) +
  labs(title = "Plot 4")

Setting limits: what the warnings say

base_plot +
  labs(title = "Plot 1")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values (`geom_point()`).

base_plot +
  scale_x_continuous(limits = c(190, 220)) +
  scale_y_continuous(limits = c(4000, 5000)) +
  labs(title = "Plot 2")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 235 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 235 rows containing missing values (`geom_point()`).

base_plot +
  xlim(190, 220) +
  ylim(4000, 5000) +
  labs(title = "Plot 3")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 235 rows containing non-finite values (`stat_smooth()`).
## Removed 235 rows containing missing values (`geom_point()`).

base_plot +
  coord_cartesian(xlim = c(190, 220),
                  ylim = c(4000, 5000)) +
  labs(title = "Plot 4")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values (`geom_point()`).

Setting limits

  • Setting scale limits: Any data outside the limits is thrown away
    • scale_*_continuous(limits = ...)
    • xlim() and ylim()
  • Setting coordinate system limits: Use all the data, but only display a small region of the plot (zooming in)
    • coord_cartesian(xlim = ..., ylim = ...)

Cropping scatterplot

Fixing aspect ratio with coord_fixed()

Useful when having an aspect ratio of 1 makes sense, e.g. scores on two tests (reading and writing) on the same scale (0 to 100 points)

Transformations

ggplot(penguins, aes(x = bill_depth_mm, y = body_mass_g)) +
  geom_point() +
  geom_smooth(method = "lm")

ggplot(
  penguins,
  aes(
    x = log10(bill_depth_mm),
    y = log10(body_mass_g)
  )
) +
  geom_point() +
  geom_smooth(method = "lm")

ggplot(penguins, aes(x = bill_depth_mm, y = body_mass_g)) +
  geom_point() +
  geom_smooth(method = "lm") +
  scale_x_log10() +
  scale_y_log10()

ggplot(penguins, aes(x = bill_depth_mm, y = body_mass_g)) +
  geom_point() +
  geom_smooth(method = "lm") +
  coord_trans(x = "log10", y = "log10")

Polar coordinate systems and pie charts

Pie charts and bullseye charts with coord_polar()

ggplot(penguins, aes(x = 1, fill = species)) +
  geom_bar() +
  labs(title = "Stacked bar chart")

ggplot(penguins, aes(x = 1, fill = species)) +
  geom_bar() +
  coord_polar(theta = "y") +
  labs(title = "Pie chart")

ggplot(penguins, aes(x = 1, fill = species)) +
  geom_bar() +
  coord_polar(theta = "x") +
  labs(title = "Bullseye chart")

aside: about pie charts…

Pie charts

What do you know about pie charts and data visualization best practices? Love ’em or lose ’em?

Pie charts: when to love ’em, when to lose ’em

❤️ For categorical variables with few levels, pie charts can work well





💔 For categorical variables with many levels, pie charts are difficult to read

Waffle charts

  • Like with pie charts, work best when the number of levels represented is low
  • Unlike pie charts, easier to compare proportions that represent non-simple fractions

Application exercise

ae-04

  • Go to the course GitHub org and find your ae-04 (repo name will be suffixed with your NetID).
  • Clone the repo in RStudio Workbench, open the Quarto document in the repo, and follow along and complete the exercises.
  • Render, commit, and push your edits by the AE deadline – end of tomorrow.
12:00

Facets

facet_*()

  • facet_wrap()
    • “wraps” a 1d ribbon of panels into 2d
    • generally for faceting by a single variable
  • facet_grid() for faceting
    • produces a 2d grid of panels defined by variables which form the rows and columns
    • generally for faceting by two variables
  • facet_null(): a single plot, the default

Free the scales!

p <- ggplot(penguins, aes(
  x = flipper_length_mm,
  y = body_mass_g
)) +
  geom_point()

p +
  facet_wrap(facets = vars(species)) +
  labs(title = "Same scales")

p +
  facet_wrap(
    facets = vars(species),
    scales = "free"
  ) +
  labs(title = "Free scales")

Free some scales

p +
  facet_wrap(
    facets = vars(species),
    scales = "free_x"
  ) +
  labs(title = "Free x scale")

p +
  facet_wrap(
    facets = vars(species),
    scales = "free_y"
  ) +
  labs(title = "Free x scale")

Freeing the y scale improves the display, but it’s still not satisfying. What’s wrong with it?

ggplot(penguins, aes(y = species, x = body_mass_g, fill = species)) +
  geom_boxplot(show.legend = FALSE) +
  facet_grid(rows = vars(island)) +
  labs(title = "Same scale and spacing")

ggplot(penguins, aes(y = species, x = body_mass_g, fill = species)) +
  geom_boxplot(show.legend = FALSE) +
  facet_grid(rows = vars(island), scales = "free_y") +
  labs(title = "Free y scale, same spacing")

Free spaces

ggplot(penguins, aes(y = species, x = body_mass_g, fill = species)) +
  geom_boxplot(show.legend = FALSE) +
  facet_grid(rows = vars(island), scales = "free_y", space = "free") +
  labs(title = "Free y scale and spacing")

Highlighting across facets

penguins_sans_species <- penguins |> select(-species)

ggplot(penguins, aes(x = flipper_length_mm, y = body_mass_g)) +
  geom_point(data = penguins_sans_species, color = "gray") +
  geom_point(aes(color = species)) +
  facet_wrap(vars(species))

Themes

Complete themes

Themes from ggthemes

Themes and color scales from ggthemes

p +
  aes(color = species) +
  scale_color_wsj() +
  theme_wsj() +
  labs(title = "Wall Street Journal")

Modifying theme elements

p +
  labs(title = "Palmer penguins") +
  theme(
    plot.title = element_text(color = "red", face = "bold", family = "Comic Sans MS"),
    plot.background = element_rect(color = "red", fill = "mistyrose")
  )

Project 01

Project 01

  • Initial proposal
  • Develop as a team
  • Take chances, make mistakes, get messy!