Lecture 10
Cornell University
INFO 3312/5312 - Spring 2025
February 25, 2024
Source: @MarcBodnick
How can the following figure be improved with custom breaks in axes, if at all?
Image credit: STA 313
“This is what customers pay us for – to sweat all these details so it’s easy and pleasant for them to use our computers.”
How can plots be annotated to enhance their clarity and interpretability?
04:00
Image credit: Flowing Data
04:00
Image credit: Charted: Median House Prices vs. Income in the U.S.
geom_text()
, geom_label()
, geom_text_repel()
, etc.
ggplot(
gapminder_europe,
aes(x = gdpPercap, y = lifeExp)
) +
geom_point(aes(color = should_be_labeled)) +
geom_label_repel(
data = filter(
gapminder_europe,
should_be_labeled == TRUE
),
aes(
label = country,
fill = should_be_labeled
),
color = "white"
) +
scale_color_manual(values = c(
"grey50",
"red"
)) +
scale_fill_manual(values = c("red")) +
guides(color = "none", fill = "none")
# Color just Oceania
gapminder_highlighted <- gapminder |>
mutate(
is_oceania = continent == "Oceania"
)
ggplot(
gapminder_highlighted,
aes(
x = year, y = lifeExp,
group = country,
color = is_oceania,
size = is_oceania
)
) +
geom_line() +
scale_color_manual(values = c(
"grey70",
"red"
)) +
scale_size_manual(values = c(0.1, 0.5)) +
guides(color = "none", size = "none") +
theme_minimal()
geom_text()
, geom_label()
, geom_text_repel()
, etc.
annotate()
ggplot(
gapminder_europe,
aes(x = gdpPercap, y = lifeExp)
) +
geom_point() +
annotate(
geom = "rect",
xmin = 30000, xmax = 50000,
ymin = 78, ymax = 82,
fill = "red", alpha = 0.2
) +
annotate(
geom = "label",
x = 40000, y = 76.5,
label = "Rich and long-living"
) +
annotate(
geom = "segment",
x = 40000, xend = 40000,
y = 76.8, yend = 77.8,
arrow = arrow(
length = unit(0.1, "in")
)
)
Rows: 6,020
Columns: 9
$ iso2c <chr> "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF", "AF"…
$ iso3c <chr> "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG", "AFG",…
$ country <chr> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "…
$ year <dbl> 2001, 1998, 2009, 2000, 2012, 1996, 1999, 2002, 2003, 2004, 2005, 2006, 2007…
$ population <dbl> 19688632, 18493132, 27385307, 19542982, 30466479, 17106595, 19262847, 210002…
$ co2_emissions <dbl> 0.05529272, 0.07126970, 0.23950690, 0.05516661, 0.33506104, 0.08226652, 0.05…
$ gdp_per_cap <dbl> NA, NA, 490.2728, NA, 570.6761, NA, NA, 344.2242, 347.4152, 338.7394, 363.54…
$ region <chr> "South Asia", "South Asia", "South Asia", "South Asia", "South Asia", "South…
$ income <chr> "Low income", "Low income", "Low income", "Low income", "Low income", "Low i…
Credit: Data Visualization with R
# A tibble: 170 × 9
iso3c country region income rank_1995 rank_2020 rank_diff big_change better_big_change
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <lgl> <chr>
1 ZWE Zimbabwe Sub-S… Lower… 75 39 -36 TRUE Rank improved
2 DNK Denmark Europ… High … 160 127 -33 TRUE Rank improved
3 SWE Sweden Europ… High … 132 100 -32 TRUE Rank improved
4 SYR Syrian Arab Repub… Middl… Low i… 96 64 -32 TRUE Rank improved
5 MLT Malta Middl… High … 128 99 -29 FALSE Rank changed a l…
6 EST Estonia Europ… High … 161 133 -28 FALSE Rank changed a l…
7 UKR Ukraine Europ… Lower… 139 111 -28 FALSE Rank changed a l…
8 YEM Yemen, Rep. Middl… Low i… 53 25 -28 FALSE Rank changed a l…
9 VEN Venezuela, RB Latin… Not c… 119 92 -27 FALSE Rank changed a l…
10 SWZ Eswatini Sub-S… Lower… 79 53 -26 FALSE Rank changed a l…
# ℹ 160 more rows
ae-09
: Improving the chart through annotationInstructions
Brainstorm methods to improve the readability and interpretability of the chart through annotations
What is a “good” rank? What is a “bad” rank?
Note
What are the countries that have significantly improved or worsened their rank?
What other aspects do you feel should be emphasized?