Lecture 6

Dr. Benjamin Soltoff

Cornell University

INFO 3312/5312 - Spring 2024

February 9, 2024

- Homework 1 collected
- Team projects begin tomorrow – look for team assignments in Canvas

- What is the story?
- How does the design account for the time gaps?

Coordinate systems

Facets

Themes (a little bit)

- Combine the two position aesthetics (
`x`

and`y`

) to produce a 2d position on the plot:- linear coordinate system: horizontal and vertical coordinates
- polar coordinate system: angle and radius
- maps: latitude and longitude

- Draw axes and panel backgrounds in coordination with the coordinate systems

**Linear coordinate systems:**preserve the shape of geoms

`coord_cartesian()`

: the default Cartesian coordinate system, where the 2d position of an element is given by the combination of the x and y positions.`coord_flip()`

: Cartesian coordinate system with x and y axes flipped*(won’t be using much now that geoms can take aesthetic mappings in x and y axes)*`coord_fixed()`

: Cartesian coordinate system with a fixed aspect ratio.*(useful only in limited circumstances)*

**Non-linear coordinate systems:**can change the shapes – a straight line may no longer be straight. The closest distance between two points may no longer be a straight line.

`coord_trans()`

: Apply arbitrary transformations to x and y positions,*after the data has been processed by the stat*`coord_polar()`

: Polar coordinates`coord_map()`

/`coord_quickmap()`

/`coord_sf()`

: Map projections

Identify the differences between each plot. Focus on the range of the `x`

and `y`

axes as well as the contents of the plots.

`02:00`

```
base_plot +
labs(title = "Plot 1")
base_plot +
scale_x_continuous(limits = c(190, 220)) +
scale_y_continuous(limits = c(4000, 5000)) +
labs(title = "Plot 2")
base_plot +
xlim(190, 220) +
ylim(4000, 5000) +
labs(title = "Plot 3")
base_plot +
coord_cartesian(xlim = c(190, 220),
ylim = c(4000, 5000)) +
labs(title = "Plot 4")
```

```
base_plot +
labs(title = "Plot 1")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values (`geom_point()`).
base_plot +
scale_x_continuous(limits = c(190, 220)) +
scale_y_continuous(limits = c(4000, 5000)) +
labs(title = "Plot 2")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 235 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 235 rows containing missing values (`geom_point()`).
base_plot +
xlim(190, 220) +
ylim(4000, 5000) +
labs(title = "Plot 3")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 235 rows containing non-finite values (`stat_smooth()`).
## Removed 235 rows containing missing values (`geom_point()`).
base_plot +
coord_cartesian(xlim = c(190, 220),
ylim = c(4000, 5000)) +
labs(title = "Plot 4")
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values (`geom_point()`).
```

- Setting scale limits: Any data outside the limits is thrown away
`scale_*_continuous(limits = ...)`

`xlim()`

and`ylim()`

- Setting coordinate system limits: Use all the data, but only display a small region of the plot (zooming in)
`coord_cartesian(xlim = ..., ylim = ...)`

`coord_fixed()`

Useful when having an aspect ratio of 1 makes sense, e.g. scores on two tests (reading and writing) on the same scale (0 to 100 points)

`coord_polar()`

```
ggplot(penguins, aes(x = 1, fill = species)) +
geom_bar() +
labs(title = "Stacked bar chart")
ggplot(penguins, aes(x = 1, fill = species)) +
geom_bar() +
coord_polar(theta = "y") +
labs(title = "Pie chart")
ggplot(penguins, aes(x = 1, fill = species)) +
geom_bar() +
coord_polar(theta = "x") +
labs(title = "Bullseye chart")
```

aside: about pie charts…

What do you know about pie charts and data visualization best practices? Love ’em or lose ’em?

❤️ For categorical variables with few levels, pie charts can work well

💔 For categorical variables with many levels, pie charts are difficult to read

- Like with pie charts, work best when the number of levels represented is low
- Unlike pie charts, easier to compare proportions that represent non-simple fractions

`ae-04`

- Go to the course GitHub org and find your
`ae-04`

(repo name will be suffixed with your NetID). - Clone the repo in RStudio Workbench, open the Quarto document in the repo, and follow along and complete the exercises.
- Render, commit, and push your edits by the AE deadline – end of tomorrow.

`12:00`

`facet_*()`

`facet_wrap()`

- “wraps” a 1d ribbon of panels into 2d
- generally for faceting by a single variable

`facet_grid()`

for faceting- produces a 2d grid of panels defined by variables which form the rows and columns
- generally for faceting by two variables

`facet_null()`

: a single plot, the default

Freeing the y scale improves the display, but it’s still not satisfying. What’s wrong with it?

```
ggplot(penguins, aes(y = species, x = body_mass_g, fill = species)) +
geom_boxplot(show.legend = FALSE) +
facet_grid(rows = vars(island)) +
labs(title = "Same scale and spacing")
ggplot(penguins, aes(y = species, x = body_mass_g, fill = species)) +
geom_boxplot(show.legend = FALSE) +
facet_grid(rows = vars(island), scales = "free_y") +
labs(title = "Free y scale, same spacing")
```

- Initial proposal
- Develop as a team
- Take chances, make mistakes, get messy!