library(tidyverse)
library(ggmap)
library(colorspace)
# set default theme
theme_set(theme_minimal())
AE 16: Drawing maps with {ggmap}
Suggested answers
Packages
Load NYC 311 reports
Let’s first load a subset of 311 service requests in New York City.1
# load data
<- read_csv(file = "data/nyc-311.csv") nyc_311
Rows: 43323 Columns: 6
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (2): complaint_type, borough
dbl (3): unique_key, latitude, longitude
dttm (1): created_date
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(nyc_311)
Rows: 43,323
Columns: 6
$ unique_key <dbl> 44118718, 44144807, 44153542, 44156435, 44160451, 44170…
$ created_date <dttm> 2019-10-22 11:44:38, 2019-10-25 19:58:29, 2019-10-26 1…
$ complaint_type <chr> "Food Poisoning", "Food Poisoning", "Food Poisoning", "…
$ borough <chr> "QUEENS", "QUEENS", "BROOKLYN", "MANHATTAN", "BROOKLYN"…
$ latitude <dbl> 40.71226, 40.75848, 40.66921, 40.72604, 40.70665, 40.78…
$ longitude <dbl> -73.88848, -73.82961, -73.86451, -73.98955, -73.92297, …
This subset includes 311 service requests related to Food Poisoning in commercial establishments (e.g. restaurants, cafeterias, food carts).
Register a Stadia Maps API Key
Your turn: Store your Stadia Maps API key using the function
register_stadiamaps(key = "YOUR-API-KEY", write = TRUE)
replacing "YOUR-API-KEY"
with your actual API key. Otherwise you will not be able to obtain map tiles and complete the application exercise.
Obtain map tiles for New York City
Your turn: Use bboxfinder.com to find bounding box coordinates for New York City. Then, use get_stamenmap()
to obtain map tiles for New York City and visualize the map.
I recommend a zoom
level of 11.
Food poisoning rates
The COVID-19 pandemic caused massive disruption in the restaurant industry. Due to social distancing measures and lockdowns, restaurant traffic decreased significantly.
While this had significant financial ramifications, one potentially overlooked consequence is the impact on food poisoning rates. With fewer people eating out, the number of food poisoning complaints may have decreased.
Your turn: Visualize the geospatial distribution of complaints related to food poisoning in NYC in March, April, and May over a six-year time period (2018-23). Construct the chart in such a way that you can make valid comparisons over time and geographically. What impact did COVID-19 have on food poisoning cases in NYC? Did it vary geographically?
<- nyc_311 |>
nyc_covid_food_poison # generate a year variable
mutate(year = year(created_date)) |>
# only keep reports in March, April, and May from 2018-23
filter(month(created_date) %in% 3:5, year %in% 2018:2023)
ggmap(nyc) +
# add the heatmap
stat_density_2d(
data = nyc_covid_food_poison,
mapping = aes(
x = longitude,
y = latitude,
fill = after_stat(level)
),alpha = .1,
bins = 50,
geom = "polygon"
+
) scale_fill_viridis_c() +
facet_wrap(facets = vars(year))
Warning: Removed 39 rows containing non-finite outside the scale range
(`stat_density2d()`).
Your turn: Now visualize the change in food complaints over time without making a map. How else could you represent this data? Does making it a map improve the understanding of the data, or add more confusion?
# do we need a map?
|>
nyc_311 # generate a year variable
mutate(year = year(created_date)) |>
filter(month(created_date) %in% 3:5, year < 2025) |>
count(year) |>
ggplot(mapping = aes(x = year, y = n)) +
geom_line() +
geom_point() +
labs(
x = NULL,
y = NULL,
title = "Food poisoning complaints in NYC dropped during COVID-19",
subtitle = "311 complaints recorded in March, April, and May",
caption = "Source: NYC Open Data"
)
# by borough
|>
nyc_311 # generate a year variable
mutate(year = year(created_date)) |>
filter(month(created_date) %in% 3:5, year < 2025) |>
# count number of cases by borough and year
count(borough, year) |>
# remove NAs and Unspecified locations
drop_na() |>
filter(borough != "Unspecified") |>
# order the boroughs meaningfully for the guide
mutate(borough = borough |>
str_to_title() |>
fct_reorder2(.x = year, .y = n)) |>
ggplot(mapping = aes(x = year, y = n, color = borough)) +
geom_line() +
geom_point() +
scale_color_discrete_qualitative() +
labs(
x = NULL,
y = NULL,
color = NULL,
title = "Food poisoning complaints in NYC dropped during COVID-19",
subtitle = "311 complaints recorded in March, April, and May",
caption = "Source: NYC Open Data"
)
Visualize food poisoning complaints on Roosevelt Island
Your turn: Now focus on food poisoning complaints on or around Roosevelt Island.2 Use get_stamenmap()
to obtain map tiles for the Roosevelt Island region and overlay with the food poisoning complaints. What type of chart is more effective for this task?
- Consider adjusting your
zoom
for this geographic region. - Try a different set of map tiles. Which one looks both interpretable as well as aesthetically pleasing?
# Obtain map tiles for Roosevelt Island
<- c(
roosevelt_bb left = -73.967121,
bottom = 40.748700,
right = -73.937080,
top = 40.774704
)<- get_stadiamap(
roosevelt bbox = roosevelt_bb,
zoom = 14,
maptype = "stamen_watercolor"
)
ℹ © Stadia Maps © Stamen Design © OpenMapTiles © OpenStreetMap contributors.
# Generate a scatterplot of food poisoning complaints
ggmap(roosevelt) +
# add a scatterplot layer
geom_point(
data = nyc_311,
mapping = aes(
x = longitude,
y = latitude
),alpha = 0.5
)
Warning: Removed 42271 rows containing missing values or values outside the scale range
(`geom_point()`).
::session_info() sessioninfo
─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.4.2 (2024-10-31)
os macOS Sonoma 14.6.1
system aarch64, darwin20
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/New_York
date 2025-03-26
pandoc 3.4 @ /usr/local/bin/ (via rmarkdown)
─ Packages ───────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
archive 1.1.9 2024-09-12 [1] CRAN (R 4.4.1)
bit 4.0.5 2022-11-15 [1] CRAN (R 4.3.0)
bit64 4.0.5 2020-08-30 [1] CRAN (R 4.3.0)
bitops 1.0-7 2021-04-24 [1] CRAN (R 4.3.0)
cli 3.6.3 2024-06-21 [1] CRAN (R 4.4.0)
colorspace * 2.1-1 2024-07-26 [1] CRAN (R 4.4.0)
crayon 1.5.3 2024-06-20 [1] CRAN (R 4.4.0)
curl 6.2.1 2025-02-19 [1] RSPM
dichromat 2.0-0.1 2022-05-02 [1] CRAN (R 4.3.0)
digest 0.6.37 2024-08-19 [1] CRAN (R 4.4.1)
dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.3.1)
evaluate 1.0.3 2025-01-10 [1] CRAN (R 4.4.1)
farver 2.1.2 2024-05-13 [1] CRAN (R 4.3.3)
fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0)
forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.3.0)
generics 0.1.3 2022-07-05 [1] CRAN (R 4.3.0)
ggmap * 4.0.0 2024-01-09 [1] Github (stadiamaps/ggmap@dbbca88)
ggplot2 * 3.5.1 2024-04-23 [1] CRAN (R 4.3.1)
glue 1.8.0 2024-09-30 [1] CRAN (R 4.4.1)
gtable 0.3.6 2024-10-25 [1] CRAN (R 4.4.1)
here 1.0.1 2020-12-13 [1] CRAN (R 4.3.0)
hms 1.1.3 2023-03-21 [1] CRAN (R 4.3.0)
htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.3.1)
htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.3.1)
httr 1.4.7 2023-08-15 [1] CRAN (R 4.3.0)
isoband 0.2.7 2022-12-20 [1] CRAN (R 4.3.0)
jpeg 0.1-10 2022-11-29 [1] CRAN (R 4.3.0)
jsonlite 1.8.9 2024-09-20 [1] CRAN (R 4.4.1)
knitr 1.49 2024-11-08 [1] CRAN (R 4.4.1)
labeling 0.4.3 2023-08-29 [1] CRAN (R 4.3.0)
lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.1)
lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.3.1)
magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0)
MASS 7.3-61 2024-06-13 [1] CRAN (R 4.4.2)
pillar 1.10.1 2025-01-07 [1] CRAN (R 4.4.1)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.3.0)
plyr 1.8.9 2023-10-02 [1] CRAN (R 4.3.1)
png 0.1-8 2022-11-29 [1] CRAN (R 4.3.0)
purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.3.0)
R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.0)
RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.3.0)
Rcpp 1.0.14 2025-01-12 [1] CRAN (R 4.4.1)
readr * 2.1.5 2024-01-10 [1] CRAN (R 4.3.1)
rlang 1.1.5 2025-01-17 [1] CRAN (R 4.4.1)
rmarkdown 2.29 2024-11-04 [1] CRAN (R 4.4.1)
rprojroot 2.0.4 2023-11-05 [1] CRAN (R 4.3.1)
scales 1.3.0.9000 2025-03-19 [1] Github (bensoltoff/scales@71d8f13)
sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0)
stringi 1.8.4 2024-05-06 [1] CRAN (R 4.3.1)
stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.3.1)
tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.3.0)
tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.3.1)
tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.3.1)
tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.3.0)
timechange 0.3.0 2024-01-18 [1] CRAN (R 4.3.1)
tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.3.0)
vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.3.1)
viridisLite 0.4.2 2023-05-02 [1] CRAN (R 4.3.0)
vroom 1.6.5 2023-12-05 [1] CRAN (R 4.3.1)
withr 3.0.2 2024-10-28 [1] CRAN (R 4.4.1)
xfun 0.50.5 2025-01-15 [1] https://yihui.r-universe.dev (R 4.4.2)
yaml 2.3.10 2024-07-26 [1] CRAN (R 4.4.0)
[1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
──────────────────────────────────────────────────────────────────────────────
Footnotes
These reports were obtained from the NYC Open Data portal API.↩︎
Also the location of Cornell Tech.↩︎