Lecture 18
Cornell University
INFO 3312/5312 - Spring 2025
March 27, 2025
Original implementation. Inspired by @townusa.bsky.social
.shp
- geographic coordinates.dbf
- data associated with the geographic features.prj
- projection of the coordinates in the shapefile-- geo_export_6fd95df5-1136-4829-8f2d-9cb5d1cc2222.dbf
-- geo_export_6fd95df5-1136-4829-8f2d-9cb5d1cc2222.prj
-- geo_export_6fd95df5-1136-4829-8f2d-9cb5d1cc2222.shp
-- geo_export_6fd95df5-1136-4829-8f2d-9cb5d1cc2222.shx
Uses JavaScript Object Notation (JSON) file format
Plain text files
Image credit: Allison Horst
Source: Simple Features for R
Reading layer `geo_export_6fd95df5-1136-4829-8f2d-9cb5d1cc2222' from data source `/Users/soltoffbc/Projects/info-3312/course-site/slides/data/borough-boundaries'
using driver `ESRI Shapefile'
Simple feature collection with 5 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -74.25559 ymin: 40.49613 xmax: -73.70001 ymax: 40.91553
Geodetic CRS: WGS84(DD)
Simple feature collection with 5 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -74.25559 ymin: 40.49613 xmax: -73.70001 ymax: 40.91553
Geodetic CRS: WGS84(DD)
boro_code boro_name shape_area shape_leng geometry
1 5 Staten Island 1623631283 325924.0 MULTIPOLYGON (((-74.05051 4...
2 2 Bronx 1187189499 463277.2 MULTIPOLYGON (((-73.89681 4...
3 1 Manhattan 636605816 359103.2 MULTIPOLYGON (((-74.01093 4...
4 3 Brooklyn 1934169229 728478.1 MULTIPOLYGON (((-73.86327 4...
5 4 Queens 3041397430 888238.6 MULTIPOLYGON (((-73.82645 4...
Reading layer `borough-boundaries' from data source
`/Users/soltoffbc/Projects/info-3312/course-site/slides/data/borough-boundaries.geojson'
using driver `GeoJSON'
Simple feature collection with 5 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -74.25559 ymin: 40.49613 xmax: -73.70001 ymax: 40.91553
Geodetic CRS: WGS 84
Simple feature collection with 5 features and 4 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -74.25559 ymin: 40.49613 xmax: -73.70001 ymax: 40.91553
Geodetic CRS: WGS 84
boro_code boro_name shape_area shape_leng
1 5 Staten Island 1623631283.36 325924.002076
2 2 Bronx 1187189499.3 463277.240478
3 1 Manhattan 636605816.437 359103.151368
4 3 Brooklyn 1934169228.83 728478.125489
5 4 Queens 3041397430.33 888238.562635
geometry
1 MULTIPOLYGON (((-74.05051 4...
2 MULTIPOLYGON (((-73.89681 4...
3 MULTIPOLYGON (((-74.01093 4...
4 MULTIPOLYGON (((-73.86327 4...
5 MULTIPOLYGON (((-73.82645 4...
Simple feature collection with 56 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -179.1489 ymin: -14.5487 xmax: 179.7785 ymax: 71.36516
Geodetic CRS: NAD83
First 10 features:
STATEFP STATENS AFFGEOID GEOID STUSPS NAME LSAD ALAND
1 56 01779807 0400000US56 56 WY Wyoming 00 2.514587e+11
2 02 01785533 0400000US02 02 AK Alaska 00 1.478943e+12
3 24 01714934 0400000US24 24 MD Maryland 00 2.515199e+10
4 60 01802701 0400000US60 60 AS American Samoa 00 1.977591e+08
5 05 00068085 0400000US05 05 AR Arkansas 00 1.346608e+11
6 38 01779797 0400000US38 38 ND North Dakota 00 1.786943e+11
7 10 01779781 0400000US10 10 DE Delaware 00 5.046732e+09
8 66 01802705 0400000US66 66 GU Guam 00 5.435558e+08
9 35 00897535 0400000US35 35 NM New Mexico 00 3.141986e+11
10 49 01455989 0400000US49 49 UT Utah 00 2.133551e+11
AWATER geometry
1 1867503716 MULTIPOLYGON (((-111.0546 4...
2 245378425142 MULTIPOLYGON (((179.4825 51...
3 6979074857 MULTIPOLYGON (((-76.05015 3...
4 1307243751 MULTIPOLYGON (((-168.1458 -...
5 3121950081 MULTIPOLYGON (((-94.61792 3...
6 4414779956 MULTIPOLYGON (((-104.0487 4...
7 1399179670 MULTIPOLYGON (((-75.56555 3...
8 934337453 MULTIPOLYGON (((144.6454 13...
9 726482113 MULTIPOLYGON (((-109.0502 3...
10 6529973239 MULTIPOLYGON (((-114.053 37...
ggplot()
Source: xkcd
shift_geometry()
crimes_homicide <- filter(.data = crimes, ofns_desc == "MURDER & NON-NEGL. MANSLAUGHTER")
crimes_homicide
# A tibble: 269 × 7
cmplnt_num boro_nm cmplnt_fr_dt law_cat_cd ofns_desc latitude
<chr> <chr> <dttm> <chr> <chr> <dbl>
1 240954923H1 BROOKLYN 1977-12-20 05:00:00 FELONY MURDER & NON… 40.7
2 245958045H1 BROOKLYN 2001-08-13 04:00:00 FELONY MURDER & NON… 40.7
3 8101169H6113 MANHATTAN 2005-03-06 05:00:00 FELONY MURDER & NON… 40.8
4 8101169H6113 MANHATTAN 2005-03-06 05:00:00 FELONY MURDER & NON… 40.8
5 16631466H8909 BROOKLYN 2006-05-24 04:00:00 FELONY MURDER & NON… 40.7
6 246056367H1 QUEENS 2015-05-13 04:00:00 FELONY MURDER & NON… 40.6
7 243507594H1 MANHATTAN 2020-06-19 04:00:00 FELONY MURDER & NON… 40.8
8 243688124H1 BROOKLYN 2021-01-31 05:00:00 FELONY MURDER & NON… 40.7
9 240767513H1 BROOKLYN 2021-02-17 05:00:00 FELONY MURDER & NON… 40.6
10 240767512H1 BROOKLYN 2021-05-24 04:00:00 FELONY MURDER & NON… 40.6
# ℹ 259 more rows
# ℹ 1 more variable: longitude <dbl>
crimes_homicide_sf <- st_as_sf(x = crimes_homicide, coords = c("longitude", "latitude"))
st_crs(crimes_homicide_sf) <- 4326 # set the coordinate reference system
crimes_homicide_sf
Simple feature collection with 269 features and 5 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -74.08578 ymin: 40.59087 xmax: -73.73331 ymax: 40.90316
Geodetic CRS: WGS 84
# A tibble: 269 × 6
cmplnt_num boro_nm cmplnt_fr_dt law_cat_cd ofns_desc
* <chr> <chr> <dttm> <chr> <chr>
1 240954923H1 BROOKLYN 1977-12-20 05:00:00 FELONY MURDER & NON-NEGL. MA…
2 245958045H1 BROOKLYN 2001-08-13 04:00:00 FELONY MURDER & NON-NEGL. MA…
3 8101169H6113 MANHATTAN 2005-03-06 05:00:00 FELONY MURDER & NON-NEGL. MA…
4 8101169H6113 MANHATTAN 2005-03-06 05:00:00 FELONY MURDER & NON-NEGL. MA…
5 16631466H8909 BROOKLYN 2006-05-24 04:00:00 FELONY MURDER & NON-NEGL. MA…
6 246056367H1 QUEENS 2015-05-13 04:00:00 FELONY MURDER & NON-NEGL. MA…
7 243507594H1 MANHATTAN 2020-06-19 04:00:00 FELONY MURDER & NON-NEGL. MA…
8 243688124H1 BROOKLYN 2021-01-31 05:00:00 FELONY MURDER & NON-NEGL. MA…
9 240767513H1 BROOKLYN 2021-02-17 05:00:00 FELONY MURDER & NON-NEGL. MA…
10 240767512H1 BROOKLYN 2021-05-24 04:00:00 FELONY MURDER & NON-NEGL. MA…
# ℹ 259 more rows
# ℹ 1 more variable: geometry <POINT [°]>
usa_fb <- get_acs(
geography = "state",
variables = c(total = "B05002_001E", native = "B05002_002E",
foreign = "B05002_013E"),
year = 2023,
output = "wide",
geometry = TRUE
) |>
select(GEOID, NAME, total, native, foreign) |>
mutate(pct_foreign = foreign / total) |>
filter(NAME %in% state.name) |>
shift_geometry()
Simple feature collection with 50 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -3111747 ymin: -1697746 xmax: 2258200 ymax: 1565782
Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic
First 10 features:
GEOID NAME total native foreign pct_foreign
1 35 New Mexico 2114768 1919284 195484 0.09243756
2 46 South Dakota 899194 863976 35218 0.03916619
3 06 California 39242785 28763259 10479526 0.26704338
4 21 Kentucky 4510725 4311120 199605 0.04425120
5 01 Alabama 5054253 4862214 192039 0.03799553
6 13 Georgia 10822590 9650102 1172488 0.10833710
7 05 Arkansas 3032651 2874432 158219 0.05217185
8 42 Pennsylvania 12986518 12025271 961247 0.07401884
9 29 Missouri 6168181 5898279 269902 0.04375715
10 08 Colorado 5810774 5254183 556591 0.09578603
geometry
1 MULTIPOLYGON (((-1231344 -5...
2 MULTIPOLYGON (((-633765.6 8...
3 MULTIPOLYGON (((-2066923 -2...
4 MULTIPOLYGON (((584560 -886...
5 MULTIPOLYGON (((760323.7 -7...
6 MULTIPOLYGON (((1390722 -58...
7 MULTIPOLYGON (((122656.3 -1...
8 MULTIPOLYGON (((1287712 486...
9 MULTIPOLYGON (((19008.51 34...
10 MULTIPOLYGON (((-1123223 20...
Simple feature collection with 269 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -74.25559 ymin: 40.49613 xmax: -73.70001 ymax: 40.91553
Geodetic CRS: WGS 84
First 10 features:
boro_code boro_name shape_area shape_leng cmplnt_num
1 5 Staten Island 1623631283.36 325924.002076 239347883H1
1.1 5 Staten Island 1623631283.36 325924.002076 244818447H1
1.2 5 Staten Island 1623631283.36 325924.002076 246985876H1
2 2 Bronx 1187189499.3 463277.240478 242029394H1
2.1 2 Bronx 1187189499.3 463277.240478 242029394H1
2.2 2 Bronx 1187189499.3 463277.240478 243573864H1
2.3 2 Bronx 1187189499.3 463277.240478 244256173H1
2.4 2 Bronx 1187189499.3 463277.240478 240776535H1
2.5 2 Bronx 1187189499.3 463277.240478 240776535H1
2.6 2 Bronx 1187189499.3 463277.240478 240776535H1
boro_nm cmplnt_fr_dt law_cat_cd
1 STATEN ISLAND 2022-01-18 05:00:00 FELONY
1.1 STATEN ISLAND 2022-05-09 04:00:00 FELONY
1.2 STATEN ISLAND 2022-06-21 04:00:00 FELONY
2 BRONX 2021-06-01 04:00:00 FELONY
2.1 BRONX 2021-06-01 04:00:00 FELONY
2.2 BRONX 2021-10-24 04:00:00 FELONY
2.3 BRONX 2021-12-17 05:00:00 FELONY
2.4 BRONX 2022-01-01 05:00:00 FELONY
2.5 BRONX 2022-01-01 05:00:00 FELONY
2.6 BRONX 2022-01-01 05:00:00 FELONY
ofns_desc geometry
1 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-74.05051 4...
1.1 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-74.05051 4...
1.2 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-74.05051 4...
2 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-73.89681 4...
2.1 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-73.89681 4...
2.2 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-73.89681 4...
2.3 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-73.89681 4...
2.4 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-73.89681 4...
2.5 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-73.89681 4...
2.6 MURDER & NON-NEGL. MANSLAUGHTER MULTIPOLYGON (((-73.89681 4...
Simple feature collection with 5 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -74.25559 ymin: 40.49613 xmax: -73.70001 ymax: 40.91553
Geodetic CRS: WGS 84
boro_name n geometry
1 Bronx 101 MULTIPOLYGON (((-73.89679 4...
2 Brooklyn 75 MULTIPOLYGON (((-73.86318 4...
3 Manhattan 46 MULTIPOLYGON (((-74.0086 40...
4 Queens 44 MULTIPOLYGON (((-73.82646 4...
5 Staten Island 3 MULTIPOLYGON (((-74.05054 4...
ae-17
Instructions
ae-17
(repo name will be suffixed with your GitHub name).renv::restore()
to install the required packages, open the Quarto document in the repo, and follow along and complete the exercises.ggplot2::binned_scale()
ggplot2::binned_scale()
with quartilesggplot2::binned_scale()
with quartilesImage credit: xkcd
+proj=laea +lat_0=45 +lon_0=-100 +x_0=0 +y_0=0 +a=6370997 +b=6370997 +units=m +no_defs
PROJCS["US National Atlas Equal Area",
GEOGCS["Unspecified datum based upon the Clarke 1866 Authalic Sphere",
DATUM["Not_specified_based_on_Clarke_1866_Authalic_Sphere",
SPHEROID["Clarke 1866 Authalic Sphere",6370997,0,
AUTHORITY["EPSG","7052"]],
AUTHORITY["EPSG","6052"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.01745329251994328,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4052"]],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]],
PROJECTION["Lambert_Azimuthal_Equal_Area"],
PARAMETER["latitude_of_center",45],
PARAMETER["longitude_of_center",-100],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
AUTHORITY["EPSG","2163"],
AXIS["X",EAST],
AXIS["Y",NORTH]]
Source: StackOverflow
Work through part 2
geom_sf()
to visualize simple features data in {ggplot2}# import health data
# source: https://chronicdata.cdc.gov/500-Cities-Places/PLACES-County-Data-GIS-Friendly-Format-2024-releas/i46a-9kgh/about_data
county_health <- read_csv(file = "data/PLACES__County_Data__GIS_Friendly_Format___2024_release_20250326.csv") |>
select(GEOID = CountyFIPS, drinking = BINGE_AdjPrev, smoking = CSMOKING_AdjPrev)
# get county/state boundaries
county_sf <- counties(cb = TRUE) |>
# only keep counties in US states
semi_join(
y = fips_codes |>
filter(state_name %in% state.name),
by = join_by(
STATEFP == state_code,
COUNTYFP == county_code
)
) |>
# shift Alaska and Hawaii
shift_geometry()
state_sf <- states(cb = TRUE) |>
filter(NAME %in% state.name) |>
# shift Alaska and Hawaii
shift_geometry()
# create map
smoke_drink_map <- left_join(x = county_sf, y = county_health) |>
# use biscales to create bivariate color palette
bi_class(
x = smoking,
y = drinking,
style = "quantile"
) |>
# draw map
ggplot() +
geom_sf(mapping = aes(fill = bi_class), linewidth = 0.1) +
# overlay state boundaries for context
geom_sf(data = state_sf, fill = NA, color = "white") +
# use biscale color palette
bi_scale_fill(pal = "PinkGrn", guide = "none") +
# appropriate labels and theme
labs(
title = "Smoking and excessive drinking in the United States",
caption = "Source: Behavioral Risk Factor Surveillance System (2022)"
) +
theme_map(base_family = "Atkinson Hyperlegible", base_size = 14)
# create legend
smoke_drink_legend <- bi_legend(
pal = "PinkGrn",
dim = 3,
xlab = "Higher % Smokers",
ylab = "Higher % Binge Drinkers",
size = 9,
pad_width = 1.5, pad_color = "white",
rotate_pal = FALSE,
base_family = "Atkinson Hyperlegible"
) +
# fix justification of text labels
theme(
axis.title.x = element_text(hjust = 0),
axis.title.y = element_text(hjust = 0)
)
# combine map + legend with patchwork
smoke_drink_map +
inset_element(smoke_drink_legend,
left = 0, bottom = 0.3, right = 0.20, top = 0.575
)