I’ve been downloading Census Data using its API procedure for nearly two years. Once
you’ve determined which tables and the variables you want…it becomes a breeze to replicate
downloads.
[cid:image001.png@01D9EBDE.B5BB5340]
559/224-7388: Landline (not answered unless I know the number)
559/355-1678: cell
Due to the current health emergency, Caltrans' Planning & Local Assistance staff
are working remotely whenever possible. While there may be some delay in our response
times, we continue to be available via email and remain committed to customer service.
From: Charles Purvis <clpurvis(a)att.net>
Sent: Wednesday, September 20, 2023 4:13 PM
To: The Census Transportation Products Program Community of Practice/Users discussion and
news list <ctpp(a)listserv.transportation.org>
Subject: [CTPP News] Geo-within-Geo: Census 2020: Places within My Counties
EXTERNAL EMAIL. Links/attachments may not be safe.
I’ve found a newer, simpler method to obtain Census 2020 data on cities (places) within
your counties / MPO region.
It involves using the R packages (tidyverse, jsonlite, tidycensus) and the Census Bureau’s
API system. I’ve been reluctant to use the Bureau’s APIs - - they seem as easy to decipher
as Egyptian heiroglyphics.
I think the Census Bureau’s “API examples” are worth checking out:
https://api.census.gov/data/2020/dec/dhc/examples.html<https://urldefens…
To get places within counties, we need to find data from summary level 159. (My previous
efforts used sumlev 155 and 160.)
The function “fromJSON” in the “jsonlite” R package is used to read the raw API data call
into JSON format (Javascript Object Notation). The JSON object is then converted into
standard R “data frames” using the function “as.data.frame(json_file)?
Here are snippets from my R script:
Load appropriate libraries into your R session
library(jsonlite)
library(tidyverse)
library(tidycensus)
Then pull data from the Census Bureau’s API, and have it “wrapped around” the “fromJSON”
function.
My example pulls five variables from the 2020 Census DHS, for California, and for the Nine
Counties in the SF Bay Area (001 through 097)….
temp1 <-
fromJSON("https://api.census.gov/data/2020/dec/dhc?get=NAME,P1_001N,H8…
bayarea1 <- as.data.frame(temp1)
And the LAST step uses the package “tidyverse” to clean up the variable names, and creates
a “joining” variable GEOID that can be used in subsequent analyses of place-level data for
your region. Those data can be from the ACS or the Decennial Census. They key is having a
discrete list of places with an appropriate joining variable (GEOID).
bayarea2 <- bayarea1 %>%
filter(!V1=="NAME") %>%
mutate_at(c("V2","V3","V4","V5","V6"),
as.numeric) %>%
rename(place_name = V1,
totpop_2020 = V2, # P1_001N
hhpop_2020 = V3, # H8_001N
total_du_2020 = V4, # H3_001N
occ_du_2020 = V5, # H3_002N
vac_du_2020 = V6, # H3_003N
state_fips = V7,
county_fips = V8,
place_fips = V9) %>%
unite(GEOID,c("state_fips","place_fips"),sep="",
remove=FALSE)
That’s it!
My initial foray into these "API calls wrapped in R functions" was to extract
Census 2020 data on other “geo-within-geo” of interest:
1. Congressional Districts by County
2. Congressional Districts by Place
3. Lower State Houses by County
4. Lower State Houses by Place
5. Upper State House by County
6. Upper State House by Place
I also had to master the process of scraping data from sites like wikipedia and the Daily
Kos to get data that I could merge with census data. I won’t publish details on this in
the CTPP listserv, but if you’re interested in these, let me know.
My inspiration is trying to find the congressional districts, and state legislators that
are in my region.If you’re an MPO, you probably want to know who’s door to be knocking on?
:)
Chuck