I’ve been downloading Census Data using its API procedure for nearly two years. Once you’ve determined which tables and the variables you want…it becomes a breeze to replicate downloads.
559/224-7388: Landline (not answered unless I know the number)
559/355-1678: cell
Due to the current health emergency, Caltrans' Planning & Local Assistance staff are working remotely whenever possible. While there may be some delay in our response times, we continue to be
available via email and remain committed to customer service.
From: Charles Purvis <clpurvis@att.net>
Sent: Wednesday, September 20, 2023 4:13 PM
To: The Census Transportation Products Program Community of Practice/Users discussion and news list <ctpp@listserv.transportation.org>
Subject: [CTPP News] Geo-within-Geo: Census 2020: Places within My Counties
EXTERNAL EMAIL. Links/attachments may not be safe. |
I’ve found a newer, simpler method to obtain Census 2020 data on cities (places) within your counties / MPO region.
It involves using the R packages (tidyverse, jsonlite, tidycensus) and the Census Bureau’s API system. I’ve been reluctant to use the Bureau’s APIs - - they seem as easy to decipher as Egyptian heiroglyphics.
I think the Census Bureau’s “API examples” are worth checking out:
To get places within counties, we need to find data from summary level 159. (My previous efforts used sumlev 155 and 160.)
The function “fromJSON” in the “jsonlite” R package is used to read the raw API data call into JSON format (Javascript Object Notation). The JSON object is then converted into standard R “data frames” using the function “as.data.frame(json_file)?
Here are snippets from my R script:
Load appropriate libraries into your R session
library(jsonlite)
library(tidyverse)
library(tidycensus)
Then pull data from the Census Bureau’s API, and have it “wrapped around” the “fromJSON” function.
My example pulls five variables from the 2020 Census DHS, for California, and for the Nine Counties in the SF Bay Area (001 through 097)….
temp1 <- fromJSON("https://api.census.gov/data/2020/dec/dhc?get=NAME,P1_001N,H8_001N,H3_001N,H3_002N,H3_003N&for=place%20(or%20part):*&in=state:06%20county:001,013,041,055,075,081,085,095,097")
bayarea1 <- as.data.frame(temp1)
And the LAST step uses the package “tidyverse” to clean up the variable names, and creates a “joining” variable GEOID that can be used in subsequent analyses of place-level data for your region. Those data can be from the ACS or the Decennial
Census. They key is having a discrete list of places with an appropriate joining variable (GEOID).
bayarea2 <- bayarea1 %>%
filter(!V1=="NAME") %>%
mutate_at(c("V2","V3","V4","V5","V6"), as.numeric) %>%
rename(place_name = V1,
totpop_2020 = V2, # P1_001N
hhpop_2020 = V3, # H8_001N
total_du_2020 = V4, # H3_001N
occ_du_2020 = V5, # H3_002N
vac_du_2020 = V6, # H3_003N
state_fips = V7,
county_fips = V8,
place_fips = V9) %>%
unite(GEOID,c("state_fips","place_fips"),sep="", remove=FALSE)
That’s it!
My initial foray into these "API calls wrapped in R functions" was to extract Census 2020 data on other “geo-within-geo” of interest:
1. Congressional Districts by County
2. Congressional Districts by Place
3. Lower State Houses by County
4. Lower State Houses by Place
5. Upper State House by County
6. Upper State House by Place
I also had to master the process of scraping data from sites like wikipedia and the Daily Kos to get data that I could merge with census data. I won’t publish details on this in the CTPP listserv, but if you’re interested in these, let
me know.
My inspiration is trying to find the congressional districts, and state legislators that are in my region.If you’re an MPO, you probably want to know who’s door to be knocking on? :)
Chuck