I’ve been downloading Census Data using its API procedure for nearly two years.  Once you’ve determined which tables and the variables you want…it becomes a breeze to replicate downloads.

 

559/224-7388: Landline (not answered unless I know the number)

559/355-1678: cell

Due to the current health emergency, Caltrans' Planning & Local Assistance staff are working remotely whenever possible. While there may be some delay in our response times, we continue to be available via email and remain committed to customer service.

 

From: Charles Purvis <clpurvis@att.net>
Sent: Wednesday, September 20, 2023 4:13 PM
To: The Census Transportation Products Program Community of Practice/Users discussion and news list <ctpp@listserv.transportation.org>
Subject: [CTPP News] Geo-within-Geo: Census 2020: Places within My Counties

 

EXTERNAL EMAIL. Links/attachments may not be safe.

I’ve found a newer, simpler method to obtain Census 2020 data on cities (places) within your counties / MPO region.

 

It involves using the R packages (tidyverse, jsonlite, tidycensus) and the Census Bureau’s API system. I’ve been reluctant to use the Bureau’s APIs - - they seem as easy to decipher as Egyptian heiroglyphics. 

 

I think the Census Bureau’s “API examples” are worth checking out:

 

https://api.census.gov/data/2020/dec/dhc/examples.html

 

To get places within counties, we need to find data from summary level 159. (My previous efforts used sumlev 155 and 160.)

 

The function “fromJSON” in the “jsonlite” R package is used to read the raw API data call into JSON format (Javascript Object Notation). The JSON object is then converted into standard R “data frames” using the function “as.data.frame(json_file)?

 

Here are snippets from my R script:

 

Load appropriate libraries into your R session

 

library(jsonlite)

library(tidyverse)

library(tidycensus)

 

Then pull data from the Census Bureau’s API, and have it “wrapped around” the “fromJSON” function.

 

My example pulls five variables from the 2020 Census DHS, for California, and for the Nine Counties in the SF Bay Area (001 through 097)….

 

temp1 <- fromJSON("https://api.census.gov/data/2020/dec/dhc?get=NAME,P1_001N,H8_001N,H3_001N,H3_002N,H3_003N&for=place%20(or%20part):*&in=state:06%20county:001,013,041,055,075,081,085,095,097")

 

bayarea1 <- as.data.frame(temp1)

 

And the LAST step uses the package “tidyverse” to clean up the variable names, and creates a “joining” variable GEOID that can be used in subsequent analyses of place-level data for your region. Those data can be from the ACS or the Decennial Census. They key is having a discrete list of places with an appropriate joining variable (GEOID).

 

bayarea2 <- bayarea1 %>% 

  filter(!V1=="NAME") %>%

  mutate_at(c("V2","V3","V4","V5","V6"), as.numeric) %>% 

  rename(place_name = V1,

         totpop_2020     = V2,   # P1_001N

         hhpop_2020      = V3,   # H8_001N

         total_du_2020   = V4,   # H3_001N

         occ_du_2020     = V5,   # H3_002N

         vac_du_2020     = V6,   # H3_003N

         state_fips      = V7,

         county_fips     = V8,

         place_fips      = V9) %>% 

  unite(GEOID,c("state_fips","place_fips"),sep="", remove=FALSE)

 

That’s it! 

 

My initial foray into these "API calls wrapped in R functions" was to extract Census 2020 data on other “geo-within-geo” of interest:

 

1. Congressional Districts by County

2. Congressional Districts by Place

3. Lower State Houses by County

4. Lower State Houses by Place

5. Upper State House by County

6. Upper State House by Place

 

I also had to master the process of scraping data from sites like wikipedia and the Daily Kos to get data that I could merge with census data. I won’t publish details on this in the CTPP listserv, but if you’re interested in these, let me know.

 

My inspiration is trying to find the congressional districts, and state legislators that are in my region.If you’re an MPO, you probably want to know who’s door to be knocking on? :)

 

Chuck