Just a few initial comments on the new CTPP Data Portal (https://ctppdata.transportation.org/#/index)

It works. It’s very well organized for a sequential choice of: dataset (2017-2021); part (1, 2 or 3); Universe (probably okay to skip this); and geographic summary levels.

Then proceed with step 1 to choose one (only one) table, 
then step 2: choose your geography, then,
step 3: retrieve your data.

Well, if you ask for too much data, say, tract-to-tract within your mega-region, the software will say “please reduce the geography selection and try again or use CTPP Data API to download the data.”

If the data request is very small, it will download immediately. 

If it’s a mid-sized data request (say, county-to-county by means of transportation) the system will send off your request in “batch mode”  and then send you an e-mail telling you when your “job is done” …. Kind of like how it worked in the previous generation of the software. Quite simple and effective.

I am able to download these csv files fairly routinely from this new AASHTO site. Seems to work perfectly for all small and medium-sized data requests.

I have also developed a series of table specific R scripts to “clean up” variable names and variable content. For example, the “+/-“ characters in the MOE variables are a minor nuisance, but can be eliminated by changing “+/-“ to “” in all instances. QED. A lot of this cleanup is using the R packages “dplyr” and “janitor”.

My scripts are here:

look for my r scripts starting with ctpp1721….

Now, the API.

I’ve worked with the Census Bureau’s API for certain summary levels (e.g., “geo-within-geo” - - congressional districts + counties) and the r package jsonlite to convert json frames into regular r-package “data frames”. The Bureau has a TON of examples on how to use the API to create these JSON (Java Script Object Notation) files. I could probably do better with these conversions, but I rely for the most part on the r-package “tidycensus”

The “curl” API that’s shown in the CTPP Data Portal API example is total greek to me. I’ve never hard of “curl” and I have no idea how to integrate it in my r scripts. So, we need a TON of examples on how to use the CTPP Data Portal API, as well as examples of how to use the API in R-package and Python scripting. 

As an alternative to this curl / and long, hard-to-interpret API calls, it would be immensely useful to update the R package “ctppr” to incorporate the new 2017-2021 CTPP data. 

Immensely useful.

Somehow, the older version of the ctppr r-package, by Westat, is no longer working. I’m assuming this is because the API has been totally re-built into a newer system.

So, in sum, the CTPP Data Portal works as intended for small and medium-sized data requests. For LARGE data requests (> 30,000 data cells) the software stops to work, and recommends the user to use the API.

If the new Data Portal API was “wrapped up” in a R-package wrapper, I’d be pleased.

Hope this review is of interest,

Chuck Purvis
Hayward, California