I was looking for a convenient way to download specific tables from the 2012/16 CTPP
package. I have found that there are two main methods to consider: using the web-based
Beyond 2020 software; and using the r-package CTPPr created by Westat.
The R-Package CTPPr
The CTPPr package is great for downloading simple, one-cell tables, but gets complicated
in multi-cell tables (which is most of the tables!!!) The issue is that CTPPr extracts
the CTPP data in “list” format with just a few columns: geography, characteristic #1,
estimate and margin of error. With a simple one-dimensional table, it outputs four
columns, So, a simple table such as means of transportation to work (18) will have
eighteen records (rows) of data for each piece of geography.
A simple two-way table, say Table A104201 (Vehicles Available (6) by Poverty Status (4)),
will have 24 records (rows) per piece of geography.
A detailed two-way table, say Table A102214 (Occupation (25) by Industry (15)), will have
375 records (rows) of data per unit of geography. That’s a long database.
On the other hand, the r-package TIDYCENSUS can output the decennial census and ACS
databases in either “LIST” format or “WIDE” format. See the TIDYCENSUS documentation of
when you would choose the long “LIST format versus the “WIDE” format.
The “WIDE” format would produce one record (row) of data for each piece of geography. So,
a two-way table of 6 by 4 would yield 49 columns: the geography, the 24 estimates, and the
24 margins of errors.
Unforturnately the CTPPr package does not export data in “WIDE” format i.e., one record
per each geographic area. And I don’t have the R skill sets to convert this type of data
frame from a LIST to WIDE format, and creatively updating variable names to be more
mnemonic, and memorable!
Beyond 2020 for the CTPP
The web-based software “Beyond 2020” has been developed for the Census Transportation
Planning Products (CTPP) as an AASHTO, USDOT, and the CTPP Oversight Board. It’s a State
DOT funded, cooperative program with many actors and moving pieces.
The AASHTO home page for the CTPP is here:
https://ctpp.transportation.org/ <https://ctpp.transportation.org/>
The AASHTO page for retrieving CTPP data is here:
http://data5.ctpp.transportation.org/ <http://data5.ctpp.transportation.org/>
Or bookmark this page for the introduction page to Beyond 2020 for the CTPP2012/16.
http://data5.ctpp.transportation.org/ctpp1216/Browse/browsetables.aspx
<http://data5.ctpp.transportation.org/ctpp1216/Browse/browsetables.aspx>
1. Sign in. You need to be registered (it’s free) to use the Beyond2020/CTPP software.
The “sign in” button is hiding in the upper right hand quadrant of the web page. Remember
your username and password.
2. Explore the upper middle bar. This “gray” bar from left to right shows “data set”
in a drop down menu, and Selected Geography in two drop-down menus for RESIDENCE geography
and WORKPLACE geography. This is key. The user will probably want to define several sets
of residence and workplace geographies for their geo-areas of interest.
a. What I consider key in selecting geographies is to get the labels right. There’s a
pull-down menu in the “select labels” panel in geo-selection panel that opts for “name” or
“key” or “FIPS Code” or “FIPS and Name” or “Split FIPS Code”. I like the “FIPS and Name”
option, since this returns the detailed FIPS code with a detailed “Name” of the geography,
e.g., “Census tract 4001.00 in Alameda County, California”. The FIPS code would be
“06001400100” for that particular tract!
b. Save the geographic selection. Use names that make sense. Override or delete
geographic selection sets until you get everything right!
3. Select the table of interest. A simple one-way table, say, A102103 (workers by
class of worker (9)) will have 19 columns: a column for the geography, and columns for
each of the 9 estimates, and 9 margins of error.
4. A complex two-way table, say, A102214 (workers by occupation (25) by Industry (15))
will have 750 columns!!
5. Check to be sure that the geography label is correct. Again, I prefer having both
the FIPS code and the area name showing!
6. Download the table. As with most software, there are usually more than one way to
do this: the “file” menu in the upper left hand corner of the web page, and a “green arrow
pointing down” in the middle top row.
a. I usually opt for “comma-delimited ASCII format (*.csv), with the “data format”
using “multi-dimensional”. This is most similar to what you see in your current table
view.
b. Test the “list format”… this will give you a narrow, but long database!
c. Test the “XML format”. You can open up the XML file in Excel, and the formatting is
almost the same as you see in your Beyond 2020 session. If your intent is a detailed table
that’s almost ready for publication, this is a good bet.
d. Test the “ESRI shapfile (*.shp)” file format. This works great in producing a
GIS-ready SHP file, but the variable names are basically useless, e.g., a simple two-way
table, say household size (5) by vehicles available (6) will return 60 columns of data
plus the geography column. The variables will be named “F0” through “F59”. Maybe there
are batch processes that will rename variable names in DBF file? Maybe an option is to
import the DBF file associated with SHP file into the R-package, then have a script which
renames the variables for the simple one-way or two-way tables using the analyst’s
mnemonic (memory-jogging) variable names. For me, “PHH1_VHH0_E” is a good mnemonic name
for the Estimate of one-person, zero-vehicle households!
e. I created an R-script which reads in the DBF exported from Beyond 2020, for Table
A202105 (Means of Transportation to Work (18)), for the 58 California counties-of-work. It
uses the R package “dplyr” to rename variable F0 through F35 to something more friendly,
like “dralone_est” and “athome_moe”. I add a few new variables like carpool 2+, carpool 3+
and “transit”, then export the R data frame to both a DBF (readable by the GIS!) and XLSX
Excel format.
f. Renaming variables in either Excel or this R package method is quite a chore. But
if your intent is to get Excel tables that are almost “ready to print” then I would
suggest exporting data from Beyond 2020 in XML format. If your intent is to have a simply
variable name for use in GIS or other statistical analyses, then either edit the Excel
file (add a row and create your own variable names) or use the R package to rename
variables.
Attached is my r-script to rename variables exported from Beyond 2020 in SHP file format.
Hope this is of use. I started work on this earlier this month. I forgot I wanted to share
it.
Chuck Purvis, Hayward, California.