I was looking for a convenient way to download specific tables from the 2012/16 CTPP package. I have found that there are two main methods to consider: using the web-based Beyond 2020 software; and using the r-package CTPPr created by Westat.
The R-Package CTPPr
The CTPPr package is great for downloading simple, one-cell
tables, but gets complicated in multi-cell tables (which is most of the
tables!!!) The issue is that CTPPr
extracts the CTPP data in “list” format with just a few columns: geography, characteristic
#1, estimate and margin of error. With a simple one-dimensional table, it
outputs four columns, So, a simple table such as means of transportation to
work (18) will have eighteen records (rows) of data for each piece of
geography.
A simple two-way table, say Table A104201 (Vehicles Available (6) by Poverty Status (4)), will have 24 records (rows) per piece of geography.
A detailed two-way table, say Table A102214 (Occupation (25) by Industry (15)), will have 375 records (rows) of data per unit of geography. That’s a long database.
Unforturnately the CTPPr package does not export data in “WIDE” format i.e., one record per each geographic area. And I don’t have the R skill sets to convert this type of data frame from a LIST to WIDE format, and creatively updating variable names to be more mnemonic, and memorable!
Beyond 2020 for the CTPP
The web-based software “Beyond 2020” has been developed for the Census Transportation Planning Products (CTPP) as an AASHTO, USDOT, and the CTPP Oversight Board. It’s a State DOT funded, cooperative program with many actors and moving pieces.
The AASHTO page for retrieving CTPP data is here:
http://data5.ctpp.transportation.org/
Or bookmark this page for the introduction page to Beyond 2020 for the CTPP2012/16.
http://data5.ctpp.transportation.org/ctpp1216/Browse/browsetables.aspx
1.
Sign in. You need to be registered (it’s free)
to use the Beyond2020/CTPP software. The “sign in” button is hiding in the
upper right hand quadrant of the web page. Remember your username and password.
2.
Explore the upper middle bar. This “gray” bar
from left to right shows “data set” in a drop down menu, and Selected Geography
in two drop-down menus for RESIDENCE geography and WORKPLACE geography. This is
key. The user will probably want to define several sets of residence and
workplace geographies for their geo-areas of interest.
a.
What I consider key in selecting geographies is
to get the labels right. There’s a pull-down menu in the “select labels” panel
in geo-selection panel that opts for “name” or “key” or “FIPS Code” or “FIPS
and Name” or “Split FIPS Code”. I like the “FIPS and Name” option, since this
returns the detailed FIPS code with a detailed “Name” of the geography, e.g.,
“Census tract 4001.00 in Alameda County, California”. The FIPS code would be
“06001400100” for that particular tract!
b.
Save the geographic selection. Use names that
make sense. Override or delete geographic selection sets until you get
everything right!
3.
Select the table of interest. A simple one-way
table, say, A102103 (workers by class of worker (9)) will have 19 columns: a
column for the geography, and columns for each of the 9 estimates, and 9
margins of error.
4.
A complex two-way table, say, A102214 (workers
by occupation (25) by Industry (15)) will have 750 columns!!
5.
Check to be sure that the geography label is
correct. Again, I prefer having both the FIPS code and the area name showing!
6.
Download the table. As with most software, there
are usually more than one way to do this: the “file” menu in the upper left
hand corner of the web page, and a “green arrow pointing down” in the middle
top row.
a.
I usually opt for “comma-delimited ASCII format
(*.csv), with the “data format” using
“multi-dimensional”. This is most similar to what you see in your current table
view.
b.
Test the “list format”… this will give you a
narrow, but long database!
c.
Test the “XML format”. You can open up the XML
file in Excel, and the formatting is almost the same as you see in your Beyond
2020 session. If your intent is a detailed table that’s almost ready for
publication, this is a good bet.
d.
Test the “ESRI shapfile (*.shp)” file format.
This works great in producing a GIS-ready SHP file, but the variable names are
basically useless, e.g., a simple two-way table, say household size (5) by
vehicles available (6) will return 60 columns of data plus the geography
column. The variables will be named “F0” through “F59”. Maybe there are batch processes that will
rename variable names in DBF file? Maybe
an option is to import the DBF file associated with SHP file into the
R-package, then have a script which renames the variables for the simple
one-way or two-way tables using the analyst’s mnemonic (memory-jogging)
variable names. For me, “PHH1_VHH0_E” is a good mnemonic name for the Estimate
of one-person, zero-vehicle households!
e.
I created an R-script which reads in the DBF
exported from Beyond 2020, for Table A202105 (Means of Transportation to Work
(18)), for the 58 California counties-of-work. It uses the R package “dplyr” to
rename variable F0 through F35 to something more friendly, like “dralone_est”
and “athome_moe”. I add a few new variables like carpool 2+, carpool 3+ and
“transit”, then export the R data frame to both a DBF (readable by the GIS!)
and XLSX Excel format.
f.
Renaming variables in either Excel or this R
package method is quite a chore. But if your intent is to get Excel tables that
are almost “ready to print” then I would suggest exporting data from Beyond
2020 in XML format. If your intent is to have a simply variable name for use in
GIS or other statistical analyses, then either edit the Excel file (add a row
and create your own variable names) or use the R package to rename variables.
Chuck Purvis, Hayward, California.