The Census Bureau released the five-year, 2016-2020 American Community Survey tables today. The r-package TIDYCENSUS worked without a hitch this morning.
In case folks are interested in using TIDYCENSUS to examine the three sets of non-overlapping periods: 2006-2010, 2011-2015 and 2016-2020, I’ve updated my R script for all states, counties, places and the US. This is just for the means of transportation to work, plus a few key demographic variables.
https://gist.github.com/chuckpurvis <https://gist.github.com/chuckpurvis>
Important to note is that the 2016-2020 ACS is based on 2020 Census geography. The older datasets are based on 2010 Census geography. This isn’t a big deal if your geographic areas haven’t changed, 2010 to 2020, but I’d be careful when comparing older (pre-2020) to newest (2016-2020) data, especially for smaller geographies such as tracts, block groups and places.
My next step would be to do some count checks of geographies included in Census 2020 PL 94-171 vs the 2016-2020 ACS. Check to make sure that the geographies (block groups, tracts, places, counties) match between the Decennial 2020 and ACS 2016-2020.
And to reiterate, there is no single year ACS data for 2020. Those tables are using the “experimental weights” and are only published at the nation and state level.
Happy St Patrick’s Day to all!
Chuck Purvis,
Hayward, California
Dear Colleagues,
I am beyond excited to announce that the Census Data for Transportation Planning Conference scheduled for June 7 - 9, 2022 in Reno, NV has opened for registration!
This conference promises to be heavily technical, interactive and peer to peer, featuring many workshops and sessions on hot topics (equity, workplace geo-coding, tools for data access, geography wrangling, and more), the latest research, and policy implications for census and census derived data.
Of course, we will include a healthy dose of social activity as well!
Early bird pricing is available until April, 29.
See the conference agenda and register today at: https://cvent.me/l7X2ma
Please don't hesitate to reach out to me with any questions. I can't wait to see you all.
Penelope Weinberger
She/They
CTPP Program Manager
AASHTO
Ctpp.transportation.org
Hello:
I am still a little confused why using overlapping ACS datasets is not good practice. Can someone explain it to me?
Thank you.
[cid:image001.png@01D83956.276DFD40]
David Heller, PP/AICP
Program Manager - Systems Performance and Subregional Programs
South Jersey Transportation Planning Organization
782 S Brewster Road, Unit B6
Vineland, New Jersey 08361
(856) 794-1941 | www.sjtpo.org
From: Benjamin Gruswitz <bgruswitz(a)dvrpc.org>
Sent: Tuesday, March 15, 2022 4:59 PM
To: The Census Transportation Products Program Community of Practice/Users discussion and news list <ctpp(a)listserv.transportation.org>
Subject: [CTPP News] Re: CTPP commuter flows in strong MCD states (Vermont test case)
Yes, Chuck, this is a data advantage for strong MCD regions--the workplace allocation is complete at this subcounty level that covers all areas of each county (as opposed to place, which doesn't have county-wide coverage). And our TAZs nest within our municipal boundaries, so fitting their flows to the MCD total is a good way to go for adjustments. The only issue in our region is that Philadelphia is both a county and MCD, so we don't get a subcounty control for our TAZ workplace fitting within our high pop/high employment urban center the way we do for our smallest boroughs and townships (we have one borough with a population of 10 and employment of 25).
Thanks for always pointing us to good resources and encouraging our experienced and burgeoning R users to explore CTPP data with that toolset!
Ben
Working from Home | 301.655.3170
Ben Gruswitz, AICP | Manager, Socioeconomic & Land Use Analytics
(Pronouns: he/him)
Delaware Valley Regional Planning Commission
190 N Independence Mall West, 8th Floor
Philadelphia, PA 19106-1520<https://www.google.com/maps/place/Delaware+Valley+Regional+Planning+Commiss…>
215.238.2882 | www.dvrpc.org<https://www.dvrpc.org/>
Subscribe<https://app.e2ma.net/app2/audience/signup/1808352/1403728/> | Facebook<http://www.facebook.com/DVRPC> | Twitter<http://www.twitter.com/DVRPC> | Instagram<http://www.instagram.com/DVRPC> | LinkedIn<http://www.linkedin.com/company/delaware-valley-regional-planning-commission> | YouTube<https://www.youtube.com/channel/UCEU8UI5_iGkVypHP93b5jLA>
[DVRPC]<https://www.dvrpc.org/>
On Tue, Mar 15, 2022 at 4:05 PM Charles Purvis <clpurvis(a)att.net<mailto:clpurvis@att.net>> wrote:
Being a west coaster, I rarely dabble in MCDs - Minor Civil Divisions, or NECTAs (New England City and Town Areas). I thought this deserves some exploration.
I created a new version of my R-package CTPPr script that pulls in intra-state Vermont total commuters: county-to-county, tract-to-tract, and MCD-to-MCD. I’ve shared my Vermont code on my GIST GITHUB. I screwed up yesterday, and had the other scripts in “secret” mode. Oops, sorry. I’ve made the correction.
https://gist.github.com/chuckpurvis
There are 14 counties in Vermont, 184 census tracts, and 255 MCDs (towns) in Vermont. The 255 MCDs are “wall-to-wall” coverage of the entire state (i.e., no lingering unincorporated “balance of county” areas.) I was surprised that there are fewer census tracts than MCDs in Vermont, but I had some notion that the MCD-to-MCD flow data could be quite valuable (in certain states!)
According to the CTPPr documentation, probably the official CTPP documentation, as well, there are MCD-to-MCD commuter flows for the twelve “strong MCD” states.
From some random US Treasury document:
"Since the government services provided by MCDs differ greatly by state, the Census Bureau refers to
twelve states with MCDs that generally provide a wide range of general government services as “strong-
MCD” states. In these states, MCDs are generally are treated as municipalities according to state statutes
and codes. In eight other states, MCDs typically play less of a governmental role and provide more limited
government services, even though they are still active governments (“weak-MCD” states). The twelve
strong-MCD states are Connecticut, Maine, Massachusetts, Michigan, Minnesota, New Hampshire, New
Jersey, New York, Pennsylvania, Rhode Island, Vermont, and Wisconsin. The eight weak-MCD states are
Illinois, Indiana, Kansas, Missouri, Nebraska, North Dakota, Ohio, and South Dakota."
Here are the highlights of this Vermont test case:
Total Workers, Intra-State, Vermont:
County-to-County = 298,422 total workers
MCD-to-MCD = 299,415 total workers
tract-to-tract = 214,970 total workers.
The county-to-county and MCD-to-MCD totals for Vermont should be very, very close, since they both have the “standard allocation procedures” that the Census Bureau uses to impute missing workplace to the county and place level. I’m pretty sure the difference between county-to-county and MCD-to-MCD is rounding issues? Can never tell.
The tract-to-tract file does not have the standard allocation procedures applied: it’s the raw data, rounded of course. If I were Vermont, I’d stick with MCD-to-MCD flows as the best bet for controls. Adjust/factor any of the TAZ-to-TAZ flow data to MCD-to-MCD.
Happy Ides of March,
Chuck Purvis
Hayward, California
_______________________________________________
CTPP mailing list -- ctpp(a)listserv.transportation.org<mailto:ctpp@listserv.transportation.org>
To unsubscribe send an email to ctpp-leave(a)listserv.transportation.org<mailto:ctpp-leave@listserv.transportation.org>
Being a west coaster, I rarely dabble in MCDs - Minor Civil Divisions, or NECTAs (New England City and Town Areas). I thought this deserves some exploration.
I created a new version of my R-package CTPPr script that pulls in intra-state Vermont total commuters: county-to-county, tract-to-tract, and MCD-to-MCD. I’ve shared my Vermont code on my GIST GITHUB. I screwed up yesterday, and had the other scripts in “secret” mode. Oops, sorry. I’ve made the correction.
https://gist.github.com/chuckpurvis <https://gist.github.com/chuckpurvis>
There are 14 counties in Vermont, 184 census tracts, and 255 MCDs (towns) in Vermont. The 255 MCDs are “wall-to-wall” coverage of the entire state (i.e., no lingering unincorporated “balance of county” areas.) I was surprised that there are fewer census tracts than MCDs in Vermont, but I had some notion that the MCD-to-MCD flow data could be quite valuable (in certain states!)
According to the CTPPr documentation, probably the official CTPP documentation, as well, there are MCD-to-MCD commuter flows for the twelve “strong MCD” states.
From some random US Treasury document:
"Since the government services provided by MCDs differ greatly by state, the Census Bureau refers to
twelve states with MCDs that generally provide a wide range of general government services as “strong-
MCD” states. In these states, MCDs are generally are treated as municipalities according to state statutes
and codes. In eight other states, MCDs typically play less of a governmental role and provide more limited
government services, even though they are still active governments (“weak-MCD” states). The twelve
strong-MCD states are Connecticut, Maine, Massachusetts, Michigan, Minnesota, New Hampshire, New
Jersey, New York, Pennsylvania, Rhode Island, Vermont, and Wisconsin. The eight weak-MCD states are
Illinois, Indiana, Kansas, Missouri, Nebraska, North Dakota, Ohio, and South Dakota."
Here are the highlights of this Vermont test case:
Total Workers, Intra-State, Vermont:
County-to-County = 298,422 total workers
MCD-to-MCD = 299,415 total workers
tract-to-tract = 214,970 total workers.
The county-to-county and MCD-to-MCD totals for Vermont should be very, very close, since they both have the “standard allocation procedures” that the Census Bureau uses to impute missing workplace to the county and place level. I’m pretty sure the difference between county-to-county and MCD-to-MCD is rounding issues? Can never tell.
The tract-to-tract file does not have the standard allocation procedures applied: it’s the raw data, rounded of course. If I were Vermont, I’d stick with MCD-to-MCD flows as the best bet for controls. Adjust/factor any of the TAZ-to-TAZ flow data to MCD-to-MCD.
Happy Ides of March,
Chuck Purvis
Hayward, California
This may be of interest to users of the 2012-2016 CTPP.
The Part 3 journey-to-work “flow” tables DO INCLUDE the Census Bureau’s (primary) allocation of workers to county-of-work, and place-of-work (and by extension, POWPUMA-of-work, since all POWPUMAs are single counties or groups of counties). So, the user can readily match the CTPP commuter flows with “standard” American Community Survey 2012-16 tables (retrieved via data.census.gov <http://data.census.gov/> or the r-package “tidycensus”).
On the other hand, the Part 3 CTPP tables DO NOT INCLUDE the “extended” allocation of workers, that is, allocated (imputed) to the tract, TAZ or TAD of workplace. I really can’t point to the documentation where this is made any clearer.
So, I examined the county-to-county total commuters (CTPP Table A302301) for 9 states, plus the Northern California Mega-Region (24 counties). I then compared the county-to-county total commuters to the tract-to-tract summary level, summed to county-to-county level.
The basic finding is that the tract-to-tract total commuter file is missing about 20 percent of the workers. They were *not* allocated (imputed) to the tract-of-work level. (I can think of dozens of strategies to handle this predicament.)
Here’s a table summarizing these results.
I’ve also uploaded my R-scripts for three states: Idaho, Alaska, and Delaware, to my GIST/Github:https://gist.github.com/chuckpurvis <https://gist.github.com/chuckpurvis>
Can somebody double-check my work? Check it using datafiles downloaded from the Beyond 2020 CTPP and/or datafiles pulled using the r-package CTPPr. Sometimes CTPPr doesn’t work for VERY large data pulls, or even smaller areas (like Alaska state for whatever reason.)
Hopefully this is worth discussing.
cheers and Happy Pi Day!
Chuck
[https://files.constantcontact.com/ccfdce84701/6fbaaa3b-507d-401a-955f-0d632…]
[https://files.constantcontact.com/ccfdce84701/dbea4755-7fcb-4bd8-9ae3-67fd3…]
Join an Interactive CTPP Training on March 16!
Getting to Know CTPP Data
Wednesday, March 16th, 2022 - 2:00PM to 4:00PM ET
The monthly CTPP training continues with the next session: Getting to Know CTPP Data. This two-hour training session will provide an overview of the custom data tabulations in the CTPP and cover topics such as ACS data collection, important information to know about what's included - and not - in the CTPP, and how CTPP compares with other data sources. You'll learn about how the data is collected, how to interpret the data, significance testing, the margin of error, and more!
Click here to register today.<https://aashto.adobeconnect.com/ctppdata_march2022/event/registration.html>
Course Format:
* Overview of ACS data collection and what to know about ACS data
* What is included - and not - in CTPP data
* Comparison with other data sources
REQUIREMENTS:
* The Adobe Connect Application (download available here<https://r20.rs6.net/tn.jsp?f=001VnO5dqPjGkUbmkouH7lKD4lbbq0XCGj5qcebMbbIggo…>)
* Upon registration, more background information will be provided that you should review on:
* ACS Questionnaire
* How to use Adobe Connect
Registration is limited, so please reserve a seat only if you plan to attend live.
This is part of a recurring monthly series on the third Wednesday of every month. Stay tuned for more information on the next session on April 20.
Visit https://ctpp.transportation.org/upcoming-events/<https://r20.rs6.net/tn.jsp?f=001VnO5dqPjGkUbmkouH7lKD4lbbq0XCGj5qcebMbbIggo…> to stay up-to-date on future trainings and events.
I’m still learning everything about sharing nicely. I’ve uploaded my r-script to my Github Gist. This might be a better way of “sharing code”?
https://gist.github.com/chuckpurvis/281a786c06593afbf256f184a567a5ce <https://gist.github.com/chuckpurvis/281a786c06593afbf256f184a567a5ce>
This script uses the R package CTPPr to pull the California place-level 2012-16 data on households by household size (5) by vehicles available (5) by Tenure (5). That’s a pretty complex table with 125 cells per geography. And CTPPr automatically downloads the standard estimate (SE, not the 90% Margin of Error), so that’s about 250 records per each piece of geography. Ouch.
The function “pivot_wider” in the r package “dplyr” is used to rewrite the dataset from this “long” format to more of a “wide” format. It’s pulling the “estimates” (Households) separately from the “standard errors”, and then re-combining them.
The result of this first phase is a data set with many fewer rows, and lots of columns with really long variable names. But there’s a solution!
The next phase is to rename variables and re-code variable values into much shorter, mnemonic variable names, and then to do a new set of pivot_wider to create a data set with much easier to read variable names!
I would STRONGLY recommend learning the R package “dplyr” if you’re going to be analyzing census data using either CTPPr or tidycensus.
Take care,
Chuck Purvis
Hayward, California
Help!
I’m trying to fratar (iterative proportional fitting, raking) a 58 by 58 commuter matrix using R packages. I could use some help.
Attached are my initial scripts and input database to create the 58 by 58 “seed” matrix (2012-16 CTPP, Total Workers, Table A302100). I’m just a little mystified as to how to implement the raking/frataring given the different IPF packages available: Ipfp, mlfit, mipfp, rakeR, rake……
My goal is to rake the 58 by 58 county-to-county total commuters for California, using the 2012/2016 CTPP to estimates of individual years: 2012, 2013, 2014, 2015, 2016.
Chuck