Can You Trust The 2020 Census Data?

After reading up on what the US Census is doing to the 2020 Census data, I have to consider if I can really trust the data to be used for other applications?

You see, the US Census decided to use “differential privacy”, a mathematical technique to introduce statistical noise to “blur” the data. Their reasoning is that because we have powerful computers, anyone could use other data and cross reference the census data to find information on individuals. You can read about it from the US Census here.

However, what happens to the census data, especially at lower levels like the block level, they start messing around with actual numbers of people and other traits like ethnicity, age, sex, etc. For example, according to the US Census, 48 people live on New York’s Liberty Island, but actually no one has lived there since 2012. It also looks like some US States and organizations will be legally challenging the US Census data and the differential privacy technique. To be fair, the US Census has always implemented privacy safeguards using different techniques like “swapping”, but this new technique seems pretty drastic and really not needed. As data scientists duke it out, you can read all about this issue here.

Will you trust the 2020 Census data? You will need to look at it in your own jurisdiction. Burbank’s population strangely went down, so I need to investigate and look at the data more closely. We will need to determine if we can use this data at a tract, block group, or block level to make more informed and accurate decisions. As they say, your mileage may vary. They also say buyer beware!

2020 Census Redistricting Data

It seems like we have been waiting forever for some 2020 Census Data. Well, you can get started with the P.L. 94-171 2020 Census Redistricting Data. It has been available since August 12. You can find the FAQ page for the product here. Keep in mind the redistricting data only includes demographic characteristics by state, county, city, down to the block level:

  • Race and ethnicity.
  • Population 18 years and over.
  • Occupied and vacant housing units.
  • People living in group quarters like nursing homes, prisons, military barracks and college dorms.

Note you will need to merge the data files with the tract/block group/block TIGER/Line Geodatabases. A feat that has become a little more difficult since you need to manipulate the data a bit to do it. You can read the technical document to find out more.

There is another option and probably an easier one.

