After reading up on what the US Census is doing to the 2020 Census data, I have to consider if I can really trust the data to be used for other applications?
You see, the US Census decided to use “differential privacy”, a mathematical technique to introduce statistical noise to “blur” the data. Their reasoning is that because we have powerful computers, anyone could use other data and cross reference the census data to find information on individuals. You can read about it from the US Census here.
However, what happens to the census data, especially at lower levels like the block level, they start messing around with actual numbers of people and other traits like ethnicity, age, sex, etc. For example, according to the US Census, 48 people live on New York’s Liberty Island, but actually no one has lived there since 2012. It also looks like some US States and organizations will be legally challenging the US Census data and the differential privacy technique. To be fair, the US Census has always implemented privacy safeguards using different techniques like “swapping”, but this new technique seems pretty drastic and really not needed. As data scientists duke it out, you can read all about this issue here.
Will you trust the 2020 Census data? You will need to look at it in your own jurisdiction. Burbank’s population strangely went down, so I need to investigate and look at the data more closely. We will need to determine if we can use this data at a tract, block group, or block level to make more informed and accurate decisions. As they say, your mileage may vary. They also say buyer beware!