03 — Core Domain
GIScience
Why It's So Hard to Match Residence Addresses to Census Blocks — And How to Fix It
3M+
Address-level midlife death records analyzed
6 States
US states covered in the study
2 Types
Measurable problem areas — ghost blocks & multistructure settings
1.5–5%
Midlife death records joined to unpopulated blocks per five-year period
Summary
Do you ever join point data to census polygons or other spatial units containing contextual information?
Myron Gutmann, Stefan Leyk, Hoeyun Kwon, and I explored how often this join goes wrong across a dataset of over 3 million address level midlife death records from 6 US states joined to census blocks. We highlighted some persistent issues with these joins and suggest strategies for mitigation.
While linking addresses to Census boundaries may seem straightforward, even a slight mismatch can place an address into the wrong census block. We defined this challenge as topological overlay uncertainty and identifed two types of measurable problems:
Ghost Blocks: A significant portion of errors involve "ghost blocks" — unpopulated areas like highways and railroad tracks.
Multistructure Settings: Locations like mobile home parks and large apartment complexes are particularly prone to these mismatches due to how postal addresses are assigned.
We proposed several corrective measures, including spatial reallocation and aggregation, to reduce these errors and improve the accuracy of fine-grained spatial analysis.
Citation
Why it's so hard to match residence addresses to census blocks—and how to fix it
Berg, A., Gutmann, M., Leyk, S., & Kwon, H. — DOI: 10.1111/tgis.70225