An effective and efficient approach for manually improving geocoded data
2008

Improving Geocoded Data Quality in Health Research

Sample size: 22317 publication Evidence: moderate

Author Information

Author(s): Daniel W. Goldberg, John P. Wilson, Craig A. Knoblock, Beate Ritz, Myles G. Cockburn

Primary Institution: University of Southern California

Hypothesis

What is the most cost-effective method for improving geocoded data quality in health-related datasets?

Conclusion

Manual geocode correction is a feasible and cost-effective method for improving the quality of geocoded data.

Supporting Evidence

  • Geocode correction improved the overall match rate from 79.3% to 95%.
  • 12,280 records (55%) were successfully improved through manual correction.
  • The average processing time per record was 69 seconds.
  • Spatial shifts averaged 9.9 km between original and corrected geocodes.
  • Building centroid accuracy geocodes increased from 0 to 2,261.

Takeaway

This study shows that fixing location data by hand can make it much better and is worth the time spent.

Methodology

The study involved manually correcting geocodes in five health-related datasets using a web-based interactive approach.

Potential Biases

Potential bias due to reliance on the accuracy of the Google Maps API and the initial geocoding process.

Limitations

The study lacks ground truth data for the addresses and relies on the accuracy of the Google Maps API.

Participant Demographics

Participants included four full-time staff and three volunteer graduate students.

Digital Object Identifier (DOI)

10.1186/1476-072X-7-60

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication