A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics

2010

A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics

Sample size: 4910 publication Evidence: high

Author Information

Author(s): Tonny J. Oyana

Primary Institution: Southern Illinois University

Hypothesis

The linear-like pattern of elevated blood lead levels discovered in the city of Chicago may be spatially linked to the city's water service lines.

Conclusion

The FES-k-means algorithm produces clusters similar to the original k-means method at a much faster rate and provides efficient analysis of large geospatial data.

Supporting Evidence

The FES-k-means algorithm allows for efficient analysis of large geospatial data.
It produces clusters similar to the original k-means method at a much faster rate.
The study identified a robust pattern of elevated blood lead levels among children that was missed in previous analyses.

Takeaway

This study created a new way to group data that helps find patterns in health data faster and better, especially for understanding diseases.

Methodology

The study tested the FES-k-means algorithm on two real datasets and one synthetic dataset using a two-step approach of data training prior to clustering.

Limitations

The algorithm is limited to handling point data and may not effectively cluster other data types such as lines or polygons.

Participant Demographics

The datasets included georeferenced data on adult asthma in Buffalo, New York, and elevated blood lead levels linked with housing unit ages in Chicago.

Digital Object Identifier (DOI)

10.1155/2010/746021

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication

Home