Comparing Algorithms for Filling in Missing Weather Data
Author Information
Author(s): de Souza Valter Cesar, Rodrigues Sergio Augusto, Filho Luís Roberto Almeida Gabriel
Primary Institution: São Paulo State University (Unesp)
Hypothesis
This study aims to evaluate the performance of alternative multivariate procedures for principal component analysis (PCA) in imputing missing data in meteorological time series.
Conclusion
The NIPALS-PCA and EM-PCA methods are effective for imputing missing reference evapotranspiration data, especially in scenarios with lower percentages of missing data.
Supporting Evidence
- NIPALS-PCA showed the lowest MAPE of 15.4% in the 10% missing data scenario.
- EM-PCA performed best in the 50% missing data scenario with a MAPE of 19.1%.
- Both NIPALS-PCA and EM-PCA demonstrated good results in imputation with nRMSE between 10% and 20%.
Takeaway
This study looked at how to fill in missing weather data using different methods, finding that some methods work better than others depending on how much data is missing.
Methodology
The study used simulation to create scenarios of missing data and compared the performance of NIPALS-PCA, EM-PCA, and simple mean imputation across different percentages of missing data.
Limitations
The results may not be generalizable to situations where missing values occur in a non-random manner, and the initial data completion method may have influenced the outcomes.
Participant Demographics
Data collected from 45 automatic weather stations in the São Paulo region, Brazil.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website