Solar energy prediction through machine learning models: A comparative analysis of regressor algorithms
2025

Comparing Machine Learning Models for Solar Energy Prediction

Sample size: 21045 publication Evidence: moderate

Author Information

Author(s): Nguyen Huu Nam, Tran Quoc Thanh, Ngo Canh Tung, Nguyen Duc Dam, Tran Van Quan

Primary Institution: Institute for Hydropower and Renewable Energy, Vietnam Academy for Water Resources, Hanoi, Vietnam

Hypothesis

Can machine learning models accurately predict solar energy output using weather-related input variables?

Conclusion

The CatBoost model outperformed other machine learning models in predicting solar energy output, but the overall predictive accuracy was limited due to the absence of specific photovoltaic panel data.

Supporting Evidence

  • The CatBoost model achieved the highest R2 value of 0.608 during training.
  • The study identified ambient temperature and humidity as the most influential factors in solar energy predictions.
  • Machine learning models were evaluated using metrics such as R2, MAE, and RMSE.

Takeaway

This study used different computer programs to guess how much solar energy can be made based on weather conditions, and found that one program was the best at it.

Methodology

Five machine learning models (CatBoost, XGBoost, LightGBM, Gradient Boosting, KNN) were trained on a dataset of 21,045 samples using weather-related input variables to predict solar energy output.

Potential Biases

Potential biases may arise from the dataset's limited geographical context and the absence of diverse environmental conditions.

Limitations

The study's predictive accuracy was limited by the lack of photovoltaic panel-specific technical data and the dataset's geographical and technological constraints.

Digital Object Identifier (DOI)

10.1371/journal.pone.0315955

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication