Document Type


Publication Date



Solar energy is a key renewable energy source; however, its intermittent nature and potential for use in distributed systems make power prediction an important aspect of grid integration. This research analyzed a variety of machine learning techniques to predict power output for horizontal solar panels using 14 months of data collected from 12 northern-hemisphere locations. We performed our data collection and analysis in the absence of irradiation data—an approach not commonly found in prior literature. Using latitude, month, hour, ambient temperature, pressure, humidity, wind speed, and cloud ceiling as independent variables, a distributed random forest regression algorithm modeled the combined dataset with an R2 value of 0.94. As a comparative measure, other machine learning algorithms resulted in R2 values of 0.50–0.94. Additionally, the data from each location was modeled separately with R2 values ranging from 0.91 to 0.97, indicating a range of consistency across all sites. Using an input variable permutation approach with the random forest algorithm, we found that the three most important variables for power prediction were ambient temperature, humidity, and cloud ceiling. The analysis showed that machine learning potentially allowed for accurate power prediction while avoiding the challenges associated with modeled irradiation data.


This is an open access article published by and distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. CC BY 4.0

Sourced from the published version of record cited below.



Source Publication