Time series clustering and physical implication for photovoltaic array systems with unknown working conditions
Abstract How to distinguish exact working conditions is an open topic for a real world solar power station operated under unclear conditions and environmental variations. It is obvious that abnormal working conditions are rarely known beforehand. We hypothesize that currents and voltages of the solar power system could yield some statistical clusters that have certain physical implications. Without any prior knowledge on operation conditions, time series data of currents and voltages are divided into many subsequences to measure the similarities among subsequences through a dynamic time warping method, and then the similarities are classified through the k-means method. Meanwhile, the clusters yielded by the k-means method alone are used for comparison. Both approaches are evaluated by two external measures which suggest four clusters being the best number of clusters. To acquire real world data, a small test facility that is a scale-down version of a solar power station is built and operated under four working conditions such as normal working, partial shading, overall shading and open circuit fault. The experimental results show that four clusters of the unlabeled raw data are related to actual unknown working conditions where the accuracy of the coincidence rates is within 60–100%. It is also observed that the electrical faults are easily distinguished from other conditions with irradiation and temperature variations. The discovery to physical implication of time series clusters is important to develop advanced forecasting and fault diagnosis technologies in the field of photovoltaic power systems.