Addressing Solar’s Growing Pains with Data Standards & Analytics
Originally posted on pv magazine USA. Written by Data Engineer Paul Young.
It is hard to fathom just how much solar has been installed in the last few years—over 1.3 million systems added 36GW of capacity to the grid. As the industry catches up to this rapid growth, it is important to take stock of some of the challenges that lie ahead. While solar has been around for over forty years, 90% of solar systems are less than five years old. Only a handful of industry players have experience managing, monitoring, and maintaining large distributed fleets, and fewer still have done so with fleets of any significant vintage.
At kWh Analytics, we have aggregated the solar asset performance data of nearly 20% of the U.S. market, allowing us to study the challenges that these organizations face as they grow and mature.
Challenge No. 1: Scaling data management practices.
Processes that worked while managing hundreds of systems start to break down when managing thousands of systems. Building out databases and servers, handling multiple monitoring systems, and setting up robust quality control processes and pipelines require good planning and timely execution. Fast growth can wreak havoc on these processes, as teams will often find themselves continuously struggling to scale up their data management tools in order to keep up with the company’s growth. In the software engineering world, we refer to this problem as ‘technical debt.’
While there are no precise definitions of technical debt, here is the general concept: A debt is created when engineering teams sacrifice quality for speed. This concept originated in the software engineering world, but it is very applicable to data management and warehousing projects. While some technical debt is normal, if teams let it build up too much, it can create problems. Quality control issues, software bugs, and long execution times are all symptoms of excessive technical debt. Good software teams will plan regular cycles of code refactoring and maintenance in order to ‘pay’ down their debt, but this is difficult in a fast-growing environment. As new solar systems come online and more data sources get added to the pipeline, data teams find themselves overwhelmed. Carving out the time to step back and properly build out the necessary infrastructure becomes increasingly difficult, if not impossible.
Challenge No. 2: Finding cost-effective ways to diagnose O&M issues across large fleets.
This challenge will become more pronounced over the next few years. The vast majority of installed capacity is relatively new, so there is limited data available on how well current modeling techniques predict the performance of aging fleets. Revenue models usually account for module degradation, defaults, and O&M costs. But residential O&M presents unique challenges due to the distributed nature of the systems. Smaller O&M issues (I like to call them micro O&M), like moderate soiling and gradual increases in shading, can be hard to detect because the signal-to-noise ratio can be too small. And even if we could reliably detect these micro O&M issues, sending maintenance crews out to fix these problems would be cost prohibitive. But spread across a large fleet, these micro O&M issues could add up to real revenue loss.
Data analytics can be a valuable tool to assist fleet managers in distributed O&M efforts by allowing them to remotely diagnose problems and prioritize their O&M efforts in order to maximize ROI. Analytics allow us to accurately identify underperforming PV systems and to diagnose the specific problems associated with those systems.
However, in order to implement these tools, you need good data. And one of the lessons that we have learned at kWh Analytics is that data availability and data quality varies greatly across the industry. Data rarely lives in a single silo. Equipment manufacturers, installers, monitoring companies, developers, and financial institutions are all generating, sharing, and consuming data. This has resulted in a byzantine landscape of nomenclature, data quality, and data formats.
Where do we go from here?
As solar scales, data standards and analytics become more important than ever before. Making smart investments in data infrastructure can pay dividends on many fronts:
On an organizational level, investing in data enables teams to better prepare for growth and allows for smoother scaling of operations. And as the organization grows, data becomes increasingly useful for business intelligence and analytics driving further efficiencies.
On an industry level, creating unified data standards can help the solar industry reduce market inefficiencies and lower costs for consumers. To move our industry forward, Orange Button, a program of the U.S. Department of Energy SunShot Initiative, has organized the creation and adoption of industry-led open data standards. As a part of Orange Button, kWh Analytics is creating a new data translation tool that will ease the solar industry’s transition to a unified solar dataset by translating original data formats to consistent data standards. This technology is currently being developed with lessons learned from the development of our HelioStats platform, which is capable of synthesizing thousands of data points on project payment and performance. Leading solar companies now implementing Orange Button include Wells Fargo, Sunnova, and Sunrun.
If you would like get involved with the Orange Button initiative, visit the Orange Button website or come meet with Orange Button participants at SPI.