Presented By:
David Wallach
Duke Energy
TechCon 2017
Abstract
Duke Energy is embarking on a project to update its Transmission Health & Risk Management (HRM) strategy (Kurant, 2016). An initial focus will be on power transformers as the most capital intensive asset in the Transmission system. Duke Energy has utilized flavors of HRM in the past and present; however, as new data mining and algorithm techniques come into play, we must adjust our strategy. Mergers create opportunities to seek new common approaches. Past methods have been primarily manual in approach and use snapshots of data that quickly grows stale. New approaches with a merged fleet containing large numbers of power transformers must be automated to create a health index dashboard. This data will be used to drive maintenance plans, business cases for proactive replacement opportunities, and budgets. We will discuss how we are addressing issues with data integrity, the aging infrastructure, and depleting resources as we develop this update prioritization strategy.
Introduction
Duke Energy Transmission has an Asset Management team that includes a Special Operations Group whose role is to develop and improve strategies to reduce equipment failures and which translates to increased grid reliability. HRM is the most powerful weapon we could have in our arsenal therefore is our focus. We have continued to hone past methods but we are being overrun due to lack of resources, industry know-how leaving the workforce (retirements), and inability to bring on new technologies and tools. Our methods are almost entirely manual; our data is disjointed and incomplete producing a reactionary snapshot in time, with very low confidence of accuracy and results that are minimally effective at best. It is analogous to an enemy with advanced weaponry while we are armed with bows and arrows.
Today, Duke Energy Transmission uses a snapshot approach. Data is mined and a Criticality, Health and Risk (CHR) score is generated. A CHR score is calculated for each transformer where criticality represents the consequence of failure, health represents the probability of failure and risk is represented by criticality multiplied by health. A watch list is produced of our fleet’s worst performing transformers that require actions. Asset Management SME’s speak monthly on a conference call to evaluate the current state of problematic assets. It takes weeks of effort to generate this report that is updates once or twice a year. The data becomes stale after it is mined. The input to the scoring process includes:
- DGA (IEEE C57.104-2008)
- Duval Triangles (IEC Standard 60599)
- Normalized Energy Intensity (NEI) (Jakob, Dukarm, 2015)
- Normalized Energy Intensity 3 (NEI3) (Jakob, Dukarm, 2015)
- Insulation Power Factor (IEEE C57.152-2013)
- Criticality
Our process must be updated and become more automated/real-time.
Problem Statement
We need HRM as our environment is changing and we cannot continue to do things the way we have in the past. The changes in front of us present us with both threat and opportunity. How we respond will determine our success or our failure.
The proposed HRM analytics platform is necessary to not only manage and mitigate our threats and risk today, it is essential to meet the evolving demands in front of us coming from poor data integrity, an aging infrastructure, and depleting resources.
The first critical aspect of the problem is Data Integrity. The condition of our data today is disjointed, inconsistent, incomplete, and ultimately un-actionable. The process of collecting necessary data is referred to as ‘data dumpster diving’ and produces unreliable results with many assumptions, huge gaps, and low accuracy.
The second critical aspect of the problem is our Aging Infrastructure. Our transmission infrastructure experienced a growth boom in the late 1970’s and 1980’s that is quickly approaching theoretical end of life. In our current environment we are not able to evaluate our fleet in a comprehensive manner. We maintain our assets using engineering judgment and experience and respond to problems as they arise vs. preventing them from occurring. We have embraced condition-based maintenance however that is not a complete solution. We do not get optimal lifecycle from our assets and we don’t know which ones will fail first. Combining these points with an expected increase in failure rate as these assets age we are facing a critical juncture.
The final critical aspect is Resource Constraints. The current resources, processes and skill sets within Transmission struggle to meet the demands of our work load today much less having the ability to mobilize and effectively meet the demands on our horizon.
The combination of this poor data quality, aging infrastructure, and depleting knowledge and skill presents us with a quite a problem. What we see on our horizon without HRM is increased failures, increased customer outages, degrading reliability, increased environmental exposure and poor grid integrity.
Data Integrity
Information we need to run our business comes from data. Data is the foundation of all of our efforts, and any analytics solution. The integrity of the data is paramount to success and the quality of the data is a cultural issue throughout the utility industry. Without data integrity it is impossible to forecast or predict with any confidence and have any impact on the looming failures.
The data necessary to support Duke Energy Transmission is located in various source systems and states throughout our operating regions. A true statement is that in all regions within Transmission our data is incomplete, inconsistent, low quality, and may be missing all together. Access to information from this data is necessary for Asset Management to support Transmission in its core functions: asset condition health maintenance, asset failure risk mitigation, optimization of resources, and ultimately protection of grid reliability.
Achieving the data integrity necessary to succeed will require a drastic culture shift throughout Transmission. It is essential that we as an organization view our intellectual assets as being as valuable as our physical assets. We must protect our data integrity and govern its collection and maintenance with the same vigilance and dedication we do protecting the grid itself. In reality that is exactly what it is doing.
The Asset Management group will set the standards for data collection and quality. We must also embark on an aggressive educational campaign throughout all of Transmission starting with leadership and successfully deliver the big picture that leads to buy-in to each and every front-line engineer and technician. Setting clear and concise expectations for how data is viewed and handled throughout our organization is not optional if we are to succeed. Every employee within Transmission plays a critical role in the management of our data assets and must understand what that is.
Some specific examples of how data is viewed and managed include:
- Specific task coded maintenance and corrective work orders in place of the higher tiered work orders used frequently today. This practice should reduce to a minimum the amount of field work completed being done under a generic or unspecified work order code. Today, preventive maintenance is typically coded better than corrective maintenance.
- Deployment of user enabling (not inhibiting) electronic mobile data collection devices. Combined with the task specific work order this will enable us to collect consistent, fluid and accurate field data. This data source will become our most valuable in the future and is the one that is least available today.
- Establish data governance “Czars” supporting the Transmission enterprise who protect and manage Transmission data coming from all sources. This includes process, procedure, tools, communications, and storage to name a few.
Aging Infrastructure
The fleet across all equipment classes is nearing end of life for the vast majority of our assets caused by a growth boom in the late 1970’s through the mid 1980’s. This bubble of assets is degrading grid integrity, increasing the need for emergent and emergency work, presenting an ominous threat to grid reliability and to safety. Looking at one asset type in particular, Power Transformers, we can easily see in Table 1 the threat that is upon us.

Duke Energy has 8,026 High Voltage Transformers that are over 30 years of age; this represents 70% of our fleet. Using an industry average of 0.5% fleet failures annually and the drastic decrease in transformer life expectancy after the age of 60 the projection is of transformer failures for Duke Energy over the next 50 years.
One can see that we should expect our transformer failures to rise to a rate of 40 or more a year in the next ten years, however from there on it almost doubles in each 10 year band. We took a very conservative approach and it still shows a sharp increase in our very near future that will not slow down over time, it will continue to increase.
This threat does not only apply to the power transformers, it applies to the grid as a whole. When the transformers were installed we also installed the circuit breakers, switches, instrument transformers, conductors, structures, bus, and capacitor banks and so on. Although each of these asset classes would have a different lifecycle and failure modes the fact still remains; we have an aging infrastructure that is coming toward end of life in an escalated fashion. We have both the threat of failure and obsolescence to contend with. Without a precision tool to identify the biggest threats, effective mitigating action and prioritization of work we do not have a prayer in this battle.
Resource Constraints
The power industry as a whole expects 30% of their employees to retire in the next 5 years taking with them an estimated 60% of the current knowledge. Duke Energy Transmission Engineering Asset Management has 30% of their existing employees as retirement eligible. How much of our knowledge should we expect to loose with them? We can hire but we cannot replace the tribal knowledge that will be lost when these employees leave Duke Energy.
We are already aware of the large number of retirement eligible employees as well as many who are leaving earlier in their career for other opportunities. We cannot hire and train in our traditional manner to this fill this gap. The intelligence leaving is largely tribal knowledge assembled over years of firsthand experience, and it hasn’t been documented and organized in a retainable manner. It exists in the minds of our employees. This is why in today’s world the solution to so many things is a specific person, we have people who get things accomplished, not processes or practices that do.
So what will we do when these people are not there any longer? Many of these employees represent single points of failure in a process or activity that will disappear when they do. How do we not only keep up what we are doing today but address all of these new challenges in front of us as our human resources leave, taking with them so much of our knowledge?
Solution Plan
The HRM Platform that Duke Energy is planning will harness all of the power of our resources and provide the optimal results with the minimal input. Duke Energy Transmission will have a single source for conditional data with full situational awareness and intelligence.
We start with quality data being accessed in a near real time basis to avoid “garbage in- garbage out.” Engineering Principles, Statistical Analysis, Strategic Planning, Financial Planning, and Real-Time Operations will be utilized to provide the most comprehensive, robust and accurate risk management services possible on the market today; empowering our employees, drastically decreasing failures, and extending asset lifecycle, increasing grid reliability and our overall value to the customer.
The three key elements of the solution are machine learning, analytics, and modeling.
Machine Learning
The constant flow of data feeds the machine learning module which will find relationships and patterns unique to each component, asset, fleet, and system.
- 1st Analyze: The module will examine the historical and operating data producing operational & performance characteristic norms and anomalies.
- 2nd Identify: The module will identify failure signatures and failure correlated detection logic
- 3rd Validate: Generate a list of at-risk assets for investigation, followed by field investigation of the at-risk assets that the conclusion of which is fed back into the module.
- 4th Learn: The module will automatically detect new failure modes, build new analytics, establish asset signatures, and provide a 24/7/365 constant monitoring of these norms presenting them to the SME’s through visualization, alarms and tasks.
Analytics
Analytics begins with data discovery, visualization, analysis, and ends in action. The analytics will look at statistical diagnostic data including that produced by the machine learning element and apply known engineering principles and specifications to identify grid vulnerabilities and provide actionable insights.
The types of analytics we are deploying include but are not limited to:
- Engineering Diagnostics: Duval Triangle, Breaker Timing, Power Factor
- Asset Condition Health: Winding & Paper Condition, Pole Condition
- Grid Performance: SAIDI, SAIFI
- Degradation Models: Asset Remaining Life, Mean Time to Failure
Modeling
The complex modeling piece of the solution provides viable business intelligence that will drive our decisions in a directly applicable manner.
- Work Prioritization: System Equipment Reliability Prioritization (SERP) based on Health & Risk
- Risk Reduction Strategies: Repair Vs. Replace
- Sparing Strategies: Inventory & Logistics
- Optimization of Resources: Return on Investment, Remaining Life, Repair Vs. Replace
Part of the solution again is the data. We need to institutionalize the data in a way that it is presented to our new employees in an efficient, effective, and actionable manner. Allow the intelligent, yet inexperienced, engineers to work with quality information that has meaning provided to them in a fluid manner with confidence. If we hire the skill sets that understand how to use data, the power of analytics, and strong educational foundations in engineering, we can train them to meet the demands of our grid with a platform such as HRM. The new talent would be guided where to go to find the problems vs. having to know the field equipment personally. They will be able to address and learn in a way that uses their skill most effectively and with confidence understand where they do not need to spend their time.
Scope
This Duke Energy initiative is planned for pre-deployment and continued full deployment phases. The pre-deployment scope will focus on one region and one asset type; Power Transformers. This phase of the initiative will also include our Condition Based Maintenance (CBM) monitoring devices with the assets we will model since these devices are a primary data source for our key assets, we must be able to discern a bad sensor from a bad asset when the data we are receiving is presenting anomalies of concern. The only effective way to do this is to treat the sensor as an asset itself with its own failure modes that can be recognized by our algorithms.
The full deployment phase will extend Power Transformers throughout all 4 Regions as well as add Circuit Breakers and Line Structures to the asset groups. This will provide a complete yet skeletal view of the whole Transmission Grid from within the HRM platform.
The expanded scope beyond the first 6 years could include additional asset groups such as: Batteries, Instrument Transformers, Step-voltage Regulators, Reactors, Relays (Protection Zones), Switches & Disconnects, Digital Fault Recorders/Disturbance Monitoring Equipment, Buses Capacitor Banks, Cameras, and Phasor Measurement Units.
Duke Energy is planning to spend approximately 6 months mobilizing our data for Duke Energy Progress (DEP) into the Information Management Architecture (IMA) data hub and Pi Historians prior to standing up the HRM platform. We will begin data work in the 3 remaining regions immediately after DEP comes up and work on data collection and clean-up efforts continually until the regional HRM deployment is scheduled.
Conclusion
A list of specific Key Performance Indicators (KPI) identified for the HRM Initiative includes:
Reduce Failures & Outages
- 10% reduction in SAIDI and reduced safety risks. Eliminate unnecessary work and optimal scheduling of necessary work as well as minimizing Emergent and Emergency work and avoiding catastrophic failures. This is estimated to result in ~$6MM in annual economic benefit during the pre-deployment and ~$36MM in annual economic benefit for the full deployment to be realized by Duke’s customers.
Optimize O&M
- Through more efficient maintenance operations, improved asset visibility and reporting across the enterprise. Elimination of truck rolls and labor intensive data collection and evaluation processes. This is estimated to represent an overall 2% reduction in O&M resulting in ~$300k annual economic value during the pre-deployment and ~$5MM in annual economic value during the full deployment.
Optimize Capital
- CapX deferral/improvement in capital efficiency through predictive intelligent asset management. Precision capital spending focusing on replacement of those assets that will have the largest and most direct impact on risk and the threat of failure. This is estimated to be an overall 4.5% reduction in Capital spending, resulting in ~$1.5MM in annual economic value during the pre-deployment and ~$32MM in annual economic value during the full deployment.
References
[1] Kurant, Nicole. “Transmission Health & Risk Management (HRM)”, Duke Energy Internal Report, 2016.
[2] IEEE. “Guide for the Interpretation of Gases Generated in Oil-Immersed Transformers.” IEEE C57.104-2008.
[3] IEC 60599. “Mineral Oil-filled Electrical Equipment in Service – Guidance on the Interpretation of Dissolved and Free Gases Analysis.” IEC 60599-2015.
[4] Jakob, Fredi and Dukarm, James. “Thermodynamic Estimation of Transformer Fault Severity.” IEEE Transactions on Power Delivery, Vol 30, No.4, August 2015.
[5] IEEE. “IEEE Guide for Diagnostic Field Testing of Fluid-Filled Power Transformers, Regulators, and Reactors.” IEEE C57.152-2013.