Most of us have all worked at places with some degree of an ‘RCA’ effort. They likely all defined and practiced ‘RCA’ differently, but nonetheless they had something called RCA.
What made one facility better at it than another? Why was one facility’s RCA more effective than another?
Having been in this space for 35+ years, to be honest, this has been quite frustrating for me to watch. As a consultant, our value is like ‘industrial espionage’, but not in the criminal way. We are fortunate to just observe how people operate in different industries, geographical regions and in different cultures. We just see a cross-section of how people and processes behave. Given that, we are in a position to share those general observations.
I could easily make a list of what I see as the Top 10 reasons RCA efforts don’t last, but I’ll spare you the additional reading and focus on what I see as the Top 5.
The best RCA efforts I have seen revolve around a true Champion who walks the talk. They are people that understand what their RCA analysts have to go through in order to be effective. They provide analysts the tools, training, support and expectations they deserve to do the analyses properly.
Unfortunately, these people are rare. They are in these jobs because they WANT to be, not because they HAVE to be. The problem then companies often have, is when they do have such a rare individual, they most certainly will be on the corporate fast track. They will likely be in that position for 1 to 3 years and they move on up.
A successful RCA effort will be institutionalized...meaning it will be ingrained as to how we do business, and it will survive the loss of the Champion and Leadership turnover. This should be a key element of designing an effective RCA. To me, it is the leading reason why RCA efforts fail!
In my company, when a bid comes out for ‘RCA’ and the requirements are generic (non-specific as to what they want), they are just looking for the lowest bidder. RCA is a commodity to such people and the specific RCA approach itself has no value. When RCA is viewed as a commodity and the approach is not valued, all approaches are viewed as equal.
So brainstorming, 5-Whys, Fishbones and evidence-based causal tree approaches are the same in weight. That simply is NOT TRUE. Effective RCA’s require appropriate breadth and depth. In looking at breadth, think of the difference in asking “How Can” vs “Why”.
“How Can” will explore all possibilities instead of just the single obvious observation. Where depth is concerned, stopping at the physics of failure, replacing parts, will not necessarily prevent the next failure. A true RCA will drill down and find out inappropriate decisions that were made and WHY?
If a bearing fails due to fatigue, and we just replace it, is the problem solved? No, where did the fatigue come from? If we find someone misaligned the pump, causing the fatigue, and we discipline them, does the problem go away? No, we need to understand why that person felt aligning the way they did was appropriate. We may find they didn’t know how to align properly, the procedures were obsolete or non-existent, the tools they had were inadequate or they were simply time pressured and took short-cuts. If we go to this depth and solve the systemic problems....then we will work on preventing recurrence.
When anyone is time pressured to do anything, they will often take shortcuts. The most time consuming task in an effective RCA is the collection of evidence to prove our hypotheses. When we are time pressured in RCA, that is where we tend to take the shortcuts.
This is like a detective saying they don’t need evidence from the crime scene to make their case. It just doesn’t work like that, no one goes to court with hearsay and tries to make it fly as fact. An effective RCA effort will require the proper degree of evidence to support their hypotheses.
Unfortunately a compliant RCA effort does not guarantee any improvement in Reliability or Safety. In the field I see many RCA efforts where success is defined as being compliant. Typically this is in highly regulated industries like high hazard and hospital settings.
In the US almost all 6000 hospitals are accredited, meaning they pass the regulatory audit and will get their federal Medicare and Medicaid monies. This accreditation includes their RCA efforts. However, deaths due to medical error are consistently in the Top 5 killers of all Americans.
This demonstrates a disconnect between a compliant RCA system and actual patient safety. An effective RCA effort will measure success based on actual improvements in the process via bottom line metrics (financial, safety, environmental, quality and leading metrics focused on adherence to the proper steps of a true RCA [i.e.- adequacy of evidence collected to support hypotheses]).
A BIG missing link to the traditional application of ‘RCA’ is the understanding why good people often make the wrong decisions, at the time they do. This gets into the field of the Social Sciences. I myself did not have a great enough appreciation for this field until I started researching the correlations between Reliability and Safety.
As RCA analysts we must have a better understanding of human reasoning and the impact of organizational systems on decision-making. Typically engineers shine when delving into the physics of a failure. They are often lost when they delve into understanding the ‘soft’ stuff like understanding human reasoning. Conversely, social scientists shine in understanding decision making and intent, but are lost when it comes to understanding the physics of a failure.
It’s hard to pick from this list the proper priority, as I feel they are all equally important. I find that most of the time, a formal ‘RCA’ is not conducted unless a corporate/site trigger has been met. Such triggers will be thresholds based on production losses, equipment damage, injuries/fatalities and environmental excursions to name a few. However, from an ‘RCA’ purist’s standpoint, that is too late. That is a reactive use of RCA, which is typical. Ideally, we’d like to be able to apply RCA before these catastrophic events so we can avoid the risk of them. How can we do that? We can apply the concepts of effective RCA to chronic failures (the ones that do not rise to the levels of our triggers), hi severity near-misses and unacceptable risks from risk assessments like FMEAs.
While this is my experience, I’d be very interested in hearing from you about places you have worked (or currently work) and what prevents their RCA effort from realizing its potential? Feel free to Contact Me on LinkedIn or just email me at firstname.lastname@example.org.
About the Author
Robert (Bob) J. Latino is CEO of Reliability Center, Inc. a company that helps teams and companies do RCAs with excellence. Bob has been facilitating RCA and FMEA analyses with his clientele around the world for over 35 years and has taught over 10,000 students in the PROACT® methodology.
Bob is co-author of numerous articles and has led seminars and workshops on FMEA, Opportunity Analysis and RCA, as well as co-designer of the award winning PROACT® Investigation Management Software solution. He has authored or co-authored six (6) books related to RCA and Reliability in both manufacturing and in healthcare and is a frequent speaker on the topic at domestic and international trade conferences.
Bob has applied the PROACT® methodology to a diverse set of problems and industries, including a published paper in the field of Counter Terrorism entitled, "The Application of PROACT® RCA to Terrorism/Counter Terrorism Related Events."
Subscribe to our newsletter for industry leading Reliability content & ideas.