Abstract: Traditionally RCA is only utilized after an event has occurred, so how can one call it proactive? This perception is explored through trying to understand the current paradigms that exist about RCA, what it is and when it is used. Do we really have to wait for an undesirable outcome to occur in order to use RCA?
PARADIGM #1: ALL ‘RCA’ IS THE SAME
False. RCA has become a diluted acronym that has essentially been rendered useless in terms of meaning. Everyone says they do RCA and in their own minds they do. This is because there is no universally accepted definition of what RCA is. The major providers of RCA methodologies cannot agree on such a definition and this further confuses the marketplace.
What is the advantage to providers of coming to an agreement on such a definition? If such a definition is agreed upon, their respective methodologies risk not being viewed as unique in the marketplace, therefore this could have a potential negative impact on their business. In any business, the company must establish their uniqueness through their marketing message. If RCA providers agree that RCA is a single definition, will they be viewed as all having the same offering and value? Could RCA then just be viewed as a commodity and the only difference be the price to deliver that commodity in an effective manner?
This author’s opinion is that a standard RCA definition can exist AND the providers still maintain their uniqueness. The RCM community was successful in doing so with the establishment of the SAE RCM Standard JA1011 - The Evaluation Criteria for Reliability Centered Maintenance Processes. Whether you agree with it or not, a standard exists which identifies the essential elements of what is considered RCM. The same can be done for RCA, which would provide potential users a baseline document to evaluate various RCA processes that they may be considering.
This prospect comes with risks as I mentioned earlier, because if some RCA provider’s current methodologies do not meet such a standard, there would be no reason for them to support the standard itself. The fact is that because of this lack of cooperation, methodologies such as 5-Whys and Fishbone Diagrams are permitted to be compared equally to such noted proprietary processes as PROACT, K-T, Apollo, SoLogic, Tripod Beta, The Phoenix Approach, REASON and Taproot as equals in terms of RCA methodologies. This leads to RCA being equated to trouble-shooting, problem solving and brainstorming.
Because of competitive concerns of the providers, the potential users are being shorted the experience and expertise that these providers are capable of providing to them.
Paradigm Shift: Providers and users need to come together and produce an unbiased baseline standard that establishes the minimum requirements for what is to be considered RCA. Attempts have been made at this but progress has been slow.
PARADIGM #2: CHRONIC FAILURES ARE AN ACCEPTABLE COST OF DOING BUSINESS
Think about it, under what circumstances is an RCA typically commissioned? Experience dictates:
- When someone is injured
- Significant production loss
- When there is catastrophic damage
- When there is a regulatory violation
- When an event adversely affects the community
- When there is recognized liability on the part of the company
There will be more, but this list sums up the majority of the conditions, which typically trigger an RCA to be conducted. When RCA is performed only under these conditions, its applicability is reactive in nature. This would indicate that we must wait for the adverse outcome to occur, before we can do something about it.
We must realize while these high visibility type of events must be addressed using analytical tools like RCA; we should not neglect the fact that chronic events typically cause these sporadic or acute events to occur. The cost of chronic (high frequency, low impact events) failures far exceeds the costs of sporadic events when viewed on an annual basis.
Think about the chronic failures that often make their way into our cost of doing business - the repetitive bearing and seal failures, conveyor belt stoppages and slow downs, cut rate production levels, defective parts introduced into the field, leaks on the floor, product sent for rework, delivery delays, etc. These types of events often occur every shift, day, week, month and year. Because they likely do not hurt someone, cause significant, short-term production loss or involve regulatory or legal authorities, they are often accepted as a cost of doing business and built into the budget as a permanent fixture. Once such an item is in the budget, the failure is compensated for, so in the minds of the organization it is not a failure anymore because it is covered in the budget.
More importantly, we must recognize these types of chronic events are precursors to the more dramatic, sporadic (or acute) events. A certain bearing could fail that will stop all production. A seal could fail that would cause a release into the atmosphere. A leak on the floor could cause a person to slip and result in harm.
Knowing this, we can and should apply RCA to the chronic events in an effort to eliminate the possibility of the more sporadic type of events. This is a proactive use of RCA because if we do not act on such events, no one else will. We are being proactive by addressing the chronics and preventing the sporadics.
Paradigm Shift: Analysts need to focus on the chronic events that plague their processes. We must not accept “routine failures” as a cost of doing business and include them in the budget where they get a cost of living increase each year. We must recognize and support the contention that addressing chronics events reduces the risks associated with the emergence of sporadic events which have much more dire consequences.
PARADIGM #3: RCA IS A REACTIVE TOOL
The true concepts of real RCA can be applied on events that have not occurred. When doing various types of risk analyses such as HAZOPs, FMEAs or RCMs, we typically identify which events, if they were to occur, would be most critical.
While most of the risk analyses will address the risks with the implementation of predictive measures to minimize the consequences of such an event occurring, RCA can be applied to understand all the potential causes that could line up for such an event to occur.
Predictive techniques seek to identify a signal of impending failure in their early stages. Such signals may take the form of vibration, sound, temperature, pressure, flow, eddy currents, etc. The goal of prediction is to pick up the signal early, while the goal of RCA is to determine why the signal is occurring in the first place.
When we have identified a critical risk we can employ the RCA concepts by “assuming” the identified risk has materialized. Now we can use our RCA approaches to find out how such an undesirable event could have occurred all the way down to understanding the decision-making system that allowed it to occur.
By utilizing RCA in this proactive capacity, we can identify the potential physical, human and latent/systemic root causes that could cause such an event to occur. Knowing this, we can implement countermeasures to prevent these causes from materializing, thus preventing the error-chain from completing.
Paradigm Shift: RCA should be used in concert with RCM and other risk analysis tools to provide a holistic approach to eliminating failure. Diffuse the paradigm that a failure must occur in order to use RCA, and start to use it on ‘potential’ events.
The moral to this story is the only thing that is preventing us from learning about ways to understand why things go wrong (and could go wrong) is ourselves. Using RCA for only the sporadic or acute events is using “in-the-box thinking”. Using RCA for chronic and potential events is using “out-of-the-box” thinking because we would be using it unconventionally. Which do you prefer?
About Reliability Center
RCI is an international leader in bringing Reliability engineering principles to the field of solving problems, correcting failures and preventing future human errors in the workplace. RCI has been a leader in successfully applying these principles in the manufacturing and health care fields for over 44 years. For more information, visit www.reliability.com
Robert (Bob) J. Latino is CEO of Reliability Center, Inc. Bob has been a practitioner, trainer, author and international speaker on the topics of Reliability and Root Cause Analysis for over 30 years. He can be contacted at +1 804 458-0645 or firstname.lastname@example.org. Connect with Bob on LinkedIn at https://www.linkedin.com/in/robert–bob–latino–3411097. Visit our website at www.reliability.com to learn more