Failure Modes and Effects Analysis (FMEA) is the most important reliability tool in a product’s life cycle. Executed effectively, an FMEA is a living, breathing document that sticks with your product from conception, through testing and production, and all the way into the field. It empowers you to design out failures, address areas of risk with targeted controls and actions, and leverage lessons learned from the field to improve future products.
When executed ineffectively, an FMEA becomes a box-checking exercise that consumes valuable engineering time with minimal impact on product reliability.
This article examines common mistakes made when implementing an FMEA programme and how to avoid them while getting the most out of your FMEAs.
Effective FMEAs rely solely on the knowledge of engineering teams. The most common mistake when implementing an FMEA programme is to launch the programme without first gaining support from the engineering teams it impacts. If these teams do not respect the why or how of the process, and if they view FMEAs as a hindrance to their work rather than an aid, the FMEAs they produce will not meaningfully impact reliability.
If you are working to implement or revamp an FMEA programme, make sure you take time to get feedback from management and end-users alike, and make sure to train them on the methodology and reasoning behind FMEAs.
This can be a difficult sell. Where engineers like objective rationale and hard numbers, FMEAs are often subjective and painfully number free – Severity, Occurrence and Detection (SOD) scales and risk priority numbers (RPNs) are really pseudo-numbers. Be sure to stress the fact that FMEAs offer an opportunity to have their voices heard and to leverage their unique product knowledge to make real impacts on design, testing, and process improvements.
The Risk Priority Number (RPN) methodology is a technique for analyzing the risk associated with potential problems identified during a Failure Mode and Effects Analysis (FMEA). This article presents a brief overview of the basic RPN method and then examines some additional and alternative ways to use RPN ratings to evaluate the risk associated with a product or process design and to prioritize problems for corrective action.
FMEAs are not grant submissions to be completed and filed at some arbitrary, predetermined date. They are living and breathing documents that should be conceived in parallel to a product’s design and follow that product through its life cycle.
If a design FMEA is performed post design freeze, you have missed your opportunity to address the probability of occurrence of your failure modes, and can only address detection risk through testing. If a design FMEA is performed after testing, then you have accomplished nothing beyond checking a box and wasting engineers’ time, as you can no longer address any kind of design risk (and good luck to the process team).
Whether your team is focused on a design, system, or process FMEA, the earlier you start an FMEA, the more opportunity you have to design reliability into your product. Ditch fixed due dates that encourage procrastination and minimalism, and start envisioning FMEAs as a design tool that gives you an edge over your box-checking competition.
Too often, experienced FMEA practitioners let slip a silent scream as they scroll their colleagues’ Cause lists, littered with “user error”, “operator error”, “wear”, etc. If FMEAs fail to drill down to the root causes of failures, then there is little that can be addressed through actions.
How does one design out “wear” or “corrosion” – reshape the laws of physics? How does one test for “user error” prior to design freeze? Generic causes like these are crutches and cop-outs that act as dead ends in an FMEA.
Instead, try “User error due to unintuitive design of the braking system” or “Corrosion due to poor material selection of the housing”. These are causes that can be addressed with existing controls and, if necessary, effective actions.
You have made it through the often tedious task of drilling down to root causes for your failure modes, and you have dug up every control document you can think of to assign prevention and detection type controls to those causes. Your team has debated every SOD score (is the effect a loss of secondary function or just an annoyance? Can we detect failure prior to design freeze or prior to launch?). Now, you are staring down a stream of red-, yellow-, and green-highlighted RPNs or action priorities, and you haven’t a clue what to do with all of this.
Actions are the entire reason for doing an FMEA – the reason so much time is spent laying out the system architecture, carefully delineating functions, determining the failure modes and their end effects, drilling down to root causes, and assigning controls. All of this leads to prioritising the failure causes by relative risk and mitigating that risk with targeted actions aimed at reducing failure probability or increasing detection.
You can’t address every cause with new actions. Take a hard look at causes that have safety or regulatory concerns first. If you have time, move on to the causes with the highest economic risk. Make sure responsibility for actions are assigned to specific individuals. Most importantly, reconvene the team once actions have been completed and assess their impact on risk. Which actions were the most effective? How did they address the risk and to what degree have they reduced it? This information is crucial not only for the current product but also for future products.
ReliaSoft Cloud 3 panel dashboard highlighting actions
The melancholy fate of most completed FMEAs is to drift into a lonely graveyard, seven file folders deep on Rick from Quality Assurance’s laptop. After being toiled over for hours by cross-functional teams of engineers, the document, which has soaked up tens of thousands of dollars in engineering time, is forgotten and left to digitally decay.
One of the biggest complaints from engineers who are tasked with performing FMEAs is wasting time on repeated work. Most FMEAs are completed on systems that have previous iterations. Steve Jobs invented the iPhone® once in 2007– Apple® has since released 24 iterations and counting.
There is rarely any need to reinvent an entire FMEA. Current FMEAs should recycle relevant pieces of past FMEAs and focus on where problems have occurred in the field and on what has changed in the new design. Does your new chiller system reuse the same dependable control unit from the past design? Great, let’s incorporate its prior FMEA and review any safety or systems integration issues and then move on to the redesigned compressor which will require more attention.
While an effective FMEA programme relies heavily on building strong processes and a culture around reliability, having a platform for global, cross-functional teams to host and collaborate on FMEAs is a critical piece as well.
HBK is bringing decades of experience in creating powerful FMEA software to the cloud with a new Software-as-a-Service product, ReliaSoft Cloud.
How ReliaSoft Cloud solves common FMEA challenges:
To learn more about ReliaSoft Cloud, click here.
Zachary Graves is an Application Engineer at HBK - Hottinger Brüel & Kjær, where he has worked for over five years. In his role, he supports clients in their reliability programmes through training, software demonstrations, and support solving reliability challenges.
Prior to his current role, he received his Bachelor’s degree in Engineering Mechanics and Astronautics from the University of Wisconsin and worked on applying statistical methods such as DOE, Gaussian Process Models, and Uncertainty Quantification to engineering problems.
Zachary is an expert in various reliability engineering techniques. He teaches and supports clients in applying methods such as FMEA, Life Data Analysis, Accelerated Life Testing, Reliability Growth Analysis, Systems Reliability Analysis, RCM, RAMS, and DFR. With a strong focus on customer success and technical communication, he is dedicated to helping clients optimise their reliability processes.
This will bring together HBM, Brüel & Kjær, nCode, ReliaSoft, and Discom brands, helping you innovate faster for a cleaner, healthier, and more productive world.
This will bring together HBM, Brüel & Kjær, nCode, ReliaSoft, and Discom brands, helping you innovate faster for a cleaner, healthier, and more productive world.
This will bring together HBM, Brüel & Kjær, nCode, ReliaSoft, and Discom brands, helping you innovate faster for a cleaner, healthier, and more productive world.
This will bring together HBM, Brüel & Kjær, nCode, ReliaSoft, and Discom brands, helping you innovate faster for a cleaner, healthier, and more productive world.
This will bring together HBM, Brüel & Kjær, nCode, ReliaSoft, and Discom brands, helping you innovate faster for a cleaner, healthier, and more productive world.