Failure Mode and Effects Analysis (FMEA)

What is Failure Mode and Effects Analysis (FMEA)?

Failure Mode and Effects Analysis (FMEA) is a structured analytical methodology used to identify and evaluate potential ways a system, design, process, or service might fail and the consequences of such failures. This proactive risk assessment tool helps organizations prevent problems before they occur rather than fix them after they happen.

To better understand FMEA, consider it similar to a doctor performing a thorough preventive health screening. Just as a doctor examines various body systems to identify potential health risks before they become serious problems, FMEA systematically examines all possible ways something might go wrong in a system or process.

The methodology consists of several key components that work together:

Failure Modes

This represents the specific ways in which something might fail. For example, in an electric motor, failure modes might include bearing seizure, electrical short circuit, or overheating. Each failure mode describes a specific type of malfunction or problem that could occur.

Effective Analysis

Effects Analysis examines the consequences of each identified failure mode. These effects can be immediate (direct impact on the system), secondary (impact on related systems), or end effects (ultimate impact on the final user or process). For instance, a bearing seizure might lead to motor stoppage (immediate effect), damage to connected equipment (secondary effect), and production line shutdown (end effect).

Risk Priority Number

Risk Priority Number (RPN) is a numerical rating system prioritizing attention and resources. It’s calculated by multiplying three factors:

  • Severity: How serious are the consequences if this failure occurs?
  • Occurrence: How likely is this failure to happen?
  • Detection: How likely can we detect this failure before it causes problems?

What is the FMEA process?

The FMEA process typically follows these steps:

  1. System definition and breakdown into components
  2. Identification of potential failure modes for each component
  3. Analysis of effects for each failure mode
  4. Assessment of severity, occurrence, and detection
  5. Calculation of Risk Priority Numbers
  6. Development of recommended actions
  7. Implementation of improvements
  8. Re-evaluation after changes
FMEA process steps
Data and Illustration: WorkTrek

For example, in manufacturing a car brake system, FMEA might identify failure modes like brake fluid leakage, pad wear, or sensor malfunction. Each would be analyzed for its effects (like reduced braking power), likelihood of occurrence, and current detection methods. This analysis helps engineers design better systems and maintenance procedures.

FMEA is particularly valuable because it:

  • Provides a systematic approach to reliability improvement
  • Helps prioritize improvement efforts based on risk
  • Creates a documented history of improvement efforts
  • Promotes cross-functional team collaboration
  • Reduces warranty claims and liability issues
  • Improves customer satisfaction through better reliability.

Core Terminology and Concepts

Failure Mode

This refers to the specific way in which a piece of equipment or system can fail. For example, a pump might fail through bearing seizure, impeller wear, seal leakage, or shaft misalignment. Each represents a distinct failure mode that needs separate analysis and prevention strategies.

Effects Analysis

This describes the consequences of each failure mode, both immediate and long-term. Effects might include production stoppage, safety hazards, environmental impacts, or damage to other equipment. Understanding these effects helps prioritize maintenance efforts and allocate resources effectively.

Severity (S)

This measures how serious the consequences would be if a particular failure occurred. It is typically rated on a scale of 1-10, where 10 represents catastrophic failure effects such as safety risks or complete production stoppage, and 1 represents minor inconveniences with negligible impact.

Occurrence (O)

This rates how frequently a failure mode is likely to occur, typically on a 1-10 scale. Higher ratings indicate more frequent occurrences. This rating often draws from historical maintenance data and equipment reliability records.

Detection (D)

This evaluates how likely current maintenance practices and inspection methods are to detect a potential failure before it occurs. Again rated 1-10, where 10 means the failure is almost impossible to detect beforehand, and 1 means detection is almost certain.

Risk Priority Number (RPN)

This is calculated by multiplying Severity, Occurrence, and Detection ratings (S x O x D). The resulting number helps prioritize which failure modes need the most urgent attention. Higher RPNs indicate more critical issues requiring immediate action.

Risk Priority Number
Data and Illustration: WorkTrek

Critical Characteristics

These are the specific features, dimensions, or parameters that most directly influence the function and reliability of equipment. For a bearing, critical characteristics might include clearances, lubrication requirements, and operating temperature limits.

Control Methods

These are the current inspection, testing, or monitoring procedures in place to prevent failures or detect them before they occur. Examples include vibration analysis, oil analysis, thermal imaging, and regular visual inspections.

These are the specific steps suggested to reduce the likelihood of failure or improve the ability to detect potential failures before they happen. These might include implementing condition monitoring, changing maintenance frequencies, or modifying operating procedures.

Preventive Actions

These are steps taken to reduce the likelihood of a failure mode. This could include regular lubrication schedules, alignment checks, or component replacement at specified intervals.

Detection Methods

These are the specific techniques and tools used to identify potential failures before they occur. This includes predictive technologies like vibration analysis and simpler visual inspection or performance monitoring methods.

Current Process Controls

These are the existing maintenance procedures, inspection methods, and operational controls already in place to prevent or detect failures. Understanding current controls helps identify gaps in the maintenance strategy.

How does FMEA in Maintenance Organization

In maintenance organizations, FMEA serves several crucial functions:

  • Planning preventive maintenance schedules by identifying critical components and potential failure modes
  • Prioritizing maintenance resources based on risk assessment
  • Developing inspection and monitoring programs focused on the most likely or critical failure modes
  • Creating standard operating procedures that incorporate key findings from the FMEA process
  • Training maintenance personnel on essential characteristics of equipment and failure indicators

Get a Free WorkTrek Demo

Let's show you how WorkTrek can help you optimize your maintenance operation.

Try for free