Fema analysis. Analysis of the types and consequences of failures

Powerful data analysis tool to improve reliability

William Goble for InTech

Failure Mode and Effects Analysis (FMEA) is a special technique for assessing the reliability and safety of systems, developed in the 60s. last century in the United States, as part of the Minuteman rocket program. The purpose of its development was to detect and eliminate technical problems in complex systems.

The technique is quite simple. The failure modes of each component of a particular system are listed in a special table and documented - along with the expected consequences. The method is systematic, efficient and detailed, although sometimes considered time consuming and repetitive. The reason for the effectiveness of the method is that it is studied each failure type of each separate component. The following is an example of a table described in one of the original guidelines for the application of this method, namely MIL-HNBK-1629.

Column #1 contains the name of the tested component, column #2 - the identification number of the component (serial number or code). Together, the first two columns should uniquely identify the component under study. Column #3 describes the function of the component and column #4 describes the possible failure modes. For each type of failure, as a rule, one line is used. Column #5 is used to record the reason for the failure, where applicable. Column #6 describes the consequences of each failure. The rest of the columns may differ depending on which versions of the FMEA are being used.

FMEA allows you to find problems

The FMEA method has grown in popularity over the years and has become an important part of many development processes, especially in the automotive industry. The reason for this was that the method was able to demonstrate its usefulness and effectiveness despite criticism. Be that as it may, it is during the application of the FMEA method that you can often hear a cry like “Oh no” when it becomes clear that the consequences of the failure of one or another component are very serious, and, most importantly, before that they went unnoticed. If the problem is severe enough, corrective actions are also recorded. The design is improved to detect, avoid or manage the problem.

Application in various industries

Several variants of the FMEA technique are used in various industries. In particular, FMEA is used to identify hazards that need to be considered during the design of petrochemical plants. This technique is in excellent agreement with another well-known technique - Hazard and Operability Study (HAZOP). In fact, both techniques are almost the same, and are variations on lists of system components in tabular form. The main difference between FMEA and HAZOP is that HAZOP uses keywords to help employees identify abnormalities while FMEA is based on known equipment failure modes.

A variant of the FMEA technique used to analyze control systems is the Control Hazards and Operability Analysis (CHAZOP) technique. The list lists known failure modes of control system components, such as basic process control systems, valve and actuator combinations, or various transmitters, and records the consequences of these failures. In addition, descriptions of corrective actions are provided if the failure leads to serious problems.

FMEA example

This figure schematically shows a simplified "reactor" with an emergency cooling system. The system consists of a gravity water tank, a control valve, a cooling jacket around the reactor, a switch with a temperature sensor, and a power source. During normal operation, the switch is in the active (conductive) position because the reactor temperature is below the danger zone. Electric current flows from the source through the valve and the switch, and keeps the valve in the closed position. If the temperature inside the reactor becomes too high, the temperature sensitive switch opens the circuit and the control valve opens. The cooling water flows from the reservoir, through the valve, then through the cooling jacket and out through the jacket drain. This water flow cools the reactor, lowering its temperature.

Do you like this article? Give us Like! Thank you:)

The FMEA procedure requires the creation of a table that lists all failure modes for each of the system components. The "reactor" table below is an example of the use of the FMEA technique, which identified critical components that should be checked for the need for corrective action.

The designer of the system - a simple reactor in our case - may consider installing 2 temperature-sensitive switches in series. You can use an IEC 61508 compliant smart transmitter with auto diagnostic function and output signal. A certified transmitter greatly simplifies the verification process required to locate faults. Along with one drain, you can install a second one, so that a blockage in one of them will not lead to a critical failure of the system. The level gauge in the tank may indicate insufficient water level. Many other design changes and improvements are possible to prevent breakage.

Part II

Evolution of the FMEA Method

The FMEA method was expanded in the 1970s to include semi-quantitative ratings (a number from 1 to 10) of severity, frequency of origin, and failure detection. Added 5 columns to the table. Three columns included ratings, and the fourth - risk priority number (from English: risk priority number or RPN), obtained by multiplying three numbers. This extended method is called Failure Modes, Effects and Criticality Analysis or FMECA. An example of a table with FMECA analysis results for a "simple reactor" is shown below.

FMEA techniques have continued to evolve. Some of the later variations can be used not only for design, but also for technological processes. Similar to the list of components, a list of process steps is created. Each step is accompanied by a description of all the options for the incorrect course of the process, which corresponds to the description of possible failures of one or another component of the system. In all other respects, these variations of the FMEA technique are consistent with each other. In the literature, these methods are sometimes referred to as "design FMEA" or DFMEA and "process FMEA" or PFMEA. "Process" FMEA has been successfully shown to be effective in detecting unanticipated problems.

Analysis of failures, their consequences and diagnostics

The continuously evolving FMEA method, among other things, gave life to the "Failure Modes Effects and Diagnostic Analysis" (FMEDA) method. At the end of the 80s. there was a need to simulate automatic diagnostics of smart devices. There was a new architecture in the market for safety controllers called "one of two" with a diagnostic switch (1oo2D), competing with the then common triple modular redundancy architecture called "two of three" (2oo3). Because the security and readiness of the new architecture depended heavily on the implementation of diagnostics, quantifying diagnostics became an important process. FMEDA does this by adding additional columns showing the frequency of occurrence of various types of failures and a column with the probability of detection for each line of analysis.

As with FMEA, the FMEDA technique lists all components and failure modes and the consequences of those failures. Columns are added to the table that list all the failure modes of the system, the probability that diagnostics will detect a particular failure, and also, a quantitative assessment of the probability of this failure occurring. When the FMEDA analysis is completed, a "diagnostic coverage" factor is calculated based on the failure rate weighted average of the diagnostic coverage of all components.

Failure rates and failure distributions must be available for each component if an FMEDA analysis is to be carried out. Therefore, a component database is required, as seen in the FMEDA Process figure (above).

The component database must take into account the key variables that affect the component failure rate. The variables include environmental factors. Fortunately, there are certain standards that allow you to characterize the environment in process industries, thanks to which you can create appropriate profiles. The table below shows the "Environmental Profiles for Process Industries" taken from the second edition Electrical and Mechanical Component Reliability Handbook,(www.exida.com).

FMEDA Field Failure Data Analysis

Design analysis can be used to create theoretical failure databases. However, accurate information can only be obtained if the component failure rates, as well as the failure modes, are based on data collected from a study of real field equipment. Any unexplained difference between component failure rates calculated from field data and those from FMEDA should be investigated. Sometimes the field data collection process needs to be improved. Sometimes it may be necessary to update the component database with new failure modes and component types.

Fortunately, some functional safety certification organizations examine field equipment failure data when evaluating most products, making it a valuable source of real-life failure data. Some projects also collect data on field failures with the help of end customers. After more than 10 billion hours (!) of operation of various equipment, which yielded a huge amount of data on failure modes and rates collected in dozens of studies, it is difficult to overestimate the value of the FMEDA component base, especially in the aspect of functional safety. The resulting FMEDA product data is typically used for safety integrity level verification calculations.

The FMEDA technique can be used to evaluate the effectiveness of verification tests of various security functions to determine whether a particular design meets a certain level of security integrity. Any particular proof test will identify some or other potentially dangerous failures - but not all. FMEDA allows you to determine which failures are or are not identified by proof testing. This is done by adding another column that evaluates the probability of detecting each component failure mode during proof testing. Using this detailed, systematic method, it becomes apparent that some potentially dangerous failure modes are not detected during verification testing.

Reverse side of the medal

The main problem when using the FMEA method (or any of its variations) is the high cost of time. Many analysts complain about the boring and lengthy process. Indeed, a strict and focused facilitator is needed in order for the analysis process to move forward. It must always be remembered that problem solving is not part of the analysis. Problems are solved after the analysis is completed. If these rules are followed, fairly rapid improvements in safety and reliability will result.

Dr. William Goble is Chief Engineer and Director of the Functional Safety Certification Group at exida, an accredited certification body. Over 40 years of experience in electronics, software development and security systems. Ph.D. in the field of quantitative analysis of the reliability / safety of automation systems.

Tests of technological processes for completeness.

Structural testing for completion.

These tests are carried out on the first prototypes of the product. Their purpose is to show that the design of the product satisfies the requirements for reliability.

It does not matter how the prototype was built and what efforts went into its debugging. If the required level of product reliability is not achieved, the design must be improved. Testing continues until the product meets all specified requirements.

During these tests, failures are recorded in initial period operation of the product. With this data, full consistency is achieved between the design of the product and the processes required for its manufacture, and determines the amount of testing necessary to achieve the required reliability in delivery [ of the product to consumers.

Tests are also carried out on the first samples of products. These I samples work for a given period (run-in period). The characteristics of their work are carefully monitored, decreasing failure rate is measured. After a run-in period, experience data is collected to measure and verify the performance of the product and compare it with the results. tatami, obtained during testing of the product for completeness. I Observations made during these tests, allow you to set the value of the period of the product's running-in.

Durability tests. During these tests, wear failures of product elements are recorded and their distribution is built. The data obtained is used for elimination. causes of those failures, the occurrence of which leads to an unacceptable reduction in the expected life of the product. Durability tests are carried out on a number of samples of this product. During these tests, it is necessary to determine the boundary of the transition from a constant to an increasing failure rate and construct a distribution for each observed failure mode.

One of the effective means of improving the quality of technical objects is the analysis of the types and consequences of potential failures (Potential Failure Mode and Effects Analysis - FMEA). The analysis is carried out at the stage of designing a structure or a technological process (the corresponding stages life cycle products - development and preparation for production), as well as when finalizing and improving products already put into production. It is advisable to divide this analysis into two stages: a separate analysis at the design development stage and at the technological process development stage.

The standard (GOST R 51814.2-2001. Quality systems in the automotive industry. A method for analyzing the types and consequences of potential defects) also provides for the possibility of using the FMEA method in the development and analysis of other processes, such as sales, service, and marketing processes.



The main objectives of the analysis of the types and consequences of potential failures:

Identification of critical failures associated with a danger to human life and the environment and development of measures
to reduce the likelihood of their occurrence and the severity of possible consequences;

Identification and elimination of the causes of any possible failures of the product to improve its reliability.

During the analysis, the following tasks are solved:

Identification of possible failures of an object (product or process) and its elements (this takes into account the experience of manufacturing and operating similar objects),

Studying the causes of failures, quantifying the frequency of their occurrence,

Classification of failures according to the severity of consequences and quantitative assessment of the significance of these consequences,

Evaluation of the sufficiency of monitoring and diagnostic tools Evaluation of the possibility of detecting a failure, the possibility of preventing a failure in the practical use of these tools,

Development of proposals for changing the design and manufacturing technology in order to reduce the likelihood of failures and their criticality,

Development of rules for personnel behavior in the event of critical failures,

analysis of possible personnel errors.

To conduct the analysis, a group of specialists with practical experience and high professional level in the field of designing similar objects, knowing the processes of manufacturing components and assembling an object, "technology for monitoring and diagnosing the state of an object, methods" of maintenance and repair. The brainstorming method is used. At the same time, at the stage of qualitative analysis, a structural diagram of the object is developed: the object is considered as a system consisting of subsystems of various levels, which in turn consist of individual elements.

Possible types of failures and their consequences are analyzed from the bottom up, i.e. from elements to subsystems, and then to the object as a whole. The analysis takes into account that each failure can have several causes and several different consequences.

At the stage of quantitative analysis, the criticality of the failure is assessed expertly, in points, taking into account the probability of its occurrence, the probability of its detection and assessment of the severity of possible consequences. Failure risk (priority risk number) can be found using the formula: I

where the value of O is determined in points depending on the probability of failure, - on the probability of detecting (detecting) a failure, "depends on the severity of the consequences of the failure.

The found value for each element for each cause and for each possible consequence is compared with the critical one. The critical value is set in advance and is selected from 100 to 125. Reducing the critical value corresponds to the development of more reliable products and processes.

For each failure, for which the value of R exceeds the critical one, measures are developed to reduce it by improving the design and manufacturing technology. For a new version of the object, the criticality of the object R is recalculated. If necessary, the refinement procedure is repeated again.

During the development and production of various equipment, defects periodically occur. What is the result? The manufacturer incurs significant losses associated with additional tests, checks and design changes. However, this is not an uncontrolled process. You can assess possible threats and vulnerabilities, as well as analyze potential defects that may interfere with the operation of equipment, using FMEA analysis.

For the first time this method of analysis was used in the USA in 1949. Then it was used exclusively in the military industry when designing new weapons. However, already in the 70s, the ideas of FMEA appeared in large corporations. Ford was one of the first to introduce this technology (at that time - largest manufacturer cars).

Today, the FMEA analysis method is used by almost all machine-building enterprises. The main principles of risk management and failure cause analysis are described in GOST R 51901.12-2007.

Definition and essence of the method

FMEA is an acronym for Failure Mode and Effect Analysis. This is a technology for analyzing the types and consequences of possible failures (defects due to which the object loses the ability to perform its functions). Why is this method good? It gives the company the opportunity to anticipate possible problems and malfunctions even earlier. During the analysis, the manufacturer receives the following information:

  • a list of potential defects and malfunctions;
  • analysis of their causes, severity and consequences;
  • risk mitigation recommendations in order of priority;
  • overall assessment of the safety and reliability of products and the system as a whole.

The data obtained as a result of the analysis is documented. All detected and studied failures are classified according to the degree of criticality, ease of detection, maintainability and frequency of occurrence. The main task is to identify problems before they arise and begin to affect the company's customers.

Scope of FMEA analysis

This method of research is actively used in almost all technical fields, such as:

  • automobile and shipbuilding;
  • aviation and space industry;
  • chemical and oil refining;
  • building;
  • making industrial equipment and mechanisms.

In recent years, this method of risk assessment has been increasingly used in non-manufacturing areas - for example, in management and marketing.

FMEA can be carried out at all stages of the product life cycle. However, most often the analysis is performed during the development and modification of products, as well as when using existing designs in a new environment.

Kinds

With the help of FMEA technology, they study not only various mechanisms and devices, but also the processes of company management, production and operation of products. In each case, the method has its own specific features. The object of analysis can be:

  • technical systems;
  • designs and products;
  • processes of production, assembly, installation and maintenance of products.

When examining mechanisms, the risk of non-compliance with standards, the occurrence of malfunctions in the process of operation, as well as breakdowns and reduced service life, are determined. This takes into account the properties of materials, the geometry of the structure, its characteristics, interfaces of interaction with other systems.

FMEA analysis of the process allows you to detect inconsistencies that affect the quality and safety of products. Customer satisfaction and environmental risks are also taken into account. Here, problems can arise from the side of a person (in particular, employees of the enterprise), production technology, raw materials and equipment used, measuring systems, environmental impact.

The research uses different approaches:

  • "top down" (from large systems to small details and elements);
  • "bottom up" (from individual products and their parts to

The choice depends on the purpose of the analysis. It can be part of a comprehensive study in addition to other methods or used as a standalone tool.

Stages of the

Regardless of specific tasks, FMEA analysis of the causes and consequences of failures is carried out according to a universal algorithm. Let's consider this process in more detail.

Preparation of the expert group

First of all, you need to decide who will conduct the study. Teamwork is one of the key principles of FMEA. Only such a format ensures the quality and objectivity of the examination, and also creates space for non-standard ideas. As a rule, the team consists of 5-9 people. It includes:

  • Project Manager;
  • process engineer performing the development of the technological process;
  • design engineer;
  • production representative or;
  • member of the customer service department.

If necessary, qualified specialists from outside organizations can be involved in the analysis of structures and processes. Discussion possible problems and ways to solve them takes place in a series of meetings lasting up to 1.5 hours. They can be held both in full and in part (if the presence of certain experts is not necessary to resolve current issues).

Project study

To conduct an FMEA analysis, it is necessary to clearly identify the object of study and its boundaries. If we're talking about technological process, you should designate the initial and final events. For equipment and structures, everything is simpler - you can consider them as complex systems or focus on specific mechanisms and elements. Discrepancies can be considered taking into account the needs of the consumer, the stage of the product's life cycle, the geography of use, etc.

At this stage, the members of the expert group should receive a detailed description of the object, its functions and principles of operation. Explanations should be accessible and understandable to all team members. Usually presentations are held at the first session, experts study instructions for the manufacture and operation of structures, planning parameters, normative documentation, blueprints.

#3: Listing Potential Defects

After the theoretical part, the team proceeds to evaluate possible failures. Compiled complete list all possible inconsistencies and defects that may occur on the object. They can be associated with the breakdown of individual elements or their incorrect functioning (insufficient power, inaccuracy, low performance). When analyzing processes, it is necessary to list specific technological operations during which there is a risk of errors - for example, non-execution or incorrect execution.

Description of causes and consequences

The next step is an in-depth analysis of such situations. The main task is to understand what can lead to the occurrence of certain errors, as well as how the detected defects can affect employees, consumers and the company as a whole.

The team reviews operation descriptions, approved performance requirements, and statistical reports to determine probable causes of defects. The FMEA protocol can also indicate risk factors that the company can correct.

At the same time, the team considers what can be done to eliminate the chance of defects, suggests control methods and the optimal frequency of inspections.

Expert assessments

  1. S - Severity / Significance. Determines how severe the consequences of this defect for the consumer. It is evaluated on a 10-point scale (1 - practically no effect, 10 - catastrophic, in which the manufacturer or supplier may be subject to criminal punishment).
  2. O - Occurrence / Probability. Indicates how often a certain violation occurs and whether the situation can be repeated (1 - very unlikely, 10 - failure is observed in more than 10% of cases).
  3. D - Detection / Detection. A parameter for evaluating control methods: whether they will help to identify a discrepancy in a timely manner (1 - almost guaranteed to be detected, 10 - a hidden defect that cannot be detected before the onset of consequences).

Based on these estimates, a risk priority number (HRN) is determined for each failure mode. This is a generalized indicator that allows you to find out which breakdowns and violations pose the greatest threat to the company and its customers. Calculated by the formula:

FRR = S × O × D

The higher the PHR, the more dangerous the violation and the more destructive its consequences. First of all, it is necessary to eliminate or reduce the risk of defects and malfunctions that have given value exceeds 100-125. From 40 to 100 points, violations with an average level of threat are gaining, and a PFR of less than 40 indicates that the failure is insignificant, occurs rarely and can be detected without problems.

After assessing the deviations and their consequences, the FMEA working group determines the priority areas for work. The first priority is to develop a corrective action plan for the bottlenecks, the elements and operations with the highest OCRs. To reduce the threat level, you need to influence one or more parameters:

  • eliminate the original cause of the failure by changing the design or process (rating O);
  • prevent the occurrence of a defect using statistical control methods (score O);
  • mitigate negative consequences for buyers and customers - for example, reduce the price of defective products (score S);
  • introduce new tools for early detection of faults and subsequent repair (grade D).

In order for the enterprise to immediately start implementing the recommendations, the FMEA team simultaneously develops a plan for their implementation, indicating the sequence and timing of each type of work. The same document contains information about the executors and those responsible for carrying out corrective measures, sources of funding.

Summarizing

The final stage is the preparation of a report for company executives. What sections should it contain?

  1. Overview and detailed notes on the progress of the study.
  2. Potential causes of defects in the production / operation of equipment and the performance of technological operations.
  3. List of likely consequences for employees and consumers - separately for each violation.
  4. Assessment of the level of risk (how dangerous are possible violations, which of them can lead to serious consequences).
  5. List of recommendations for the maintenance service, designers and planners.
  6. Schedule and reports on corrective actions based on the results of the analysis.
  7. A list of potential threats and consequences that were eliminated by changing the project.

The report is accompanied by all tables, graphs and charts that serve to visualize information about the main problems. Also, the working group should provide the used schemes for assessing inconsistencies in terms of significance, frequency and probability of detection with a detailed breakdown of the scale (which means a particular number of points).

How to complete the FMEA protocol?

During the study, all data must be recorded in a special document. This is the Cause Analysis Protocol consequences of FMEA". It is a universal table where all information about possible defects is entered. This form is suitable for the study of any systems, objects and processes in any industry.

The first part is completed based on personal observations of team members, study of enterprise statistics, work instructions and other documentation. The main task is to understand what can interfere with the operation of the mechanism or the performance of any task. At meetings, the working group must assess the consequences of these violations, answer how dangerous they are for workers and consumers, and what is the likelihood that a defect will be detected even at the production stage.

The second part of the protocol describes options for preventing and eliminating nonconformities, a list of activities developed by the FMEA team. A separate column is provided for appointing those responsible for the implementation of certain tasks, and after making adjustments to the design or organization of the business process, the manager indicates in the protocol a list of work performed. The final stage is re-grading, taking into account all the changes. Comparing the original and final indicators, we can conclude about the effectiveness of the chosen strategy.

A separate protocol is created for each object. At the very top is the name of the document - "Analysis of the types and consequences of potential defects." A little lower is the equipment model or the name of the process, the dates of the previous and next (according to the schedule) checks, the current date, as well as the signatures of all members of the working group and its leader.

An example of an FMEA analysis ("Tulinov Instrument-Making Plant")

Let's consider how the process of assessing potential risks takes place on the experience of a large Russian industrial company. At one time, the management of the Tulinovsky Instrument-Making Plant (JSC TVES) faced the problem of calibrating electronic scales. The enterprise produced a large percentage of incorrectly functioning equipment, which the department technical control had to send it back.

After studying the sequence of steps and requirements for the calibration procedure, the FMEA team identified four sub-processes that had the greatest impact on the quality and accuracy of the calibration.

  • moving and placing the device on the table;
  • checking the position by level (the scales must be 100% horizontal);
  • placing cargo on platforms;
  • registration of frequency signals.

What types of failures and malfunctions were recorded during these operations? The working group identified the main risks, analyzed their causes and possible consequences. Based expert assessments indicators of the CPR were calculated, which made it possible to identify the main problems - the lack of a clear control over the performance of work and the condition of the equipment (bench, weights).

StageFailure scenarioCausesConsequencesSODHCR
Moving and installing scales on the stand.Risk of the scale falling due to the heavy weight of the structure.There is no specialized transport.Device damage or failure.8 2 1 16
Checking the horizontal position by level (the device must stand absolutely level).Incorrect graduation.The bench top was not level.6 3 1 18
Employees do not follow work instructions.6 4 3 72
Arrangement of cargoes at the fixed points of the platform.Using weights of the wrong size.Operation of old, worn-out weights.OTK returns marriage due to metrological discrepancy.9 2 3 54
Lack of control over the placement process.6 7 7 252
The stand mechanism or sensors are out of order.The combs of the movable frame are skewed.From constant friction, weights wear out quickly.6 2 8 96
The rope broke.Suspension of production.10 1 1 10
The gear motor has failed.2 1 1 2
The schedule of scheduled inspections and repairs is not observed.6 1 2 12
Registration of frequency signals of the sensor. Programming.Loss of data that was entered into the storage device.Power outages.You need to recalibrate.4 2 3 24

To eliminate risk factors, recommendations were developed for additional training of employees, modification of the bench top and purchase of a special roller container for transporting scales. Buying an uninterruptible power supply solved the problem with data loss. And to prevent future problems with calibration, the working group proposed new schedules for maintenance and scheduled calibration of weights - inspections began to be carried out more often, due to which damage and failures can be detected much earlier.

FEDERAL AGENCY FOR TECHNICAL REGULATION AND METROLOGY

NATIONAL

STANDARD

RUSSIAN

FEDERATION

GOSTR

51901.12-

(IEC 60812:2006)

Risk management

METHOD OF ANALYSIS OF TYPES AND CONSEQUENCES

REFUSAL

Analysis techniques for system reliability - Procedure for failure mode and effects

Official edition


С|Ш№Ц1ЧИ1+П|Ш

GOST R 51901.12-2007

Foreword

Goals and principles of standardization e Russian Federation installed Federal law dated December 27, 2002 No. 184-FZ “On technical regulation”, and the rules for the application of national standards of the Russian Federation - GOST R 1.0-2004 “Standardization in the Russian Federation. Basic Provisions»

Information about the standard

1 PREPARED BY OPEN joint stock company"Research Center for Control and Diagnostics of Technical Systems" (OJSC "NITs KD") and the Technical Committee for Standardization TC 10 "Advanced Production Technologies, Management and Risk Assessment" based on our own authentic translation of the standard specified in paragraph 4

2 INTRODUCED by the Development Department. information support and accreditation of the Federal Agency for Technical Regulation and Metrology

3 APPROVED AND INTRODUCED BY Order No. 572-st of December 27, 2007 of the Federal Agency for Technical Regulation and Metrology

4 This standard is modified in relation to the international standard IEC 60812:2006 “Methods for analyzing the reliability of systems. Failure mode and effects analysis (FMEA) method” (IEC 60812:2006 “Analysis techniques for system reliability - Procedure for failure mode and effects analysis (FMEA)”) by introducing technical deviations, the explanation of which is given in the introduction to this standard.

The name of this standard has been changed from the name of the specified international standard to bring it in line with GOST R 1.5-2004 (subsection 3.5)

5 INTRODUCED FOR THE FIRST TIME

Information about changes to this standard is published in the annually published information index " National standards". and the text of changes and amendments - in the monthly published information indexes "National Standards". In case of revision (replacement) or cancellation of this standard, a corresponding notice will be published in the monthly published information index "National Standards". Relevant information, notification and texts are also placed in information system general use - on the official website of the Federal Agency for Technical Regulation and Metrology on the Internet

© Standartinform, 2008

This standard cannot be fully or partially reproduced, replicated and distributed as an official publication without the permission of the Federal Agency for Technical Regulation and Metrology

GOST R 51901.12-2007

1 Scope ...............................................................1

3 Terms and definitions...............................................2

4 Fundamentals...............................................2

5 Failure modes and effects analysis .......................................................... 5

6 Other studies..............................................20

7 Applications................................................... 21

Annex A (informative) Short description FMEA and FMECA procedures.......................25

Annex B (informative) Study examples ..........................................28

Annex C (informative) List of abbreviations for English language used in the standard. 35 Bibliography............................................... 35

GOST R 51901.12-2007

Introduction

In contrast to the applicable International Standard, this standard includes references to IEC 60050*191:1990 “International Electrotechnical Vocabulary. Chapter 191. Reliability and quality of services”, which is inappropriate to be included in the national standard due to the lack of an accepted harmonized national standard. In accordance with this, the content of section 3 has been changed. In addition, the standard includes an additional Appendix C. containing a list of used abbreviations in English. References to national standards and supplementary annex C are in italics.

GOST R 51901.12-2007 (IEC 60812:2006)

NATIONAL STANDARD OF THE RUSSIAN FEDERATION

Risk management

METHOD OF ANALYSIS OF FAILURE TYPES AND EFFECTS

risk management. Procedure for failure mode and effects analysts

Introduction date - 2008-09-01

1 area of ​​use

This International Standard specifies methods for failure mode and effects analysis (FMEA). types, consequences and criticality of failures (Failure Mode. Effects and Criticality Analysis - FMECA) and gives recommendations on their application to achieve the goals by:

Performing the necessary stages of analysis;

Identification of relevant terms, assumptions, criticality indicators, failure modes:

Definitions of the main principles of analysis:

Using the examples required technological maps or other tabular forms.

All general FMEA requirements given in this standard apply to FMECA. because

the latter is an extension of FMEA.

2 Normative references

8 of this standard uses normative references to the following standards:

GOST R 51901.3-2007 (IEC 60300-2:2004) Risk management. Reliability Management Guide (IEC 60300-2:2004 Reliability Management - Reliability Management Guide. MOD)

GOST R 51901.5-2005 (IEC 60300-3-1:2003) Risk management. Guidelines for the application of reliability analysis methods (IEC 60300-3-1:2003 "Reliability management - Part 3-1 - Application guide - Reliability analysis methods - Methodology guide". MOD)

GOST R 51901.13-2005 (IEC 61025:1990) Risk management. Fault tree analysis (IEC 61025:1990 "Fault tree analysis (FNA)". MOD)

GOSTR51901.14-2005 (IEC61078:1991) Risk management. Reliability Structure Chart Method (IEC 61078:2006 "Methods for Reliability Analysis - Reliability Structure Chart and Bulway Methods". MOD)

GOS TR51901.15-2005 (IEC61165:1995) Risk management. Application of Markov methods (IEC 61165:1995 "Application of Markov methods". MOD)

Note - When using this standard, it is advisable to check the validity of reference standards in the public information system - on the official website of the Federal Agency for Technical Regulation and Metrology on the Internet or according to the annually published information index "National Standards *", which was published as of January 1 of the current year , and according to the corresponding monthly published information signs published in the current year. If the reference standard is replaced (modified), then when using this standard, you should be guided by the replacing (modified) standard. If the referenced standard is canceled without replacement, the provision in which the reference to it is given applies to the extent not affecting that reference.

Official edition

GOST R 51901.12-2007

3 Terms and definitions

In this standard, the following terms are used with their respective definitions:

3.1 item any part, element, device, subsystem, functional unit, apparatus or system that can be considered on its own

Notes

1 The object may consist of technical means, software tools or combinations thereof and may also, in particular cases, include technical staff.

2 A number of objects, such as their population or sample, can be considered as an object.

NOTE 3 A process can also be considered as an entity that performs a given function and for which an FMEA or FMECA is performed. Typically, a hardware FMEA does not cover people and their interaction with hardware or software, while a process FMEA usually includes analysis of people's actions.

3.2 failure

3.3 fault state of an entity in which it is unable to perform a required function, except for such incapacity due to maintenance or other planned activities, or due to a lack of external resources

Notes (edit)

NOTE 1 A failure is often the result of an object failure, but may occur without it.

NOTE 2 In this International Standard, the term “malfunction” is used alongside the term “failure” for historical reasons.

3.4 failure effect

3.5 failure mode

3.6 failure criticality this refusal and reduce the severity of its consequences.

3.7 system

Notes (edit)

1 With regard to reliability, the system should have:

a) certain goals, presented in the form of requirements for its functions:

t>) specified operating conditions:

c) certain boundaries.

2 The structure of the system is hierarchical.

3.8 failure severity the significance or severity of the consequences of a failure mode to the operation of the facility, the environment and the operator, related to the established boundaries of the facility under investigation

4 Fundamentals

4.1 introduction

Failure Modes and Effects Analysis (FMEA) is a systematic system analysis method for identifying potential failure modes. their causes and consequences, as well as the impact of failure on the functioning of the system (the system as a whole or its components and processes). The term "system" is used to describe hardware, software (with their interaction) or process. It is recommended that the analysis be carried out in the early stages of development when it is most cost effective to eliminate or reduce the consequences and number of failure modes. The analysis can be started as soon as the system can be presented in the form of a functional block diagram with an indication of its elements.

For more details see.

GOST R 51901.12-2007

The timing of the FMEA is very important. If the analysis was carried out early enough in the system development, then the introduction of changes during the design to eliminate the shortcomings found during the FMEA. is more cost effective. Therefore, it is important that the goals and objectives of the FMEA are described in the plan and timeline of the development process. In this way. FMEA is an iterative process carried out concurrently with the design process.

FMEA is applicable at various levels of system decomposition - from the highest level of the system (the system as a whole) to the functions of individual components or software commands. FMEAs are constantly iterated and updated as the system design improves and changes during development. Design changes require changes to the relevant parts of the FMEA.

In general, FMEA is the result of the work of a team consisting of qualified specialists. capable of recognizing and evaluating the significance and consequences of various types of potential design and process inconsistencies that could lead to product failures. Teamwork stimulates the thinking process and ensures required quality expertise.

FMEA is a method to identify the severity of the consequences of potential failure modes and provide risk mitigation measures, in some cases FMEA also includes an assessment of the probability of occurrence of failure modes. This expands the analysis.

Before applying the FMEA, a hierarchical decomposition of the system (hardware with software or process) into basic elements must be carried out. It is useful to use simple block diagrams illustrating the decomposition (see GOST 51901.14). The analysis begins with the elements of the lowest level of the system. Consequence of refusal to lower level can cause the object to fail at a higher level. The analysis is carried out from the bottom to the top of the bottom-up scheme, until the final consequences for the system as a whole are determined. This process is shown in Figure 1.

FMECA (Failure Modes, Effects, and Criticality Analysis) extends FMEA to include methods for ranking the severity of failure modes, allowing prioritization of countermeasures. The combination of the severity of the consequences and the frequency of occurrence of failures is a measure called criticality.

FMEA principles can be applied beyond project development to all stages of a product's life cycle. The FMEA method can be applied to manufacturing or other processes such as hospitals. medical laboratories, education systems, etc. When applying the PMEA to a production process, this procedure is called the FMEA of the process (Process Failure Mode and Effects Analysis (PFMEA)]. For the effective application of FMEA, it is important to provide adequate resources. A complete understanding of the system for preliminary FMEA is not necessary, however as a design develops, a detailed analysis of failure modes and effects requires full knowledge of the characteristics and requirements of the system being designed Complex engineering systems typically require analysis to be applied to a large number of design factors (mechanical, electrical, systems engineering, software development, maintenance facilities) etc.).

6 In general, FMEA applies to certain types failures and their consequences for the system as a whole. Each failure mode is considered as independent. Thus, this procedure is not suitable for dealing with dependent failures or failures resulting from a sequence of multiple events. To analyze such situations, it is necessary to apply other methods, such as Markov analysis (see GOST R 51901.15) or fault tree analysis (see GOST R 51901.13).

When determining the consequences of a failure, it is necessary to consider higher-level failures and failures of the same level that arose as a result of the failure that occurred. The analysis shall identify all possible combinations of failure modes and their sequences that can cause the consequences of failure modes at a higher level. In this case, additional modeling is needed to assess the severity or likelihood of such consequences occurring.

FMEA is a flexible tool that can be adapted to the specific requirements of a particular production. In some cases, the development of specialized forms and rules for keeping records is required. The severity levels of failure modes (if applicable) for different systems or different levels of the system can be defined in different ways.

GOST R 51901.12-2007

Subsystem

Subsisgaia

"Subsystem" * 4 *

Pyoeisteab

Cause opt system

Widmotk&iv

Pietista: otid padyastama 4

Aftermath: stm * iodine *


;tts, Nodul3

(Premium atash aoyagsh 8 Types of spam

UA.4. ^ .A. a... "l"

Posyaedoteio:<утммчеип«2


Figure 1 - Interrelation of types and consequences of failures in the hierarchical structure of the system

GOST R 51901.12-2007

4.2 Goals and objectives of the analysis

Reasons for applying a failure modes and effects analysis (FMEA) or a failure modes, effects and criticality analysis (FMECA) can be the following:

a) identification of failures that have undesirable consequences for the operation of the system, such as termination or significant degradation of performance or impact on user safety;

b) fulfillment of the customer's requirements specified in the contract;

c) improving the reliability or safety of the system (for example, through design changes or quality assurance activities);

d) improve system maintainability by identifying areas of risk or inconsistency with respect to maintainability.

In accordance with the above, the objectives of the FMEA (or FMECA) can be the following:

a) full identification and evaluation of all undesirable consequences within the established system boundaries and sequences of events caused by each identified common cause failure mode at various levels of the system functional structure;

b) determining criticality (see c) or prioritization to diagnose and mitigate the adverse effects of each failure mode affecting the correct operation and performance of the system or associated process;

c) classification of identified failure modes according to such characteristics. as ease of detection, diagnosability, testability, operating and repair conditions (repair, operation, logistics, etc.);

d) identification of functional failures of the system and assessment of the severity of the consequences and the likelihood of failure;

e) developing a plan to improve the design by reducing the number and consequences of failure modes;

0 development of an effective maintenance plan to reduce the likelihood of failures (see IEC 60300-3-11).

NOTE When dealing with criticality and failure probabilities, it is recommended to apply the FMECA methodology.

5 Failure modes and effects analysis

5.1 Fundamentals

Traditionally, there are quite large differences in the way FMEA is conducted and presented. Typically, analysis is performed by identifying failure modes, corresponding causes, immediate and final consequences. Analytical results can be presented in the form of a worksheet containing the most significant information about the system as a whole and details, taking into account its features. in particular about potential system failure paths, components and failure modes that can cause system failure, and the causes of each failure mode.

The application of FMEA to complex products is very difficult. These difficulties may be less if some subsystems or parts of the system are not new and coincide with or are modifications of subsystems and parts of the previous system design. A newly created FMEA should use information about existing subsystems to the greatest extent possible. It should also indicate the need for testing or full analysis of new properties and objects. Once a detailed FMEA has been developed for a system, it can be updated and improved for subsequent system modifications, requiring significantly less effort than a new FMEA development.

Using the existing FMEA of a previous version of the product, it is necessary to ensure that the design (design) is reused in the same way and with the same loads as the previous one. New loads or environmental influences in operation may require prior review of the existing FMEA prior to performing the FMEA. Differences in environmental conditions and operational loads may require the creation of a new FMEA.

The FMEA procedure consists of the following main four steps:

a) establishing ground rules for planning and scheduling FMEA work (including allocating time and ensuring that expertise is available for analysis);

GOST R 51901.12-2007

b) performing FMEA using appropriate worksheets or other forms such as logic diagrams or fault trees;

c) summing up and writing a report on the results of the analysis, including all conclusions and recommendations;

d) updates to the FMEA as the development and development of the project progresses.

5.2 Preliminary tasks

5.2.1 Planning the analysis

FMEA activities. including actions, procedures, interactions with processes in the field of reliability, actions to manage corrective actions, as well as the deadlines for the completion of these actions and their stages, should be indicated in the overall plan of the reliability program 1 K

The reliability program plan should describe the FMEA methods to be used. The description of the methods may be a standalone document, or may be replaced by a link to a document containing the description.

The reliability program plan should contain the following information:

Determination of the purpose of the analysis and expected results;

The scope of the analysis, indicating which design elements the FMEA should pay particular attention to. The scope should be appropriate to the maturity of the project and cover design elements that may be a source of risk because they perform a critical function or are manufactured using undeveloped or new technology;

Description of how the presented analysis contributes to the overall reliability of the system:

Identified actions to manage FMEA revisions and associated documentation. Management of revisions of analysis documents, worksheets and methods of their storage should be defined;

The required scope of participation in the analysis of project development experts:

Clear indication of key stages in the project schedule for timely analysis:

The way to complete all the actions specified in the process of mitigation of the identified failure modes that need to be considered.

The plan must be agreed upon by all project participants and approved by its management. The final FMEA at the end of the product design or manufacturing process (process FMEA) shall identify all recorded actions to eliminate or reduce the number and severity of identified failure modes, and the manner in which these actions are taken.

5.2.2 System structure

5.2.2.1 System structure information

Information about the structure of the system should include the following data:

a) description of system elements with characteristics. operating parameters, functions;

b) a description of the logical relationships between elements;

c) extent and nature of redundancy;

d) the position and significance of the system within the device as a whole (if any);

e) system inputs and outputs:

f) substitutions in system design for measuring operating conditions.

For all levels of the system, information about functions, characteristics and parameters is needed. The levels of the system are considered from the bottom up to the highest level, investigating with the help of FMEA the failure modes that impair each of the functions of the system.

5.2.2.2 Defining system boundaries for analysis

System boundaries include the physical and functional interfaces between the system and its environment, including other systems with which the system under study interacts. The definition of the system boundary for analysis should be consistent with the system boundaries established for design and maintenance and apply to any level of the system. Systems and/or components that go beyond the boundaries should be clearly defined and excluded.

Determining the boundaries of a system depends more on its design, intended use, sources of supply, or commercial criteria than on optimal FMEA requirements. However, whenever possible, the definition of boundaries should take into account the requirements to simplify the FMEA and its integration with other related studies. This is especially important.

1> For more details on the elements of the reliability program and the reliability plan, see GOST R 51901.3.

GOST R 51901.12-2007

if the system is functionally complex, with numerous relationships between objects inside and outside the boundaries. In such cases, it is useful to define the boundaries of research based on the functions of the system, rather than hardware and software. This will limit the number of entries and exits to other systems and may reduce the number and severity of system failures.

It must be made clear that all systems or components outside the boundaries of the system under study are considered and excluded from the analysis.

5.2.2.3 Levels of analysis

it is important to determine the system level that will be used for the analysis. For example, a system may experience malfunctions or failures of subsystems, interchangeable items, or unique components (see Figure 1). The basic rules for choosing system levels for analysis depend on the desired results and the availability of the necessary information. It is useful to use the following basic principles:

a) The top level of the system is selected based on the design concept and the specified output requirements:

b) the lowest level of the system at which the analysis is effective. is the level characterized by the presence available information to determine the definition of its functions. The choice of the appropriate system level depends on previous experience. For a system based on a mature design with fixed and high levels of reliability, maintainability and safety, a less detailed analysis is applied. A more detailed study and correspondingly lower levels of the system is introduced for a newly developed system or a system with an unknown reliability history:

c) the established or expected level of maintenance and repair is a valuable guide in determining the lower levels of the system.

In FMEA, the determination of failure modes, causes, and consequences depends on the level of analysis and system failure criteria. In the analysis process, the consequences of a failure identified at a lower level can become failure modes for a higher level of the system. Failure modes at a lower level of the system can cause failures at a higher level of the system, and so on.

When a system is decomposed into its elements, the consequences of one or more failure mode causes create a failure mode, which in turn is the cause of component failures. The failure of the component is the cause of the failure of the module, which in turn is the cause of the failure of the subsystem. The impact of a failure cause at one level of the system thus becomes the cause of an impact at a higher level. The explanation given is shown in Figure 1.

5.2.2.4 System structure view

The symbolic representation of the structure of the functioning of the system, especially in the form of a diagram, is very useful when conducting an analysis.

It is necessary to develop simple diagrams that reflect the main functions of the system. In the diagram, the block connection lines represent the inputs and outputs for each function. The nature of each function and each input must be accurately described. Several diagrams may be required to describe the various phases of system operation.

8 According to the progress of system design, a block diagram can be designed. representing real components or constituent parts. This representation provides additional information to more accurately identify potential failure modes and their causes.

Block diagrams should reflect all elements, their relationships, redundancy and functional relationships between them. This provides traceability of the functional failures of the system. Several block diagrams may be required to describe alternative modes of system operation. Separate circuits may be required for each mode of operation. At a minimum, each block diagram must contain:

a) decomposition of the system into major subsystems, including their functional relationships:

b) all respectively marked inputs and outputs and identification numbers of each subsystem:

c) all redundancy, warnings and other technical features which protect the system from failures.

5.2.2.5 Start-up, operation, control and maintenance

The status of the various modes of operation of the system, as well as changes in the configuration or position of the system and its components during the various stages of operation, should be determined. The minimum requirements for system operation should be defined as follows. to criteria

GOST R 51901.12-2007

failure and/or operability were clear and understandable. Availability or safety requirements should be established based on specified minimum levels of performance required for operation and maximum damage levels that allow acceptance. You need to have accurate information:

a) the duration of each function performed by the system:

b) the time interval between periodic tests;

c) the time to take corrective action before serious system consequences occur;

d) any means used. environmental conditions and/or personnel, including interfaces and interactions with operators;

e) work processes during system start-up, shutdown and other transitions (repair);

f) management during the operational phases:

e) preventive and/or corrective maintenance;

h) test procedures, if applicable.

It has been found that one of the important uses of FMEA is to assist in the development of a maintenance strategy. Information about facilities. equipment, spare parts for maintenance should also be known for preventive and corrective maintenance.

5.2.2.6 System environment

The environmental conditions of the system shall be determined, including external conditions and conditions created by other nearby systems. For a system, its relationships must be described. interdependencies or interrelationships with support or other systems and interfaces and with personnel.

At the design stage, not all of these data are known, and therefore approximations and assumptions must be used. As the project progresses and the data to account increases new information or changed assumptions and approximations, FMEA changes must be made. Often FMEA is used to determine the necessary conditions.

5.2.3 Definition of failure modes

The successful functioning of the system depends on the functioning of the critical elements of the system. To assess the functioning of the system, it is necessary to identify its critical elements. The effectiveness of procedures for identifying failure modes, their causes and consequences can be improved by preparing a list of expected failure modes based on the following data:

a) the purpose of the system:

b) features of system elements;

c) system operation mode;

d) performance requirements;

f) time limits:

f) environmental influences:

e) workloads.

An example of a list of common failure modes is shown in Table 1.

Table 1 - Example of common failure modes

Note - This list is only an example. Different types of systems correspond to different lists.

In fact, each failure mode can be assigned to one or more of these general modes. However, these common failure modes are too broad in scope. Therefore, the list needs to be expanded in order to narrow down the group of failures assigned to the general failure mode under investigation. Input and output control parameter requirements and potential failure modes

GOST R 51901.12-2007

should be identified and described in the object reliability block diagram. It should be noted that one type of failure can have several causes.

it is important that the assessment of all items within the system boundary at the lowest level to provide an idea of ​​all potential failure modes is consistent with the objectives of the analysis. Then studies are carried out to determine possible failures, as well as the consequences of failures for subsystems and system functions.

Component suppliers should identify potential failure modes for their products. Typically, failure mode data can be obtained from the following sources:

a) for new objects, data from other objects with similar function and structure, as well as the results of tests of these objects with appropriate loads, can be used;

b) for new items, potential failure modes and their causes are determined in accordance with the design objectives and detailed analysis of the features of the item. This method is preferable to the one given in listing a), since the loads and actual operation may differ for similar objects. An example of such a situation could be using FMEA to process signals of a processor other than the same processor used in a similar project;

c) for items in operation, data from reports relating to maintenance and failures can be used;

d) potential failure modes can be determined based on an analysis of the functional and physical parameters specific to the operation of the item.

it is important that failure modes are not missed due to missing data and initial estimates are improved based on test results and project progress data, records of the status of such estimates should be kept in accordance with the FMEA.

Identification of failure modes and. where appropriate, the definition of project corrective actions, preventive quality assurance actions, or product maintenance actions is of paramount importance. It is more important to identify and. where possible, mitigate the effects of failure modes by design measures rather than knowing the probability of their occurrence. If it is difficult to prioritize, a criticality analysis may be required.

5.2.4 Causes of failures

The most likely causes of each potential failure mode should be identified and described. Since a failure mode can have multiple causes, the most likely independent causes of each failure mode must be identified and described.

Identification and description of the causes of failures is not always necessary for all failure modes identified in the analysis. Identification and description of the causes of failures and proposals for their elimination should be made on the basis of a study of the consequences of failures and their severity. The more severe the consequences of the failure mode, the more accurately the causes of failures must be identified and described. Otherwise, the analyst may spend unnecessary effort identifying the causes of failure modes that do not affect system performance or have very little effect.

The causes of failures can be determined based on an analysis of operational failures or failures during testing. If the project is new and has no precedents, the reasons for failures can be established by expert methods.

After identifying the causes of failure modes, based on estimates of their occurrence and severity of consequences, the recommended actions are evaluated.

5.2.5 Consequences of failure

5.2.5.1 Determining the consequences of failure

The failure consequence is the result of the operation of the failure mode in terms of system operation, performance or status (see definition 3.4). A failure consequence can be caused by one or more failure modes of one or more objects.

The consequences of each failure mode for the performance of the elements, function or status of the system shall be identified, evaluated and recorded. Maintenance activities and system goals should also be considered each time. when it is necessary. The consequences of failure may affect the next and. ultimately to the highest level of system analysis. Therefore, at each level, the consequences of failures must be evaluated for the next higher level.

5.2.5.2 Local consequences of failure

The expression "local consequences)" refers to the consequences of the failure mode for the system element under consideration. The consequences of each possible failure at the output of the object must be described.

GOST R 51901.12-2007

dignity. The purpose of identifying local consequences is to provide a basis for assessing existing alternative conditions or developing recommended corrective actions, in some cases there may be no local consequences other than the failure itself.

5.2.5.3 Consequences of failure at the system level

When identifying the consequences for the system as a whole, the consequences of a possible failure for the highest level of the system are determined and evaluated based on the analysis at all intermediate levels. Higher level consequences can be the result of multiple failures. For example, the failure of a safety device leads to catastrophic consequences for the system as a whole only if the safety device fails at the same time as exceeding the permissible limits. main function system for which the safety device is intended. These consequences resulting from multiple failures should be indicated in the worksheets.

5.2.6 Failure detection methods

For each failure mode, the analyst must determine the method by which the failure is detected and the means that the installer or maintenance technician uses to diagnose the failure. Failure diagnostics can be performed using technical means, it can be carried out by automatic means provided in the design (built-in testing), as well as by introducing a special control procedure before the system starts working or during maintenance. Diagnostics can be carried out at the start of the system during its operation or at set intervals. In any case, after diagnosing the failure, the dangerous mode of operation must be eliminated.

Failure modes, other than the one under consideration, that have identical manifestations shall be analyzed and listed. The need for separate diagnostics of failures of redundant elements during system operation should be considered.

For FMEA design failures are examined with what probability, when and where a design flaw will be identified (via analysis, simulation, testing, etc.). For a process FMEA, failure detection considers how likely and where process deficiencies and inconsistencies can be identified (eg, by an operator in statistical process control, in a quality control process, or later in the process).

5.2.7 Failure compensation conditions

The identification of all design features at a given system level or other safety measures that can prevent or mitigate the effects of failure modes is critical. The FMEA must clearly show the true effect of these safeguards under the conditions of a particular failure mode. Safety measures to prevent failure, which must be registered with the FMEA. include the following:

a) redundant facilities that allow continued operation if one or more elements fail;

b) alternative means of work;

c) monitoring or signaling devices;

d) any other methods and means of effective operation or damage limitation.

During the design process, functional elements (hardware and software) can be repeatedly rebuilt or reconfigured, and their capabilities can also be changed. At each stage, the need to analyze the identified failure modes and apply the FMEA must be confirmed or even revised.

5.2.8 Failure severity classification

The severity of the failure is an assessment of the significance of the impact of the consequences of the failure mode on the operation of the object. Classification of failure severity, depending on the specific application of the FMEA. designed taking into account several factors:

Characteristics of the system in accordance with possible failures, characteristics of users or the environment;

Functional parameters of the system or process;

Any requirements of the customer established in the contract;

Legislative and safety requirements;

Warranty claims.

Table 2 provides an example of a qualitative classification of severity of consequences when performing one of the types of FMEA.

GOST R 51901.12-2007

Table 2 — Illustrative example of failure severity classification

Failure severity class number

Name of gravity class

Description of the consequences of failure for people or the environment

Catastrophic

The failure mode can lead to the termination of the primary functions of the system and cause severe damage to the system and the environment and / or death and serious injury to people.

Critical

The type of failure may lead to the termination of the primary functions of the system and cause significant damage to the system and the environment, but does not pose a serious threat to human life or health.

Minimum

the failure mode may degrade the performance of the system without appreciable damage to the system or threat to human life or health

negligible

the type of failure may impair the performance of the system functions, but does not cause damage to the system and does not pose a threat to life and health of people

5.2.9 Frequency or probability of occurrence of failures

The frequency or probability of occurrence of each failure mode should be determined in order to assess the consequences or severity of failures.

To determine the probability of occurrence of a failure mode, in addition to published information on the failure rate. It is very important to consider the actual operating conditions of each component (environmental, mechanical and/or electrical loads) whose characteristics contribute to the likelihood of failure. This is necessary because the components of the failure rate are consequently, the intensity of the considered failure mode in most cases increases along with the increase in the acting loads in accordance with a power law or exponential law. The probability of occurrence of failure modes for a system can be estimated using:

Life test data;

Available databases of failure rates;

Operational failure data;

Data on failures of similar objects or components of a similar class.

FMEA failure probability estimates are related to a certain period of time. This is usually the warranty period or the stated life of the item or product.

The use of frequency and probability of occurrence of failure is explained below in the description of the criticality analysis.

5.2.10 Analysis procedure

The flowchart shown in Figure 2 shows the general analysis procedure.

5.3 Failure Modes, Effects and Criticality Analysis (FMECA)

5.3.1 Purpose of the analysis

The letter C included in the abbreviation FMEA. means that failure mode analysis also leads to criticality analysis. The definition of criticality implies the use of a qualitative measure of the consequences of failure modes. Criticality has many definitions and measurement methods, most of which have a similar meaning: the impact or significance of the failure mode that needs to be eliminated or mitigated. Some of these measurement methods are explained in 5.3.2 and 5.3.4. The purpose of criticality analysis is to qualitatively determine relative magnitude each consequence of failure. Values ​​for this quantity are used to prioritize actions to eliminate or mitigate failures based on combinations of failure severity and failure severity.

5.3.2 Risk R and risk priority value (RPN)

One method for quantifying criticality is to determine the risk prioritization value. The risk in this case is assessed by a subjective measure of severity.

n The value characterizing the severity of the consequences.

GOST R 51901.12-2007


Figure 2 - Analysis Flowchart

ty consequences and the probability of a failure occurring within a given period of time (used for analysis). In some cases, when this method is not applicable, it is necessary to turn to a simpler form of non-quantitative FMEA.

GOST R 51901.12-2007

8 As a general measure of potential risk, R&S, some types of FMECA use the value

where S is the value of the severity of the consequences, i.e. the degree of impact of the failure on the system or user (dimensionless value);

P is the probability of failure occurrence (dimensionless value). If it is less than 0.2. it can be replaced by the criticality value C. which is used in some quantitative FMEA methods. described in 5.3.4 (evaluation of the probability of occurrence of failure consequences).

8 Some FMEA or FMECA applications further allocate a failure detection level to the system as a whole. In these cases, an additional failure detection value of 0 (also a dimensionless value) is used to form the RPN risk priority value.

where O is the probability of failure for a given or set period of time (this value can be defined as a rank, and not the actual value of the probability of failure);

D - characterizes the detection of a failure and is an assessment of the chance to identify and eliminate the failure before the consequences for the system or the customer appear. The D values ​​are usually ranked in reverse order of failure probability or failure severity. The higher the value of D., the less likely it is to detect a failure. A lower detection probability corresponds to a higher RPN and a higher failure mode priority.

The RPN risk priority value can be used to prioritize failure mode reduction. In addition to the risk priority value, in order to decide on the reduction of failure modes, the severity of failure modes is taken into account, first of all, implying that with equal or close RPN values, this decision should first be applied to failure modes with higher failure severity values.

These values ​​can be evaluated numerically using a continuous or discrete scale (a finite number of given values).

The failure modes are then ranked according to their RPN. High priority is assigned to high RPN values. In some cases, the consequences for failure modes with RPN. exceeding the specified limit are unacceptable, while in other cases high failure severity values ​​are set regardless of the RPN values.

Different types of FMECA use different scales for S. O and D. For example, from 1 to 4 or 5. Some types of FMECA, such as those used in the automotive industry for design and manufacturing process analysis, are called DFMEA and PFMEA. assign a scale from 1 to 10.

5.3.3 Relationship of FMECA to risk analysis

The combination of criticality and severity characterizes a risk that differs from the commonly used risk indicators by less stringency and requires less effort to assess. The differences lie not only in the way the failure severity is predicted, but also in the description of the interactions between the contributing factors using the usual FMECA bottom-up procedure. Moreover. FMECA usually allows for a relative ranking of contributions to total risk, while the risk analysis for a high-risk system is usually oriented towards acceptable risk. However, for systems with low risk and low complexity, FMECA may be a more cost-effective and appropriate method. Every time. when FMECA reveals the likelihood of high-risk outcomes, it is preferable to use Probabilistic Risk Analysis (PRA)] instead of FMECA.

For this reason, FMECAHe should be used as the only method for deciding the risk acceptability of specific consequences for a high-risk or high-complexity system, even if the assessment of frequency and severity of consequences is based on reliable data. This should be a task of probabilistic risk analysis, where more influencing parameters (and their interactions) can be taken into account (eg dwell time, probability of avoiding consequences, latent failures of failure detection mechanisms).

According to the FMEA, each identified failure consequence is assigned to the appropriate severity class. The event rate is calculated from failure data or estimated for the component under investigation. The event rate multiplied by the specified operating time gives a criticality value, which is then applied directly to the scale, or. if the scale represents the probability of occurrence of an event, determine this probability of occurrence in accordance with

GOST R 51901.12-2007

steppes with a scale. The severity class and severity class (or probability of occurrence) for each consequence together constitute the magnitude of the consequence. There are two main methods for assessing criticality: the criticality matrix and the RPN risk priority concept.

5.3.4 Determining the failure rate

If failure rates are known for failure modes of similar items, determined for environmental and operating conditions similar to those adopted for the system under study, these event rates can be directly used in FMECA. If failure rates (rather than failure modes) are available for environmental and operating conditions other than those required, the failure mode rate should be calculated. In this case, the following ratio is usually used:

>.i “X, aD.

where >.j is the estimate of the failure rate of the /-th failure mode (the failure rate is assumed to be constant);

X, - failure rate of the j-th component;

a, - is the ratio of the number of the i-th type of failures to total failure modes, i.e. the probability that the object will have the i-th failure mode: p, is the conditional probability of the consequences of the i-th failure mode.

The main disadvantage of this method is the implicit assumption that that the failure rate is constant and that many of the parameters used are derived from predictions or assumptions. This is especially important when there is no data on the corresponding failure rates for the components of the system, but only the estimated probability of failure for a specified time of operation with the corresponding loads.

With the help of indicators that take into account changes in environmental conditions, loads, maintenance, data on failure rates obtained under conditions other than those under study can be recalculated.

Recommendations for choosing the values ​​of these indicators can be found in the relevant reliability publications. The correctness and applicability of the selected values ​​of these parameters for the specific system and its operating conditions should be carefully checked.

In some cases, such as the quantitative method of analysis, the failure mode criticality value C, (not related to the overall "criticality" value, which can take on a different value) is used instead of the failure rate of the i-th failure mode X;. The criticality value is related to the conditional failure rate and operating time and can be used to obtain a more realistic assessment of the risk associated with a particular failure mode over a given product use time.

C i \u003d X\u003e ".P, V

where ^ is the operating time of the component during the entire specified time of the FMECA studies. for which the probability is estimated, i.e., the time of active operation of the jth component.

The criticality value for the i-th component with m failure modes is determined by the formula

C, - ^Xj-a,pjf|.

It should be noted that the value of criticality is not related to criticality as such. This is just a value calculated in some types of FMECA, which is a relative measure of the consequences of a failure mode and the probability of its occurrence. Here the criticality value is a measure of risk rather than a measure of failure occurrence.

Probability P, occurrence of failure of the /-th type in time t for the obtained criticality:

P, - 1 - e with ".

If the failure mode rates and the corresponding criticality values ​​are small, then with a rough approximation it can be argued that for occurrence probabilities less than 0.2 (criticality is 0.223), the criticality and failure probability values ​​are very close.

In the case of variable failure rates or failure rates, it is necessary to calculate the probability of occurrence of failure, and not criticality, which is based on the assumption of a constant failure rate.

GOST R 51901.12-2007

5.3.4.1 Criticality matrix

Criticality can be represented as a criticality matrix, as shown in Figure 3. It should be kept in mind that there are no universal definitions of criticality. Criticality should be determined by the analyst and accepted by the program or project manager. Definitions can vary significantly for different tasks.

8 criticality matrix shown in Figure 3. it is assumed that the severity of the consequences increases with its value. In this case, IV corresponds to the highest severity of consequences (death of a person and / or loss of system function, injuries to people). In addition, it is assumed that on the y-axis, the probability of occurrence of a failure mode increases from bottom to top.

Likely

fanfare cl

ItaMarv poopvdvpy

Figure 3 - Criticality matrix

If the highest probability of occurrence does not exceed 0.2, then the probability of occurrence of the failure mode and the criticality value are approximately equal to each other. Often, when compiling a criticality matrix, the following scale is used:

The criticality value is 1 or E. An almost improbable otkae. the probability of its occurrence varies in the interval: 0 £P^< 0.001;

The criticality value is 2 or D. A rare failure, the probability of its occurrence varies in the interval: 0.001 nR,< 0.01;

The criticality value is 3 or C. possible failure, the probability of its occurrence varies in the interval: 0.01 £P,<0.1;

The criticality value is 4 or B. probable failure, the probability of its occurrence varies in the range: 0.1 nP,< 0.2;

The criticality value is 5 or A. Frequent failure, the probability of its occurrence varies in the interval: 0.2 & P,< 1.

Figure 3 is for illustration purposes only. In other methods, other designations and definitions may be used for criticality and severity of consequences.

In the example shown in Figure 3, failure mode 1 has a higher probability of occurring than failure mode 2, which has a higher severity. Solution from. which type of failure corresponds to a higher priority depends on the type of scale, the severity and frequency classes and the ranking principles used. Although for a linear scale, failure mode 1 (as usual in the severity matrix) should have a higher criticality (or probability of occurrence) than failure mode 2, there may be situations where severity of consequences takes absolute precedence over frequency. In this case, failure mode 2 is the more critical failure mode. Another obvious conclusion is that only failure modes related to the same level of the system can reasonably be compared according to the severity matrix, since failure modes of low complexity systems at a lower level usually have a lower frequency.

As shown above, the criticality matrix (see Figure 3) can be used both qualitatively and quantitatively.

5.3.5 Assessment of risk acceptability

If the required result of the analysis is a criticality matrix, a distribution diagram of the severity of the consequences and the frequency of occurrence of events can be drawn up. Risk acceptability is determined subjectively or guided by professional and financial decisions, depending on

GOST R 51901.12-2007

depending on the type of production. Table 3 shows some examples of acceptable risk classes and a modified criticality matrix.

Table 3 - Risk/criticality matrix

Failure rate

Severity levels

negligible

Minimum

Critical

Catastrophic

1 Practically

Minor

Minor

tolerable

tolerable

incredible rejection

consequences

consequences

consequences

consequences

2 Rare rejection

Minor

tolerable

unwanted

unwanted

consequences

consequences

consequences

consequences

3 possible from-

tolerable

unwanted

unwanted

Unacceptable

consequences

consequences

consequences

consequences

4 Probable from-

tolerable

unwanted

Unacceptable

Unacceptable

consequences

consequences

consequences

consequences

S Frequent failure

unwanted

Unacceptable

Unacceptable

Unacceptable

consequences

consequences

consequences

consequences

5.3.6 FMECA types and ranking scales

FMECA types. described in 5.3.2 and widely used in the automotive industry, are commonly used to analyze the design of a product, as well as to analyze the production processes of these products.

The analysis methodology coincides with those written in the general form of FMEA / FMECA. apart from the definitions in the three tables for severity values ​​S. O occurrence and D detection.

5.3.6.1 Alternative definition of severity

Table 4 provides an example of a severity ranking that is commonly used in the automotive industry.

Table 4 - Failure Mode Severity

The severity of the consequences

Criterion

Missing

No consequences

Very minor

Finishing (noise) of the object does not meet the requirements. The defect is noticed by demanding customers (less than 25%)

Minor

Finishing (noise) of the object does not meet the requirements. Defect noticed by 50% of customers

Very low

Finishing (noise) of the object does not meet the requirements. The defect is noticed by the majority of customers (more than 75%)

The vehicle is operational, but the comfort/convenience system operates at a weakened level, ineffective. The client experiences some dissatisfaction

Moderate

The vehicle/assembly is operational, but the comfort/amenity system is not operational. The client experiences discomfort

The vehicle/assembly is operational, but at a reduced level of efficiency. The client is very dissatisfied

Very high

Vehicle/assembly inoperable (loss of primary function)

Dangerous with danger warning

Very high severity level where the potential failure mode affects operational safety vehicle and/i/i causes non-compliance with mandatory safety requirements with warning of danger

Dangerous without danger warning

Very high severity, where the potential failure mode affects the safe operation of the vehicle and/or causes non-compliance with mandatory requirements without warning of the hazard

Note - The table is taken from SAE L 739 | 3].

GOST R 51901.12-2007

A severity rank is assigned for each failure mode based on the impact of the failure consequences on the system as a whole, its safety, compliance with requirements, objectives and constraints, and the type of vehicle as a system. The severity rank is indicated on the FMECA sheet. The severity rank definition given in Table 4 is accurate for the bi severity values ​​above. It should be used in the above wording. Determining the rank of severity from 3 to 5 can be subjective and depends on the characteristics of the task.

5.3.6.2 Failure occurrence characteristics

Table 5 (also adapted from FMECA, used in the automotive industry) provides examples of qualitative measures. characterizing the occurrence of a failure, which can be used in the RPN concept.

Table 5 - Pitchfork failure according to frequency and probability of occurrence

Ida failure generation characteristic

Failure rate

Probability

Very low - failure is unlikely

< 0.010 на 1000 транспортных средсте/объектоа

Low - relatively few failures

0.1 per 1000 vehicles/objecta

0.5 per 1000 vehicles/objects

Moderate - failures

POSSIBLE

1 per 1000 vehicles/objecta

2 per 1000 vehicles/objecta

5 not 1000 vehicles/objects

High - the presence of repeated failures

10 per 1000 vehicles/objects

20 per 1000 vehicles/objects

Very high - failure is almost inevitable

50 per 1000 vehicles/objects

> 100 per 1000 vehicles/objects

NOTE See AIAG (4).

8 in Table 5, "frequency" refers to the ratio of the number of favorable cases to all possible cases of the event under consideration during the execution strategic objective or service life. For example, a failure mode, which corresponds to values ​​from 0 to 9, can result in the failure of one of the three systems during the period of the task. Here, the definition of the probability of occurrence of failures is associated with the studied period of time. It is recommended to indicate this time period in the header of the FMEA table.

Best practices can be applied when the probability of occurrence is calculated for the components and their failure modes based on the respective failure rates for expected loads (external operating conditions). If the required information is not available, an assessment can be assigned. but at the same time specialists performing FMEA. should keep in mind that the failure occurrence value is the number of failures per 1000 vehicles during a given time interval (warranty period, vehicle service life, etc.). Thus, it is the calculated or estimated probability of a failure mode occurring over the time period under study. 8 Unlike the severity scale, the failure occurrence scale is not linear and is not logarithmic. Therefore, it must be taken into account that the corresponding value of the RPN after calculating the estimates is also non-linear. It must be used with extreme caution.

5.3.6.3 Ranking the probability of failure detection

The RPN concept provides for an assessment of the probability of failure detection, i.e. the probability that with the help of equipment, verification procedures provided for by the project, possible types of failures will be detected in a time sufficient to prevent failures at the system level as a whole. For a process FMEA application (PFMEA), it is the probability that a series of process control activities have the ability to detect and isolate a failure before it affects downstream processes or finished products.

In particular, for products that can be used in several other systems and applications, the probability of detection can be difficult to estimate.

GOST R 51901.12-2007

Table 6 shows one of the diagnostic methods used in the automotive industry.

Table b - Criteria for evaluating failure mode detection

Characteristic

detecting

Criterion - feasibility of detecting the type of return on the basis of the intended operations

yaoitrolya

Practically

one hundred percent

Design controls almost always detect the potential cause/mechanism and next failure mode

Very good

Very high chance that design controls will detect the potential cause/mechanism and subsequent failure mode

high chance that design controls will detect the potential cause/mechanism and subsequent failure mode

moderately good

Moderately high chance that design controls will detect the potential cause/mechanism and subsequent failure mode

Moderate

Moderate chance that design controls will detect the potential cause/mechanism and subsequent failure mode

Low chance that design controls will detect the potential cause/mechanism and subsequent failure mode

Very weak

Very low chance that design controls will detect the potential cause/mechanism and subsequent failure mode

It is unlikely that the design controls will detect the potential cause/mechanism and subsequent failure mode.

Very bad

It is almost unbelievable that the design controls will detect the potential cause/mechanism and subsequent failure mode.

Practically

impossible

Design controls fail to detect potential cause/mechanism and subsequent failure mode or control is not provided

5.3.6.4 Risk assessment

The intuitive method described above should be accompanied by a prioritization of actions aimed at ensuring the highest level of security for the customer (consumer, client). For example, a failure mode with a high severity value, a low occurrence rate, and a very high detection value (e.g. 10.3 and 2) may have a much lower RPN (in this case 60) than a failure mode with average values ​​of all listed values ​​(e.g. 5 in each case), and respectively. RPN is 125. Therefore, additional procedures are often used to ensure that failure modes with a high severity rank (eg 9 or 10) are given priority and remedial action taken first. In this case, the decision should also be guided by the rank of severity, and not just RPN. In all cases, the rank of severity must be considered along with the RPN to make a more informed decision.

Risk prioritization values ​​are also defined in other FMEA methods, especially qualitative methods.

RPN values. calculated according to the tables above are often used as a guide in reducing failure modes. At the same time, warnings 5.3.2 should be taken into account.

RPN has the following disadvantages:

Gaps in value ranges: 88% of ranges are empty, only 120 out of 1000 values ​​are used:

RPN Ambiguity: Several combinations of different parameter values ​​result in the same RPN values:

Sensitivity to small changes: small deviations of one parameter have a large effect on the result if other parameters have big values(e.g. 9 9 3 = 243 and 9 9 - 4 s 324. while 3 4 3 = 36 and 3 4 -4 = 48):

Inadequate scale: the failure occurrence table is non-linear (for example, the ratio between two successive ranks can be both 2.5. and 2):

Inadequate RPN scaling: The difference in RPN values ​​may seem insignificant, when in fact it is quite significant. For example, the values ​​S = 6. 0*4, 0 = 2 give RPN - 48. and the values ​​S = 6, O = 5 and O = 2 give RPN - 60. The second RPN value is not twice as large, but

GOST R 51901.12-2007

while in fact for 0 = 5 the probability of a failure is twice as high as for 0=4. Therefore, raw values ​​for RPN should not be compared linearly;

Erroneous conclusions based on RPN comparison. since the scales are ordinal, not relative.

RPN analysis requires care and attention. Proper application of the method requires analysis of severity, occurrence and detection values ​​before forming a conclusion and taking corrective action.

5.4 Analysis report

5.4.1 Scope and content of the report

The FMEA report may be developed as part of a larger study report or may be a standalone document. In any case, the report should include an overview and detailed notes of the study carried out, as well as diagrams and functional diagrams of the system structure. The report should also list the regimens (with their status) on which the FMEA is based.

5.4.2 Consequence analysis results

A list of failure consequences should be prepared for the particular system being investigated by the FMEA. Table 7 shows a typical set of failure consequences for a starter and electrical circuit car engine.

Table 7 - Example of the consequences of failures for a car starter

Note 1 - This list is only an example. Each analyzed system or subsystem will have its own set of failure consequences.

A failure effects report may be required to determine the likelihood of system failures. resulting from the listed failure effects, and prioritizing corrective and preventive actions. The failure effects report shall be based on a list of failure effects of the system as a whole and shall contain details of the failure modes affecting each failure effect. The probability of occurrence of each failure mode is calculated for a specified period of time of the operation of the object, as well as for the expected parameters of use and loads. Table 8 shows an example of a failure effects overview.

Table B — Example of failure consequence probabilities

Note 2 - Such a table can be built for various qualitative and quantitative rankings of an object or system.

GOST R 51901.12-2007

The report should also contain a brief description of the method of analysis and the level. on which it was conducted, the assumptions used and the underlying rules. In addition, it should include lists of:

a) failure modes that lead to serious consequences:

c) design changes that are made as a result of the FMEA:

d) impacts that are eliminated as a result of overall design changes.

6 Other studies

6.1 Common cause failure

For reliability analysis, it is not sufficient to consider only random and independent failures, since common cause failures can occur. For example, the cause of a malfunction of the system or its failure may be the simultaneous malfunction of several components of the system. This may be due to a design error (unjustified limitation of the allowable values ​​of components), environmental influences (lightning), or human error.

The presence of Common Cause Failure (CCF)] is contrary to the assumption of independence of the failure modes considered by the FMEA. The presence of CCF implies the possibility of occurrence of more than one failure at the same time or within a sufficiently short period of time and the corresponding occurrence of the consequences of simultaneous failures.

Typically, sources of CCF can be:

Design (software development, standardization);

Production (deficiencies in batches of components);

Environment (electrical noise, temperature cycling, vibration);

Human factor (incorrect operation or incorrect maintenance actions).

The FMEA must therefore consider possible sources of CCF when analyzing a system that uses redundancy or a large number of objects to mitigate the consequences of a failure.

CCF is the result of an event that, due to logical dependencies, causes a simultaneous failure condition in two or more components (including dependent failures caused by the consequences of an independent failure). Common cause failures can occur in identical sub-assemblies with the same failure modes and weak points at different options system builds and can be redundant.

The FMEA's ability to analyze CCF is quite limited. However, FMEA is a procedure for examining each failure mode and its associated causes in turn, and identifying all periodic testing, preventive maintenance, etc. This method allows you to investigate all the causes that can cause CCF.

It is useful to use a combination of several methods to prevent or mitigate the effects of CCF (system modeling, physical analysis of components), including: functional diversity, when redundant branches or parts of the system perform the same function. are not identical and have different failure modes; physical separation to eliminate environmental or electromagnetic influences that cause CCF. etc. Usually FMEA provides for review of preventive CCF measures. However, these measures should be described in the remarks column of the worksheet to assist in understanding the FMEA as a whole.

6.2 Human factor

Special developments are needed to prevent or reduce some human errors. Such measures include the provision of mechanical blocking of the railway signal and a password for computer use or data retrieval. If such conditions exist in the system. The consequences of failure will depend on the type of error. Some types of human error should be investigated using the system fault tree to verify the effectiveness of the equipment. Even partial listing of these failure modes is useful in identifying design and procedure deficiencies. Identifying all kinds of human error is probably impossible.

Many CCF failures are based on human error. For example, improper maintenance of identical objects can invalidate a reservation. To avoid this, non-identical backup elements are often used.

GOST R 51901.12-2007

6.3 Software errors

FMEA. conducted for the hardware of a complex system may have implications for the software of the system. Thus, the decisions about the consequences, criticality and conditional probabilities arising from the FMEA may depend on the elements of the software, their features. sequence and timing. In this case, the relationship between hardware and software must be clearly identified, since a subsequent change or improvement of the software may change the FMEAh estimates derived from it. Approval of the software and its modifications may be a condition for reviewing the FMEA and related assessments, for example software logic may be modified to improve safety at the expense of operational reliability.

Failures due to software errors or inconsistencies will have consequences, the meaning of which should be determined in the software and hardware design. The identification of such errors or inconsistencies and the analysis of their consequences are only possible to a limited extent. The consequences of possible errors in the software for the respective hardware should be evaluated. Recommendations for mitigating such errors for software and hardware are often the result of analysis.

6.4 FMEA and consequences of system failures

The FMEA of a system can be made independent of its particular application and can then be tailored to the particularities of the system design. This applies to small kits that can be viewed as components on their own (eg electronic amplifier, electric motor, mechanical valve).

However, it is more typical to design an FMEA for a specific project with specific consequences of system failures. It is necessary to classify the consequences of system failures, for example: fuse failure, recoverable failure, fatal failure, deterioration in task performance, task failure, consequences for individuals, groups or society as a whole.

The ability of an FMEA to take into account the most remote consequences of a system failure depends on the design of the system and the relationship of the FMEA to other forms of analysis such as fault trees, Markov analysis, Petri nets, etc.

7 Applications

7.1 Using FMEA/FMECA

FMEA is a method that is primarily adapted to the study of material and equipment failures and can be applied to various types of systems (electrical, mechanical, hydraulic, etc.) and their combinations for parts of equipment, a system or a project as a whole.

The FMEA should include an examination of software and human actions if they affect the reliability of the system. FMEA can be a study of processes (medical, laboratory, industrial, educational, etc.). In this case, it is usually referred to as the process FMEA or PFMEA. When performing a process FMEA, the goals and objectives of the process are always taken into account and then each step of the process is examined for any adverse outcomes for other steps in the process or the achievement of process goals.

7.1.1 Application within the project

The user must determine how and for what purposes the FMEA is used. FMEA can be used on its own or serve as a complement and support for other reliability analysis methods. The requirements for an FMEA result from the need to understand the behavior of hardware and its implications for the operation of a system or equipment. FMEA requirements can vary significantly depending on the specifics of the project.

FMEA supports the concept of design analysis and should be applied as early as possible in the design of subsystems and the system as a whole. FMEA is applicable to all levels of the system, but is more suitable for lower levels characterized by a large number of objects and/or functional complexity. Special training for personnel performing FMEA is important. Close collaboration between engineers and system designers is essential. The FMEA should be updated as the project progresses and design changes. At the end of the design phase, FMEA is used to validate the design and demonstrate that the designed system meets specified user requirements, standards, guidelines, and regulatory requirements.

GOST R 51901.12-2007

Information derived from FMEA. identifies priorities for the statistical office production process, selective control and input control in the process of production and installation, as well as for qualification, acceptance, acceptance and start-up tests. FMEA is a source of information for diagnostic procedures, maintenance in the development of related manuals.

When choosing the depth and methods of applying FMEA to an object or project, it is important to consider the chains for which FMEA results are needed. timing with other activities and establish the required degree of competence and control over undesired failure modes and consequences. This leads to quality FMEA planning at the indicated levels (system, subsystem, component, object of the iterative design and development process).

For an FMEA to be effective, its place in the reliability program must be clearly established, as well as time, labor and other resources. It is vital that the FMEA is not cut to save time and money. If time and money are limited. FMEA should focus on those parts of the design that are new or use new techniques. For economic reasons, FMEA may be targeted at areas identified as critical by other methods of analysis.

7.1.2 Application to processes

To perform PFMEA, the following is required:

a) a clear definition of the purpose of the process. If the process is complex, the purpose of the process may conflict common purpose or a goal associated with the product of a process, the product of a series of successive processes or steps, the product of a single process step, and the corresponding particular goals:

b) understanding of the individual steps in the process;

c) understanding the potential weaknesses specific to each step of the process;

d) understanding the consequences of each individual deficiency (potential failure) for the product of the process;

e) understanding the potential causes of each of the deficiencies or potential failures and inconsistencies in the process.

If a process is associated with more than one product, then it can be analyzed for individual product types as a PFMEA. Process analysis can also be performed according to process steps and potential adverse outcomes that result in a generalized PFMEA regardless of specific product types.

7.2 Benefits of FMEA

Some of the application features and benefits of FMEA are listed below:

a) avoiding costly modifications due to early identification of design flaws;

b) identification of failures that, when occurring singly and in combination, have unacceptable or significant consequences, and identification of failure modes that could have serious consequences for the expected or required function.

NOTE 1 Such consequences may include dependent failures.

c) determination of the necessary methods to improve the reliability of the design (redundancy, optimal workloads, fault tolerance, component selection, reassembly, etc.);

d) providing a logic model for assessing the likelihood or intensity of occurrence of abnormal system operating conditions in preparation for criticality analysis;

e) identification of problem areas of safety and responsibility for the quality of products or their non-compliance with mandatory requirements.

Note 2 to entry: Self-research is often necessary for safety, but overlap is unavoidable and collaboration is highly desirable during the investigation:

f) development of a test program to detect potential failure modes:

e) concentration on key issues of quality management, analysis of control processes and

product manufacturing:

h) help identifying features overall strategy and preventive maintenance schedule;

i) assistance and support in the definition of test criteria, test plans and diagnostic procedures (comparative tests, reliability tests);

GOST R 51901.12-2007

j) support for design defect elimination sequencing and support for scheduling alternative modes of operation and reconfiguration;

k) designers' understanding of the parameters that affect system reliability;

l) development of a final document containing evidence of the actions taken to ensure that the design results meet the requirements of the maintenance specification. This is especially important in the case of liability for product quality.

7.3 Limitations and disadvantages of FMEA

FMEA is extremely effective when it is used to analyze the elements that cause the failure of the overall system or the disruption of the system's primary function. However, FMEA can be difficult and tedious for complex systems with many functions and different sets of components. These complexities are exacerbated by multiple operating modes and multiple maintenance and repair policies.

FMEA can be a time-consuming and inefficient process if not thoughtfully applied. FMEA research. the results of which are supposed to be used in the future, should be determined. Conducting an FMEA should not be included as a pre-evaluation requirement.

Complications, misunderstandings, and errors can occur when trying to cover multiple levels in the hierarchical structure of a system if the FMEA study is redundant.

Relationships between people or groups of failure modes or causes of failure modes cannot be effectively represented in an FMEA. since the main assumption for this analysis is the independence of failure modes. This shortcoming becomes even more pronounced due to software and hardware interactions where the assumption of independence is not confirmed. The noted is true for human interaction with hardware and models of this interaction. The assumption of independence of failures does not allow due attention to the failure modes, which, when combined, can have significant consequences, while each of them individually has a low probability of occurrence. It is easier to study the interconnections of the system elements using the RTA fault tree method (GOSTR 51901.5) for analysis.

PTA is preferred for FMEA applications. since it is limited to connections of only two levels hierarchical structure, for example, the identification of failure modes of objects and the determination of their consequences for the system in the chain. These consequences then become failure modes at the next level, for example for a module, etc. However, there is experience with successful implementation of multi-level FMEAs.

In addition, the disadvantage of FMEA is its inability to evaluate the overall reliability of the system and thus assess the degree of improvement in its design or changes.

7.4 Relationship with other methods

FMEA (or PMESA) can be applied on its own. As a systemic inductive method of analysis, FMEA is most often used as an adjunct to other methods, especially deductive ones, such as PTA. At the design stage, it is often difficult to decide which method (inductive or deductive) to prefer, since both are used in the analysis. If levels of risk are identified for manufacturing equipment and systems, the deductive method is preferred, but FMEA is still a useful design tool. However, it should be used in addition to other methods. This is especially true when solutions must be found in situations with multiple failures and a chain of consequences. The method used initially should depend on the program of the project.

In the early stages of design, when only the functions, the general structure of the system and its subsystems are known, the successful functioning of the system can be depicted using a reliability block diagram or a fault tree. However, in order to compose these systems, the inductive FMEA process must be applied to the subsystems. Under these circumstances, the FMEA method is not comprehensive. but displays the result in a visual tabular form. In the general case of the analysis of a complex system with several functions, numerous objects and relationships between these objects, FMEA is necessary but not sufficient.

Fault tree analysis (FTA) is a complementary deductive method for analyzing failure modes and their corresponding causes. It sings to trace low level causes leading to high level failures. Although logic analysis is sometimes used for qualitative analysis of fault sequences, it usually precedes high-level failure rate estimation. FTA allows you to model interdependencies different types failures in cases where

GOST R 51901.12-2007

their interaction can lead to a high severity event. This is especially important when the occurrence of one failure mode causes the occurrence of another failure mode with high probability and high severity. This scenario cannot be successfully simulated with application of FMEA. where each failure mode is considered independently and individually. One of the disadvantages of FMEA is its inability to analyze interactions and failure mode dynamics in a system.

PTA focuses on the logic of coincident (or sequential) and alternative events that cause undesirable consequences. FTA allows you to build a correct model of the analyzed system, assess its reliability and failure probability, and also allows you to evaluate the impact of design improvements and a decrease in the number of failures of a particular type on the reliability of the system in the circuit. The FMEA form is more descriptive. Both methods are used in the overall safety and reliability analysis of a complex system. However, if the system is based primarily on sequential logic with little redundancy and multiple functions, then FTA is an overly complex way of representing the system logic and identifying failure modes. In such cases, FMEA and the Reliability Block Diagram method are adequate. In other cases where FTA is preferred. it should be supplemented by descriptions of the failure modes and their consequences.

When choosing an analysis method, it is necessary to be guided primarily by the specific requirements of the project, not only technical, but also the requirements for indicators of time and cost. efficiency and use of results. General guidelines:

a) FMEA is applicable when a comprehensive knowledge of the object's failure characteristics is required:

b) FMEA is more suitable for smaller systems, modules or complexes:

c) FMEA is an important tool for research, development, design or other tasks when the unacceptable consequences of failures must be identified and necessary measures found to eliminate or mitigate them:

d) FMEA may be necessary for state-of-the-art facilities where the failure characteristics may not be consistent with previous operation;

e) FMEA is more applicable to systems that have a large number of components that are linked by a common fault logic:

f) FTA is more suitable for multiple and dependent failure mode analysis with complex logic and redundancy. FTA can be used at higher levels of the system structure, the early stages of a project, and when the need for detailed FMEA is identified at lower levels during in-depth design development.

GOST R 51901.12-2007

Annex A (informative)

Brief description of FMEA and FMECA procedures

A.1 Stages. Overview of Analysis Runs

During the analysis, the following steps of the procedure should have been performed: c) the decision to which method - FMEA or FMECA is needed:

b) defining system boundaries for analysis:

c) awareness of the requirements and functions of the system;

d) definition of failure/operability criteria;

c) definition of failure modes and consequences of failures of each object in the report:

0 description of each failure consequence: e) reporting.

Additional steps for FMECA: h) determination of system failure severity ranks.

I) setting the severity values ​​of object failure modes:

J) determination of the object's failure mode and frequency of consequences:

k) determination of failure mode frequency:

l) compilation of criticality matrices for object failure modes:

m) description of the severity of failure consequences in accordance with the severity matrix; n) compilation of a criticality matrix for the consequences of system failure; o) reporting for all levels of analysis.

NOTE Evaluation of the frequency of failure mode and consequences in the FMEA can be done using steps n>. I) and j).

A.2 FMEA worksheet

A.2.1 Scope of the worksheet

The FMEA worksheet describes the details of the analysis in tabular form. Though general procedure FMEA is permanent, the worksheet can be adapted to a specific project in accordance with its requirements.

Figure A.1 shows an example of the layout of the FMEA worksheet.

A.2.2 Worksheet head

The head of the worksheet should include the following information:

The designation of the system as an object as a whole, for which the final consequences are identified. This designation must be compatible with the terminology used in block diagrams, diagrams and figures:

Period and mode of operation selected for analysis:

The object (module, component, or part) being examined in this worksheet.

Revision level, date, name of the analyst coordinating the FMEA. as well as the names of the main team members. providing additional information for document control.

A.2.3 Completing the worksheet

Entries in the "Object" and "Description of the object and its functions*" columns should identify the topic of the analysis. Links to a block diagram or other application, a brief description of the object and its function, should be given.

The description of the failure modes of the object is given in the “Type of failure*” column. Clause 5.2.3 provides guidelines for identifying potential failure modes. Using a unique 'Failure Mode Code*' identifier for each unique object failure mode will make it easier to summarize the analysis.

The most likely causes of failure modes are listed in the "Possible Failure Causes" column. A brief description of the consequences of the failure mode is given in the "Local consequences of failure" column. Similar information for the facility as a whole is given in the “Failure Outcomes” column. For some FMEA studies, it is desirable to evaluate the consequences of a failure at an intermediate level. In this case, the consequences are indicated in the additional column "Next higher build level". Identification of the consequences of a failure mode is discussed in 5.2.5.

A brief description of the failure mode detection method is given in the Failure Detection Method column. The detection method may be implemented automatically by a built-in test by design, or may require the use of diagnostic procedures by the involvement of operations and maintenance personnel, it is important to identify the method for detecting failure modes to ensure that corrective actions are taken.

GOST R 51901.12-2007

Design features that mitigate or reduce the number of failures of a particular type, such as redundancy, should be noted in the Failure Compensation Conditions column. Compensation by means of maintenance or operator actions should also be indicated here.

the Failure Severity column indicates the severity level set by the FMEA analysts.

in the column "Frequency or probability of occurrence of failure" indicate the frequency or probability of occurrence of a particular type of failure. The scaling should correspond to its value (for example, failures per million hours, failures per 1000 km, etc.).

8 column "Remarks" indicate observations and recommendations in accordance with 5.3.4.

A.2.4 Notes in the worksheet

The last column of the worksheet should contain all the necessary remarks to clarify the rest of the entries. Possible future actions, such as design improvement recommendations, can be recorded and then reported. This column may also include the following:

a) any unusual conditions:

b) consequences of failures of the redundant element:

c) description of the critical properties of the project:

0) any remarks expanding the information:

f) essential maintenance requirements:

e) dominant causes of failures;

P) dominant consequences of failure:

0 Decisions made, such as project analysis.

end object.

Period and mode of operation:

Revision:

Prepared by:

Description of the object and its functions

(faulty

Code of the type of failure (malfunction)

reasons for failure (not serviceability)

(faulty

Final

(faulty

Failure detection method

Cancellation compensation conditions

Frequency or probability of failure

Figure AL - Example of an FMEA worksheet

GOST R 51901.12-2007

GOST R 51901.12-2007

Annex B (informative)

Research examples

B.1 Example 1 - FMECA for vehicle power supply with RPN calculation

Figure 8.1 shows a small portion of the extensive MEC for a car. The power supply and its connections with the battery are analyzed.

The battery circuit includes a diode D1. capacitor C9. connecting the positive terminal of the battery to ground. A reverse polarity diode is used, which, in the case of connecting the negative terminal of the battery to the case, protects the object from damage. The capacitor is an EMI filter. If any of these parts short to ground, the battery will also short to ground, which may result in battery failure.

Object/Function

Potential type of failure

Potential consequences of failure

Potential!." May cause / failure

Point(s) reasons(s). ‘Mechanism of failure

Subsystem

Local

consequences

Final

consequences

Power supply

A short

closure

Battery terminal * shorts non-ground

Internal component defect

Material destruction

electrical

No backup reverse voltage protection

internal component defect

Crack in welding or semiconductor

A short

closure

Battery terminal * shorts to ground

Battery leak. trip is not possible

internal component defect

Dielectric failure or crack

electrical

No EMI filter

The operation of the object does not meet the requirements

internal component defect

Dielectric exposure, leak, void or crack

electrical

Internal component defect

Material destruction

electrical

No voltage to turn on the electrical circuit

The object is inoperable. No warning indication

Internal component defect

Crack in weld or material

Figure B.1 - FMEA for an automotive part

GOST R 51901.12-2007

vehicle. Such a refusal, of course, has no warning. Failure that makes travel impossible is considered dangerous in the motorcycle industry. Therefore, for the failure mode of both named parts, the severity rank S is equal to 10. O occurrence rank values ​​were calculated based on the intensities of failure parts with the corresponding loads for vehicle operation and then scaled up to O for the vehicle FMEA. The value of detection rank D is very low, since the closure of any of the slice honors is detected when the object is tested for health.

Failure of any of the above parts does not damage the object, however, there is no polarity reversal protection for the diode. Failure of a capacitor that does not filter electromagnetic interference may cause interference to equipment in the vehicle.

If in coil L1. located between the battery and the electrical circuit and intended for filtering. there is an open, the object is inoperable because the battery is disconnected, and no warning will be displayed. Coils have a very low failure rate, so the occurrence rank is 2.

Resistor R91 transmits the battery voltage to the switching transistors. If R91 fails, the object becomes inoperable with a severity rank of 9. Since the resistors have a very low failure rate, the occurrence rank is 2. The detection rank is 1. because the object is not operable.

Appearance Rank

Prevention Actions

Discovery actions

action

Responsible and due date

Results of actions

Actions taken

More component selection High Quality and power

Evaluation and control tests not reliability

Selecting a Higher Quality and Power Component

Evaluation and control tests for reliability

Selecting a Higher Quality and Power Component

Evaluation and control tests for reliability

Selecting a Higher Quality and Power Component

Evaluation and control tests for reliability

Selecting a Higher Quality and Power Component

Evaluation and control tests for reliability

electronics with RPN calculation

GOST R 51901.12-2007

B.2 Example 2 - FMEA for an engine-generator system

The example illustrates the application of the FMEA method to an engine-generator system. The purpose of the study is limited to the system only and concerns the consequences of failures of elements associated with the power supply of the engine-generator or any other consequences of failures. This defines the boundaries of the analysis. The above example partially illustrates the representation of the system in the form of a block diagram. Initially, five subsystems were identified (see Figure B.2) and one of them - the heating, ventilation and cooling system - is presented at lower levels of the structure in relation to the hen. where it was decided to start the FMEA (see Figure c.3). The flowcharts also show the numbering system used for references in the FMEA worksheets.

For one of the engine-generator subsystems, an example of a worksheet (see Figure B.4) is shown that complies with the recommendations of this standard.

an important honor of the FMEA is the definition and classification of the severity of the consequences of failures for the system as a whole. For the engine-generator system, they are presented in Table B.1.

Table B.1 — Definition and classification of failure severity for the engine-generator system as a whole

Figure B.2 - Diagram of engine-generator subsystems


Figure 6L - Diagram of the heating, ventilation, cooling system

GOST R 51901.12-2007

System 20 - Heating, ventilation and cooling system

Component

type of failure (malfunction)

Consequence of failure

Method or indication of failure detection

Reservation

Remarks

Heating system (from 12 to 6 switches at each end) only when the mechanism is not working

Note - Mech-“mzm can overheat. if the heaters do not turn off automatically

Heaters

a) Heater burnout

b) Short circuit to earth due to insulation defect

Lower 'mine natre yours

No heating - possible condensation1c<я

a) Temperature less than 5°Above ambient temperature

b) Use of a fuse or approved circuit breaker

One short circuit not empo should not lead to system failure

One short circuit on the empo should not lead to a system failure for a long time

Housing for heating ther-m “small, cable

Connection with heaters

a) Overheating of the terminal or cable of one/six or all heaters

b) Short circuit to earth terminals (trace)

No or reduced heating, condensation

Lack of all heating - condensation

Temperature less than b‘Above ambient temperature

Verified

supply

Figure 0.4 - FMEA for system 20

GOST R 51901.12-2007

GOST R 51901.12-2007

B.3 Example 3 - FMECA for a manufacturing process

The FMECA process examines each manufacturing process of the object in question. FMECA is investigating that. what could go wrong. as foreseen and existing protection measures (in case of failure), as well as how often this can happen and how such situations can be eliminated by modernizing the facility or process. The goal is to focus on possible (or known) problems to maintain or achieve the required quality of the finished product. Enterprises that collect complex objects. such as passenger cars are well aware of the need to require component suppliers to perform this analysis. The main beneficiaries are the component suppliers. The implementation of the analysis forces re-checking of violations of the manufacturing technology, and sometimes failures, which leads to the cost of improvement.

The worksheet form for the FMECA process is similar to the worksheet form for the FMECA product, but there are some differences (see Figure B.5). A measure of criticality is the Action Priority Value (APW). very close in meaning to the risk priority value (PPW). considered above. Process FMECA examines the ways in which defects and nonconformities occur and delivery options to the customer in accordance with quality management procedures. FMECA does not consider service failures due to wear and tear or misuse.

GU>OM*SS

The object here is the failure action

Leaked * ala "e

CONSEQUENCES»

(b get dark on *

Existing facility manage**

SUSHDSTV

R "xm" "dominoes *

I>yS 10*1"

PvzMOTRVIINO

e>ah*mi*

Incorrect dimensions or angles of the shoulder

inserts without willows" weights on the die. Decreased performance

Misadjusted by inserting the wrong

thickness. surrounding the insert Reduced operability Reduced service life

production deficiencies OR controls shakes the pto

manufacturer and SAT plans

Analysis of sampling plans

Isolate defective components from good supplies

Gathering training

Insufficient shine of nickel plating

Corrosion. Deviations at the final stage

visual control in accordance with the plan of statistical acceptance control

Turn on random control to give it a visual check for the correct gloss

bad estimate of the mesh view

insufficient metal extrusion Incorrect wall thickness. Waste

thin walls were found during machining.

deficiencies in production or quality management

visual control" in the plans of statistical acceptance control

Enable some JUICY control to perform a visual check for the correct gloss

Resource reduction

Kind of consequences

implications for the intermediate process, implications for final process: Consequences for assembly. losledst""i for user

type "ITICITY

Ose to the probability of occurrence * 10;

$ek = severity of consequences on a scale of 1-10.

De(* probability of ""detection before delivery to the customer. u, are * priority action value * Ose $ek Dei

Figure B.5 — Part of the FM EC A process for a machined alumina bar

GOST R 51901.12-2007

GOST R 51901.12-2007

Annex C (informative)

List of abbreviations in English used in the standard

FMEA - Failure Modes and Effects Analysis Method:

FMECA - a method for the analysis of modes, consequences and criticality of failures:

DFMEA - FMEA. used for project analysis in the automotive industry: PRA - probabilistic risk analysis:

PFMEA - FMEA. used for process analysis:

FTA - fault tree analysis:

RPN - risk priority value:

APN - action priority value.

Bibliography

(1J GOST 27.002-89

Reliability in technology. Basic concepts. Terms and definitions (Industrial product dependability. General principles. Terms and definitions)

(2) IEC 60300-3-11:1999

Reliability management. Part 3. Application guide. Section 11 Maintenance reliability oriented

(IEC 60300-3-11:1999)

(Dependability management - Part 3-11: Application guide-Reliability centered maintenance)

(3) SAE J1739.2000

Potential Failure Mode and Effects Analysis In Design (Design FMEA) and Potential Failure Mode and Effects Analysis in Manufacturing and Assembly Processes (Process FMEA). and Potential Failure Mode and Effects Analysis for Machinery

Potential Failure Mode and Effects Analysts, Third Edition. 2001

GOST R 51901.12-2007

UDC 362:621.001:658.382.3:006.354 OKS 13.110 T58

Key words: analysis of failure modes and consequences, analysis of failure modes, consequences and criticality. failure, redundancy, system structure, failure mode, failure criticality

Editor L.8 Afanasenko Technical editor of the PA. Guseva Proofreader U.C. Kvbashoea Computer layout P.A. Circles of oil

Handed over to the set 10.04.2003. Signed and stamped t6.06.2008. Format 60" 64^. Offset paper. Arial headset.

Offset printing Uel. print 4.65. Uch.-ed. 3.90. Circulation 476 zhz. Zach. 690.

FSUE STANDARTINFORM*. 123995 Moscow. Grenade lane.. 4. wvrwgoslmto.ru infoggostmlo t

Typed in FSUE "STANDARTINFORM" on a PC.

Printed at the branch of FSUE STANDARTINFORM* ■- type. Moscow printer. 105062 Moscow. Lyalin per., 6.