Analysis of the nature and consequences of fmea failures. Analysis of the mode and consequences of failure

With an exponential law of distribution of recovery time and time between failures, the mathematical apparatus of Markov random processes is used to calculate the reliability indicators of systems with recovery. In this case, the functioning of systems is described by the process of changing states. The system is depicted as a graph called a state-to-state transition graph.

Random process in any physical system S , is called Markovian, if it has the following property : for any moment t 0 probability of the state of the system in the future (t > t 0 ) depends only on the current state

(t = t 0 ) and does not depend on when and how the system came to this state (in other words: with a fixed present, the future does not depend on the prehistory of the process - the past).

t< t 0

t > t 0

For a Markov process, the "future" depends on the "past" only through the "present", i.e., the future course of the process depends only on those past events that affected the state of the process at the present moment.

The Markov process, as a process without aftereffect, does not mean complete independence from the past, since it manifests itself in the present.

When using the method, in the general case, for the system S , it is necessary to have mathematical model as a set of system states S 1 , S 2 , … , S n , in which it can be during failures and restoration of elements.

When compiling the model, the following assumptions were introduced:

The failed elements of the system (or the object under consideration) are immediately restored (the beginning of the restoration coincides with the moment of failure);

There are no restrictions on the number of restorations;

If all flows of events that transfer the system (object) from state to state are Poisson (the simplest), then random process transitions will be a Markov process with continuous time and discrete states S 1 , S 2 , … , S n .

Basic rules for compiling a model:

1. The mathematical model is depicted as a state graph, in which

a) circles (vertices of the graphS 1 , S 2 , … , S n ) – possible states of the system S , arising from element failures;

b) arrows– possible directions of transitions from one state S i to another S j .

Above/below arrows indicate transition intensities.

Graph examples:

S0 - working condition;

S1 – failure state.

"Loop" denotes delays in a particular state S0 and S1 relevant:

Good condition continues;

The failure state continues.

The state graph reflects a finite (discrete) number of possible system states S 1 , S 2 , … , S n . Each of the vertices of the graph corresponds to one of the states.

2. To describe the random process of state transition (failure / recovery), state probabilities are used

P1(t), P2(t), … , P i (t), … , Pn(t) ,

where P i (t) is the probability of finding the system at the moment t in i-th state.

Obviously, for any t

(normalization condition, since other states, except for S 1 , S 2 , … , S n No).

3. According to the graph of states, a system of ordinary differential equations of the first order (Kolmogorov-Chapman equations) is compiled.

Let's consider an installation element or installation itself without redundancy, which can be in two states: S 0 - trouble-free (workable),S 1 - state of failure (restoration).

Let us determine the corresponding probabilities of element states R 0 (t): P 1 (t) at an arbitrary point in time t under different initial conditions. We will solve this problem under the condition, as already noted, that the flow of failures is the simplest with λ = const and restorations μ = const, the law of distribution of time between failures and recovery time is exponential.

For any moment of time, the sum of the probabilities P 0 (t) + P 1 (t) = 1 is the probability of a certain event. Let us fix the moment of time t and find the probability P (t + ∆ t) that at the moment of time t + ∆ t item is in progress. This event is possible when two conditions are met.

    At time t the element was in the state S 0 and for time t there was no failure. The probability of the element operation is determined by the rule of multiplying the probabilities of independent events. The probability that at the moment t item was and condition S 0 , is equal to P 0 (t). The probability that in time t he did not refuse e -λ∆ t . Up to a higher order of smallness, we can write

Therefore, the probability of this hypothesis is equal to the product P 0 (t) (1- λ t).

2. At the point in time t element is in state S 1 (in a state of recovery), during the time t restoration has ended and the element has entered the state S 0 . This probability is also determined by the rule of multiplying the probabilities of independent events. The probability that at the time t the element was in the state S 1 , is equal to R 1 (t). The probability that the recovery has ended is determined through the probability of the opposite event, i.e.

1 - e -μ∆ t = μ· t

Therefore, the probability of the second hypothesis is P 1 (t) ·μ· t/

Probability of the operating state of the system at a point in time (t + ∆ t) is determined by the probability of the sum of independent incompatible events when both hypotheses are fulfilled:

P 0 (t+∆ t)= P 0 (t) (1- λ t)+ P 1 (t) ·μ t

Dividing the resulting expression by t and taking the limit at t → 0 , we obtain the equation for the first state

dP 0 (t)/ dt=- λP 0 (t)+ µP 1 (t)

Carrying out similar reasoning for the second state of the element - the state of failure (restoration), we can obtain the second equation of state

dP 1 (t)/ dt=- µP 1 (t)+λ P 0 (t)

Thus, to describe the probabilities of the state of the element, a system of two differential equations was obtained, the state graph of which is shown in Fig. 2

d P 0 (t)/ dt = - λ P 0 (t)+ µP 1 (t)

dP 1 (t)/ dt = λ P 0 (t) - µP 1 (t)

If there is a directed state graph, then the system of differential equations for state probabilities R To (k = 0, 1, 2,…) can be written immediately using the following rule: on the left side of each equation is the derivativedP To (t)/ dt, and in the right one there are as many components as there are edges connected directly with the given state; if the edge ends in this state, then the component has a plus sign, if it starts from given state, then the component has a minus sign. Each component is equal to the product of the intensity of the flow of events that transfers an element or system along a given edge to another state, by the probability of the state from which the edge begins.

The system of differential equations can be used to determine the PBR of electrical systems, the function and availability factor, the probability of being under repair (restoration) of several elements of the system, the average time the system is in any state, the failure rate of the system, taking into account the initial conditions (states of the elements).

Under initial conditions R 0 (0)=1; R 1 (0)=0 and (P 0 +P 1 =1), the solution of the system of equations describing the state of one element has the form

P 0 (t) = μ / (λ+ μ )+ λ/(λ+ μ )* e^ -(λ+ μ ) t

Failure state probability P 1 (t)=1- P 0 (t)= λ/(λ+ μ )- λ/ (λ+ μ )* e^ -(λ+ μ ) t

If at the initial moment of time the element was in the state of failure (restoration), i.e. R 0 (0)=0, P 1 (0)=1 , then

P 0 (t) = μ/ (λ +μ)+ μ/(λ +μ)*e^ -(λ +μ)t

P 1 (t) = λ /(λ +μ)- μ/ (λ +μ)*e^ -(λ +μ)t


Usually in the calculations of reliability indicators for sufficiently long time intervals (t ≥ (7-8) t in ) without a large error, the probabilities of states can be determined by the established average probabilities -

R 0 (∞) = K G = P 0 and

R 1 (∞) = To P =P 1 .

For steady state (t→∞) P i (t) = P i = const a system of algebraic equations with zero left-hand sides is compiled, since in this case dP i (t)/dt = 0. Then the system of algebraic equations has the form:

As Kg there is a probability that the system will be operational at the moment t at t , then from the resulting system of equations is determined P 0 = kg., i.e. the probability of the element operation is equal to the stationary availability factor, and the probability of failure is equal to the forced downtime factor:

limP 0 (t) = Kg =μ /(λ+ μ ) = T/(T+ t in )

limP 1 (t) = Кп = λ /(λ+μ ) = t in /(T+ t in )

i.e., the same result was obtained as in the analysis of limit states using differential equations.

The method of differential equations can be used to calculate reliability indicators and non-recoverable objects (systems).

In this case, the inoperable states of the system are "absorbing" and the intensities μ exits from these states are excluded.

For a non-restorable object, the state graph looks like:

System of differential equations:

Under initial conditions: P 0 (0) = 1; P 1 (0) = 0 , using the Laplace transform of the probability of being in a working state, i.e., FBG to operating time t will be .

F MEA analysis is today recognized as one of the most effective tools to improve the quality and reliability of the objects under development. It is aimed primarily at preventing the occurrence of possible defects, as well as at reducing the amount of damage and the likelihood of its occurrence.

Failure Modes and Effects Analysis FMEA in order to reduce risks, it is successfully used all over the world at enterprises in various industries. This is a universal method applicable not only for each production facility, but also for almost any activity or individual processes. Wherever there is a risk of defects or failures, FMEA analysis allows you to evaluate potential threat and choose the most suitable option.

FMEA terminology

The basic concepts on which the concept of analysis is based are the definitions of defect and failure. Having a general result in the form negative consequences they are, however, significantly different. Thus, a defect is a negative result of the predicted use of an object, while a failure is an unplanned or abnormal operation during production or operation. In addition, there is also the term non-compliance, which means failure to meet the planned conditions or requirements.

Negative results, the probability of which analyzes FMEA method, marks are given, which can conditionally be divided into quantitative and expert. Quantitative estimates include the probability of occurrence, the probability of detecting a defect, measured as a percentage. Expert assessments are given in points for the probability of occurrence and detection of a defect, as well as for its significance.

The final indicators of the analysis are the complex risk of the defect, as well as the priority number of the risk, which are general assessment the significance of the defect or failure.

Analysis steps

Briefly FMEA analysis method consists of the following steps:

  • 1. Team building
  • 2. Choice of the object of analysis. Defining the boundaries of each part of a composite object
  • 3. Determination of applications of the analysis
  • 4. Selection of types of nonconformities under consideration based on time limits, type of consumers, geographical conditions, etc.
  • 5. Approval of the form in which the results of the analysis will be provided.
  • 6. Designation of the elements of the object in which failures or defects may occur.
  • 7. Compile a list of the most significant possible defects for each item
  • 8. Determining the possible consequences for each of the defects
  • 9. Evaluation of the probability of occurrence, as well as the severity of the consequences for all defects
  • 10. Calculation of the priority risk number for each defect.
  • 11. Ranking of potential failures/defects by significance
  • 12. Development of measures to reduce the likelihood of occurrence or severity of consequences, by changing the design or production process
  • 13. Recalculation of grades

If necessary, items 9-13 are repeated until an acceptable indicator of the risk priority number for each of the significant defects is obtained.

Types of analysis

Depending on the stage of product development and on the object of analysis FMEA method is divided into the following types:

  • SFMEA or analysis of the interaction between individual elements of the whole system
  • DFMEA analysis - an event to prevent the launch of an unfinished design into production
  • PFMEA analysis allows you to work out and bring processes to an applicable state

Purpose of FMEA analysis

Using FMEA analysis method on the manufacturing plant you can achieve the following results:

  • reducing the cost of production, as well as improving its quality by optimizing the production process;
  • reduction of after-sales costs for repairs and maintenance;
  • reduction of production preparation time;
  • reduction in the number of product improvements after the start of production;
  • increase in consumer satisfaction and, as a result, an increase in the reputation of the manufacturer.

The peculiarity is that the analysis failure modes and effects FMEA in short term may not provide tangible financial benefits or even be costly. However, in strategic planning it plays a decisive role, since, carried out only at the pre-production stage, it will subsequently bring economic benefits throughout life cycle product. In addition, the costs of the negative consequences of defects can often be higher than the final cost of the product. An example can be given aviation industry where hundreds of human lives depend on the reliability of every detail.

Each main component of the system is studied in order to determine the ways of its transition to an emergency state. The analysis is predominantly qualitative and is carried out on a “bottom-up” basis, subject to the occurrence of emergency conditions “one at a time”.

Analysis of failure modes, consequences and criticality much more detailed than fault tree analysis, as all possible failure modes or emergencies for each element of the system.

For example, a relay may fail for the following reasons:

– the contacts have not opened or closed;

- delay in closing or opening contacts;

- short circuit of the contacts to the housing, power source, between the contacts and in the control circuits;

– bounce of contacts (unstable contact);

– contact arc, noise generation;

- rupture of the winding;

– winding short circuit;

– low or high winding resistance;

- overheating of the winding.

For each type of failure, the consequences are analyzed, methods for eliminating or compensating for failures are outlined, and a list of necessary checks is compiled.

For example, for tanks, tanks, pipelines, this list may be as follows:

– variable parameters (flow rate, quantity, temperature, pressure, saturation, etc.);

– systems (heating, cooling, power supply, control, etc.);

– special states (maintenance, switching on, switching off, content replacement, etc.);

– change in conditions or condition (too big, too small, water hammer, settling, immiscibility, vibration, rupture, leakage, etc.).

The forms of documents used in the analysis are similar to those used in the preliminary hazard analysis, but are largely detailed.

Criticality Analysis provides for the classification of each element in accordance with the degree of its influence on the performance of the overall task by the system. Criticality categories are established for various kinds bounce:

The method does not provide a quantitative assessment of the possible consequences or damage, but allows you to answer the following questions:

– which of the elements should be subjected to a detailed analysis in order to eliminate hazards leading to accidents;

- which element requires special attention in the production process;

- what are the standards of input control;

– where special procedures, safety rules and other protective measures should be introduced;

How to spend the most efficient way to prevent
accidents.

7.3.3. Diagram analysis of all possible
consequences of failure or failure of the system
("fault tree")

This method of analysis is a combination of quantitative and qualitative techniques for recognizing the conditions and factors that can lead to an undesirable event (“top event”). The conditions and factors taken into account are built into a graphic chain. Starting from the top, the causes or emergency states of the next, lower functional levels of the system are identified. Many factors are analyzed, including human interactions and physical phenomena.

Attention is concentrated on those effects of a malfunction or accident that are directly related to the top of the events. The method is especially useful for the analysis of systems with many areas of contact and interaction.

Representing an event in the form of a graphic diagram leads to the fact that one can easily understand the behavior of the system and the behavior of the factors included in it. Due to the bulkiness of "trees", their processing may require the use of computer systems. Due to the bulkiness, it is also difficult to check the "fault tree".

The method is primarily used in risk assessment to assess the probabilities or frequencies of faults and accidents. Section 7.4 gives a more detailed description of the method.

7.3.4. Analysis of the diagram of possible consequences of an event
("event tree")

"Event Tree" (ET) - an algorithm for considering events emanating from the main event (emergency). DS is used to determine and analyze the sequence (options) of the development of an accident, including complex interactions between technical safety systems. The probability of each emergency scenario is calculated by multiplying the probability of the main event by the probability of the end event. In its construction, direct logic is used. All values ​​of probability of non-failure operation P very small. "Tree" does not give numerical solutions.

Example 7.1. Suppose, by performing a preliminary hazard analysis (PHA), it was revealed that the critical part of the reactor, i.e. the subsystem from which the risk begins, is the reactor cooling system; thus, the analysis begins by looking at the sequence of possible events from the moment of failure of the pipeline of the refrigeration plant, called the triggering event, the probability of which is equal to P(A)(Fig. 7.1), i.e. the accident begins with the destruction (breakage) of the pipeline - the event A.
Next, we analyze the possible scenarios for the development of events ( B,C, D and E) that may follow the collapse of the pipeline. On fig. 7.1 shows a "tree initiating events” displaying all possible alternatives.
The first branch examines the state of the electrical supply. If power is available, the next to be analyzed is the emergency core cooling system (ACOR). The failure of ASOR leads to the melting of fuel and to various leaks of radioactive products, depending on the integrity of the structure.

For analysis using a binary system in which the elements either perform their functions or fail, the number of potential failures is 2 N– 1, where N is the number of considered elements. In practice, the original "tree" can be simplified using engineering logic and reduced to a simpler tree, shown at the bottom of Fig. 7.1.

First of all, the question of the availability of electrical power is of interest. The question is, what is the probability P B power failure and what effect this failure has on other protection systems. If there is no power supply, in fact, none of the actions provided for in the event of an accident using sprayers to cool the reactor core can be carried out. As a result, the simplified "tree of events" does not contain a choice in the event of a power failure, and a large leakage may occur, the probability of which is equal to P A(P B).

In the event that the failure in the supply of electrical energy depends on the failure of the pipeline of the reactor cooling system, the probability P B should be calculated as a conditional probability to account for this dependence. If power is available, the following options in the analysis depend on the state of the ACOP. It may or may not work, and its failure is likely P C 1 leads to the sequence of events depicted in fig. 7.1.

Rice. 7.1. "Event Tree"

It should be noted that for the system under consideration, it is possible various options accident development. If the radioactive material removal system is operational, there are fewer radioactive leaks than if it were to fail. Of course, failure in the general case leads to a sequence of events with less probability than in the case of uptime.

Rice. 7.2. Probability histogram for various leak rates

Having considered all variants of the "tree", it is possible to obtain a range of possible leaks and the corresponding probabilities for various sequences of accident development (Fig. 7.2). The top line of the "tree" is the main option for a reactor accident. This sequence assumes that the pipeline fails and all safety systems remain operational.

To deal with the second part, I strongly recommend that you first familiarize yourself with.

Failure Mode and Effect Analysis (FMEA)

Failure Modes and Effects Analysis (FMEA) is an inductive reasoning risk assessment tool that considers risk as a product of the following components:

  • severity of consequences of potential failure (S)
  • the possibility of a potential failure (O)
  • failure detection probability (D)

The risk assessment process consists of:

Assignment to each of the above risk components of an appropriate risk level (high, medium or low); with detailed practical and theoretical information about the principles of design and operation of a qualifying device, it is possible to objectively assign risk levels both for the possibility of a failure and for the probability of not detecting a failure. The possibility of a failure occurrence can be considered as a time interval between occurrences of the same failure.

Assigning risk levels to the probability of not detecting a failure requires knowing how a failure of a particular instrument function will manifest itself. For example, system failure software instrument assumes that the spectrophotometer cannot be operated. Such a failure can be easily detected and therefore can be assigned a low risk level. But the error in the measurement of optical density cannot be detected in a timely manner if calibration has not been performed, respectively, the failure of the function of the spectrophotometer to measure optical density should be assigned a high level of risk of its non-detection.

Assigning a risk severity level is somewhat more subjective and depends to some extent on the requirements of the relevant laboratory. In this case, the level of risk severity is considered as a combination of:

Some suggested criteria for assigning a risk level for all components of the overall risk assessment discussed above are presented in Table 2. The proposed criteria are most suitable for use in a regulated product quality control environment. Other laboratory analysis applications may require a different set of assignment criteria. For example, the impact of any failure on the performance of a forensic laboratory may ultimately affect the outcome of a criminal trial.

Table 2: suggested criteria for assigning risk levels

Level of riskQuality (Q)Compliance (C) Business (B)Spawn Probability (P) Detection Probability (D)
severity
TallLikely to harm the consumer Will lead to a product recall More than one week downtime or potential major loss of revenue More than once within three months Unlikely to be detected in most cases
AverageProbably won't harm the consumer Will result in a warning letter Downtime of up to one week or potential significant loss of income Once every three to twelve months May be found in some cases
ShortWill not harm the user Will lead to the discovery of a nonconformity during the audit Downtime up to one day or minor loss of income Once every one to three years Likely to be discovered

Taken from source

The calculation of the level of total risk involves:

  1. Assigning a numerical value to each risk severity level for each individual severity category, as shown in Table 3
  2. Summing the numerical values ​​of the severity levels for each risk category will give a cumulative quantitative severity level in the range from 3 to 9
  3. The cumulative quantitative severity level can be converted to the cumulative qualitative severity level as shown in Table 4.
Table 3: assignment of a quantitative level of severity Table 4: cumulative severity calculation
Quality level of severity Quantitative severity level Cumulative Quantitative Severity Level Cumulative quality level of severity
Tall3 7-9 Tall
Average2 5-6 Average
Short1 3-4 Short
  1. As a result of multiplying the cumulative quality level of Severity (S) by the level of possibility of Occurrence (O), we get the Risk Class, as shown in table 5.
  2. The Risk Factor can then be calculated by multiplying the Risk Class by the Undetectable as shown in Table 6.
Table 5: risk class calculation Table 6: risk level calculation
Severity level undetectability
Appearance level ShortAverageTall Risk classShortAverageTall
TallAverageTallTall TallAverageTallTall
AverageShortAverageTall AverageShortAverageTall
ShortShortShortAverage ShortShortShortAverage
Risk class = Severity level * Occurrence level Risk Factor = Risk Class * Level of Undetectable

An important feature of this approach is that when calculating the Risk Factor, this calculation gives additional weight to the factors of occurrence and detectability. For example, if a failure is of high severity but is unlikely to occur and is easy to detect, then the overall risk factor will be low. Conversely, if the potential severity is low, but the occurrence of failure is likely to be frequent and not easily detected, then the cumulative risk factor will be high.

Thus, severity, which is often difficult or even impossible to minimize, will not affect total risk associated with a specific functional failure. Whereas occurrence and non-detectability, which are easier to minimize, have a greater impact on overall risk.

Discussion

The risk assessment process consists of four main steps, as listed below:

  1. Conducting an assessment in the absence of any mitigation tools or procedures
  2. Establishment of means and procedures for minimizing the assessed risk based on the results of the assessment
  3. Conducting a risk assessment after the implementation of mitigation measures to determine their effectiveness
  4. If necessary, establish additional mitigation tools and procedures, and reassess

The risk assessment summarized in Table 7 and discussed below is considered from the perspective of the pharmaceutical and related industries. Despite this, similar processes can be applied to any other sector of the economy, however, if other priorities are applied, then different, but no less justified, conclusions can be obtained.

Initial assessment

One starts with the spectrophotometer's operating functions: wavelength accuracy and precision, and the spectral resolution of the spectrophotometer, which determine whether it can be used in UV/Visible identity testing. Any inaccuracies, insufficient wavelength precision of the determination, or insufficient resolution of the spectrophotometer can lead to erroneous results of the identity test.

In turn, this can lead to the release of products with unreliable authenticity, up to its receipt by the final consumer. It can also lead to product recalls and subsequent significant costs or loss of revenue. Therefore, in each category of severity, these functions will present a high level of risk.

Table 7: risk assessment with FMEA for UV/V spectrophotometer

Preminimization Subsequent minimization
severity severity
FunctionsQ C B S O D RF Q C B S O D RF
Working functions
Wavelength accuracy ATATATATWithATAT ATATATATHHH
Wavelength reproducibility ATATATATWithATAT ATATATATHHH
Spectral resolution ATATATATWithATAT ATATATATHHH
scattered lightATATATATWithATAT ATATATATHHH
Photometric stability ATATATATATATAT ATATATATHHH
Photometric noise ATATATATATATAT ATATATATHHH
Spectral baseline flatness ATATATATATATAT ATATATATHHH
Photometric accuracy ATATATATATATATATATATATHHH
Data Quality and Integrity Functions
Access controls ATATATATHHH ATATATATHHH
Electronic signatures ATATATATHHH ATATATATHHH
Password controls ATATATATHHH ATATATATHHH
Data security ATATATATHHH ATATATATHHH
audit trail ATATATATHHH ATATATATHHH
Timestamps ATATATATHHH ATATATATHHH

H = High, M = Medium, L = Low
Q = Quality, C = Compliance, B = Business, S = Severity, O = Occurrence, D = Undetectable, RF = Risk Factor

Analyzing further, scattered light affects the correctness of optical density measurements. Modern instruments can take it into account and correct the calculations accordingly, but this requires that this scattered light be determined and stored in the spectrophotometer's operating software. Any inaccuracies in the stored stray light parameters will result in incorrect optical density measurements with the same consequences for photometric stability, noise, accuracy and baseline flatness as indicated in the next paragraph. Therefore, in each category of severity, these functions will present a high level of risk. The accuracy and precision of the wavelength, resolution and scattered light are largely dependent on the optical properties of the spectrophotometer. Modern diode array devices have no moving parts and therefore failures of these functions can be assigned a medium probability of occurrence. However, in the absence of special checks, the failure of these functions is unlikely to be detected, therefore, undetected is assigned a high level of risk.

Photometric stability, noise and accuracy, as well as the flatness of the baseline affect the accuracy of the optical density measurement. If the spectrophotometer is used to make quantitative measurements, then any error in the optical density measurement may result in erroneous results being reported. If the reported results from these measurements are used to release a batch of a pharmaceutical product to the market, it may result in end-users receiving poor-quality batches of the drug.

Such series will have to be recalled, which in turn will entail significant costs or loss of income. Therefore, in each category of severity, these functions will present a high level of risk. In addition, these functions depend on the quality of the UV lamp. UV lamps have a standard life of approximately 1500 hours or 9 weeks of continuous use. Accordingly, these data indicate a high risk of failure. In addition, in the absence of any precautions, failure of any of these functions is unlikely to be detected, which implies a high factor of undetectable.

Now back to the functions of quality assurance and data integrity, as test results are used to make decisions about the suitability of a pharmaceutical product for its intended use. Any compromise on the correctness or integrity of the records created could potentially result in a product of undetermined quality being released to the market, which could harm the end user, and the product might have to be recalled, resulting in large losses to the laboratory/company. Therefore, in each category of severity, these functions will present a high level of risk. However, once the required instrument software configuration has been properly configured, it is unlikely that these functions will fail. In addition, any failure can be detected in a timely manner.

For example:

  • Granting access only to authorized persons to the relevant work program until the moment it is opened, it can be implemented by requiring the system to enter a username and password. If this function fails, the system will no longer prompt for username and password, respectively, it will be immediately detected. Therefore, the risk of not detecting this failure will be low.
  • When a file is created that needs to be certified electronic signature, then a dialog box opens that requires you to enter a username and password, respectively, if a system failure occurs, this window will not open and this failure will be immediately detected.

minimization

Although the severity of failure of operational functions cannot be minimized, the possibility of failure can be significantly reduced and the probability of detection of such a failure can be increased. Before using the instrument for the first time, it is recommended that you qualify the following functions:

  • wavelength accuracy and precision
  • spectral resolution
  • scattered light
  • photometric accuracy, stability and noise
  • flatness of the spectral baseline,

and then re-qualify at specified intervals, as this will significantly reduce the possibility and probability of not detecting any failure. Because photometric stability, noise and accuracy, and baseline flatness are dependent on the condition of the UV lamp, and standard deuterium lamps have a lifespan of approximately 1500 hours (9 weeks) of continuous use, it is recommended that the operating procedure indicate that the lamp(s) should be turned off for the idle period of the spectrophotometer, that is, when it is not in use. It is also recommended to perform preventive maintenance (PM) every six months, including lamp replacement and requalification (RP).

The rationale for the requalification period depends on the lifetime of the standard UV lamp. It is approximately 185 weeks when used for 8 hours once a week, and the corresponding lifetime in weeks is shown in Table 8. Thus, if the spectrophotometer is used four to five days a week, the UV lamp will last about eight to ten months.

Table 8: average lifetime of a UV lamp, depending on the average number of eight-hour days of operation of the spectrophotometer during the week

Average number of days of use per week Average lamp life (weeks)
7 26
6 31
5 37
4 46
3 62
2 92
1 185

Preventive maintenance every six months Maintenance and requalification (PTO/PC) will ensure trouble-free operation of the instrument. If the spectrophotometer is operated for six to seven days a week, then the lamp life is expected to be about six months, so a PHE/PC every three months is more appropriate to ensure adequate uptime. Conversely, if the spectrophotometer is used once or twice a week, then the PHE/PC will suffice to run every 12 months.

In addition, due to the relatively short life of the deuterium lamp, it is recommended that the following parameters be checked, preferably every day the spectrophotometer is used, as this will further guarantee its correct functioning:

  • lamp brightness
  • dark current
  • calibration of deuterium emission lines at wavelengths of 486 and 656.1 nm
  • filter and shutter speed
  • photometric noise
  • spectral baseline flatness
  • short-term photometric noise

Modern instruments already contain these tests within their software and can be performed by selecting the appropriate function. If any of the tests fail, with the exception of the dark current and filter and shutter speed test, then the deuterium lamp must be replaced. If the dark current or filter and gate speed test fails, then the spectrophotometer should not be operated and should instead be sent in for repair and requalification. Establishing these procedures will minimize both the risk that a work function may fail and the risk that any failure may not be detected.

Risk factors for data quality and integrity functions are already low without any mitigation. Therefore, it is only necessary to check the operation of these functions during OQ and PQ to confirm the correct configuration. After that, any failure can be detected in a timely manner. However, personnel must be trained or instructed to be able to recognize a failure and take appropriate action.

Conclusion

Failure Mode and Effect Analysis (FMEA) is an easy-to-use risk assessment tool that can be easily applied to assess the risks of laboratory equipment failure affecting quality, compliance and business operations. Performing such a risk assessment will enable informed decisions to be made regarding the implementation of appropriate controls and procedures to economically manage the risks associated with the failure of critical instrument functions.

Failure mode and consequences analysis - AVPO (Failure Mode and Effects Analysis - FMEA) it is applied for qualitative assessment reliability and safety technical systems. Failure mode and effects analysis is a method to identify the severity of the consequences of potential failure modes and provide mitigation measures. An essential feature of this method is the consideration of each system as a whole and each of its component parts (element) in terms of how it can become faulty (type and cause of failure) and how this failure affects technological system(consequences of refusal). The term "system" here is understood as a set of interrelated or interacting elements (GOST R 51901.12-2007) and is used to describe hardware (technical) means, software (and their combination) or process. In general, AVPO is applied to certain types failures and their consequences for the system as a whole.

It is recommended to carry out AVPO at the early stages of system (object, product) development, when the elimination or reduction of the number and (or) types of failures and their consequences is more cost-effective. At the same time, the principles of AVPO can be applied at all stages of the system life cycle. Each failure mode is considered as independent. Thus, this procedure is not suitable for dealing with dependent failures or failures resulting from a sequence of multiple events.

Failure mode and effect analysis is an inductive, bottom-up analysis method that systematically analyzes all possible failure modes or emergencies and identifies their resulting effects on the system, based on a sequential consideration of one element after another. Individual emergency situations and failure modes of elements are identified and analyzed in order to determine their impact on other elements and the system as a whole. The AFPO method can be performed in more detail than the fault tree analysis, since it is necessary to consider all possible failure modes or emergencies for each element of the system. For example, a relay may fail for the following reasons: the contacts did not open; delay in closing contacts; short circuit of contacts to the case, power source, between contacts and in control circuits; rattling of contacts; unstable electrical contact; contact arc; winding break, etc.

Examples general types failures can be:

  • ? failure during operation;
  • ? failure associated with non-operation at the set time;
  • ? refusal associated with the non-stopping of work at the set time;
  • ? early activation, etc.

Additionally, for each category of equipment, a list of necessary checks should be drawn up. For example, for tanks and other capacitive equipment, such a list might include:

  • ? technological parameters: volume, flow rate, temperature, pressure, etc.;
  • ? auxiliary systems: heating, cooling, power supply, supply, automatic control, etc.;
  • ? special states of equipment: commissioning, maintenance during operation, decommissioning, catalyst change, etc.;
  • ? changes in the conditions or condition of the equipment: excessive deviation of the pressure value, water hammer, sediment, vibration, fire, mechanical damage, corrosion, rupture, leakage, wear, explosion, etc.;
  • ? characteristics of instrumentation and automation: sensitivity, tuning, delay, etc.

The method provides for the consideration of all types of failures for each element. The causes and consequences of failure (local - for the element and general - for the system), methods of detection and conditions for compensating for failure (for example, redundancy of elements or monitoring of the object) are subject to analysis. An assessment of the significance of the impact of the consequences of a failure on the operation of an object is the severity of the rejection. An example of classification by category of severity of consequences when performing one of the types of AVPO (in a qualitative form) is given in Table. 5.3 (GOST R 51901.12-2007).

Table 5.3

Failure severity classification

The ending

The AFRA checklist is a statement of the AFRA method itself, and its form is similar to that used in other qualitative methods, including expert assessments, with a difference in greater detail. The AFPO method is focused on equipment and mechanical systems, is easy to understand, and does not require the use of a mathematical apparatus. This analysis allows you to determine the need for changes in the design and evaluate their impact on the reliability of the system. The disadvantages of the method include a significant time investment for implementation, as well as the fact that it does not take into account combinations of failures and the human factor.