What is the MTBF Means?
Guest post by Msc Teofilo Cortizo
The term MTBF (Mean Time Between Failures) within maintenance management, it is the most important KPI after Physical Availability. Unlike MTTF (Mean Time To Failure), which relates directly to available equipment time, MTBF also adds up the time spent inside a repair. That is, it starts its count from a certain failure and only stops its counter when this fault was remedied, started and repeated itself again. According to ISO 12849: 2013, this indicator can only be used for repairable equipment, and MTTF is the equivalent of non-repairable equipment.
The graphic below illustrates these occurrences:
Calculating the MTBF in the Figure 01, we have added the times T1 and T2 and divided by two. That is, the average of all times between one failure and another and its return is calculated. It is, therefore, a simple arithmetical calculation. But what does it mean?
Generally speaking, this indicator is associated with a reliability quality of assets or asset systems, and may even reach a repairable item, although it is rarer to have data available to that detail. Maintenance managers set some benchmark numbers and track performance on a chart over time. In general, the higher the MTBF the better, or fewer times of breaks and repairs over the analyzed period.
Once we have fixed the concepts, some particularities need to be answered:
1. Can we establish periodicity of a maintenance plan based on MTBF time?
2. Can I calculate my failure rates based on my MTBF?
3. Can I calculate my probability of failure based on my MTBF?
4. If the MTBF of my asset or system is 200 hours, after that time will it fail?
It is interesting to answer these questions separately:
1. Can we establish periodicity of a maintenance plan based on MTBF time?
The MTBF is an average number calculated from a set of values. That is, these values can be grouped into a histogram to generate a data distribution where the average value is its MTBF, or the average of the data. Imagine that this distribution follows the Gaussian law and we have a Normal curve that was modeled based on the failure data. The chart below shows that the MTBF is positioned in the middle of the chart.
In a modeled PDF curve (Probability Density of Failure) the mean value, or the MTBF, will occur after 50% of the failure frequencies have occurred. If we implement the preventive plan with a frequency equal to the MTBF time, it will already have a 50% probability of failing. Therefore, the MTBF is not a number that indicates the optimal time for a scheduled intervention.
2. Can I calculate my failure rates based on my MTBF?
Considering the modeling of the failure data to calculate the MTBF, it´s only possible in the exponential distribution fix a value where the failure rate is the inverse of the MTBF:
MTBF = 1 / ʎ
In this distribution, the MTBF time already corresponds to 63.2% probability of failure.
Any modeling other than exponential, the failure rate will be variable and time dependent, so its calculation will also depend on factors such as the probability density function f(t) and the reliability function R(t).
ʎ(t) = h(t) = f(t) / R(t)
Although the exponential distribution is the most adopted in reliability projects, which would generate a constant failure rate over time, most of the assets have variations within their “bathtub curve”, as exemplified by Moubray:
This means that the exponential expression is not best suited to reflect the behavior of most assets in an industrial plant.
3. Can I calculate my probability of failure based on my MTBF?
As seen above, only in the exponential distribution has a constant failure rate that can be calculated as the inverse of the MTBF. In this case, yes, we can calculate the probability of failure of an asset using the formula below:
f(t) = ʎˑexp(-ʎt)
For other models where the failure rate depends on the time, it is only possible to calculate the probability of failure through a data modeling and determination of a parametric statistical curve.
4. If the MTBF of my asset or system is 200 hours, after that time will it fail?
The question is, what exactly does that number mean? It was shown that MTBF isn´t used as a maintenance plan frequency. According to the items explained above, this time means nothing as it is not comparable to its history over the months. If the parametric model governing the behavior of the assets in a reliability study is not determined, the time of 200 hours has no meaning for a probability of failure. In the case of the MTBF provided by equipment manufacturers is different, through life tests they determine exponential curves and thus calculate the time in which there will be 63.2% of sample failures.
I hope the article has helped us to reflect on the definitions of an indicator that is both used but also so misunderstood within industrial maintenance management.
Msc Teofilo Cortizo
Reliability Engineer