As regular readers know, MTBF by itself is misleading. When representing actual data it can be deceptive as well. Just because you have a high MTBF value doesn’t mean it is reliable.
In a previous article, 10 Reasons to Avoid MTBF, I mentioned that it is possible to have a relatively high MTBF value when the actual reliability is low. Ashley sent me the following note:
Hi Fred, i love reading your articles they are very informative. I have a question about something you said in a comment which i am hoping you will be able to clarify for me. You said products with higher MTBF can actually be less reliable than products with a lower MTBF
I have tried to find information on how this is possible online, and tried to do the maths myself to make this happen but i have to admit i am struggling.
No worries, Ashley, let’s work out an example to illustrate what I meant.
A Sample Set of Data
Let’s create an example data set with a decreasing hazard rate. I used R and the command of
This provided a set of 10 values drawn at random from a Weibull distribution with a beta = 0.5 and eta = 500. The values are:
56, 5, 2559, 1147, 486, 931, 1, 1166, 786, 2.
Let’s say this is in hour of operation till failure from a set of 10 motors. We have complete data, no censoring, nice and simple.
The MTBF Value
Let’s calculate the MTBF of these items. You may argue we should calculate MTTF here, since we are not repairing the motor, yet the calculation is the same.
We would like to know as we are considering buying a new type of motor if the measured reliability (MTBF) is below the manufacturers claim of 500 hours MTBF. The use of these motors is for 168 hour (1 week) runs and we’d like to maintain a relatively high reliability over 168 hours.
The classic way to calculate MTBF is to tally up the run times and divide by the number of failures. We have a sum of 7,139 and with 10 failures estimate MTBF as 713.9 hours. This is above the vendor’s claim of 500 so we are supporting the notion these are good motors.
The Weibull Based MTBF
A quick inspection of the data shows a cluster of early failures then quite a bit of time between failures as the equipment go older. Seems to be a decreasing hazard rate at play here, thus our assumption underlying using MTBF may be suspect.
Let’s fit a Weibull distribution to the data. Firing up Weibull++ and using default fitting for a Weibull 2-parameter distribution we find beta = 0.39664 and eta = 454.137744. The data has a beta below 1 thus shows a decreasing hazard rate over time.
Using the MTBF calculation based on the Weibull distribution fitted parameters we determine MTBF is 1,545 hours. See the article Determine MTBF Given a Weibull Distribution for details on the calculation.
Even more evidence based on the data the performance is well above the vendor’s claim of 500 hours MTBF. Let’s double the order of these fine machines.
Let’s Consider Reliability Instead
We run these motors for 168 hours at a time. So what is the probability a motor will survive 168 hours once installed?
Using the exponential distribution (MTBF estimate) we find the reliability from time 0 till 168 hours is 79%. Using the exponential reliability function, R(t) = exp [ – t / θ ], here.
A similar question is what is the chance of successful operation over 168 hours the 10th time we run the motor (from 1,512 to 1680 hours of life time operation or the tenth run). This is assuming the motor has survived through 9 runs. In this case, we find, not surprisingly given the assumed constant hazard rate and memoryless property of the exponential distribution the expected reliability is 79%.
Using the Weibull distribution we find the reliability from time 0 till 168 hours is 51%. Much lower than the estimate based on the MTBF calculation. We could make a decision based on the 1,545 hours MTBF value or the estimate of a 50% survival rate over the first 168 hours. 50% is not high reliability, yet 1,545 hours seems rather high.
The 10th run reliability using the Weibull fit likewise assumes the motor has survived running for 9 runs or the 1,512 hours. The reliability over the 10th run is 93%. Much higher then the MTBF based estimate.
The data suggests first that the assumption the exponential distribution describes the data is not true. Thus the calculation of MTBF based on the assumption of a constant hazard rate or the exponential distribution provides a misleading result.
The extra step of estimating MTBF after fitting a Weibull distribution just makes the motors appear ‘better’ then the initial estimate. An almost 3x increase in MTBF is due to the slope of the fitting distribution. It is the same data, yet accounting for the decreasing hazard rate results in a higher value for the MTBF. Keep in mind that the MTBF is the mean of the distribution and a Weibull distribution with a beta of 0.5 is heavily right skewed. (Long tail to the right…)
Based on the Weibull it is suggesting that some of the motors would run for a very, very long time without failure, even though more than half failure rather quickly.
The reliability estimate depends on the time frame of interest. For the exponential distribution fit the reliability over 168 hours is 79%, while over 1,680 hours (ten runs) it is 9.5%. For the Weibull distribution fit the reliability over 168 hours is 51% and over 1,680 hours is 18.6%.
Bottom line, using just MTBF we would buy more of the same motors and ‘enjoy’ the experience of about half the motors failing within their first week of use.
Do you have an example that shows just how bad using MTBF misleads you and decision makers? Send it over or add a comment below.