Questions about MTTF
Over the past week I’ve seen or received a couple of questions about MTTF. One was on how to use failure data to calculate MTTF, another on how to estimate Weibull parameters after assuming a constant rate of failure.
It is good to see such questions, as it means the person is curious enough to take the time to ask.
How to calculate MTTF from field data is pretty straight forward. Tally up the time the parts operated before failure and any time of units that are still operating, then divide by the number of failures. Thus if I had to replace a part four times after failures each having operated for 25 hours, and there is one still operating for 20 hours, the total hours these parts operated is 25 + 25 + 25 + 25 + 20 = 120 hours. Divide 120 by 4 failures to find the MTTF of 30 hours.
Simple, which is why MTTF and MTBF are still in widespread use.
We should ask a couple of questions and check on a few assumptions when calculating MTTF. First, is the assumption of constant failure rate justified? Second, what is it we’re trying to learn from the data?
If the only thing we do is calculate MTTF and use or report only the resulting value, we are assuming the exponential distribution and a constant failure rate. That may or may not be appropriate. Plot the data and the fitted distribution and does the distribution fit the data? Do we know something about the failure mechanism that is behind the failures, was it truly random events, or a wear out mechanisms?
At least ask and when possible check.
Note the result in the simple example is 30 hours, not the average of the five values which is just shy of 25 hours. We divide by the number of failures and the units that haven’t failed add time to the numerator only. We do want to take into account the items that haven’t failed yet (right censored), yet it does alter the result away from what many consider to be an average. This is one of the sources of confusion around such metrics.
How to estimate the Weibull parameters
The question included that they first assumed a constant failure rate and calculated MTTF, then wanted to know how to estimate the Weibull parameters, beta and eta. Not sure why and glad to see they may have found using only MTTF wanting, thus the interest in Weibull. Whatever the reason, if you start by making an assumption, that bounds what you can do with the analysis unless you go back to the raw data. By not assumption a constant rate of failure, avoiding being constrained by the single parameter exponential distribution, we may be able to estimate the Weibull parameters.
The question did not include any information about the components in questions or the failure mechanisms observed. If we knew the parts are fans, and the failures were bearing wear-out, we could use that knowledge to estimate the beta parameter. Then we could use the MTTF value as a very crude estimate for eta.
We have to know the failure mechanism though.
Of course, if we have sufficient data, approximately 20 units with at least 5 having failed, we could estimate the Weibull parameters. More units and more failures would help improve the estimates, of course, yet I’ve found that a we can find a reasonable fit with as few as five failures.
Other questions to ask about MTTF
My first question when asked to calculate MTTF is why? What is the information needed for or what decisions are being made based on the values? Then I want to know more about the failures, the failure mechanisms, and the circumstances around the unit during operation and failure. The MTTF value by itself is just a number. When out of context provides very little information.
When you feel the desire or have been asked to calculate MTTF, start asking questions. The same applies if someone wants a Weibull plot, or Mean Cumulative Plot, too. Data analysis is not just arithmetic, it is analysis and requires a reasoned interpretation and conclusion.