What is MTBF?
The acronym MTBF is commonly known in our field as Mean Time Between Failure.
It is also associated with repairable systems in most text books.
It is also denoted as the theta parameter for an exponential distribution.
It is referenced as a metric for reliability, too. Oh, and it is the inverse of the failure rate.
And, it is mis-understood and mis-used by many. I digress, as there is plenty already written on the perils of MTBF.
What is MTBF? And where and how should it be used, if at all?
According to the old Mil Std 721C (1991)
MEAN-TIME-BETWEEN-FAILURE (MTBF): A basic measure of reliability for repairable items: The mean number of life units during which all parts of the item perform within their specified limits, during a particular measurement interval under stated conditions.
The conventional method to estimate MTBF is to tally up the hours of operation of a set of equipment and divide by the number of failures. It is not the only way to calculate MTBF, yet it does provide a reasonable unbiased estimate.
We often assume a constant failure rate for work involving parts count predictions, system reliability estimates or calculations, or just to simplify the calculations. Often is is unnecessary and may lead to erroneous results.
While MTBF is the single parameter for an exponential distribution, which implies a constant failure rate, nearly all distributions also have a mean – which can be estimated and denoted as MBTF.
When only provided MTBF as the reliability measure for an item, with no other information, we then rely on the validity of the assumed exponential distribution. MTBF is not restricted to just this distribution, thus leads to the misuse of the measure.
MTBF is a basic measure
It is not a good measure or a useful measure when the failure rate changes with equipment use or time. The measure masks that changing failure rate and implies a simple average describes enough information to make good decisions.
The next time you run across an MTBF value, ask which distribution did it come from, ask
- for the supporting data and evidence that the average is sufficient to fully describe the time to failure behavior.
- over what time period is the MTBF valid (this one often confuses those with little knowledge about reliability engineering or how MTBF is calculated or means.)
- about the failure mechanisms and what is expected to fail and when.
How do you define MTBF? This of course is a loaded question and you should be prepared to support your definition.