“Why do you avoid MTBF?”
I got this question the other day. The person knew about the NoMTBF campaign. They didn’t quite understand why it was a big deal, especially for me, to avoid MTBF.
The tiff between MTBF and myself is not personal. The metric has not been a part of my work or caused any significant problems for me personally.
It has caused problems that have caused problems for my enjoyment of products and systems though. It has lead to poor decisions by many organizations that create items I and you use on a regular basis.
We can do better than to settle with the use of MTBF in our own work or in the work of those around us. Here are 10 reasons I recommend you avoid using MTBF.
1 – Failure Free Period Confusion
More often than I would like to count someone has expressed the understanding that MTBF represents the period of time when very few if any failure may occur. I don’t know of any set of numbers that would represent the idea of no failures occurring up to the average of time to failure.
2 – 50th Percentile Confusion
This is very common and in large part do to the various courses on statistics that generally rely on the Normal Distribution. Life data is rarely, although it can happen, normally distributed. Time to failure data is often skewed thus while the mean is defined as the center of mass it often not at the 50th percentile.
I guess it the large scheme of thing this isn’t too bad, yet it does rely on the mean alone, which besides being offset form our understanding it doesn’t represent the changing nature of the failure rate as the item ages. Often this is mixed up with the notion of a constant hazard rate, which rarely is true.
3 – Complete Description of Reliability Performance Confusion
MTBF is one bit of information. It is the inverse of a failure rate over an unspecified period of time and set of conditions. At best MTBF provides inadequate information of the probability of success. It does not include a duration, environment, nor function which are necessary to round out our understanding of reliability performance.
4 – Useful Life of Bathtub Confusion
This often comes from a simplifying assumption that we have eliminated all early life failure mechanisms and the onset of wear out is long ofter anyone cares. This doesn’t happen. Every product has thousands of competing failure mechanism striving to end the item’s life. Each mechanism and cause have either a decreasing or increasing chance over time of leading to failure. Very, very few are actually random causes, thus assuming you are in the ‘flat part of the curve’ is little more than wishful thinking built on ignorance.
5 – The Math is Easier
Granted the math involving MTBF and the assumed exponential distribution is easier than working Weibull, LogNormal, Gamma or other distributions. It is easier then working out or plotting non-parametric approaches too. It is an average and we leaned how to calculate averages early in our education.
Easy comes at a price of being over simplified to the point of being misleading, inaccurate, and just plain worthless for today’s design and business decision making.
6 – Everyone is Using It
Your competition is using MTBF and making poor decisions concerning reliability. This is your chance to make better decisions and improve your product’s reliability performance in remarkable ways. This is your chance to describe the expected and actual reliability performance of your system to your customers in a useful manner.
7 – Books Suggestion Using MTBF
This is all too often true. I’ve heard from a few authors that part of why they use MTBF and the exponential distribution is to make the math easy to explain reliability concepts, to facilitate learning about distributions, modeling, regression, and data analysis. Or they state their readers expect to see MTBF in the book. Writing about MTBF is easier, too.
This really blends the various problems from across this list and is a disservice to you, the reader.
8 – Customers Ask for MTBF
This they do. Always ask what the really want to know about the product in question. Do they want to know how many spares to have on hand? MTBF is a very poor method to estimate spare requirements. Do they want to know the probability of failure over a duration, mission time, etc? MTBF is a remarkably poor approach to answer those questions.
Give customers that ask for MTBF the information they need to make better decisions. Help then understand the reliability performance, not hide it.
9 – Software Defaults to MTBF
This is especially true of parts count prediction software packages. It also is in life data analysis packages. In discussions with software package developers they cite providing what the customer wants. Apparently, market forces impedes provide a useful solution.
10 – MTBF is Good Enough
If you are happy with your warranty expenses, level of customer complaints, declining market share, etc. then I guess MTBF is good enough. Using MTBF is like wearing blinders to avoid seeing and understanding your product’s reliability performance. If you want to improve your decision making and improve the reliability performance, MTBF isn’t good enough.
What would you add to the list? These are my ten reasons to avoid MTBF. Hopefully they help you explain to others why we need to move to using reliability metrics that are useful, such as probability of success over a stated time period.
Add you comment on which of the reasons helps you explain the perils of MTBF to your peers.