MTBF is a common reliability metric. It is totally useless in most applications. So, why do we use it? And what can we do as reliability professionals to lead our industries away from using MTBF?
A recent set of forum discussions raised the idea that we use MTBF because our customers require it. Another writer suggested that MTBF is useful because it has been in use for so long, and therefore it must be useful. Another writer advertised their offer to determine MTBF for you for a small charge.
MTBF is common on product and component datasheets, it is common in specifications and it is common in the ASQ Certified Reliability Engineer body of knowledge (CRE BoK). At one time I argued for the removal of any mention of MTBF from the CRE BoK and failed since most at the meeting agreed that MTBF is so common that it must be part of the document despite an agreement regarding the common misunderstandings of MTBF, that it is less then useful in practice and that ‘good’ reliability professionals would use better metrics.
Do we use MTBF simply because it is there? Is it so simple and widely requested that we comply and tally total hours and divide by number of failures, without regard to the assumptions and mistakes that that single number causes? Are we so naïve to believe a single number reasonably describes the failure rate over the life of the product or component? No, I suspect visitors of this site reading this blog are enlightened and know better.
As reliability professionals, we have the obligation to our profession, career and organization to use our knowledge to provide value. A great way to provide value is to educate those around us, apply the appropriate tools, and create value. In organizations using MTBF you have the unique situation to contribute significant value.
Plot your data, and take a look at it. Does the exponential distribution fit? If the data does not have a constant failure rate or has a known wear-out mechanism I would suspect you’d soon see the divergence of the MTBF value and reality. This is basic statistics; just plot the data and the fit. Show the discrepancy to your team. Determine if the decisions based on the data would change using a better description of the data. I would suspect in many cases decisions will change, as has been my experience.
Ask your customers, clients, vendors and managers what they really mean when they talk about MTBF. If you see “50,000 hours MTBF” that should immediately trigger a few questions:
- What do you really want? 50k hours of failure free operation or 2/3rds to fail in 50k hours?
- Over what duration does this apply? Is this a two or five year product? MTBF, as you know, is only the failure rate.
- When do you expect the first failures to occur?
- What is the initial use acceptable failure rate?
- What assumptions are you making related to this metric? Where is the data to support those assumptions?
- Would you rather specify reliability clearly? Let me show you how.
Lets take charge. Lets not accept poorly conceived requirements. Lets use the tools and our knowledge to enable successful and reliable products.