Where MTBF Falls Short
Guest post by Chris Peterson – see her daily blog Test To Be Your Best
I have a brand new widget and I’m very excited about the design. It has features I’ve never built in before, there is a huge market need for it, and now I have to try to prove it’s reliability before I can sell it. What do I use?
Let’s try MTBF. It has been around for over 60 years and is a very common approach.
Here is where the possible issues start. MTBF is the calculated average time it will take for a system to fail. If you think about an average, it includes everything from the highest to lowest. If 5 is the average out of 10 you have no clue from that 5 if the majority of the population was close to it or scattered from 1 to 10. An average doesn’t show variability and that could cause issues.
Another issue is that the end use environment is also based on averages yet the world is full of extremes. Say that my widget is going inside of a laptop computer and I decide to use the metric for typical office temperatures. That’s all well and good, but that computer will see much more extreme conditions that that. It might sit on a car seat with the windows rolled up under the blazing desert sun, or be left in a trunk in the dead of winter. People are impatient and want their electronics to work under all conditions and they aren’t going to wait until the laptop reaches typical office temperature before they use it. What will the reliability be then? MTBF can’t give us a clue because even if extremes were considered (which would be rare) they would go into an average (environmental conditions) which is then averaged again (assumption of mean time between failures).
Think now of the bathtub curve with infant mortality. Is the MTBF based on whether the item has been screened to try to detect any items that would fall into that part of the curve? If that screening has been skipped then the average that the MTBF is based on would have the possibility of many items falling short of that.
MTBF is a probability, not a guarantee. It is based on assumptions. Then there is a confidence level on top of that which may be subjective and is an assumption. Then there is averaging instead of a range which leads to another assumption. Therefore we have an assumption (which could be flawed) based on an assumption (which could be subjective) based on an assumption of probability which is an educated guess which we will call engineering judgment.
Do you really want to gamble on a good guess?