Let’s say we want to characterize the reliability performance of a vendor’s device. We’re considering including the device within our system, if and only if, it will survive 5 years reasonably well.
The vendor’s data sheet lists an MTBF value of 200,000 hours. A call to the vendor and search of their site doesn’t reveal any additional reliability information. MTBF is all we have.
We don’t trust it. Which is wise.
Now we want to run an ALT to estimate a time to failure distribution for the device. The intent is to use an acceleration model to accelerate the testing and a time to failure model to adjust to our various expected use conditions.
And, I’m not thinking about the common language definition either.
Plus, I may have this all wrong. Here is the way I think about the reliability of something. More than ‘it should just work’ and different than ‘one can count on it to start’. When I ask someone how reliable a product is, this is what I mean.
Have you ever wondered by we use the assumption of a constant failure rate? Or considered why we assume our system is ‘in the flat part of the curve [bathtub curve]’?
Where did this silliness first arise?
In part, I lay blame on Mil Hdbk 217 and parts count prediction practices. Yet, there is a theoretical support for the notion that for large, complex systems the overall system time to failure will approach an exponential distribution.
Thanks go to Wally Tubell Jr., a professor of systems engineering and test. He recently sent me his analysis of Drenick’s theorem and it’s connection to the notion of a flat section of a bathtub curve.
Everyone loves a great story. Storytelling has been a long tradition to pass along knowledge and wisdom.
There are good stories, tales of inspiration. There are sad stories, tales of caution.
There are fables, ghost stores, legends, epic poems, and more. When considering the reliability performance of your product or equipment, you probably have a few stories that you can tell. “That time … “
Simple join colleagues for lunch and ask about the ‘major disasters’ of the past. The stories help us to remember and hopefully avoid repeating mistakes.
Here are three stories with MTBF as a central figure. It is a site and blog that does take about MTBF, so it fits. To start, let me introduce you to Martin, a new reliability engineering reporting to his first day of work at a bicycle design and manufacturing company. Two sad stories and a good one. enjoy. Continue reading 3 MTBF Stories→
A simplifying assumption associated with using MTTF or MTBF implies a constant hazard rate. Some assume we’re in the useful life section of the bathtub curve. Others do not understand what assumptions they are making.
Using MTTF or MTBF has many problems and as regular reader here know, we should avoid using these metrics.
By using MTTF or MTBF we also lose information. We are unable to measure or track the rate of change of our equipment or system’s failure rates (hazard rate). The simple average is just an average and does not contain the essential information we need to make decisions.
This was a follow up question in a recent discussion with Alaa concerning using a metric other than MTBF.
The term ‘Weibull’ in some ways has become a synonym for reliability. Weibull analysis = life data (or reliability) analysis. The Weibull distribution has the capability to describe a changing failure rate, which is lacking when using just MTBF. Yet, it is suitable to use ‘Weibull’ as a metric? Continue reading How About Weibull Instead of MTBF?→
Here’s a common problem. You have been tasked to peer into the future to predict when the next failure will occur.
Predictions are tough.
One way to approach this problem is to do a little analysis of the history of failures of the commonest or system. The problem looms larger when you have only two observed failures from the population of systems in questions.
While you can fit a straight line to two failures and account for all the systems that operated without failure, it is not very satisfactory. It is at best a crude estimate.
Let’s not consider calculating MTBF. That would not provide useful information as regular reader already know. So what can you do given just two failures to create a meaningful estimate of future failures? Let’s explore a couple of options. Continue reading Life Data Analysis with only 2 Failures→
The Importance of the Discussions around MTBF Questions
The best way to help others understand and stop using MTBF is to engage them in a discussion. I get questions concerning MTBF or reliability a few times a week. I attempt to answer each and every one, plus adding a follow up question or two.
Stories communicate well. We have been telling stories long before the invention of writing, or the internet. The MTBF stories we tell communicate our ideas, suggestions, and recommendations.
There are a differences between good and poor stories. How you tell a story matters as well as the subject of the story. Now, MTBF stories may not be the most thrilling or entertaining, yet there are stories on MTBF topics that matter.
Is Your Organization Compromising Reliability Performance Due to a Reliability Conflict of Interest?
Kirk Gray wrote the article titled Exposing a Reliability Conflict of Interest on Accendo Reliability. He talked about a recent article discussion the maintenance costs for the F-35 fighter jet program and how the companies designing the system make a significant profit selling spare parts or maintenance services.
Getting on an airplane we think about the very low probability of failure during the flight duration. This is how we think about reliability.
When buying a car we think about if the vehicle will leave us stranded along a deserted stretch of highway. When buy light bulbs for the hard to reach fixtures we consider paying a bit more to avoid having to drag out the ladder as often.
When we consider reliability as a customer does, we think about the possibility of failure over some duration.