At first MTBF seems like a commonly used and useful measure of reliability. Trained as a statistician and understanding the use of the expected value that MTBF represented, I thought, ‘cool, this is useful’.
Then the discussions with engineers, technical sales folks and other professionals about reliability using MTBF started. And the awareness that not everyone, and at times it seems very few, truly understood MTBF and how to properly use the measure.
Continue reading First Impressions
3 Ways to Expose MTBF Problems
MTBF use and thinking is still rampant. It affects how our peers and colleagues approach solving problems.
There is a full range of problems that come from using MTBF, yet how do you spot the signs of MTBF thinking even when MTBF is not mentioned? Let’s explore there approaches that you can use to ferret out MTBF thinking and move your organization toward making informed decisions concerning reliability. Continue reading 3 Ways to Expose MTBF Problems
The Army Memo to Stop Using Mil HDBK 217
Over 20 years ago the Assistant Secretary of the Army directed the Army to not use MIL HBK 217 in a request for proposals, even for guidance. Exceptions, by waiver only.
217 is still around and routinely called out. That is a lot of waivers.
Why is 217 and other parts count database prediction packages still in use? Let’s explore the memo a bit more, plus ponder what is maintaining the popularity of 217 and ilk. Continue reading The Army Memo to Stop Using Mil HDBK 217
Why do we use ReliaSoft instead of JMP to Identify the Time to Failure?
This is a question someone posted to Quora and the system prompted me to answer it, which I did.
This question is part of the general question around which software tools do you use for specific situations. First, my response to the question. Continue reading Why do we use Weibull++ over JMP?
Futility of Using MTBF to Design an ALT
Let’s say we want to characterize the reliability performance of a vendor’s device. We’re considering including the device within our system, if and only if, it will survive 5 years reasonably well.
The vendor’s data sheet lists an MTBF value of 200,000 hours. A call to the vendor and search of their site doesn’t reveal any additional reliability information. MTBF is all we have.
We don’t trust it. Which is wise.
Now we want to run an ALT to estimate a time to failure distribution for the device. The intent is to use an acceleration model to accelerate the testing and a time to failure model to adjust to our various expected use conditions.
Given the device, a small interface module with a few buttons, electronics, a display and enclosure, and the data sheet with MTBF, how can we design a meaningful ALT? Continue reading Futility of Using MTBF to Design an ALT
Two Ways to Think and Talk about Reliability
Neither includes using MTBF, btw.
And, I’m not thinking about the common language definition either.
Plus, I may have this all wrong. Here is the way I think about the reliability of something. More than ‘it should just work’ and different than ‘one can count on it to start’. When I ask someone how reliable a product is, this is what I mean.
By explaining my basic understanding we can compare notes. It is possible, quite possible, that I will learn something. As you may as well. Let’s see. Continue reading Two Ways to Think and Talk about Reliability
The Damage Done by Drenick’s Theorem
Have you ever wondered by we use the assumption of a constant failure rate? Or considered why we assume our system is ‘in the flat part of the curve [bathtub curve]’?
Where did this silliness first arise?
In part, I lay blame on Mil Hdbk 217 and parts count prediction practices. Yet, there is a theoretical support for the notion that for large, complex systems the overall system time to failure will approach an exponential distribution.
Thanks go to Wally Tubell Jr., a professor of systems engineering and test. He recently sent me his analysis of Drenick’s theorem and it’s connection to the notion of a flat section of a bathtub curve.
Wally did a little research and found the theorem lacking for practical use. I agree and will explain below. Continue reading The Damage Done by Drenick’s Theorem
3 MTBF Stories
Everyone loves a great story. Storytelling has been a long tradition to pass along knowledge and wisdom.
There are good stories, tales of inspiration. There are sad stories, tales of caution.
There are fables, ghost stores, legends, epic poems, and more. When considering the reliability performance of your product or equipment, you probably have a few stories that you can tell. “That time … “
Simple join colleagues for lunch and ask about the ‘major disasters’ of the past. The stories help us to remember and hopefully avoid repeating mistakes.
Here are three stories with MTBF as a central figure. It is a site and blog that does take about MTBF, so it fits. To start, let me introduce you to Martin, a new reliability engineering reporting to his first day of work at a bicycle design and manufacturing company. Two sad stories and a good one. enjoy. Continue reading 3 MTBF Stories
Different Data Same Decision
Let say you have some time to failure data on your equipment. A common action is to calculate the MTBF. All well and good until you expect to make a meaningful decision based on the calculation.
Using just the mean of the data, the MTBF value is likely to provide you with a less than useful bit of information. Thus your decision will be rather random or worthless.
Let’s explore just how this simple calculation of perfectly good data can mislead your decision making. Continue reading Different Data Same Decision
5 Reasons Rate of Change is Important
A simplifying assumption associated with using MTTF or MTBF implies a constant hazard rate. Some assume we’re in the useful life section of the bathtub curve. Others do not understand what assumptions they are making.
Using MTTF or MTBF has many problems and as regular reader here know, we should avoid using these metrics.
By using MTTF or MTBF we also lose information. We are unable to measure or track the rate of change of our equipment or system’s failure rates (hazard rate). The simple average is just an average and does not contain the essential information we need to make decisions.
Let’s explore five different reasons the rate of change of a failure rate is important to measure and track. Continue reading 5 Reasons Rate of Change is Important
What About Weibull, Can I Use it Instead of MTBF?
This was a follow up question in a recent discussion with Alaa concerning using a metric other than MTBF.
The term ‘Weibull’ in some ways has become a synonym for reliability. Weibull analysis = life data (or reliability) analysis. The Weibull distribution has the capability to describe a changing failure rate, which is lacking when using just MTBF. Yet, it is suitable to use ‘Weibull’ as a metric? Continue reading How About Weibull Instead of MTBF?
Let’s Demand Better Reliability Engineering Content
Teaching reliability occurs through textbooks, technical papers, peers, mentors, and courses. The many sources available tend to use MTBF as a primary vehicle to describe system reliability.
What has gone wrong with our education process? Continue reading Are We Teaching Reliability All Wrong?
Life Data Analysis with only 2 Failures
Here’s a common problem. You have been tasked to peer into the future to predict when the next failure will occur.
Predictions are tough.
One way to approach this problem is to do a little analysis of the history of failures of the commonest or system. The problem looms larger when you have only two observed failures from the population of systems in questions.
While you can fit a straight line to two failures and account for all the systems that operated without failure, it is not very satisfactory. It is at best a crude estimate.
Let’s not consider calculating MTBF. That would not provide useful information as regular reader already know. So what can you do given just two failures to create a meaningful estimate of future failures? Let’s explore a couple of options. Continue reading Life Data Analysis with only 2 Failures
The Importance of the Discussions around MTBF Questions
The best way to help others understand and stop using MTBF is to engage them in a discussion. I get questions concerning MTBF or reliability a few times a week. I attempt to answer each and every one, plus adding a follow up question or two.
In person or online, ask and answer MTBF questions. You not only improve your understanding of MTBF and reliability, you improve your still at tell stories to help affect change across your industry. Continue reading Discussions and MTBF Questions
The MTBF Stories You Tell Can Cause Change
Stories communicate well. We have been telling stories long before the invention of writing, or the internet. The MTBF stories we tell communicate our ideas, suggestions, and recommendations.
There are a differences between good and poor stories. How you tell a story matters as well as the subject of the story. Now, MTBF stories may not be the most thrilling or entertaining, yet there are stories on MTBF topics that matter.
Let’s explore using the power of story to cause those around us to better understand and avoid the use of MTBF. Continue reading 3 Types of MTBF Stories
Trying to Respond to All Questions and Comments Concerning MTBF
Over the past couple of days, like most days, have received questions and comments concerning MTBF. I do try to respond to all questions and acknowledge the comments.
Glad to help in anyway I can, so please feel free to send me your questions. Certainly do appreciate the supporting comments, or any comments for that matter.
Let’s take a look a few such discussion that occurred over the past two days. Continue reading 3 Recent Questions and Comments Concerning MTBF