Replace After MTTF Time To Avoid Failures – Right?
Received a short question last week. The person writing seems to already know the answer, yet asked:
If we replace an item after a duration equal to the MTTF value, we would avoid failures, right?
Well, no, most likely not, was my response. What is your response? How would you answer this question?
My First Response
MTTF is the total time divided by the number of failures. It really doesn’t matter what underlying pattern of failures occurs, or which distribution may properly describe the pattern of failures. Given the question, we do not know the distribution or pattern in the time to the failures.
The use of MTTF suggests we are talking about non-repairable items or only considering the time to first failures. That doesn’t really help much.
Given only MTTF and if the failures rate decreases over time or is constant over time, then replacing the units would at best keep the failure rate the same, or make the chance of failure worse.
If the items exhibit a wear out pattern, then replacing an item at any point in time would generally decrease the chance of failure.
In any of these circumstances, we would expect the item to fail on or before the duration that is the same value as the MTTF value. If a constant failure rate applies we would expect about 2/3 of items in service to have failed before operating the MTTF duration.
Keep in mind that MTTF represents the inverse of the failure rate per unit time. So a 1,200 hour MTTF value means there is a 1 in 1,200 chance of failure each and every hour – if and only if the actually underlying distribution that described the failure pattern is exponential. A constant hazard rate means each and every hour has the same chance of failure as any other hour. This doesn’t occur in nature, thus a rather poor assumption.
My Suggestion to Plan for Maintenance
If using a non-repairable item, and you would like to minimize unscheduled failures, using MTTF is not going to help.
Instead, we need to understand if the items are exhibiting a decreasing or increasing failure rate over time, or a mix of the two. Get the data and plot the time to failure data.
If a decreasing failure rate over time, replace the items when they fail, as any preventative replacement will increase the chance of failure. Better would be to improve the ability of the item to work upon installation thus minimizing the initial chance of failures.
If the item has an increasing failure rate over time, replace the item when the chance of failure increases to an unacceptable level. Generally, consider the cost of the replacement along with the cost of unscheduled downtime to determine the optimal time for replacement.
It could be a mix of failure mechanisms, that show both decreasing and increasing failure rates over time, thus apply both approaches mentioned. Get the data. Do not assume a constant hazard rate.
What is your approach? How would you answer the posed question? Use the comment field to add you thoughts.
MTTR is the total time divided by the number of failures? Really? Since when?
Mean Time To Repair MTTR suggests we are talking about non-repairable items?
Do you proof read this stuff before you post it?
Do you mean, no pun intended, MTTR or MTTF? I’ll take a look and update to be sure I didn’t mix these up, which is all to possible.
Sometimes I proofread, yet that is not my strong suit. Thanks for the heads up.
Cheers,
Fred
This is a very interesting example of something I see a lot: many people seem to look at MTTF and MTBF as meaning that failures will not occur before this time. They do not intentionally misunderstand, this seems to be a subconscious assumption. This is the first time I have seen it expessed so explicitly, however!
Plenty of articles here Gerald that hope are explicitly on the many ways MTBF is confused. cheers, Fred
Fred,
It is worse. You said “MTTF is the total time divided by the number of failures.”
Actually, that equation provides a statistical estimate of the mean uncertainty for MTTF. That estimate can be very close to, or very different from, the true value of MTTF. What is even worse is that impossible to tell.
And to top it all off, you are guaranteed by the math and decision theory to make any decision using any mean estimate of any quantity.
The person asking this question might be better off using a random number generator for making that decision.
Mark Powell
Getting better data and doing better analysis than simple MTBF is always in order.