There are occasions when we have either field or test data that includes the duration of operation and whether or not the unit failed. This can be, say, 10 large motors. For sake of argument, the test ran each motor for 1,000 hours and when a motor failed it was repaired quickly and returned to the test. There were 3 failures.

Sadly, this is all we need to calculate an estimate for the motor MTBF.

Total time divided by number of failures in this case is 10 times 1,000 hours for a total time of 10,000 hours. Divide 10,000 by the three failures to find, 3,333 hr MTBF.

What I find interesting is I could find the same MTBF value using 10,000 motors each run for one hour. Or, the same MTBF if we ran one motor for 10,000 hours. IF in each case there were three failures we would find the MTBF of 3,333 hours.

Now that works perfectly well when there is a constant failure rate. Meaning there is equal chance of failure each hour of operation. Old motors would have the same chance of failure as brand new motors.

Of course, you know why I choose motors for the example. To reinforce the idea that the chance of failure is not always a constant. Be sure to think about the failure mechanisms before using MTBF (or MTTF). If the failure rate is time dependent then this simple calculation is not useful.

I used this example during a class last week and it seemed to spark a good discussion. How have you explained MTBF to others? Any suggestions on how to best describe what the MTBF value really means, or doesn’t mean?

In my case, trying to calculate reliability and/or MTBF for a subsystem is very frustrating. The advertised reliability of many components are based on the OEM’s projection or engineering analysis because no one wants to spend the dollars and time required to truly test the component thoroughly. Trying to validate their projection or engineering analysis at the system level usually means that my only data is based on one or two failures for a series of tests that accumulate a total of 30 to 40 hours of operation. For simplicity, I am usually forced to ignore the conditions of testing (e.g., temperature, altitude, load). Components that require a high level of reliability (R) and confidence (C) need several hundred hours of operation to fully demonstrate R&C.

Hi Bill, I feel your pain. Vendors have to deal with many operating conditions and use cases. They tend to do what is requested by the majority. Unfortunately, so many seem happy with very poor information, that those that need and request better information and thwarted. I suggest we continue to ask for meaningful information, educate our peers to do likewise, and when all else fails do the testing ourselves.

Cheers,

Fred