Given MTBF? Now What?
Let’s say you join a project as a reliability professional (or an engineer or manager of any type) and you discover that the team as a reliability goal stated as 5,000 hours MTBF.
What are you going to do with that information?
The meaning of the current goal
5,000 hour MTBF might be exactly the right metric and value for the project. It probably is not.
To find out what this value means you need some more information. Ask around and find:
- What is the primary function of the device?
- How long should it provide that function for customers?
- Is there a warranty period or executed lifetime duration?
- How many hours per year will the device be operated?
- How many should survive the warranty period or expect useful lifetime?
- In what environment should it work?
Note these questions help you find each of the four element for a full reliability goal description. Key to this discovery is the duration of useful life (or warranty) along with how many are expected to survive each duration.
The durations are often linked to market expectations. The probabilities (survival) is connected to business objectives of profitability along with customer satisfaction.
Let’s say we determine that there is a one year warranty and the business objectives expect less than 2% of units will fail during the warranty period.
A simple calculation is now possible. Assuming the unit is susceptible to failure every hour of the year, or 8,760 hours, then
That means the goal is to have only 17% of units survive the warranty period.
That is not very good.
A bit more information
I would show this simple calculation to others and ask if the reliability goal was correct.
Let’s say we quickly learn that the most likely way the product will fail is due to fan failure. And, we discover the unit is only expected to operate 2,000 hours per year.
Changing the calculation to reflect the reliability at 2,000 hours we find
Which is 67% of units would be expected to survive.
Still not great.
We may find with additional discussion a few reasons for such a poor reliability goal.
It could be
- Thinking MTBF was a failure free period
- thinking 5,000 hour MTBF was 2.5x longer than expected 2,000 operation, thus had plenty of margin
- 5,000 hours was goal for last project and we copied it to this one
- This is a new high risk project and we set the goal very low on purpose
- This is an improvement of the last product that had 50% fail in warrantee period
Or, something else.
Whatever, if the business goal is have less than 2% fail in the warranty period and our engineering goal is to expect 33% of device fail, something needs adjusting.
This simple example, which is all too common, illustrates that even a simple calculation to interpret MTBF and compare the results to expectations may cause ‘some discussion’. Hopefully, it helps the organization to adjust the goal and how it is stated to be something meaningful.
Given this situation you may want to take the next steps:
- Restate the reliability goal in terms of reliability (function, environment, duration, probability) In this case the device will function in an office setting for one year with 98% surviving. Add other duration/probability couplets as needed for clarity.
- Create an apportionment model for the device and major subsystems/components.
- Identify past product or similar product performance – also in terms of reliability (not MTBF).
- Identify high risk of failure areas and focus engineering and supply chain improvements there.
After device launch to the market
Once the product is out there, monitor it’s reliability performance. Again, not in terms of MTBF as we really are not interested in the mean. Rather we are interested in the first 2 percentile point – our target for the warranty period.
Stating reliability clearly helps. Helping others understand the meaning of MTBF also helps.
Go be useful!