Actions

Learn about MTBF

One of the advantages of learning about MTBF is the same understanding applies to the many variations. Rather than engage in endless debates over what is or is not counted – shift the conversation to how to use the  information for decisions. What is in service to the decision?

The Perils page provides some information about MTBF and the web provides a wealth of information about this common metric. A basic understanding of this inverse of a failure rate goes a long way to determine if the data is being well represented.

Alternative Metrics

MTBF is often used to represent product life. It is not complete nor sufficient. Product life or reliability has four elements: function, probability, duration and environment. MTBF is only the probability and assumes (in most cases) the duration does not matter, or worse is not even stated.

As an alternative, use reliability directly. State the probability of success over a specified time frame, along with the functions (leads to understanding of product failure definition) and environment. The function and environment are often abbreviated, i.e. a respirator provides life support breathing in North American intensive care facilities. The details of the functions and environment are often well stated in product development and marketing documents.

The probability and duration may include multiple statements. One for important elements of the product life. For example, since products that failure during first use damage the product brand significantly, we may want to have a very high probability of success during the first 3 months of product use. Say, 99.99% reliability over first 3 months of use.

The warranty period may be another duration of interest. 98% reliability over the 1 year warranty period. And, the design life (how long the product should last and provide value to the customer) might be stated as 90% reliability over 5 years.

The early failures focus on component, assembly, shipping and installation sources of product failure. The warrant period and reliability is of interest as a business liability. The design life focuses on the longer term failure mechanisms.

Therefore, move away from a partial statement concerning product reliability. Make full use of clear statements of expectations (goals) and measures.

Teach Others

This one is easy – with your understanding of MTBF and the various pitfalls and misunderstandings. Do not permit those around you (peers, management, vendors, suppliers, and customers) continue to misunderstand and use incorrectly MTBF. Verify understanding and convey the actual meaning of the term.

Ask Questions

Besides asking about understanding and definitions, ask about the use of reliability statements.

  • Where did this come from?
  • How and when is this measure?
  • What decisions does this metric support?
  • How does the data support this measure?

Challenge Assumptions

There are two principle assumptions made related to MTBF – both require your challenge.

First, what is the evidence that the underlying time to failure data supports the use of the exponential distribution? Or, the concept of a constant failure rate over the product life?

And, if the data supports the assumption you still benefit by using reliability as the metric, rather than MTBF. As it avoids the common misunderstandings surrounding the metric.

Second, challenge the assumption based on ‘this is what our industry and customers always use’. A full statement of reliability can always be converted to an MTBF statement if required. To make decisions use the data and appropriate and accurate summaries and measures. To assist your vendors and customers fully understand the reliability requirements or claims, use a complete reliability statement.

In conclusion

Years ago while conducting factory assessments we often asked about and inspected a suppliers application of statistical process control (SPC). Often these programs were little more than a show of a few very poorly management and applied SPC charts that resulted in no process improvements. One factory even used a locked display cabinet to display the charts, convenient for customer inspection. Two years later the same charts were still on display. Worthless to them and us.

The point is MTBF and similar measures can have real value across the entire supply chain and product life cycle. Only when the measure accurately describes the data, if well or easily understood, and permits appropriate assessment of risks and tradeoffs. MTBF often fails these simple criteria.

What can you do? Do not use MTBF.

8 thoughts on “Actions

  1. Hello Fred,

    I had a question regarding the use of MTBF.

    In the useful life phase of the product (from the bathtub curve), the failure rate remains constant and the product experiences an exponential distribution of hazard rate. And most of the calculation models assume this, when they provide the results. And as far as I understand from this group’s discussion is that MTBF is useful when the failure data follows the exponential distribution (if wrong, correct me on this). Also the Reliability is calculated using this MTBF/FR for the mission time required.

    Most of the customers are interested in to know, how their product behaves in the useful life phase, right? So then, cant we use MTBF as a reference indicator in this case?

    Regards,

    Anup Hegde

    1. Hi Anup,

      Before you assume your system or component really has a constant hazard rate – get some data and check. In my experience, it is very rare that you will actually find a constant hazard rate – the bathtub curve in textbooks is used to explain different patterns of changing failure rates – it does not actually represent any actual data. The bathtub curve is fiction.

      Most customers want to know if your product will work for them over the duration they are interested in having it work – They would prefer no failures, yet understand they may occur. They would rather understand the actual expected pattern of failure rates over time, not some vague and very useless average.

      If you ask a customer what they want when asking for MTBF values, it may reveal they believe that MTBF is a failure-free period or some other misunderstanding of what it really means.

      Sure assuming a constant hazard rate makes life simple and one could use the exponential distribution, yet it rarely is useful, accurate, or helpful to do so. I advise you to first understand the failure mechansims, get the data and understanding the failure rate pattern over time, then clearly represent the probability of failure over time.

      Cheers,

      Fred

  2. Hi Fred

    I deal exclusively with industrial pumps and my question is this: in the mining industry the pm usually comes out every 3 months ( just an example) this happens because of knowing from repeated failures for example. This is data we learned right? the bad thing is if the types of liners are replaced then thus could be unreliable. Is there a better method for centrifugal pumps that can help me ? Like DAIMAC , or Weibull?
    since I’m only an outsider trying to help my clients with their best options. I don’t have access to reliability tools and data most of the time

    1. Hi Pierre, Thanks for the comment. There are many tools or methods that may help you. The first step is to sort out what you are trying to accomplish and what data/resources you have available.

      A mean cummulative plot needs nothing special and applies to understanding repairable systems and if the repairs are helping or hurting the system’s reliability performance.

      A failure modes and effects analysis may reveal areas that need attention for correction, process improvements, etc.

      There are data analysis methods, like Weibull if looking at time to first failure or non-repairable systems

      And more, what is it you’re trying to accomplish?

      cheers,

      Fred

      1. Hi Fred
        I’m trying to accomplish using and choosing the best type of liners possible for centrifugal pumps. Many people don’t understand the metallurgy and the problems a slurry pumps have. Abrasive with acidic chemicals in slurry and in water are the both worst case scenarios combinations to deal with. Abrasive slurries ( usually high chrome works well) however in acid it’s the CD4m that performs best. Here’s where it gets challenging Sir, having both at the same time. Companies lack pump experts and engineers who are very knowledgeable and skilled in pumps. Usually we would present the product with the BEP and needed flows and head for the customer. From that point they’ll make the decision and many times it’s not the best option. So this is where I can help them by doing a system health assessment looking at the water or slurry . The ph, SG , temperature, solid % , and so on . Great result happens if they listen and move forward with the recommendation. My question now is this: is there something more that can help me other than MTBF to keep track of the new installed liners ? Would a RCA or DAIMAC help to show the improvements? The final accomplishment is reliability and predictability to avoid any unplanned maintenance. So for me I can do systems health checks that show me the pump actual performance so it’s more for the data that I’m trying to document so I can show the improvement and best options for a specific pump. A detailed description and breakdown of the pump performance and savings to share with the customer is what I’m looking for. This is very important because just words fall on deaf ears so has to be documented and shown to them. Thanks for your time
        Pierre

        1. Hi Pierre,

          You may want to combine the knowledge of how the different liner materials respond to the varous slurries compositions. Anything, in my opinion, is better than MTBF or related. Instead start with images or chemical analysis of failed liners – including the time they ran with x slurry.

          For example, if in a 4.5ph acidic environment, how long will the CD4m last and what does it look like after say 3 months. Likewise with stainless, it won’t last as long and look very different at the same time period. Build a portfolio of images of what liners look like with different situations.

          Use the time to failure data to estimate the probability of failure over x time range. If PMs are every 3 months, then what is the chance of y liner matieral lasting 3 months. This analysis hinges on having a clear definition of failure.

          Root cause analysis is a solid process to use with every failure. Always collect run time, and conditions of the slurry involved if possible.

          cheers,

          Fred

Leave a Reply

Your email address will not be published. Required fields are marked *