In Response to ‘What was the Original Purpose of MTBF Predictions?’
Guest Post by Andrew Rowland, Executive Consultant, ReliaQual Associates, LLC, www.reliaqual.com in response to the ‘Reliability Predictions‘ article.
Hi Fred,
In the section on predictions you mention Dr. Box’s oft quoted
statement that “..all models are wrong, but some are useful.” In the
same book Dr. Box also wrote, “Remember that all models are wrong; the
practical question is how wrong do they have to be to not be useful.” [see these and other quote by Dr. George Box here]
Reliability predictions are intended to be used as risk and resource
management tools. For example, a prediction can be used to:
- Compare alternative designs.
- Used to guide improvement by showing the highest contributors to failure.
- Evaluate the impact of proposed changes.
- Evaluate the need for environmental controls.
- Evaluate the significance of reported failures.
None of these require that the model provide an accurate prediction of
field reliability. The absolute values aren’t important for any of the
above tasks, the relative values are. This is true whether you express
the result as a hazard rate/MTBF or as a reliability. Handbook methods
provide a common basis for calculating these relative values; a
standard as it were. The model is wrong, but if used properly it can
be useful.
Think about the use of RPN’s in certain FMEA. The absolute value of
the RPN is meaningless, the relative value is what’s important. For
sure, an RPN of 600 is high, unless every other RPN is greater than
600. Similarly, an RPN of 100 isn’t very large, unless every other RPN
is less than 100. The RPN is wrong as a model of risk, but it can be
useful.
I once worked at an industrial facility where the engineers would dump
a load of process data into a spreadsheet. Then they would fit a
polynomial trend line to the raw data. They would increase the order
of the polynomial until R^2 = 1 or they reached the maximum order
supported by the spreadsheet software. The engineers and management
used these “models” to support all sorts of decision making. They were
often frustrated because they seemed to be dealing with the same
problems over and over. The problem wasn’t with the method, it was
with the organization’s misunderstanding, and subsequent misuse, of
regression and model building. In this case, the model was so wrong it
wasn’t just useless, it was often a detriment.
Reliability predictions often get press. In my experience, this is
mostly the result of misunderstanding of their purpose and misuse of
the results. I haven’t used every handbook method out there, but each
that I have used state somewhere that the prediction is not intended to
represent actual field reliability. For example, MIL-HDBK-217 states,
“…a reliability prediction should never be assumed to represent the expected field reliability.”
I think the term “prediction” misleads
the consumer into believing the end result is somehow an accurate
representation of fielded reliability. When this ends up not being the
case, rather than reflecting internally, we prefer to conclude the
model must be flawed.
All that said, I would be one of the first to admit the handbooks could
and should be updated and improved. We should strive to make the
models less wrong, but we should also strive to use them properly.
Using them as estimators of field reliability is wrong whether the
results are expressed as MTBF or reliability.
Best Regards,
Andrew
Well said Andrew.
Thanks for posting it Fred.
I just recently had the same discussion.
The wrong use of “prediction” here is one to keep in mind.
Andrew, I agree with you that some models, although they do not provide a absolute correlation, can be useful. This is only true and valid if there is a link between the model and actual physical mechanisms that cause failures. As you are familiar with MIL HDBK -217 you know that most of the models are based on the Arrhenius equation which is not the cause of the vast majority of electronics failures when it was last update in the mid-1990’s. A most excellent paper was presented at a recent RAMS and it again shows why reliability predictions are a misleading approach. I have posted this public domain document for all to download at http://www.acceleratedreliabilitysolutions.com/images/Reliability_Predictions_Continued_Reliance_on_a_Misleading_Approach.pdf .
Kirk,
I agree with the criticisms outlined in the paper, but even the paper acknowledges predictions can have value and they are often improperly performed, misused, and/or misunderstood. All too often predictions are (mis)used or (mis)understood to represent the field reliability of the system under development. Criticism of handbook methods is also often directed from this viewpoint. If that was the purpose of handbook predictions, I would agree we should “throw out the baby with the bathwater.”
However, I was simply pointing out that handbook predictions are not intended to estimate the ultimate field reliability. If we shift our perspective and discuss handbook prediction results as relative rather than absolute, we may find more value in them. Of course, we may not, but we should be evaluating the tool from the viewpoint of it’s intended purpose not our desired purposed.
Andrew
Hi Andrew,
My main issue with parts count prediction is they tend to focus on providing MTBF only and generally assume a constant failure rate, despite the clear evidence in some cases of wear out mechanisms. Plus these methods tend to ignore early failures or just average them across some undefined period.
Not only is the information from parts count prediction mis understood the results, often in terms of MTBF, is also very much so misunderstood and often very misleading.
We can and should do better.
Cheers,
Fred
If we divide part hazard rate into 3 regions of early life, useful life, and wearout, we can separate the causes and identify corrective actions. Standards Predictions can be used for early design estimates and scoring of alternatives. If wearout occurs before the required system service life, preventive maintenance or redesign to extend part life needs to be recommended. It would be nice to have prediction models capable of weibull distributions. But, that hasn’t happened yet in the major reliability tools.
Hi Vic, I agree that limiting parts count predictions to early comparison of design options, that’s fine. When used to advertise reliability performance to customers, thats wrong. I’ve not see any product show three distinct phases as many text books and you have described. Rather I recommend looking at the various sources of failure causes as it tends to help identify causes and thus what can be done about it. I agree that running failures down to root cause an eliminating, mitigating or minimizing the causes is all good. cheers, Fred