First Impressions

At first MTBF seems like a commonly used and useful measure of reliability. Trained as a statistician and understanding the use of the expected value that MTBF represented, I thought, ‘cool, this is useful’.

Then the discussions with engineers, technical sales folks and other professionals about reliability using MTBF started. And the awareness that not everyone, and at times it seems very few, truly understood MTBF and how to properly use the measure.

Continue reading First Impressions

The Fear of Reliability

The Fear of Reliability

MTBF is a symptom of a bigger problem. It is possibly a lack of interest in reliability. Which I doubt is the case. Or it is a bit of fear of reliability.

Many shy away from the statistics involved. Some simply do not want to know the currently unknown. It could be the fear of potential bad news that the design isn’t reliable enough. Some do not care to know about problems that will requiring solving.

What ever the source of the uneasiness, you may know one or more coworkers that would rather not deal with reliability in any direct manner. Continue reading The Fear of Reliability

Being In The Flat Part of the Curve

What Does Being In The Flat Part of the Curve Mean?

To mean it means very little, as it rarely occurs. Products fail for a wide range of reasons and each failure follows it’s own path to failure.

As you may understand, some failures tend to occur early, some later. Some we call early life failures, out-of-box failures, etc. Some we deem end of life or wear out failures. There are a few that are truly random in nature, just as a drop or accident causing an overstress fracture, for example. Continue reading Being In The Flat Part of the Curve

A Series of Unfortunate MTBF Assumptions

A Series of Unfortunate MTBF Assumptions

The calculation of MTBF results in a larger number if we make a series of MTBF assumptions. We just need more time in the operating hours and fewer failures in the count of failures.

While we really want to understand the reliability performance of field units, we often make a series of small assumptions that impact the accuracy of MTBF estimates.

Here are just a few of these MTBF assumptions that I’ve seen and in some cases nearly all of them with one team. Reliability data has useful information is we gather and treat it well.  Continue reading A Series of Unfortunate MTBF Assumptions

Time to Update the Reliability Metric Book

It is Time to Update the Reliability Metric Book with Your Help

Let’s think of this as a crowdsourced project. The first version of this book is a compilation of articles. It lays out why we do not want to use MTBF and what to do instead (to some extent).

With your input of success stories, how to make progress using better metrics, and input of examples, stories, case studies, etc. the next version of the book will be much better and much more practical. Continue reading Time to Update the Reliability Metric Book

We Need to Try Harder to Avoid MTBF

We Need to Try Harder to Avoid MTBF

Just back from the Reliability and Maintainability Symposium and not happy. While there are signs, a proudly worn button, regular mentions of progress and support, we still talk about reliability using MTBF too often. We need to avoid MTBF actively, no, I mean  aggressively.

Let’s get the message out there concerning the folly of using MTBF as a surrogate to discuss reliability. We need to work relentlessly to avoid MTBF in all occasions.

Teaching reliability statistics does not require the teaching of MTBF.

Describing product reliability performance does not benefit by using MTBF.

Creating reliability predictions that create MTBF values doesn’t make sense in most if not all cases. Continue reading We Need to Try Harder to Avoid MTBF

3 Ways to Expose MTBF Problems

3 Ways to Expose MTBF Problems

MTBF use and thinking is still rampant. It affects how our peers and colleagues approach solving problems.

There is a full range of problems that come from using MTBF, yet how do you spot the signs of MTBF thinking even when MTBF is not mentioned? Let’s explore there approaches that you can use to ferret out MTBF thinking and move your organization toward making informed decisions concerning reliability. Continue reading 3 Ways to Expose MTBF Problems

The Army Memo to Stop Using Mil HDBK 217

The Army Memo to Stop Using Mil HDBK 217

Over 20 years ago the Assistant Secretary of the Army directed the Army to not use MIL HBK 217 in a request for proposals, even for guidance. Exceptions, by waiver only.

217 is still around and routinely called out. That is a lot of waivers.

Why is 217 and other parts count database prediction packages still in use? Let’s explore the memo a bit more, plus ponder what is maintaining the popularity of 217 and ilk. Continue reading The Army Memo to Stop Using Mil HDBK 217

Why do we use Weibull++ over JMP?

Why do we use ReliaSoft instead of JMP to Identify the Time to Failure?

This is a question someone posted to Quora and the system prompted me to answer it, which I did.

This question is part of the general question around which software tools do you use for specific situations. First, my response to the question. Continue reading Why do we use Weibull++ over JMP?

Futility of Using MTBF to Design an ALT

Futility of Using MTBF to Design an ALT

Let’s say we want to characterize the reliability performance of a vendor’s device. We’re considering including the device within our system, if and only if, it will survive 5 years reasonably well.

The vendor’s data sheet lists an MTBF value of 200,000 hours. A call to the vendor and search of their site doesn’t reveal any additional reliability information. MTBF is all we have.

We don’t trust it. Which is wise.

Now we want to run an ALT to estimate a time to failure distribution for the device. The intent is to use an acceleration model to accelerate the testing and a time to failure model to adjust to our various expected use conditions.

Given the device, a small interface module with a few buttons, electronics, a display and enclosure, and the data sheet with MTBF, how can we design a meaningful ALT? Continue reading Futility of Using MTBF to Design an ALT

Two Ways to Think and Talk about Reliability

Two Ways to Think and Talk about Reliability

Neither includes using MTBF, btw.

And, I’m not thinking about the common language definition either.

Plus, I may have this all wrong. Here is the way I think about the reliability of something. More than ‘it should just work’ and different than ‘one can count on it to start’. When I ask someone how reliable a product is, this is what I mean.

By explaining my basic understanding we can compare notes. It is possible, quite possible, that I will learn something. As you may as well. Let’s see. Continue reading Two Ways to Think and Talk about Reliability

The Damage Done by Drenick’s Theorem

The Damage Done by Drenick’s Theorem

Have you ever wondered by we use the assumption of a constant failure rate? Or considered why we assume our system is ‘in the flat part of the curve [bathtub curve]’?

Where did this silliness first arise?

In part, I lay blame on Mil Hdbk 217 and parts count prediction practices. Yet, there is a theoretical support for the notion that for large, complex systems the overall system time to failure will approach an exponential distribution.

Thanks go to Wally Tubell Jr., a professor of systems engineering and test. He recently sent me his analysis of Drenick’s theorem and it’s connection to the notion of a flat section of a bathtub curve.

Wally did a little research and found the theorem lacking for practical use. I agree and will explain below. Continue reading The Damage Done by Drenick’s Theorem

3 MTBF Stories

3 MTBF Stories

Everyone loves a great story. Storytelling has been a long tradition to pass along knowledge and wisdom.

There are good stories, tales of inspiration. There are sad stories, tales of caution.

There are fables, ghost stores, legends, epic poems, and more. When considering the reliability performance of your product or equipment, you probably have a few stories that you can tell. “That time … “

Simple join colleagues for lunch and ask about the ‘major disasters’ of the past. The stories help us to remember and hopefully avoid repeating mistakes.

Here are three stories with MTBF as a central figure. It is a site and blog that does take about MTBF, so it fits. To start, let me introduce you to Martin, a new reliability engineering reporting to his first day of work at a bicycle design and manufacturing company. Two sad stories and a good one. enjoy. Continue reading 3 MTBF Stories

Different Data Same Decision

Different Data Same Decision

Let say you have some time to failure data on your equipment. A common action is to calculate the MTBF. All well and good until you expect to make a meaningful decision based on the calculation.

Using just the mean of the data, the MTBF value is likely to provide you with a less than useful bit of information. Thus your decision will be rather random or worthless.

Let’s explore just how this simple calculation of perfectly good data can mislead your decision making. Continue reading Different Data Same Decision

5 Reasons Rate of Change is Important

5 Reasons Rate of Change is Important

A simplifying assumption associated with using MTTF or MTBF implies a constant hazard rate. Some assume we’re in the useful life section of the bathtub curve. Others do not understand what assumptions they are making.

Using MTTF or MTBF has many problems and as regular reader here know, we should avoid using these metrics.

By using MTTF or MTBF we also lose information. We are unable to measure or track the rate of change of our equipment or system’s failure rates (hazard rate). The simple average is just an average and does not contain the essential information we need to make decisions.

Let’s explore five different reasons the rate of change of a failure rate is important to measure and track. Continue reading 5 Reasons Rate of Change is Important

How About Weibull Instead of MTBF?

What About Weibull, Can I Use it Instead of MTBF?

This was a follow up question in a recent discussion with Alaa concerning using a metric other than MTBF.

The term ‘Weibull’ in some ways has become a synonym for reliability. Weibull analysis = life data (or reliability) analysis. The Weibull distribution has the capability to describe a changing failure rate, which is lacking when using just MTBF. Yet, it is suitable to use ‘Weibull’ as a metric? Continue reading How About Weibull Instead of MTBF?