In some cases we have to conduct testing and are asked to not break the product. Now, that isn’t all that fun as a reliability engineer. We want to find what fails and understand it. Or, we want to confirm what we expect will fail, actually does as expected.
So, what do we do when confronted with a very small sample size (that is one issue) and are expected to conduct failure free testing (second issue)? Let’s explore each issue separately and come up with a few suggestions on how to proceed.
As reliability engineers we are the local expert. We know the arcane arts of product life and equipment uptime design and maintenance. We are sought after to estimate useful life, time to first failure, and consulted when failures occur. Continue reading Are You Doing Your Professional Reading?→
Take Action Today to Improve How Your Organization Talks About Reliability
You know the perils of MTBF use. The widespread misunderstanding and mis-use. You know about how MTBF treats your data poorly.
You also know everyone around you uses MTBF. Your industry uses MTBF. And, now one likes change, least of all about metrics concerning reliability.
As I said to a friend this morning, “The madness has to stop.”
And, you feel that say way. So, what are you going to do about it? Here are five things you can do today.
Use the data to calculate reliability (probability of success) over a duration of interest along with calculating MTBF, then share the results.
Encourage five of your colleagues to check out and subscribe to this site, www.nomtbf.com.
Ask a vendor how they determined the MTBF value they are presenting on the data sheet? What evidence supports that claim and what assumptions are included (often unstated)?
The next time you hear someone mention MTBF, ask them what do they mean? And, than ask what percentage of items should survive a year? If they are not consistent — you found a learning opportunity.
Write a blog post for the www.nomtbf.com site. What have you done to encourage better understanding of reliability concepts in your world? Share you hints, tips, stories, and advice here.
Pick one for today and do as many as you can. What would you add to this list? What kind responses are you receiving when you speak out about the perils of MTBF.
Keep up the effort. Together we are making progress. Thanks for the support.
How to Translate Customer Expectations About Reliability
As a customer when I purchase a new car, a toaster, or a pump for my production line, I expect it to work. To Just Work. As a reliability professional, I also have the language to specify what I mean by, ‘just work’.
Customers that are not reliability engineers do not accurately specify what they mean by ‘it should just work’. So, we have to do a little extra to help translate that they want into specifications that we (manufacturer of the item) can create and deliver. Continue reading How to Translate Customer Expectations About Reliability→
I endured a difficult conversation with a project manager yesterday. The meeting agenda included an initial discussion about the product development reliability plan. She agreed that we needed to identify risks and provide feedback to the team concerning product reliability. Continue reading Is Reliability Just Testing?→
Mean time between failure or mean time before failure is very common. The common definition describes MTBF as a reliability measure that is calculated by tallying operating hours and dividing by the number of failures. Intuitively this is the average time until a failure occurs. Mathematically it is the inverse of the failure rate. Generally used for repairable systems. Continue reading Popular Reliability Measures and Their Problems→
It’s not MTBF. It’s not just the period of time the product does not fail. It’s not just a probability.
It’s a bit more. Reliability is it ‘just works’.
HP calculators are reliable. They work and keep on working. Apparently Lexus makes reliable cars. (According to the current car rankings by Consumer Reports, 2015). My coffee maker is reliable.
The dictionary on my Mac says reliable is:
And, according to O’Connor and Kleyner in Practical Reliability Engineering, 5th ed. Reliability is:
The probability that an item will perform a required function without failure under stated conditions for a stated period of time.
This is a definition we can use as engineers. It has four parts:
Function
Environment
Probability
Duration
And we certainly can define and measure each well.
BTW: MTBF is only probability (actually stated as an inverse failure rate), thus does not fully define reliability.
Consistent, trustworthy? Yes, a reliable product or system should process these essential qualities, too.
Reliability conjures many images and thoughts. The examples you envision are different than mine. That is fine. The concept remains the same. When an item is reliable, it just works. I like to add that it just keeps on working.
When setting goals, estimating, predicting, or measuring reliability, use all four element of the definition laid out by O’Connor and Kleyner. Be clear and complete. Keep it simple and make it reliable.
What comes to mind when you think of reliability? Leave a comment and share what you consider reliable.
When my son was young he asked a lot of questions that were difficult to answer. For example:
Why is the sky blue?
Why do I have to go to school?
What is a conspiracy theory?
The first two were expected, yet the third set me back a little. How do you explain conspiracy theory to a 5th grader? The dictionary type definitions just seemed to confuse everyone. So, I made up a conspiracy theory.
I said, “Did you know, North Dakota, is not really a state?”
For those that haven’t heard of North Dakota, which on many maps is in the north central part of the US, that just reinforces the theory that it doesn’t exist.
My son, having recently memorized all fifty US states and their capital cities in school, said I was wrong and he even knew that was true as he still recalled the capital city name.
“Prove it.”, Was all I said in response.
“Well it’s on the map on the country as a state.” My reply included how maps change and are arbitrary. Anyone could have drawn the map, and how do we know it is accurate. Maybe the good folks in South Dakota paid the map maker to draw in the fictions state of North Dakota.
“It’s listed in Wikipedia!” And, my reply, was about how anyone can create a posting on the site, what is the proof it’s actually true? Have you ever seen a car with ND plates or meet someone from there?” He hadn’t.
My son knew I was only demonstrating the idea of a conspiracy theory. We had fun with it for years.
I was glad he never asked me,
“Why do people use MTBF?”
Just with the blue sky, a shrug and smile just wasn’t a good enough answer. There has to be a rational reasons people use MTBF.
After writing about perils of MTBF use for a few years, my current theory is it has to be a conspiracy.
The MTBF conspiracy theory revealed
Here’s what I think happened.
A bright engineer was tasked with estimating the reliability of a nuclear submarine’s electronics. He was given about a month to achieve this task, which is not enough time to conduct any testing. So, he gathered all the component failure rate data, tallied it up and reported the expected failure rate. {Parts count prediction}
The marketing department noticed the failure rate value and the word failure. The admission that the submarine might fail didn’t help to sell summaries, so they flipped the failure over, creating the average time between failure, or mean time between failures, MTBF.
The lower the failure rate the higher the MTBF went. Up was good. Failure is bad. {That’s how I think marketing folks think – sorry}
The engineers understood failure rates the math to create MTBF was pretty simple. So whatever, tis the same thing. Then management got involved.
The management team only wanted to read and talk about MTBF {again the word ‘failure ’ is bad thinking}. They set MTBF goals, they expected glowing reports of increasing MTBF values, and so on.
Then something really bad happened.
The US Military created a standard. And, a company used a computer to automate the standard’s estimate of MTBF. Other’s did too. Now there was profit to be made by estimating MTBF, not reliability. So, they sold MTBF estimations. After all, that is what the management team wants, MTBF.
The military standard spawned many industry standards. The standards become parts of purchase contracts. MTBF flourished.
“What is your MTBF?” became an acceptable way to ask about reliability performance.
The murky bit of the theory involves why very few stood up to say, “Let’s not use MTBF, it is not very useful. Let’s use the probability of success over a duration (reliability) instead.” You may have said these very words or words to the same affect. And you felt the resistance.
We always use MTBF.
Everyone in our industry uses MTBF.
The vendor only provides MTBF values.
My theory is we all know better, {maybe not the marketing folks – sorry} and we just do feel able to overcome the resistance to change. We know we could do much better with better metrics, yet the backlash is unrelenting.
Just as that first engineer figured out a quick way to come up with a failure rate estimate, we too face the necessity to use MTBF. We do not have the time or energy to change our company or industry to stop using MTBF. So, we just do it.
It’s easy.
I don’t know if the spread of MTBF use is organized by a secret group or not. I suspect not. Yet the ease of use and avoidance of the word failure (or anything the smells like we would have to do statistics) conspired to trap us into using MTBF.
That’s my theory. If you know of any critical bits of information to support this theory, let me know. If we expose the conspiracy for what it is, it may just fade away. We then may get back to work doing reliability engineering and creating reliable products.
Does a Certification Make You a Professional Reliability Engineer?
No, it doesn’t.
It’s just a piece of paper that conveys you mastered some body of knowledge. You most likely also committed to abide by a code of ethics. Plus you may have committed to continuing eductions to maintain the certification.
Having a certification means you know the terms, definitions, techniques and concepts concerning reliability engineering. Thanks all.
Does it mean you are a professional? No.
Being Professional
The dictionary describes professional as being associated or involved with a profession. You are professional by working or studying the profession of reliability engineering. Yet, we commonly consider a professional as being more than just a person with a job title.
A professional, in my mind exemplifies the essence of a noble, caring, capable engineer. One that works for the greater good. Someone the strives to make the world a better place. (Insert pedestal here.)
This is the nature of the engineering code of ethics that professional societies draft and encourage members to live. The following are just examples of the many similar codes that exist:
There are many others and they are all similar. Be honest, forthright and fair in your work.
You probably already adhere to these various codes of ethics. You do not have to pay membership dues to demonstrate you are ethical. It’s how you work, behave and conduct your life.
You are a professional reliability engineer by way you solve problems, continue to learn, assist others willingly, and exemplify how the reliability engineering profession makes the world a better place.
Certifications are Good, too.
There are different types of certifications and many organization offer certificates. For reliability engineering there are three professional societies that I know about that offer certifications.
American Society for Quality Certified Reliability Engineer
Some engineers have all three certifications. Some only one. Many professional engineers do not have any certification. It’s a personal decision. You can strive to work as a professional with or without securing one or more of the certifications offered by professional societies.
I should mention there are many other certifications offered in our industry. Conferences, software companies and consulting & training organizations offer certifications. These like the ones offered by professional society are not licenses (state license or charter). The various certifications simply mean the person meet some level of experience, course work, demonstrated body of work or passed a test.
It doesn’t mean they are a professional.
If you are pursuing a certification, why? Please add a comment on what certification means to you and your career.
Just a quick note. A good friend forwarded me a course listing with the note, “This is not the traditional MTBF course, despite the title.” or something like that.
So, looked into the course offered by The MIRCE Akademy, titled Mean Time Between Failures — MTBF: Scientific method for the accurate predictions of Mean Time Between Failures.
Looks interesting.
So, this may be a good one to take. You can find more information at the course page.
If you have or plan on taking the course, please leave a comment here to let others know how it went and what you learned.
Sometimes making an assumption is a good thing. You can achieve more with less. A well placed assumption saves you time, work, and worry. The right assumption may even be left unstated, it’s so good.
Have you ever assumed the failures for a system follow an exponential distribution? Did you assume tallying up the total hours and dividing by the number of failures was appropriate? Did you even check? (You don’t need to answer.) Continue reading The Convenient Use of MTBF→
Speaking of Reliability a podcast of good friends sitting down with you to talk about reliability engineering.
A new reliability engineering focused podcast now available on iTunes. Give it a listen and please leave a rating and review.
The intent is to publish two times a week. The conversations format is inspired by a chance lunch with two young engineers new to reliability engineering. The questions they asked and our conversation helped them get started and improve their programs. So, let us know your questions.
Dare to Know: Interviews with Quality and Reliability Thought Leaders with Host Tim Rodgers.
Meet the people that shape our profession. Authors, bloggers, consultants, scholars, business leaders. Learn about their insights and motivations.
Now available on iTunes. Please leave a rating and review. Help other find this new podcast.
Enjoy the shows and contact us with any ideas or thought leaders you want us to engage for a future show.
It’s not. In this case the customer is probably not asking for MTBF, what they most like want to know is something meaningful about the expected reliability performance of the item in question. They want to know if what they will or did purchase will last as long as they expect. Continue reading Just Because the Customer Requests MTBF→