As reliability engineers we are the local expert. We know the arcane arts of product life and equipment uptime design and maintenance. We are sought after to estimate useful life, time to first failure, and consulted when failures occur. Continue reading Are You Doing Your Professional Reading?→
How to Translate Customer Expectations About Reliability
As a customer when I purchase a new car, a toaster, or a pump for my production line, I expect it to work. To Just Work. As a reliability professional, I also have the language to specify what I mean by, ‘just work’.
Customers that are not reliability engineers do not accurately specify what they mean by ‘it should just work’. So, we have to do a little extra to help translate that they want into specifications that we (manufacturer of the item) can create and deliver. Continue reading How to Translate Customer Expectations About Reliability→
I endured a difficult conversation with a project manager yesterday. The meeting agenda included an initial discussion about the product development reliability plan. She agreed that we needed to identify risks and provide feedback to the team concerning product reliability. Continue reading Is Reliability Just Testing?→
Mean time between failure or mean time before failure is very common. The common definition describes MTBF as a reliability measure that is calculated by tallying operating hours and dividing by the number of failures. Intuitively this is the average time until a failure occurs. Mathematically it is the inverse of the failure rate. Generally used for repairable systems. Continue reading Popular Reliability Measures and Their Problems→
Does a Certification Make You a Professional Reliability Engineer?
No, it doesn’t.
It’s just a piece of paper that conveys you mastered some body of knowledge. You most likely also committed to abide by a code of ethics. Plus you may have committed to continuing eductions to maintain the certification.
Having a certification means you know the terms, definitions, techniques and concepts concerning reliability engineering. Thanks all.
Does it mean you are a professional? No.
Being Professional
The dictionary describes professional as being associated or involved with a profession. You are professional by working or studying the profession of reliability engineering. Yet, we commonly consider a professional as being more than just a person with a job title.
A professional, in my mind exemplifies the essence of a noble, caring, capable engineer. One that works for the greater good. Someone the strives to make the world a better place. (Insert pedestal here.)
This is the nature of the engineering code of ethics that professional societies draft and encourage members to live. The following are just examples of the many similar codes that exist:
There are many others and they are all similar. Be honest, forthright and fair in your work.
You probably already adhere to these various codes of ethics. You do not have to pay membership dues to demonstrate you are ethical. It’s how you work, behave and conduct your life.
You are a professional reliability engineer by way you solve problems, continue to learn, assist others willingly, and exemplify how the reliability engineering profession makes the world a better place.
Certifications are Good, too.
There are different types of certifications and many organization offer certificates. For reliability engineering there are three professional societies that I know about that offer certifications.
American Society for Quality Certified Reliability Engineer
Some engineers have all three certifications. Some only one. Many professional engineers do not have any certification. It’s a personal decision. You can strive to work as a professional with or without securing one or more of the certifications offered by professional societies.
I should mention there are many other certifications offered in our industry. Conferences, software companies and consulting & training organizations offer certifications. These like the ones offered by professional society are not licenses (state license or charter). The various certifications simply mean the person meet some level of experience, course work, demonstrated body of work or passed a test.
It doesn’t mean they are a professional.
If you are pursuing a certification, why? Please add a comment on what certification means to you and your career.
Speaking of Reliability a podcast of good friends sitting down with you to talk about reliability engineering.
A new reliability engineering focused podcast now available on iTunes. Give it a listen and please leave a rating and review.
The intent is to publish two times a week. The conversations format is inspired by a chance lunch with two young engineers new to reliability engineering. The questions they asked and our conversation helped them get started and improve their programs. So, let us know your questions.
Dare to Know: Interviews with Quality and Reliability Thought Leaders with Host Tim Rodgers.
Meet the people that shape our profession. Authors, bloggers, consultants, scholars, business leaders. Learn about their insights and motivations.
Now available on iTunes. Please leave a rating and review. Help other find this new podcast.
Enjoy the shows and contact us with any ideas or thought leaders you want us to engage for a future show.
If you have gathered some time to failure data. You have the breakdown dates for a piece of equipment. You review your car maintenance records and notes the dates of repairs. You may have some data from field returns. You have a group of numbers and you need to make some sense of it.
Take the average
That seems like a great first step. Let’s just summarize the data in some fashion. So, let’s day I have the number of hours each fan motor ran before failure. I can tally up the hours, TT, and divide by the number of failures, r. This is the mean time to failure.
Or, if the data was one my car and I have the days between failures, I can also tally up the time, TT, and divide by the number of repairs, r. Same formula and we call the result, the mean time between failure.
And I have a number. Say it’s 34,860 hours MTBF. What does that mean (no pun intended) other than on average my car operated for 34k hours between failures. Sometimes more, sometimes less.
Any pattern? Is my car getting better with age, or worse?
A Histogram
In school we used to use histograms to display the data. Let’s try that. Here’s an example plot.
In this case the plot is of service and repair times (most likely similar to the times the garage has my car for a oil change and tune up). Right away we see more than just a number. The values range from about 50 up to about 350 with most of the data on the lower side. Just a couple of service times take over 250 minutes.
Using just an average doesn’t provide very much information compared to a histogram.
Mean Cumulative Function Plot
Over time count the number of failures. If the repair time is short compared to operating time, than this simple plot may reveal interesting patterns that a histogram cannot.
Here is a piece of equipment and each dot represented a call for service. The x-axis is time and the vertical axis is the count of service calls. While it’s not clear what happened shortly after about 3,000 hours, it may be worth learning more about what was going on then.
Even after the first there or four point after 3,000 hours would have signaled something different is happening here.
MCF plots show when something is getting worse (more frequent repairs) by curving upward, or getting better, (longer spans between repairs) by flattening out. Again, a lot more information than with just a number.
Plot the Fitted Distribution
Let’s say we really want to assume the data is from an exponential distribution. We can happily calculate the MTBF value and continue with the day. Or, we can plot the data and the fitted exponential distribution.
Let’s say we have about five failure times based on customer returns out of the 100 units placed into service. We can calculate the MTBF value including the time the remaining 95 units operated, which is about 172,572 hours MTBF. And, we can plot the data, too.
Here’s an example. What do you notice, even with a fuzzy plot image?
The line intersects the point where the F(t) is 0.63 or about the 63rd percentile of the distribution, and the time is at the point we calculated as the MTBF value (off to the right of the plot area).
Like me, you may notice the line doesn’t seem to describe the data very well. It seems to have a different pattern than that described by the exponential distribution. Let’s add a fit of a Weibull distribution that also was fit to the data, including the units that have not failed.
The Weibull fit at least appears to represent the pattern of the failures. The slope is much steeper than the exponential fit. The Weibull tells a different story. A story that represents the story within the data.
Again, just plot the data. Let the data show you what it has to say. What does your data say today?
A new podcast show featuring discussions with reliability experts about a wide range of reliability engineering topics is in the process of development. We’ve recorded a few episodes and in editing now.
I expect to launch the podcast in the next week or two.
The show is in large part based on the questions received over the past few years from you. You being reliability minded folks that would like to solve problems, improve reliability performance and advance your career. Continue reading Speaking of Reliability — a new podcast series→
When reading a report and there is a large complex formula, maybe a derivation, do you just skip over it? Does a phrase, 95% confidence of 98% reliability over 2 years, not help your understanding of the result?
Hypothesis testing, confidence intervals, point estimates, parameters, independent identically distributed, random sample, orthogonal array, …
I suspect reliability of the products and services in your world plan an important role in your day to day existence. For me, maybe I just pay attention to reliability, yet today in particular I tried to notice when things were just working as expected.
In a recent reliability seminar I learned that the younger engineers did not have to take a statistics course, nor was it part of other courses, in their undergraduate engineering education. They didn’t dislike their stats class as so many before them have, they just didn’t have the pleasure.
Generally I ask how many ‘enjoyed’ their stats class. That generally gets a chuckle and opens an introduction to the statistics that we need to use for reliability engineering. I’ll have to change my line as more engineers just do not have any background with statistics.
I suspect this is good new for Las Vegas and other gambling based economies.
Statistics are hard
On average there are a few folks that get statistics. No me. There are those that intuitively understand probability and statistics, and demonstrate a mastery of the theory and application. No me.
I like many others that successfully use statistical tools, think carefully, consider the options, check assumptions, recheck the approach, ask for help and still check and recheck the work. Statistics is a tool and allows us to make better decisions. With practice you can get better at selecting the right tool and master the application of a range of tools.
Sure, it’s not easy, yet as many have found, mastering the use of statistics allows they to move forward faster.
Statistics are abused
Politicians, marketers, and others have a message to support and citing an interesting statistic helps. It doesn’t matter that the information is out of context nor clear. When someone claims 89% of those polled like brand x, what does that mean? Did they ask a random sample? Did they stop asking when they got the result they wanted? What was the poll section process and specific questions? What was the context?
The number may have been a simple count of positive responses vs all those questioned. The math results in a statistic, a percentage. It implies the sample represented the entire population. It may or may not, that is not clear.
We hear and read this type of statistic all too often. We discount even the well crafted and supported statistic. We associate distrust with statistics in general given the widespread poor or misleading use.
To me that means, we just need to be sure we are clear, honest and complete with our use of statistics. State the relevant information so others can fully understand. Statistics isn’t just the resulting percentage, it’s the context, too.
Statistics can be wrong
Even working to apply a statistical tool appropriately, there is a finite chance that the laws of random selection will provide a faulty result. If we test 10 items, there is a chance that our conclusion will show a 50% failure rate even though the actually population failure rate is less then 1%. Not likely to happen, yet it could.
We often do not have the luxury of the law of large numbers with our observations.
So, given the reality that we need to make a decision and that using a sample has risk, does that justify not using the sample’s results? No. The alternative of using no data doesn’t seem appealing to me, nor should it to you.
So, what can we do, we:
Do the best we can with the data we have.
Do exercise due care to minimize and quantify measurement error.
Strive to select samples randomly.
Apply the best analysis available, and,
Extract as much information from the experiment and analysis as possible.
As with wood working there are many ways to cut a board, with statistics there are many tools. Learn the ones that help you characterize and understand the data you have before you. Master the tools one at a time and use them safely and with confidence.
How Many Assumptions Are Too Many Concerning Reliability?
When I buy a product, say a laptop, I am making an educated guess that Apple has done the due diligence to create a laptop that will work as long as I expect it to last. The trouble is I don’t know how long I want it to last thus creating some uncertainty for the folks at Apple. How long should a product last to meet customer expectations when customers are not sure themselves? Continue reading How Many Assumptions Are Too Many Concerning Reliability?→
In English there is a lot of confusion on what reliability, availability and other ‘ilities mean in a technical way. Reliability as used in advertising and common discussions often means dependable or trustworthy. If talking about a product or system it may mean it will work as expected. Continue reading Reliability and Availability→
Do you check assumptions? Not all assumptions are equal as some may lead you to a costly decision.
We regularly make assumptions about the uniformity of material, the consistency of part to part performance, and many other engineering elements of a design or process. We have to simply the problems we face in order to work out solutions and make decisions. Continue reading How to Justify Using the Exponential Distribution→