Samples for Testing
Normally, we life test a sample of products in order to make sure the products will last as long as expected. We assume that the sample we select will represent the total population of products that we eventually ship. It is not a perfect system, and there is some risk involved.
For example, the risk the sample doesn’t represent the population, or that the process for making the product changes and alters the final reliability. Or, the testing doesn’t reflect the use conditions and over or under estimates the resulting product reliability. Or, the test isn’t designed well.
A bad way to start planning a test is to begin with a MTBF and wish to demonstrate the product meets or exceeds that MTBF value.
The Sample Size for MTBF Demonstration
Lets say we have a product and we want to demonstrate it has a 5 hour MTBF. And, let’s say we have five samples available for testing. If we test the products for 25 hours, that is a total of 5 x 25 = 125 hours of testing. We count the number of failures (fix and return the product to testing) and if we get 25 or fewer failures, we can calculate the MTBF estimate based on the data, as 125 hours / 25 failures = 5 hours MTBF. Pretty simple.
A common way to estimate the sample size needed to show a particular MTBF with a particular sampling confidence is done with the Χ2 distribution and the following formula.
θ is our desired MTBF to demonstrate. X2 is the chi-squared distribution with degrees of freedom determined by α which is related to the Type I confidence, and r, which is the critical number of failures. T is the total hours of operation across all units in the test. Working with this formula is left as an exercise for those interested.
We can design the test based on the constraints we have, such as sample size or test duration. A little algebra permits solving with either constraint.
Constant Failure Rate Assumption Issue
The assumption made when using this approach is that the failure rate is constant. The chance to fail is the same at hour one and hour 100. This permits the accumulation of operating hours across various machines to tally the total hours.
If the test is designed to tally 200 hours, one unit could run 199 hours and the other 1 resulting in a T = 200 hours. That’s ok. Or we could have 200 units each operate one hour. Or, one unit run the entire time. As long as some number of units run for T time in total, this approach is fine.
It is this ‘feature’ that first caused me to look at this approach a bit more closely. While it if fine when the assumption is valid, it will produce meaningless results if the constant failure rate is not valid. While we all feel more confident when a reliability figure is supported by test data, it is generally not comforting when one realizes what it really means. Nothing.
It is common to see MTBF values reported without a corresponding duration over which it is valid. That confused look you see when asking for a duration along with an MTBF values is a topic for another post. Yet, the idea that the MTBF value (the constant failure rate) is valid indefinitely is laughable.
It is possible to support with test data nearly any MTBF value. For example, component suppliers often run short duration testing on many units. Over time that accumulated test time, T, becomes quite large and the number of failures is often zero or very, very low. Thus, they can support a very high MTBF value – which is meaningless beyond the operating time units experience during testing.
How do you determine sample size for life testing? If it starts with the notion of demonstrating an MTBF value, you may need to reevaluate your approach.