One of the twitter notes I sent out a few weeks ago in part read, “Celebrate failures”. And a comment came back that it was a wonderful approach that she had not though of before. Failure will occur and when it does it is our chance to learn.
And, we need to learn. As reliability professionals, we continue to learn our entire career. New materials fail in novel manners. New assemblies fail in an assortment of ways. New designs fail due to unknown sources of variation. We will see failures. So rather than simply focus on the next try and hope to find success, let’s learn from each failure as we move toward success.
Do you work in a fire department?
It an expression that I use as have others to describe an organization that quickly responds to failures. A customer calls with a problem, the team jumps into action to solve the issue. A phone call on Friday afternoon brings a chance to work all weekend. The line is down, all hands on deck.
The better fire departments actually do a good job responding and solving problems. They may even work to prevent other similar problems form occurring. They are not very good at determining where the next failure will occur, so they remain diligent and ready to respond.
Heroes are born in a fire department organization. The one that saves the big customer account, get a prime parking spot. The engineer that pulls an all nighter to get the line running, get noticed for promotion. The message is get good at solving crises problems. The problems you solve as part of your day job, doesn’t really count. The spoils go to the solution found at the last minute, under duress, and often after hours.
What have you done of value lately?
In some fire department like organizations, unless you personally abated a major crises, you’re not noticed. Let’s say you do your work well. You craft durable products, work with teams to create reliable solutions, and meet your cost, time to market, performance and quality targets.
Not one bit of recognition or notice. You did your job. Did well even, yet that is expected isn’t it.
Let’s say the same brilliant folks that stamps out raging rates of failure find the time to actually design a product that doesn’t fail unexpectedly. Let’s say your organization does a full root causes analysis, including where the life cycle set of system failed to avert the chain of events leading to the field escalation.
Once we understand that failures will occur, we might take steps to anticipate and resolve those failures before a customer has the luxury of a failure experience. During the design and development stages, we start to balance the final design decision based on acceptable risk of failure and minimum risk of unknown failures. We begin to build certainty in knowing what will fail, when it fail and how often… and work to create a low enough failure rate.
The idea is to predict, anticipate, forecast, estimate and celebrate failures. Running a test that has no failure is a lost opportunity to learn something that allows us to improve the design. Finding a design flaw early, permits a routine amount of work to fix (i.e. If all night and all weekend pushes to meet shipping deadlines is your normal, you need to see what’s possible).
Everything will fail. It’s a matter of when and how. We regularly think about limitation, constraints, and failure modes. Now I’m asking you to consider the failure mechanisms. What chemistry or physics or elemental shift in the design would lead to failure.
It’s not someone else job to add reliability to a system. It’s anyone making decisions about part or vendor selection, anyone sizing a bolt, anyone thinking though or performing maintenance. It pretty mush is anyone that touches in some way the product. It’s your job.
What could fail? Have you discovered new failure mechanisms today? Let’s reward those that discover the first failure, the most failures, or the biggest cost avoidance failure. Let’s give the prime parking spot to someone that finds and fixes a critical flaw before it’s an emergency.
The work in physics of failure, HALT, and risk analysis are just a few of the tools available to prediction what will fail. Coupled with someone willing to consider and prove what will fail, and when, (we call these folks reliability engineers) the team can shift out of firefighting to fire prevention. Celebrating the lack of failures seen by customers, and rarely being surprised by what does fail.
Give it a try, step off alert status and think about what could fail. Sort out how you can find out before starting the line or shipping. Put in the hours now to carefully find what will fail, so you and your team can work to avert those many pending Friday afternoon phone calls from another irate customer.