<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>No MTBF</title>
	<atom:link href="http://nomtbf.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://nomtbf.com</link>
	<description>A site devoted to the eradication of the misuse of MTBF</description>
	<lastBuildDate>Wed, 16 May 2012 05:27:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>Arrhenius or Erroneous</title>
		<link>http://nomtbf.com/2012/05/arrhenius-or-erroneous/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=arrhenius-or-erroneous</link>
		<comments>http://nomtbf.com/2012/05/arrhenius-or-erroneous/#comments</comments>
		<pubDate>Tue, 15 May 2012 04:21:44 +0000</pubDate>
		<dc:creator>Fred Schenkelberg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=228</guid>
		<description><![CDATA[the following is a discussion on the sister Linkedin NoMTBF Group recently. It was and may continue to be a great discussion. Please take a look and comment on where you stand? Do you some form of the Arrhenius reaction &#8230; <a href="http://nomtbf.com/2012/05/arrhenius-or-erroneous/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<table cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td valign="middle">the following is a discussion on the sister Linkedin NoMTBF Group recently. It was and may continue to be a great discussion. Please take a look and comment on where you stand? Do you some form of the Arrhenius reaction rate equation in your reliability engineering work?</p>
<p>Join the discussion here with a comment, or on the Linkedin group conversion.</p>
<p>Fred</p>
<p><span id="more-228"></span></p>
<hr />
<p>&nbsp;</p>
<p>Enrico Donà has started a discussion: <a href="http://www.linkedin.com/e/-mpj7v-h0aokz0w-44/vaq/103727401/1857182/-1/view_disc/?hs=false&amp;tok=3k7YO5YKgVZl81">Where does &#8220;0.7eV&#8221; come from?</a></p>
<p>&#8220;Most manufacturers are still using an Arrhenius law (shouldn&#8217;t we rather call it &#8220;Erroneous law&#8221; ? <img src='http://nomtbf.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  with an activation energy of 0.7eV to extrapolate High T Operating Life test data to use conditions for every kind of electronic component (from complex ICs to simple passives). It is often claimed that 0.7eV are based on &#8220;historical data&#8221;. I have never seen actually any paper or pubblication where this activation energy has been really measured. The use of a constant hazard rate (lambda) was originally justified by the fact that electronic boards have many different components with different failure mechanisms. The argument was that different Weibull distributions with different activation energies yield on an average a roughly constant hazard rate for which an apparent activation energy can be defined. Many manufacturers seem to be convinced now that the other way round must work too! Since a constant hazard rate with Ea=0.7eV has been once claimed for electronic boards, every single component must follow the same law too! It is just amazing how errors do propagate!&#8221;</p>
<p>&#8211;</td>
<hr /></tr>
</tbody>
</table>
<p>Enrico, very good observation.I have seen no evidence that there is any valid basis for the 0.7 eV as I have never heard what physical mechanism that the activation energy is being referenced to. Is it oxide breakdown, electromigration, diffusion? It makes no sense when it is referring to the propagation of solder crack, or component package delamination, as those mechanisms are driven by thermal cycles and vibration. So much of reliability prediction of electronics is smoke and mirrors, and the real causes of unreliability are due to mistakes or overlooked design margin errors, errors in manufacturing, or abuse by customers. These causes are not predictable or follow some predictable pattern that can be modeled. There is a much better use of engineering resources to make a reliable electronics system by using stress to rapidly these overlooked margins, and errors in manufacturing before they are produced in mass quantities or shipped to the customer. We have more life in the vast majority of electronics that we really just need to remove the unreliable elements (from the causes previously mentioned) and we have a robust system that will exceed its technologically useful life.</p>
<p>Posted by Kirk Gray</p>
<hr />
<p>I would recommend the following-</p>
<p>1. Check Dimitri Kececioglu&#8217;s &#8220;Burn In Testing- It&#8217;s quantification and optimization&#8221;. I had referred to the text around a year back for the figure of Activation Energy for different components.</p>
<p>2. ASM International&#8217;s EDFAS references. I had read read certain references sometime back at their website on this subject.</p>
<p>Posted by Vinod Pal Singh</p>
<p>&nbsp;</p>
<hr />
<p>Hi Fred,<br />
&#8220;10C higher&#8211;&gt; 2 times faster&#8221; is the rule of thumb for chemical reaction rates. Why has it indiscriminately been applied to every kind of wear-out process stays a mystery&#8230;</p>
<p>Posted by Enrico Donà</p>
<hr />
<p>To obtain the activation energy the arrhenius model is simply fit to time to failure versus temperature and the activation energy is solved for. Specific failure mechanisms are typically looked for except in the case of basic material evaluations where failure is defined as a 50% loss of tensile strength. The value of 0.7 is a common rule of thumb, remembering that the lower this number the less time compression one gets for an increase in stress. If you look at polymer materials suppliers will often perform aging tests and activation energies will be published. This value is reasonable for high glass filled nylon (45%), but for unfilled nylon the activation energy is 1.0, low to medium levels of glass fill are 0.9 to 0.8.<br />
For electronics there are many studies that have looked at various failure mechanisms but most are chemical in nature and documented what the activation energy is and as expected there are ranges of values. High temperature does produce grain growth in solder which reduces strength and can reduce thermal cycle life. Research is ongoing for lead free solders compared to tin lead. See Joe Smentana&#8217;s published work here. Some published values:<br />
Silicon semiconductor devices<br />
Silicon Oxide 1-1.05<br />
Electromigration 0.5-1.2<br />
Corrosion 0.3-.06, 0.45 typ.<br />
Internetallic Growth Al/Au 1-1.05<br />
FAMOS Transistors<br />
Charge Loss 0.8<br />
Contamination 1.4<br />
Oxide Effects 0.3<br />
IC MOSFETs, Threshold Voltage Shift 1.2<br />
Plastic Excapsulated Transistors 0.5<br />
MOS Devices 1.1-1.3, Weak populations 0.3-0.9<br />
Flexible Printed Circuits Below 75C 0.4<br />
Flexible Printed Circuits above 75C 1.4<br />
Opto Electric devices &#8211; 0.4<br />
Photo Transistors 1.3<br />
Carbon Resistors 0.6<br />
LEDs 0.8<br />
Linear Op Amps 1.6-1.8, Weak populations 0.7-1.1<br />
In general, damage models provide some technical basis for accelerated tests but never decimal point accuracy, even though we get precise answers. It is quite easy for people to get comfortable with a common number such that they becomes a sacred cow and nobody knows where the information came from or its proper application.</p>
<p>Posted by Dustin Aldridge</p>
<hr />
<p>The origin of the 0.7eV is from the Characterization of Metal Interconnect Failures for<br />
Electromigration, failures and it depends on the type of Metal interconnects; these are measure at Wafer Level Reliability (WLR), and continues on packaged products all the<br />
and propage to the board level.</p>
<p>Posted by John Nkwuo PE</p>
<hr />
<p>There are several papers publlished on this check IRPS Symposium and JEDEC Standards for Electromigration.</p>
<p>Posted by John Nkwuo PE</p>
<hr />
<p>Dustin and John, so how would you use these published eV values in a PWBA with 100&#8242;s or 1,000 of components? Have you observed component wear out as a cause of system unreliability in the field? In the vast majority of systems most components when properly selected for the application have more life than needed before a system is technological obsolete. Intrinsic wear out mechanisms in components is not typically why systems experience failure in 0-7 years. How do the sometimes wide range of eV values (&#8220;MOS Devices 1.1-1.3, Weak populations 0.3-0.9&#8243;???) help make a reliable electronic system? Why do so many companies and individuals still believe that these numbers help build a reliable electronics system? What do the root causes of verified field failures tell you about why your electronics systems fail?</p>
<p>Posted by Kirk Gray</p>
<hr />
<p>I must say that I myself was reviweing this 0.7ev out from an extensive DOE that we&#8217;ve done in the past to fine tune this numbers. Number was slightly higher but fairly close to that. Yet, I do agree that activation energy numbers need to be carefully used.</p>
<p>Posted by Meny Nahon</p>
<hr />
<p>There is still a lot of ignorance in the world.<br />
I agree electronic components are not a major issue. The eV value is determined by simply fitting a model to scattered set of measurements for an average value. One can define a strategy to use them all but it is just a waste of effort.<br />
These can be used for specific field noted issues. For a test program, one might choose the conservative grand average. If an aging issues exists, it is often due to a weak sub population, thus the need to understand if there is a difference (which there can be) but it can just be a scaling factor of life with a consistent activation energy.<br />
For a test program one considers where the most risk is; and you are correct it is not in the components nor aging damage. It appears that some blindly believe that Arrhenius is all you need to consider. Wrong. Aging can reduce strength of solder joints and thus reduce solder joint thermal cycle life. This is not consistently true for lead free solder where it is a mixed bag still being researched. Hence to guide the aging target and exercise due care, an conservative average activation energy is often chosen. Aging does change the strength of engineering polymers, so for these needs this is a valid use.<br />
A value in activation energies is after root cause failure analysis. If a particular mode is created that could be caused by an aging mechanism, analyses are possible to determine the population risk for that failure considering what we know about the failed part, where it is used, how it was used, etc. and analyzing in light of usage and environmental variation.<br />
These values do not help make reliable products, but they can help prioritize where design enhancement focus should occur. A high activation energy value mechanism is more sensitive from a stress perspective. There is false precision with any of these models. The weakest things fail first and if everything fails for a non aging reason one will never observe an aging failure in the field. Hence, if the thermal cycling mode occurs much earlier, this dominates, and thus one concentrates on minimizing CTE mismatch, strain relief, or other design strategies. Experience often gives a hint for what stressors will produce failures within the design life. Poor quality can be detected using burn in. Here we just want to be sure we do not remove too much life from the product such that an Arrhenius analysis can be useful.<br />
For the most part thermal cycling is consistently the most effective stressor for electronics. Focusing too much on Arrhenius is incorrect as the aging mechanisms are not the primary cause for field failure. However to be true to a simulation need, it is reasonable to provide a requirement with some merit for the aging portion in addition to the thermal cycling requirement if this is a major stressor in the field environment. The thermal cycling requirement is also subject to similar hand waving with various models, exponents, material constants, dependency upon dwell times, rates, catalytic effects, etc. In real life one often has a mix of failure modes the occur randomly resulting in the overall exponential failure distribution observation. One lumps them all together and possibly the Arrhenius model fits even though it is not true to the physics, it still may be useful for an engineering need. Attach a named formula to it, communicate confidently and credibility goes way up with management.<br />
Consideration of aging, thermal fatigue, structural fatigue, corrosion, etc. damage mechanisms is prudent for any reliability engineer. With all these competing modes we want to focus on the ones with the highest risk. The models help us decide where to focus our efforts, and sometimes comprehend the physics, but they really only tell us within an order of magnitude when the product might fail. Analysis simply bounds our uncertainty to a perceived tolerable level.</p>
<p>Posted by Dustin Aldridge</p>
<hr />
<p>Kirk &#8212; these are tools; in the place of experience engineers or shall I say analysts take a cut at things using &#8220;formulas&#8221; ; with experience they understand the limits; these are fostered when they have limited budgets or time and are told do one or two things before we ship &#8230; in the face of management review use of these fomulae lead to confidence, as Dustin puts it &#8230; the challenges come in smaller operations with limited time and dollars</p>
<p>Posted by Eric Drobny</p>
<hr />
<p>Eric, I understand that they are considered tools, but still if they are based on invalid assumptions, and therefore answers they provide are invalid and misleading they can possibly costly through invalid solutions. Dustin references Arrhenius many times, yet it is &#8216;erroneous&#8221; for most failure mechanisms so using the Arrhenius to calculate &#8220;burn-in&#8221; life removal is invalid.<br />
Maybe I am wrong, I think most manufacturers (at least their managers and leadership, if not their stockholders) in competitive markets want to produce the most reliable products at the lowest costs. To most effectively do this the priority SHOULD be to discover weaknesses as fast as possible during development and guard against latent defects and process excursions with the most efficient stress screens in manufacturing. This means use STIMULATION with stress, and not simulation (which maybe used at the end of development for qualification), and find those weaknesses and eliminate the cause of them early. Rarely are there more than one or two elements that need to change a product from a weak one to a strong one. Sometimes it may be just software changes in a digital system that can add significant temperature margin.<br />
As long as &#8220;reliability engineering&#8221; is focused on the back end of the bathtub curve, the wear out phase, they are not dealing with the reality of unreliability in electronics. Just look at the causes of your own companies field returns. What does that tell you ?</p>
<p>Posted by Kirk Gray</p>
<hr />
<p>I believe in balance. The question related to Arrhenius, hence the response, not a diatribe on why it is worthless, nor a passionate call for stimulation. The number does come from something, how it is applied is a choice. Eric in pointing out the aspect of a tool is correct, stimulation can be a tool that also does not always produce the desired result when used inappropriately or for highly robust designs to begin with. People do the best they can within their knowledge and capability. Arrhenuius can be erroneous, but the generalization is patently incorrect. Stimulation is not a panacea either, it is a best practice that in many cases has proven to be more efficient, but there are cases where Arrhenius is appropriate and useful as well. Who makes the choice and on what basis?</p>
<p>Posted by Dustin Aldridge</p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/05/arrhenius-or-erroneous/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why HALT is not HALT</title>
		<link>http://nomtbf.com/2012/05/why-halt-is-not-halt/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=why-halt-is-not-halt</link>
		<comments>http://nomtbf.com/2012/05/why-halt-is-not-halt/#comments</comments>
		<pubDate>Fri, 11 May 2012 23:32:42 +0000</pubDate>
		<dc:creator>Fred Schenkelberg</dc:creator>
				<category><![CDATA[MTBF]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=225</guid>
		<description><![CDATA[An excellent short white paper by Craig Hillman that is worth reading. It underscores whey I claim HALT is the second worst 4 letter acronym in our profession. See the paper at http://www.dfrsolutions.com/uploads/white-papers/Why_HALT_Is_Not_HALT.pdf]]></description>
			<content:encoded><![CDATA[<p>An excellent short white paper by Craig Hillman that is worth reading. It underscores whey I claim HALT is the second worst 4 letter acronym in our profession. See the paper at http://www.dfrsolutions.com/uploads/white-papers/Why_HALT_Is_Not_HALT.pdf</p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/05/why-halt-is-not-halt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Acceleration factors</title>
		<link>http://nomtbf.com/2012/05/acceleration-factors/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=acceleration-factors</link>
		<comments>http://nomtbf.com/2012/05/acceleration-factors/#comments</comments>
		<pubDate>Sun, 06 May 2012 22:48:43 +0000</pubDate>
		<dc:creator>Fred Schenkelberg</dc:creator>
				<category><![CDATA[Testing]]></category>
		<category><![CDATA[acceleration]]></category>
		<category><![CDATA[Arrhenius]]></category>
		<category><![CDATA[factor]]></category>
		<category><![CDATA[reliability]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=219</guid>
		<description><![CDATA[Temperature acceleration factor for ALT planning (question posted to Linkedin Society of Reliability engineers group, 5/7/12 Hello, can anyone advise me how to calculate temperature acceleration factor for a complex system including cards, RF elements, cables, motors and moving parts? &#8230; <a href="http://nomtbf.com/2012/05/acceleration-factors/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div id="attachment_221" class="wp-caption alignleft" style="width: 490px"><a href="http://nomtbf.com/wp-content/uploads/2012/05/2007-09-14-RockClimbing-1.jpg"><img class="size-full wp-image-221" title="2007-09-14 RockClimbing a system of gear" src="http://nomtbf.com/wp-content/uploads/2012/05/2007-09-14-RockClimbing-1.jpg" alt="gear used for a rock climbing anchor" width="480" height="640" /></a><p class="wp-caption-text">Gear for anchor during rock climbing</p></div>
<h3>Temperature acceleration factor for ALT planning (question posted to Linkedin Society of Reliability engineers group, 5/7/12</h3>
<p>Hello, can anyone advise me how to calculate temperature acceleration factor for a complex system including cards, RF elements, cables, motors and moving parts? Is the Arrhenius model valid for such systems, or there are more precise models? Thank you!</p>
<p>&#8230;and my response&#8230;</p>
<p>The acceleration factor equations are commonly tied very closely to a specific failure mechanism. For example, for SAC solder joint fatigue uses the modified Norris Landzberg model. And, for metal migration/corrosion within plastic encapsulated packages Peck&#8217;s equation is useful.</p>
<p>Each failure mechanism reacts to stresses differently, some more so than others. Making testing at a system level with accelerated stresses, &#8230; , more difficult.</p>
<p>So be careful, unless you know which failure mechanisms are most likely to occur during use and you accelerate appropriately.</p>
<p>If you know that temperature is the stress of interest related to the dominate failure mechanisms, again be careful, as you will quickly find, the activation energy is important and is generally associated with a specific failure mechanism.</p>
<p>And just temperature may not be sufficient &#8211; for moving parts, load and frequency of motion may be more useful for acceleration. For connectors and cables, maybe thermal cycling is more important.</p>
<p>If I don&#8217;t run a motor with a high temperature and humid environment it may fail due to corrosion. If I simply run it with an unbalanced load, it may wear out the bearing quickly &#8211; both lead to failure, yet have completely different acceleration factors (failure mechanisms) and one test or one AF is just not sufficient.</p>
<p>In short, there are more precise models &#8211; depending on the failure mechanisms involved.</p>
<p>hope that helps.</p>
<p>Fred</p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/05/acceleration-factors/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>System or component testing</title>
		<link>http://nomtbf.com/2012/05/system-or-component-testing/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=system-or-component-testing</link>
		<comments>http://nomtbf.com/2012/05/system-or-component-testing/#comments</comments>
		<pubDate>Tue, 01 May 2012 16:43:05 +0000</pubDate>
		<dc:creator>Fred Schenkelberg</dc:creator>
				<category><![CDATA[Testing]]></category>
		<category><![CDATA[ALT]]></category>
		<category><![CDATA[component]]></category>
		<category><![CDATA[FMEA]]></category>
		<category><![CDATA[HALT]]></category>
		<category><![CDATA[life testing]]></category>
		<category><![CDATA[system]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=215</guid>
		<description><![CDATA[Fred i was asked this question and wanted to know what your thoughts were on this. R and D asked me what was the criteria to decide if to test at a component level or at a system level , &#8230; <a href="http://nomtbf.com/2012/05/system-or-component-testing/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Fred i was asked this question and wanted to know what your thoughts were on this. R and D asked me what was the criteria to decide if to test at a component level or at a system level , my answer was that it should depend on what is the reliability and confidence level of the component<br />
your thoughts?<br />
thanks<br />
sd</p>
<p>&nbsp;</p>
<p>Hi SD,</p>
<p>Good question &#8211; and it&#8217;s not only a factor of reliability and confidence. While those are important to have in mind prior to designing a life test, it&#8217;s not the only consideration.</p>
<p>Often the decision to test at the component level is because it has a unique or new failure mechanism which is possible to evaluate and characterize with life testing directly on the component or test structure with the component. It&#8217;s often less expensive, easier to accelerate and to accomplish.</p>
<p>The testing at the system level is more expensive, more difficult to focus the testing on a specific component or component failure mechanism. Yet, is often the only way to evaluate interactions between elements of a product or while the product is operating.</p>
<p>In the rare case when a single component and it&#8217;s failure mechanism dominant the system&#8217;s failures, then testing the component at the system level makes sense.</p>
<p>Ok, backing up a little to your question.</p>
<p>Test at the component level when you want to learn about the life of the component and the components specific failure mechanisms.</p>
<p>Test at the system level when you are exploring system life during expected use conditions (possible to accelerate &#8211; like daily temperature changes affect on product life). Or, when the failure mechanisms are related to component interactions and operation.</p>
<p>If you do not have a clear failure mechanism as the target for the testing, then it may be difficult to design the appropriate test. Using FMEA, HALT or some other discovery method is useful to uncover the product&#8217;s failure mechanisms. Then move into life testing at the appropriate level with the test focused on specific failure mechanisms.</p>
<p>Hope that helps, please do let me know if you have any questions.</p>
<p>cheers,</p>
<p>Fred</p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/05/system-or-component-testing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Parts count variation</title>
		<link>http://nomtbf.com/2012/02/parts-count-variation/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=parts-count-variation</link>
		<comments>http://nomtbf.com/2012/02/parts-count-variation/#comments</comments>
		<pubDate>Sun, 26 Feb 2012 18:01:38 +0000</pubDate>
		<dc:creator>Fred Schenkelberg</dc:creator>
				<category><![CDATA[Predictions]]></category>
		<category><![CDATA[error]]></category>
		<category><![CDATA[parts count]]></category>
		<category><![CDATA[prediction]]></category>
		<category><![CDATA[product reliability]]></category>
		<category><![CDATA[reliability]]></category>
		<category><![CDATA[variation]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=210</guid>
		<description><![CDATA[Just a short post to point to a newly added paper to the reference section. A few years ago I recalled seeing a paper that studied the difference to expect between various parts count methods and actual results. Jeff and &#8230; <a href="http://nomtbf.com/2012/02/parts-count-variation/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Just a short post to point to a newly added paper to the reference section. A few years ago I recalled seeing a paper that studied the difference to expect between various parts count methods and actual results.</p>
<p>Jeff and colleagues did this work some time ago, and in most cases the underlying parts count methods haven&#8217;t changed too much, so I suspect the results are still very relevant.</p>
<p>The bottom line &#8211; expect as much as -100% to +500% different between the prediction and the actual result.</p>
<p>To see a draft of the paper, visit the References section of the site, click <a href="http://nomtbf.com/wp-content/uploads/2011/12/Draft-Comparison-of-Electronic-Reliability-Prediction-Methodologies.pdf">Draft Comparison of Electronic Reliability Prediction Methodologies</a>, or via the slideshare window below.</p>
<object style="margin:0px" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=id=11756810&amp;doc=draftcomparisonofelectronicreliabilitypredictionmethodologies-120226115557-phpapp01&amp;type=d" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><param name="wmode" value="transparent" /><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=id=11756810&amp;doc=draftcomparisonofelectronicreliabilitypredictionmethodologies-120226115557-phpapp01&amp;type=d" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355" wmode="transparent"></embed></object>
<p>Enjoy. And, if you know of other studies of this nature. Please let me know and we&#8217;ll at least post a link to the work.</p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/02/parts-count-variation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>No Evidence of Correlation: Field failures and Traditional Reliability Engineering</title>
		<link>http://nomtbf.com/2012/02/no-evidence-of-correlation-field-failures-and-traditional-reliability-engineering/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=no-evidence-of-correlation-field-failures-and-traditional-reliability-engineering</link>
		<comments>http://nomtbf.com/2012/02/no-evidence-of-correlation-field-failures-and-traditional-reliability-engineering/#comments</comments>
		<pubDate>Fri, 10 Feb 2012 18:20:45 +0000</pubDate>
		<dc:creator>Kirk Gray</dc:creator>
				<category><![CDATA[Predictions]]></category>
		<category><![CDATA[field failure correlations]]></category>
		<category><![CDATA[HALT]]></category>
		<category><![CDATA[HASS]]></category>
		<category><![CDATA[MIL HNBK 217]]></category>
		<category><![CDATA[Reliability predictions]]></category>
		<category><![CDATA[root cause]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=198</guid>
		<description><![CDATA[Historically Reliability Engineering of Electronics has been dominated by the belief that 1) The life or percentage of complex hardware failures that occurs over time can be estimated, predicted, or modeled and 2) Reliability of electronic systems can be calculated &#8230; <a href="http://nomtbf.com/2012/02/no-evidence-of-correlation-field-failures-and-traditional-reliability-engineering/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Historically Reliability Engineering of Electronics has been dominated by the belief that 1) The life or percentage of complex hardware failures that occurs over time can be estimated, predicted, or modeled and 2) Reliability of electronic systems can be calculated or estimated through statistical and probabilistic methods to improve hardware reliability.  The amazing thing about this is that during the many decades that reliability engineers have been taught this and believe that this is true, there is little if any empirical field data from the vast majority of verified failures that shows any correlation with calculated predictions of failure rates.</p>
<p>The probabilistic statistical predictions based on broad assumptions of the underlying physical causes begin with the first electronics reliability prediction guide  begin November 1956, with the publication of the RCA release TR-1100, &#8220;Reliability Stress Analysis for Electronic Equipment&#8221;, which presented models for computing rates of component failures. This publication was followed by the &#8220;RADC Reliability Notebook&#8221; in October 1959, and the publication of a military reliability prediction handbook format known as MIL-HDBK-217.</p>
<p>It still continues today with various software applications which are progenies of the MIL-HDBK-217. Underlying these “reliability prediction assessment” methods and calculations is the assumption that the main driver of unreliability is due to components that have intrinsic failure rates moderated by the absolute temperature. It has been assumed that the component failure rates follow the Arrhenius equation and that component failure rates approximately doubles for every 10 °C.</p>
<p>MIL-HDBK-217 was removed from the military as reference document in 1996 and has not been updated since that time; it is still being reference unofficially by military contractors and still believed to have some validity even without any supporting evidence.</p>
<p>Much of the slow change in the industry is due to the fact that electronics reliability engineering has a fundamental “knowledge distribution” problem in that real field failure data, and the root causes of those failures can never be shared with the larger reliability engineering community. Reliability data is some of the most confidential sensitive data a manufacturer has, and short of a court order will never be published. Without this real data and information being disseminated and shared, one can expect little change in the beliefs of the vast majority of the electronics reliability engineering community.</p>
<p>Even though the probabilistic prediction approach to reliability has been practiced and applied for decades any engineer who has seen the root causes of verified field failures will observe that most all failures that occur before the electronic system is technologically obsolete, are caused by 1) errors in manufacturing 2) overlooked design margins 3) or accidental overstress or abuse by the customer.  The timing of the root causes of these failures, which many times are driven by multiple events or stresses, are random and inconsistent. Therefore there is no basis for applying statistical or probabilistic predictive methods. Most users of predictions have observed the non-correlation between estimated and actual failure rates.</p>
<p>It is long past time that the electronics design and manufacturing organizations to abandon these invalid and misleading approaches, acknowledge that reliability cannot be estimated from assumptions and calculations, and start using “stress to limits” to find latent failure mechanisms before a product is released to market.  It is true that you cannot derive a time to failure for most systems, but then no test can provide an actual field “life” estimate for a complex electronic system nor do we need to. There is more life than needed in most electronics for most applications.</p>
<p>Fortunately, there is an alternative. A much more pragmatic and effective approach is to find to put most engineering and testing resources to discovery of  overlooked design margins or a weakest link  early in the design process (HALT) and then use that strength and durability to  quickly screen (HASS) for errors during  manufacturing.  HALT and HASS have little to do with a specific type of chamber or chamber capabilities. It is a fundamental change in the frame of reference for reliability development, moving instead  from time metrics to stress/limit metrics. Many have already realized this new frame of reference. Since they have found these methods much more efficient and cost effective for developing robust electronics systems, it gives them a competitive advantage. They are not about to let the world or their competitors know of how successful these methods are.</p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/02/no-evidence-of-correlation-field-failures-and-traditional-reliability-engineering/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Graphical Analysis of Repair Data</title>
		<link>http://nomtbf.com/2012/02/graphical-analysis-of-repair-data/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=graphical-analysis-of-repair-data</link>
		<comments>http://nomtbf.com/2012/02/graphical-analysis-of-repair-data/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 22:03:44 +0000</pubDate>
		<dc:creator>Fred Schenkelberg</dc:creator>
				<category><![CDATA[Maintainability]]></category>
		<category><![CDATA[cost]]></category>
		<category><![CDATA[graphical]]></category>
		<category><![CDATA[mcf]]></category>
		<category><![CDATA[mean cumulative function]]></category>
		<category><![CDATA[plots]]></category>
		<category><![CDATA[renewal]]></category>
		<category><![CDATA[repair]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=182</guid>
		<description><![CDATA[With the kind permission of Wayne Nelson and Robert Abernathy we are posting an article on the analysis of repair data. As you may know, the assumptions made when using simple time to failure analysis of repairable systems may provide &#8230; <a href="http://nomtbf.com/2012/02/graphical-analysis-of-repair-data/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>With the kind permission of Wayne Nelson and Robert Abernathy we are posting an article on the analysis of repair data. As you may know, the assumptions made when using simple time to failure analysis of repairable systems may provide misleading results. Using the analysis method outlined by Wayne is one way to avoid those costly mistakes.</p>
<p>Here is the opening elements of the work by Wayne, followed by a link to the full paper.</p>
<p>Appendix M: Repair Data Analysis of Abernethy, R.B. (2006), The New Weibull Handbook, 5th ed., available from Dr. R.A. Abernethy, weibull@worldnet.att.net, 536 Oyster Road, North Palm Beach, FL 33408. May 5, 2006</p>
<p>AN APPLICATION OF GRAPHICAL ANALYSIS OF REPAIR DATA</p>
<p>Wayne Nelson, consultant<br />
WNconsult@aol.com, 739 Huntingdon Drive, Schenectady, NY 12309, USA</p>
<p>SUMMARY. This expository article presents a simple and informative non-parametric plot of repair data on a sample of systems. The plot is illustrated with transmission repair data from cars on a preproduction road test.</p>
<p>KEY WORDS: repair data; reliability data; graphical analysis.</p>
<p>1. INTRODUCTION</p>
<p>Purpose. This article presents a simple and informative plot for analyzing data on numbers or costs of repeated repairs of a sample of systems. The plotting method provides a non-parametric graphical estimate of the population mean cu¬mulative number or cost of repairs per system versus age. This estimate can be used to:</p>
<p>1. Evaluate whether the population repair (or cost) rate increases or decreases with age (this is useful for sys¬tem retirement and burn-in decisions),<br />
2. Compare two samples from different designs, production periods, maintenance policies, environ¬ments, operating conditions, etc.,<br />
3. Predict future numbers and costs of repairs,<br />
4. Reveal unexpected information and insight, an impor¬tant advantage of plots.</p>
<blockquote><p>Overview. Section 2 describes typical repair data. Section 3 de¬fines the basic population model and its mean cumulative function (MCF) for the number or cost of repairs. Sec¬tion 4 shows how to calculate and plot a sample estimate of the MCF from data from systems with a mix of ages. Section 5 explains how to use and interpret such plots.<br />
Dr. Wayne Nelson is a leading expert on analysis of reliability and accelerated test data. He consults and gives training courses for companies and professional societies. For 24 years he consulted across the General Electric Co. and received the Dushman Award of GE Corp. R&amp;D for developments and applications of product reliability data analysis. He was elected a Fellow of the Amer. Statistical Assoc. (1973), the Amer. Soc. for Quality (1983), the Institute of Electrical and Electronics Engineers (1988) for his innovative developments. He was awarded the 2003 Shewhart Medal and the 2010 Shainin Medal of ASQ and the 2005 Lifetime Achievement Award of IEEE for outstanding developments of reliability methodology and contributions to reliability education. He authored three highly regarded books Applied Life Data Analysis (Wiley 1982, 2004), Accelerated Testing (Wiley 1990, 2004), Recurrent Events Data Analysis (SIAM 2003), two ASQ booklets, and 130 journal articles. He can be contacted via WNconsult@aol.com.<br />
Dr. Robert B. Abernathy, www.bobabernethy.com</p></blockquote>
<object style="margin:0px" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=id=11487782&amp;doc=weibhdbkrecurupdate506-120208150255-phpapp01&amp;type=d" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><param name="wmode" value="transparent" /><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=id=11487782&amp;doc=weibhdbkrecurupdate506-120208150255-phpapp01&amp;type=d" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355" wmode="transparent"></embed></object>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/02/graphical-analysis-of-repair-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Problems survey</title>
		<link>http://nomtbf.com/2012/02/problems-survey/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=problems-survey</link>
		<comments>http://nomtbf.com/2012/02/problems-survey/#comments</comments>
		<pubDate>Sun, 05 Feb 2012 19:47:32 +0000</pubDate>
		<dc:creator>Fred Schenkelberg</dc:creator>
				<category><![CDATA[MTBF]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=158</guid>
		<description><![CDATA[Quipol]]></description>
			<content:encoded><![CDATA[<p><iframe src="http://quipol.com/H5BGGkwv" width="400" height="600" frameborder="0" scrolling="no" id="qpl_H5BGGkwv">Quipol</iframe><script src="http://quipol.com/javascripts/embed_quipol.js?qpl_H5BGGkwv"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/02/problems-survey/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Shaping Organizational Behavior</title>
		<link>http://nomtbf.com/2012/01/shaping-organizational-behavior/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=shaping-organizational-behavior</link>
		<comments>http://nomtbf.com/2012/01/shaping-organizational-behavior/#comments</comments>
		<pubDate>Sun, 29 Jan 2012 19:52:00 +0000</pubDate>
		<dc:creator>Pete Stuart</dc:creator>
				<category><![CDATA[MTBF]]></category>
		<category><![CDATA[Behavior]]></category>
		<category><![CDATA[metrics]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=145</guid>
		<description><![CDATA[When conducting a Human Reliability Assessment (HRA) we use the terminology: errors of commission or errors of omission. It behoves every professional to question why we focus upon one metric in preference to all others, in an objective and constructive &#8230; <a href="http://nomtbf.com/2012/01/shaping-organizational-behavior/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>When conducting a Human Reliability Assessment (HRA) we use the terminology: errors of commission or errors of omission. It behoves every professional to question why we focus upon one metric in preference to all others, in an objective and constructive manner in order to discern whether we are exposing our organization to errors of professional omission or commission. Obviously the other conclusion is that we are doing the right thing and this is also an empowering piece of knowledge.</p>
<p>When an organization uses one reliability metric predominately, there is a propensity to subsequently force organizational behavior in a certain direction, which in turn becomes the social norm. If we fail to question the underlying rationale that motivates our organization to rely on certain metrics in favor of all others, then we also fail to fully comprehend our organization&#8217;s collective mindset. Such comprehension is crucial not only as the foundation to any potential change management initiative but also to break into management&#8217;s decision making cycle which drives every tier of an organization. Getting inside our executives decision making cycle is the first step in value adding at the core business level, rather than being relegated as a second tier employee that provides little more than a process compliance function. Reliability engineering is often viewed as an &#8220;add-on&#8221; mandated process by many middle to executive management staff and I proffer that it is not their shortcomings that lead to such a mindset&#8230;.. it is ours. The reliability professional has the task of not only providing highly skilled analysis but also to educate our organization on how we can do business more efficiently, carry less risk forward and use existing resources more effectively by simply understanding how to integrate our skills into core business processes rather than as an &#8220;add-on&#8221;.</p>
<p>So, how is any of this related to something as seemingly innocuous as MTBF?</p>
<p>Well, if we have to ask ourselves that question then quite simply, we are not there yet. We are not at the level of professional and organizational understanding that allows us to instantly realize a potential disconnect between what we can truly offer and what we are currently doing. I concede, MTBF may be the right metric however, unless you &#8220;know&#8221; why it is the right metric then it should be questioned.</p>
<p>The real concern is not necessarily whether MTBF is being used but rather how its use is shaping organizational behavior that in turn potentially manifests into redundant activity, wasted resources, extended schedules, increased risk taking and more importantly&#8230;. our own inability to truly step into our management&#8217;s decision making cycle and ply our craft in a meaningful, effectual and professional manner.</p>
<p>So, are you exposing your organization to professional errors of commission or omission by not questioning the metrics being used?   How is this then shaping your organizational behavior and more importantly&#8230;. what can &#8220;we&#8221; do to treat ill-informed behavior?</p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2012/01/shaping-organizational-behavior/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What should we use instead of MTBF?</title>
		<link>http://nomtbf.com/2011/10/what-instead-mtbf/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=what-instead-mtbf</link>
		<comments>http://nomtbf.com/2011/10/what-instead-mtbf/#comments</comments>
		<pubDate>Sat, 22 Oct 2011 18:22:57 +0000</pubDate>
		<dc:creator>Fred Schenkelberg</dc:creator>
				<category><![CDATA[MTBF]]></category>
		<category><![CDATA[availability]]></category>
		<category><![CDATA[duration]]></category>
		<category><![CDATA[life]]></category>
		<category><![CDATA[metric]]></category>
		<category><![CDATA[probability]]></category>
		<category><![CDATA[reliability]]></category>

		<guid isPermaLink="false">http://nomtbf.com/?p=83</guid>
		<description><![CDATA[Giving a presentation last week and asked if anyone uses an 85/85 type test, and a couple indicated they did. I asked why? The response was &#8211; just because. We have always done it, or it’s a standard, or customers &#8230; <a href="http://nomtbf.com/2011/10/what-instead-mtbf/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://nomtbf.com/wp-content/uploads/2011/12/group-monitor.jpg"><img class="alignleft size-full wp-image-84" title="group monitor" src="http://nomtbf.com/wp-content/uploads/2011/12/group-monitor.jpg" alt="" width="800" height="638" /></a></p>
<p>Giving a presentation last week and asked if anyone uses an 85/85 type test, and a couple indicated they did. I asked why?</p>
<p>The response was &#8211; just because. We have always done it, or it’s a standard, or customers expected it. The most honest response was ‘I don’t know’.</p>
<p>They why is the test being done? Who is using the information for a decision? What is the value of the test results? If ‘just because’ is the best you can say about a test, why do it?<span id="more-83"></span></p>
<p>The same applies to MTBF. Why is it being used and for what purpose and with what value? If the response you find is basically, ‘just because’. Stop using MTBF!</p>
<p>The basic question that then arises is what should we use instead. The answer is or should be obvious &#8211; what matters to your customer and your business. If you customer wants uptime &#8211; use availability. If you customer wants durability then use reliability.</p>
<p>Reliability is the probability of successfully operating over a stated period of time. As you may know from my previous posts, some confuse MTBF as meaning the same thing. And, as you know, MTBF is a statement about the failure rate, and not a couplet of probability and time. It’s really only have of what’s needed.</p>
<p>Use Reliability. State the probability or percent that survive and state the period of time. 98% survive one year. Easy.</p>
<p>No assumptions about distributions or statistics. No simplifications or distortions. And, it’s straight forward to understand. It means what it means. 98 out of 100 units operate successfully for one year. Easy.</p>
<p>Based on this metric, we can determine or assume life distributions and answer all manner of queries. It’s just a start, yet directly useful and meaningful.</p>
<p>Why? Not just because. Reliability is a measure of what the customer or business needs. It directly relates the number of units that work over a period of time. For example, if we have an one year warranty period and want about 2% or fewer failures during the warranty period. Then saying 98% reliable over 1 year (a bit more positive statement then 2% failures) works just fine.</p>
<p>Sure this could be converted to MTBF &#8211; and again I would ask why?</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://nomtbf.com/2011/10/what-instead-mtbf/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

