Achieving reliability with lead-free solders

By Keith Gurnett and Tom Adams

Editor’s note: This is the second of a two-part series. The first part ran on page 1 of the January 2008 issue of Military & Aerospace Electronics.

Prof. Nihal Sinnadurai, a well-known industry expert in reliability assurance methodology, believes that lead-free solders can achieve the same high levels of component and system reliability that military and aerospace users have become accustomed to during 50 years of tin-lead solder use.

Sinnadurai acknowledges, however, that over the next 15 years or so, while the industry is gaining experience with lead-free solders, there will be unavoidable field failures, both in civilian and non-civilian applications, some of which may have significant impact. In this article, he describes four strategies that can make it easier and quicker for manufacturers to achieve high reliability when using lead-free solders in military and aerospace applications.

Supply chain management

One of the problems that has plagued the OEMs of military/aerospace products for years is the relatively low volume of their production. Total production of a specialized military radio, for example, or of a missile’s GPS guidance system may amount to a few thousand units, only a tiny percentage of the volume of popular consumer items, such as an iPod.

“This is where the defense and aerospace lacks leverage,” Sinnadurai says. “They may be strong in the weapons they have, but they are very weak in their purchasing power at the component level, because they’re not buying huge numbers of components...this has been the problem now for the past 20 years.” Component manufacturers are happy to negotiate with high-volume commercial manufacturers, but there is really no incentive for them to give a break to a military and aerospace customer that represents a much smaller volume of business.

OEMs of military and aerospace systems can gain more control over suppliers by banding together in cooperative groups, Sinnadurai says. One such group is STACK-International, a non-profit organization owned by its members, which include companies such as Crane Aerospace & Electronics, Honeywell, Northrop Grumman, Rockwell Collins, Smiths Aerospace, and United Technologies.

By their membership in STACK-International, the member companies basically outsource their semiconductor supply chain audit into the cooperative organization. “STACK-International is essentially a club of high-reliability application companies, who do not individually have great purchasing volumes despite being multi-billion-dollar companies; the quantity of components that they buy is quite small. By clubbing together they have much more clout.”

STACK-International carries out audits of semiconductor houses, companies who, Sinnadurai notes, “wouldn’t otherwise acquiesce to being audited by a company that buys only a thousand components from them.” Thus companies such as Freescale, Texas Instruments, and ST Microelectronics agree to routine audits by STACK. The auditors are not employees of STACK, but are highly qualified auditors employed by the member companies of STACK. “For example, companies such as Smiths Aerospace provide auditors, who undertake the audits.”

Flexible automated assembly

Flexibility, in this case, is automated production that can handle a variety of product assemblies on the same line, by programmed control of the various pieces of process equipment. This involves Design-for-Manufacture (DfM) at the outset. Such flexible manufacturing will readily process printed circuit boards (PCBs) of very different types; for example, two of the many boards that go into an unmanned aerial vehicle (UAV), which is continually evolved in order to improve functionality and thereby also exploits latest hardware technologies, and where small lots are a natural consequence of such evolution. “There are a number of flexible automated manufacturers globally, including the United Kingdom,” Sinnadurai says. One obvious consequence of automated manufacture is that dependence on labor costs is minimal: these companies have demonstrated that their costs can be lower than they would be if they outsourced assembly to a low labor cost country.

Essential for automated flexible assembly are integrated design-for-manufacture and capture of design and assembly parameters. “The parameters of the robotics systems that, for example, pick-and-place components or component recognition or vision systems, are captured and linked into the design process,” Sinnadurai says. “Clearly this requires an intelligent approach to design and manufacture. This starts from hands-on knowledge of the processes. So, there has to be a significant amount of investment in preparation involving experimental study and measurement—thereby obtaining, for instance, the optimum arrangement for pick-and-place sequencing.”

Successful automation allows a machine to assemble, for example, as many as 10 varieties of a board over a period of two hours. The machine recognizes an incoming bare board by a bar code and then adjusts to that type of board. The tapes with all the various needed components will already have been loaded. “And then it moves on to the reflow process, which again has been profiled to meet the solders that will be used,” Sinnadurai says. “What will NOT be done in such an assembly line is to have a mixture of lead and lead-free in the same assembly process, because that would be a recipe for error. There will be different board densities and different designs going through on the same assembly line.”

Some electronics manufacturing services (EMS) offer completely flexible automation, he notes. In order to maintain closer control of their products, an OEM can outsource assembly to a local EMS. This makes it much easier for the OEM to oversee the assembly process and manage the reliability and thereby their liability of the product.

Design review

“Design for reliability comes, in my view, from design review process,” Sinnadurai says. “When you introduce a new product, an essential part of product development is the design review which engages with the key engineering teams.”

What the design review process aims to do is to uncover all the development issues including reliability glitches that may arise from the design, materials, technology, manufacturing, and operation of that product. These can be subtle things: the susceptibility of a given IC package to moisture and corrosion, the right place on the board for a component that is susceptible to thermal damage during reflow, and the possibility that de-paneling a certain board type may result in internal cracks in ceramic chip capacitors.

Design review, Sinnadurai explains, is a systematic procedure. “The original designers, the R&D engineers, the manufacturing engineers, and the reliability and quality engineers may apply their respective skills to review the pros and cons of the proposed product. Issues and challenges must be solved systematically as the development proceeds and the different aspects of the development are progressively made more robust and the product becomes fit-for-purpose.”

Design review provides key engineers the opportunity to look at all of the details and to say, “Don’t do it that way, this is a better solution” or “This is the likely hazard arising from this [material, process, assembly].” All challenges or solutions may also be challenged, Sinnadurai says. “If they say, ‘Don’t do it that way because I don’t like it’ or ‘because it’s never been done that way,’ that’s not good enough. What they need to do is consider, ‘This is the probable consequence if we do it that way, and can we therefore do something in the [design, material, process] to avoid that?’”

Design review is a procedure that Sinnadurai has routinely instituted in corporations, and also instituted when he was vice president and corporate director of a photonics corporation. The various engineering teams first “brainstormed the product without invoking criticism.” The design review then followed an iterative procedure to identify and solve real issues.

“You may have to do some experimental work to de-risk some issues. Not all issues are slap-bang, yes-here’s-the-solution type,” Sinnadurai says. “You have to go through due process. Due process doesn’t have to be bureaucratic. It needs to be competent, and to have intelligent and experienced people involved in it. You can get a product out fast, but not too fast.”

Reliability assurance testing

The culture of cheapness, Sinnadurai points out, has led companies to perform too few reliability tests on too few components. “My observations are that companies also don’t do the tests for long enough because COOs and CEOs are keen to get product out the door. ‘Do you really need to take six months over that?’ or ‘I want you to release the product now because that is good enough.’ And of course the poor reliability engineer who is scared of losing his job will sign off a product against his/her better judgment, and out goes this product with about a quarter of the work that should have been done on it.”

Sinnadurai gives examples that a full qualification of an electronics product should include somewhere between 12 and 20 accelerated aging stress tests. These would probably include thermal shock, thermal cycling, elevated temperature overstress, humidity tests, and others. They add up to a plan that says, “These are the stress tests that you have to pass with no failures and with enough components to demonstrate confidence in the result.”

An effective way for a company to establish an effective reliability development regime is to divide the product into the “building blocks” from which it is constructed. Then the overall reliability testing regime may also be divided into the building blocks, meaning that the reliability tests can be carried out more appropriately on smaller assemblies or subassemblies exhaustively first. When that testing has been completed, testing the full system is much simpler. “Even with the sub-assemblies or the smaller assemblies, you need to do a pretty good job if they’re looking for a 20-year reliability product, which is what the aerospace, military, and telecoms sectors still require—i.e. 20-year reliability requirements.”

What Sinnadurai sees as a most serious reliability testing omission is that of the sample size—the number of components being tested. Stress-testing a given number of components for a particular duration will produce an estimate of the FIT (Failure unIT) rate. The FIT rate is defined as the number of component failures that would be expected in 109 device hours of operation. FIT is a system-level alternative to the MTTF (Mean Time To Failure). The FIT rate is calculated from the total number of failures and sample size (the number of parts tested) and the test duration in hours. The duration at the test stress condition is related to the operation environmental condition by a previously proven accelerated aging model and equation. Statistically, the sample size is critically related to the confidence level of the predicted failure rate.

Commercial and survival pressures have caused many corporations to “dumb down” from the number of devices being tested which was typically 235 resulting in a LTPD (Lot Tolerance Percentage Defective) better than 1 percent. Lot size is crucial to LTPD and the estimate of FIT rate. The latest Telcordia [formerly Bellcore] allows a sample size of as few as 11 components. A sample size of 11 results in a very coarse LTPD of 20%! The same duration test with a sample size of 235 results in a FIT estimate of 1000 compared with a FIT estimate of 17000 with a sample size of 11. “Telcordia has dumbed down,” Sinnadurai says, “and allows companies to do so. Less conscientious component companies leaped for joy when Telcordia produced this completely inadequate, valueless sample sizing.”

The difference is enormous: even if all 11 devices pass the tests, the predicted time to failure is just a few thousand hours and will not prove the required hundreds of thousands of hours. “Reliability assurance is a means of proving whether you’ve got a good product. When they go down to 11 components, they may well have a good product, but they cannot prove it. So they take shortcuts to get a product out the door without proving whether it’s any good.”

Bringing components and systems that use lead-free solder up to the levels of reliability that military and aerospace customers need will require diligence in reliability assurance and not a “dumbed-down” approach. Reliability assurance to achieve low FIT rates below 1000 is one of the keys to proving high reliability, no matter which solder type is being used.