Thermal management for high-performance embedded computing

Oct. 1, 2018
Liquid cooling is becoming more efficient and affordable, while hybrid approaches that blend air, liquid, and conduction cooling are coming into their own to give systems designers a leg-up in the battle against electronics heat.

Liquid cooling is becoming more efficient and affordable, while hybrid approaches that blend air, liquid, and conduction cooling are coming into their own to give systems designers a leg-up in the battle against electronics heat.

It’s long been a given in the electronics industry that computing power doubles roughly every two years — sometimes even faster. Although this maxim, referred to as Moore’s Law, still holds true today, some industry experts worry that an inability to remove heat from electronics eventually could bring Moore’s Law to an abrupt halt.

Integrated circuits in high-performance embedded computing (HPEC) systems are shrinking; that’s also part of Moore’s Law, which says the number of transistors in a dense integrated circuit doubles about every two years. Increasing numbers of transistors in relatively small and cramped spaces means waste heat, and lots of it. If Moore’s Law is to continue further into the 21st Century, then it’s up to engineers who specialize in a variety of disciplines to safeguard the ability to remove heat from embedded systems, while keeping pace with ever-more-powerful computer processors.

Curtiss-Wright Defense Solutions specializes in Air Flow-Through (AFT) technology for high-performance 3U VPX embedded computing systems.

On the face of it, removing waste heat from high-performance embedded computers is a straightforward process; it involves transferring heat from processors and other hot components inside an enclosure to a cold wall nearby using conduction, blown air, liquid, or a combination of these approaches.

Electronics cooling only is straightforward, however, in the abstract; the real engineering challenges are in the details, where factors like shock and vibration, high altitudes, dust and dirt, salt spray, humidity, and many others come into play. Thermal-management techniques that work at sea level might not at 40,000 feet. Blown air might be insufficient in the desert heat or on an airport tarmac. The list goes on.

Despite many aerospace and defense applications with pressing thermal-management issues, and a growing number of approaches for removing heat, there’s still one constant: a growing amount of waste heat and the imperative to get rid of it.

“In HPEC it’s really the same old story it has been in the past 15 to 20 years: power is going up, and power density is going up,” says Ivan Straznicky, chief technology officer of advanced packaging at the Curtiss-Wright Corp. Defense Solutions division in Ashburn, Va. More power in tighter spaces means the thermal-management problem is here to stay.

On a hopeful note, embedded computing architectures themselves do not pose the dire heat threat today that they did only a few years ago, Straznicky points out. Central processing units (CPUs), graphics processing units (GPUs), and field-programmable gate arrays (FPGAs) — all crucial technologies in today’s HPEC architectures — simply may not be generating heat at the rates they used to. “It’s not as much of a problem because there are more and more CPU, GPU, and FPGA cores, so the heat load is spread over more cores. It is one glimmer of hope in thermal management.”

Agnostic OpenVPX embedded computing cooling technology from Mercury Systems can be packaged as circuit board-only, Air Flow-By and Liquid Flow-By, conduction cooled and Liquid Flow-Through, Air Flow-Through, and convection cooled.

Approaches to thermal management

Perhaps the most common way to cool military embedded computing systems is conduction cooling, where designers package electronics components in sealed enclosures. Waste heat moves from hot components on circuit boards to the sides of boards, through the enclosures, and out to large metal surfaces like the armor of a main battle tank, where the heat disperses to the ambient air. Conduction cooling is handy in military environments where dust, dirt, and other contaminants make it impossible to use blown air to cool hot components.

Embedded systems designers are hard-pressed to deal with rising amounts of heat in ever-smaller packages with conventional thermal-management techniques. Still, companies like General Micro Systems (GMS) Inc. in Rancho Cucamonga, Calif., are pushing the bounds of conduction cooling with new, and often-patented ways of removing heat in this traditional manner.

“We can use things on the edges of the board, which are the VME chassis clamps, or wedge locks,” explains Chris Ciufo the GMS chief technology officer and vice president of product marketing. Conventional wisdom has it that these clamps can handle roughly 25 Watts of waste heat. Two clamps per board, and this offers capacity to cool 50 Watts of heat per processor board. This approach only goes so far, and “people really do need to cool more than that,” Ciufo says.

Air Flow-By cooling technology surrounds circuit boards in metallic clam shells that can be exposed to air and liquid cooling.

For reference, the Intel Xeon E5 22-core processor, which is becoming popular for HPEC military and aerospace applications, can generate as much heat as 145 Watts, while the same processor with 14 cores generates as much as 120 Watts. It’s just too much heat for conventional VME wedge locks. “Clearly, people are finding other ways to do this,” Ciufo says.

GMS has patented a version of the VME card-edge clamp that doubles the conventional VME clamp’s capacity to from 50 Watts to 100 Watts per card. “We contact both sides of our 6U VME board, so instead of just one side of the board, we can increase the heat the clamp pushes out to the chassis,” Ciufo says. “This clamp is broken into multiple segments, which move toward each other as the screw tightens. We feed heat from more surface area on both sides of the board to the clamp.”

There are other ways of squeezing more performance out of conduction cooling. GMS has patented a socket that fastens the Xeon E5 processor to an 3U or 6U VPX computer board that creates an efficient heat path from the processor to the cold plate, Ciufo says. “It allows us to get a maximum amount of heat off the processor with less than 10 degrees of heat rise to the cold plate of the system.”

Reducing the processor’s heat rise can enable GMS systems designers to run the Xeon E5 processor and other high-performance chips at their maximum clock speeds. While some designers deal with thermal management by throttling-back processor clock speeds, at GMS “we can run processors at their maximum speed and heat with no compromise in reliability,” Ciufo says.

General Micro Systems uses a proprietary thermal-management technology called RuggedCool for HPEC applications to cool 300-Watt Intel Xeon processors.

Convection cooling

Another typical way to cool embedded computing components is with fan-blown air, also called convection cooling. This approach places heat sinks with fins on hot components, and blows air over the heat sinks to remove excess heat to the ambient air.

This approach can be a problem for aerospace and defense applications because shock, vibration, and air contaminants can cause system failures in convection-cooled systems. Plus, fans are notorious single-points-of-failure, so military systems designers use them only when they must.

There’s a second kind of convection cooling, however, that doesn’t use fans. It’s called natural convection cooling, which moves air by creating currents based on the temperature difference between the cooling fins and the surrounding air. Yet while at least partially solving the problems of fan reliability, natural convection cooling is limited in the amount heat it can remove. “You would not use natural convection cooling for HPEC,” says Curtiss-Wright’s Straznicky. “It’s mostly for the lower-end stuff.”

Liquid cooling

In today’s high-power HPEC applications, sometimes liquid cooling is one of only a few viable options for removing large amounts of heat. Systems designers can use a variety of liquids, ranging from jet fuel to inert liquids. “We are seeing a lot more liquid cooling these days than we did in the past,” says Shaun McQuaid, director of product management at the Mercury Systems Sensor and Mission Processing (SMP) segment in Andover, Mass. “The processing and performance requirements are pushing in that direction.”

Liquid cooling for high-performance embedded systems can involve channeling liquid through small pipes that snake their way throughout processing boards or through card clamps to move heat away from card edges. Systems designers adapt their liquid cooling techniques to match the demands of their applications.

Liquid cooling historically has been one of the most-expensive and least-reliable electronics cooling techniques available, but depending on the application, the added expense can be worth it — not only to make the most of system performance, but also for long-term system, card, and processor reliability. The longer a processor is subject to high temperatures near the top of its specifications, the more likely it is to fail at a critical moment. “To get the longest life and best efficiency in power-hungry systems, you go with liquid,” McQuaid says.

Liquid cooling is becoming a viable option for many high-performance embedded computing systems for demanding applications like radar, electronic warfare, and signals intelligence.

Liquid cooling in the recent past has taken a bad rap for systems reliability. Leaky connectors once threatened system performance and longevity. That’s not the case today, says Curtiss-Wright’s Straznicky. “A lot of the reliability issues of liquid cooling have been solved,” he says. “The leaks often used to be associated with quick disconnects. Leaks were an issue, but what we have seen is, if you know which vendors to use, the leaks should not be an issue.”

Mercury’s McQuaid says the cooling necessary for extreme applications of 300 to 400 Watts at the board level typically requires liquid cooling. “It’s become a lot more of an available thing,” he says. “It’s not as exotic as it used to be. Technology has advanced in quick disconnects and leak-proofing to put liquid cooling in the realm of deployment.”

Despite its advantages and growing availability, however, liquid cooling still isn’t for everybody. “We don’t use liquid cooling or spray cooling because it adds the kind of complexity that doesn’t fit with small form factors,” says GMS’s Ciufo.

Hybrid cooling techniques

Some of the most intriguing new developments in electronics cooling involve blends of conduction, convection, and liquid cooling. Often these hybrid approaches offer to keep costs down, as well as to capitalize on the existing electronics infrastructure available on military systems and platforms.

Two notable industry-standard hybrid cooling approaches are ANSI/VITA 48.8 Air Flow Through (AFT) cooling, and ANSI/VITA 48.7 Air Flow-By cooling. Both approaches are for 3U and 6U VPX plug-in embedded computing boards. AFT cooling was pioneered by Curtiss-Wright Defense Solutions and Northrop Grumman Corp., while Air Flow-By cooling started at Mercury Systems.

General Micro Systems has adapted several thermal-management approaches to handle high-power embedded computing based on the Intel Xeon microprocessor.

AFT offers cooling capacity of as much as 200 Watts per card slot to support high-power embedded computing applications like sensor processing; it’s environmentally sealed to accommodate harsh military operating conditions. AFT passes air through the chassis heat frame, preventing the ambient air from contacting the electronics, but decreasing the thermal path to the cooling air dramatically, Curtiss-Wright officials say.

A gasket mounted inside the chassis seals the card’s internal air passage to the chassis side walls, and shields the internal electronics from the blown air. Each card has an isolated thermal path, rather than sharing cooling air among several cards.

Air Flow-By cooling, meanwhile, encapsulates circuit boards in heat-exchanger shells that cool both sides of the board by flowing air across both sides. The heat exchanger shell protects against airborne contaminants, electromagnetic interference (EMI), electrostatic discharge (ESD), and provides an extra layer of physical security.

Air Flow-By maintains the card’s standard 1-inch pitch, and offers a 25-percent reduction in processor temperature for dual Intel Xeon processors; a 33-percent increase in processor frequency at that reduced temperature; five times increase in mean times between failures (MTBF); and a 25-percent reduction in weight of the processor module, according to Mercury.

The AFT and Air Flow-By techniques can offer the next logical step when conventional conduction cooling no longer can meet system requirements. “If the limits of air cooling and conduction cooling are reached, the next step is Air Flow-Through,” says Curtiss-Wright’s Straznicky. In addition, the AFT and Air Flow-By cooling approaches also can offer options for liquid cooling if systems designers need it.

Cost tradeoffs

It’s true that up-front costs increase when systems designers look beyond conventional conduction and convection cooling for high-performance embedded computing. “Cost is a big concern,” admits Mercury’s McQuaid. “People always are looking for the most cost-efficient method of meeting these challenges.”

Still, relatively high initial costs often can be justified when systems integrators consider costs over the lifetimes of these technologies. “Total cost of ownership is coming into play,” McQuaid says. “It’s not how much the board costs today, but it involves calculating reliability figures into the system.” It’s accepted that running high-performance processors at relatively cool temperatures can increase their life cycles substantially.

“Standard cooling technologies like Liquid Flow-Through have not only caught-up with embedded computing, but they also have given us significant head-room with the new computing devices,” says Curtiss-Wright’s Straznicky. “Air Flow-Through, for example, is still very feasible for modern and future designs.”.

company list

Aavid Thermacore Inc.
Lancaster, Pa.
https://www.thermacore.com/default.aspx

Abaco Systems
Huntsville, Ala
https://www.abaco.com

Advanced Thermal Solutions Inc.
Norwood, Mass.
https://www.qats.com

Aitech Defense Systems Inc.
Chatsworth, Calif.
http://www.rugged.com

Behlman Electronics Inc.
Hauppauge, N.Y.
http://www.behlman.com

Crane Aerospace & Electronics
Lynnwood, Wash.
http://www.craneae.com

Curtiss-Wright Defense Solutions
Ashburn, Va.,https://www.curtisswrightds.com

Data Device Corp.
Bohemia, N.Y.
http://www.ddc-web.com

Elma Electronic
Fremont, Calif.
https://www.elma.com

Extreme Engineering Solutions (X-ES) Inc.
Verona, Wis.
https://www.xes-inc.com

General Micro Systems Inc.
Rancho Cucamonga, Calif.
https://www.gms4sbc.com

Kontron America Inc.
San Diego, Calif.
https://www.kontron.com

Meggitt Defense Systems Inc.
Irvine, Calif.
https://www.meggittdefense.com

Mercury Systems Inc.
Andover, Mass.
https://www.mrcy.com

Milpower Source Inc.
Belmont, N.H.
https://milpower.com

Parker Hannifin Corp. Aerospace Group
Alexandria, Va.
https://www.parker.com

Vicor Corp.
Andover, Mass.
http://www.vicorpower.com

Voice your opinion!

To join the conversation, and become an exclusive member of Military Aerospace, create an account today!