26.1 C
Miami
Tuesday, October 14, 2025

Four Ways to Solve AI’s Heat Problem

- Advertisement -spot_imgspot_img
- Advertisement -spot_imgspot_img

Walk into a typical data center and one of the first things that jumps out at you is the noise—the low, buzzing sound of thousands of fans: fans next to individual computer chips, fans on the back panels of server racks, fans on the network switches. All of those fans are pushing hot air away from the temperature-sensitive computer chips and toward air-conditioning units.

But those fans, whirr as they might, are no longer cutting it. Over the past decade, the power density of the most advanced computer chips has exploded. In 2017, Nvidia came out with the V100 GPU, which draws 300 watts of power. Most of that power dissipates back out as heat. Three years later, in 2020, Nvidia’s A100 came out, drawing up to 400 W. The now-popular H100 arrived in 2022 and consumes up to 700 W. The newest Blackwell GPUs, revealed in 2024, consume up to 1,200 W.

“Road maps are looking at over 2,000 watts [per chip] over the next year or two,” says Drew Matter, president and CEO of the liquid-cooling company Mikros Technologies. “In fact, the industry is preparing for 5-kilowatt chips and above in the foreseeable future.”

This power explosion is driven by the obvious culprit—AI. And all the extra computations consuming all that added power from advanced chips are generating unmanageable amounts of heat.

“The average power density in a rack was around 8 kW,” says Josh Claman, CEO of the startup Accelsius. “For AI, that’s growing to 100 kW per rack. That’s an order of magnitude. It’s really AI adoption that’s creating this real urgency” to figure out a better way to cool data centers.

Specifically, the urgency is to move away from fans and toward some sort of liquid cooling. For example, water has roughly four times the specific heat of air and is about 800 times as dense, meaning it can absorb around 3,200 times as much heat as a comparable volume of air can. What’s more, the thermal conductivity of water is 23.5 times as high as that of air, meaning that heat transfers to water much more readily.

“You can stick your hand into a hot oven and you won’t get burned. You stick your hand into a pot of boiling water and you can instantly get third-degree burns,” says Seamus Egan, general manager of immersion cooling at Airedale by Modine. “That’s because the liquid transfers heat much, much, much, much more quickly.”

The data-center industry by and large agrees that cooling chips with liquid is the future, at least for AI-focused data centers. “As AI has made racks denser and hotter, liquid cooling has become the de facto solution,” Karin Overstreet, president of Nortek Data Center Cooling, said via email.

But there are a number of ways to do liquid cooling, from the simple and straightforward to the complex and slightly weird.

At the simple end, there’s circulating chilled water through cold plates attached to the hottest chips. Then there’s circulating not water but a special dielectric fluid that boils inside the cold plate to take away the heat. A third approach is dunking the entire server into a fluid that keeps it cool. And, last and most splashy, is dunking the server into a boiling vat of liquid.

Which method will end up being the industry standard for the high-end AI factories of the future? At this point, it’s anyone’s guess. Here’s how the four methods work, and where they might find the most use.

#1: Single-Phase Direct-to-Chip Cooling

The most technologically mature approach is to use water. Already, many AI data centers are employing such direct-to-chip liquid cooling for their hottest chips.

In this scheme, metal blocks, called cold plates, with channels in them for coolant to circulate, are placed directly on top of the chips. The cold plates match the size of the chips and go inside the server. The liquid is usually water, with some glycol added to prevent bacterial growth, stabilize the temperature, protect against freezing and corrosion, and increase the viscosity of the liquid. The glycol-water mixture is forced through the cold plate, whisking away heat right from the source.

Companies like Mikros Technologies are pursuing single-phase direct-to-chip liquid cooling. In this technique, a cold plate is placed on top of the hottest chips. Liquid is circulated through the cold plate, whisking away heat. Marvell Technology

The glycol water is normally kept in a closed loop, circulating from the cold plates to a heat-exchange unit, which cools the liquid back down, and then back to the cold plate. Inside the heat exchanger, a separate loop of “facility water” is used to cool down the glycol water. The facility water is in turn cooled by either a chiller—an electrically powered refrigeration unit—or a dry cooler, an outdoor unit that uses fans to blow ambient air over the water as it moves through pipes. A dry cooler is much simpler and more energy efficient than a chiller, but it works only in cooler climates—it can’t cool the water below the ambient temperature.

One difficulty with this approach is that putting a cold plate on every single heat-producing component in a server is unfeasible. It only makes sense to put cold plates on the most energy-dense components—namely GPUs and some CPUs—leaving smaller components, like power supplies and memory units, to be cooled the old-fashioned way, with fans.

The trend is moving toward a hybrid-cooling solution,” Overstreet says. “So liquid cooling does about 80 percent of the cooling for the server room or the data hall, and about 20 percent is the existing air-cooling solution.”

#2: Two-Phase Direct-to-Chip Cooling

With GPU power densities showing no signs of leveling off, direct-to-chip water cooling is hitting a limit. You can, of course, increase the flow of water, but that will use more energy. Or you can operate the chips at a higher temperature, which will cut into their performance and in the long run degrade the chips. Fortunately, there’s a third option: to squeeze a bit more out of the physics of heat exchange.

The extra cooling power offered by physics comes from latent heat—that is, the energy it takes to change phase, in this case from liquid to gas. As the liquid boils off the GPU, it absorbs that extra latent heat as it turns into gas, without increasing temperature.

Dark blue server room with five little windows showing liquid inside Companies like Accelsius are proposing two-phase direct-to-chip liquid cooling. Here, a cold plate is also placed on top of the hottest chips, and the liquid circulating through the cold plate boils directly atop the chip. Big Idea Productions

That’s basically how two-phase direct-to-chip cooling works. In this scheme, a specially formulated dielectric liquid circulates through cold plates sitting atop high-energy chips and boils into vapor. The vapor is then fed back to a heat exchanger, which cools the fluid using facility water.

“It’s really boiling to cool,” says My Truong, chief technology officer of the startup ZutaCore, which makes two-phase direct-to-chip cooling systems.

Water boils at 100 °C (at atmospheric pressure), which is too high for proper chip operation. So you need a specially formulated fluid with a lower boiling point. ZutaCore’s chief evangelist, Shahar Belkin, explains that the fluid they use is sourced from chemical suppliers like Honeywell and Chemours, and boils at a temperature as low as 18 °C, which can be adjusted up or down by tweaking the pressure in the loop. In addition, the fluid is dielectric—it’s not electrically charged unless polarized by an external electric field. So, unlike water, if some of the fluid spills onto the electronics, it won’t damage the costly equipment.

With water, the temperature increases drastically as it flows over the hot chips. That means the incoming water needs to be kept cold, and so the facility water requires cooling with chillers in most climates.

With boiling dielectric fluid, however, the fluid remains roughly the same temperature and simply changes phase into a vapor. That means both the liquid and the facility water can be kept at a higher temperature, resulting in significant energy savings.

When liquid boils on top of a hot chip, the chip is cooled not only through contact with the cooler liquid, but also through the latent heat it takes to induce a phase change. Accelsius

“Because of the really efficient boiling process that happens on the cold plate, we can accept facility water that’s 6 to 8 degrees warmer than [with] single phase,” says Lucas Beran, director of product marketing at Accelsius, another startup working on two-phase direct-to-chip liquid cooling.

The two-phase setup also requires lower liquid flow rates than the traditional single-phase water approach, so it uses less energy and runs less risk of damaging the equipment. The flow rate of two-phase cooling is about one-fifth that of single-phase cooling, Belkin says.

With single-phase water cooling, he says, “you’ll have to flow a gallon per minute into the cold plate” for the most advanced chips running at 2,000 W. “This means very, very high pressure, very, very high flow. It means that pumping will be expensive, and [the cooling system] will actually harm itself with the high flow.”

#3: Single-Phase Immersion Cooling

Direct-to-chip liquid cooling offers much more cooling capacity than just blowing air, but it still relies on cold plates as intermediaries to do the cooling.

What if you could bypass the cold plate altogether and just dunk the entire computer server in coolant? Some companies are doing just that.

In this approach, the data center is arranged around immersion tanks rather than racks, each tank roughly the size of a refrigerator. The immersion tanks are filled with a dielectric fluid, usually an oil, which must be nonconductive and have strong thermal transfer properties, says Rachel Bielstein, global sales manager of immersion cooling at Baltimore Aircoil Co. The fluid also requires long-term stability and low environmental and fire risk.

A server with Nvidia labels submerged in liquid. Sustainable Metal Cloud is advocating for single-phase immersion cooling, in which an entire server is submerged in a vat of liquid to keep it cool.Firmus Technologies

With immersion cooling, everything gets cooled by the same fluid. After the oil has whisked away the heat, there are various approaches to cooling the immersion fluid. Baltimore Aircoil, for one, has designed a heat exchanger that circulates facility water through coils and plates inside the tank, Bielstein explains. “The heated water is then pumped to an outside cooler that releases the heat into the air, cools the water, and sends it back to the heat exchanger to absorb more heat from the tank. This process uses up to 51 percent less energy versus traditional designs.”

The team at Singapore-based Sustainable Metal Cloud (SMC), which builds immersion-cooling systems for data centers, has figured out the modifications that need to be made to servers to make them compatible with this cooling method. Beyond removing the built-in fans, the company swaps out the thermal-interface materials that connect chips to their heat sinks, as some of those materials degrade in the oil. Oliver Curtis, co-CEO of SMC and its sister company Firmus, told IEEE Spectrum the modifications they make are small but important to the functioning of SMC’s setup.

“We’ve created the perfect operating environment for a computer,” Curtis says. “There’s no dust, no movement, no vibration, because there’s no fans. And it’s a perfect operating temperature.”

There are some chips whose power density is still too high to be completely cooled by the slow-moving oil. In those cases, it’s necessary to add cold plates to increase the oil flow over them. “Single-phase immersion has already hit the limits” for cooling these advanced chips, says Egan of Airedale by Modine. Adding cold plates to immersion cooling, he says, “will definitely provide support for more advanced chip architectures and reduce the heat load on the single-phase dielectric fluid. The new challenge is that I now need two separate cooling-loop systems.”

#4: Two-Phase Immersion Cooling

If no one cooling method is enough on its own, how about putting all of them together, and dunking your data center into a vat of boiling oil?

Some companies already are.

“Two-phase immersion is probably the most moon-shot technology when it comes to data-center liquid cooling,” says Beran, of Accelsius.

But Brandon Marshall, global marketing manager of data-center liquid cooling at Chemours, says this is where the industry is headed. “We believe from the research that we’ve done that two-phase immersion is going to come up in a pretty reasonable way.”

[left] Man in labcoat stands next to black box with a window onto a server. [right] Server submerged in liquid with bubbles forming and rising to the top. At their lab in Newark, Del., the Chemours team is developing a specially formulated liquid for two-phase immersion cooling. In this approach, the server is dunked into a vat of liquid, and the liquid boils atop the hot components, cooling the system. Chemours

Marshall argues that a two-phase—also known as boiling—liquid has 10 to 100 times as much cooling capacity as a single-phase liquid, due to its latent heat. And while two-phase direct-to-chip cooling may work for the chips of today, it still leaves many components, such as memory modules and power supplies, to be air cooled. As CPUs and GPUs grow more powerful, these memory modules and power supplies will also require liquid cooling.

“That list of problems is not going anywhere,” Marshall says. “I think the immersion-cooling piece is going to continue to grow in interest as we move forward. People are going to get more comfortable with having a two-phase fluid inside of a rack just like they have [with] putting water in a rack through single-phase direct-to-chip technology.”

In their lab in Newark, Del., the Chemours team has placed several high-power servers in tanks filled with a proprietary, specially formulated fluid. The fluid is dielectric, so as not to cause shorts, and it’s also noncorrosive and designed to boil at the precise temperature at which the chips are to be held. The fluid boils directly on top of the hot chips. Then the vapor condenses on a cooled surface, either at the top or the back panel of the tank.

In their lab in Newark, Dela., the Chemours team is testing their two-phase immersion cooling fluid. In this approach, the whole server is dunked into a tank with dielectric liquid. The heat from the server boils the liquid, resulting in cooling. Chemours

That condenser is cooled with circulating facility water. “All we need is water sent directly to the tank that’s about 6 degrees lower than our boiling point, so about 43 °C,” Marshall says. “The fluid condenses [back to a liquid] right inside of the tank. The temperature required to condense our fluid can eliminate the need for chillers and other complex mechanical infrastructure in most cases.”

According to a recent case study by Chemours researchers, two-phase immersion cooling is more cost effective than single-phase immersion or single-phase direct-to-chip in most climates. For example, in Ashburn, Va., the 10-year total cost of ownership was estimated at US $436 million for a single-phase direct-to-chip setup, $491 million for a single-phase immersion setup, and $433 million for a two-phase immersion-cooling setup, mostly due to lower power requirements and a simplified mechanical system.

Critics argue that two-phase immersion makes it hard to maintain the equipment, especially since the oils are so specialized, expensive, and prone to evaporating. “When you’re in an immersion tank, and there’s dollar signs evaporating from it, that can make it a bit of a challenge to service,” Beran says.

However, Egan of Airedale by Modine says his company has developed a way to mostly avoid this issue with its immersion tanks, which are intended for edge applications. “Our EdgeBox is specifically designed to maintain the vapor layer lower down in the tank with a layer of air above it and closer to the tank lid. When the tank is opened (for a short maintenance period), the vapor layer does not ‘flow out’ of the tank,” Egan wrote via email. “The vapor is much heavier than air and therefore stays lower in the tank. The minimal vapor loss is offset by a buffer tank of fluid within the system.”

For the foreseeable future, people in the industry agree that the power demands of AI will keep going up, and the need for cooling along with them.

“Unless the floor falls out from under AI and everybody stops building these AI clusters, and stops building the hardware to perform training for large language models, we’re going to need to keep advancing cooling, and we’re going to need to solve the heat problem,” Marshall says.

Which cooling technology will dominate in the coming AI factories? It’s too soon to say. But the rapidly changing nature of data centers is opening up the field to a lot of inventiveness and innovation.

“There’s not only a great market for liquid cooling,” says Drew Matter, of Mikros Technologies, “but it’s also a fun engineering problem.”

From Your Site Articles

Related Articles Around the Web

Source link

- Advertisement -spot_imgspot_img

Highlights

- Advertisement -spot_img

Latest News

- Advertisement -spot_img