Moving the Data Center to Liquid Cooling
04 March, 2019
As data center facilities are under increasing pressure to incorporate higher and higher wattage nodes into their sites, they are confronted with many challenges. In particular balancing the need for high reliability cooling without reducing the rack density to handle high wattage nodes is no easy task. Particularly for HPC sites, reducing rack densities can mean increasing interconnect distances resulting in greater latency and lower cluster throughput. This makes liquid cooling of the high density CPU/GPU nodes appealing but can be a daunting prospect for air cooled data centers. Because many of the liquid cooling approaches are one-size-fits-all solutions, it can be difficult for data center operators to find a path that allows them to move to liquid cooling first on an as-needed basis and then evolve the overall facility to liquid cooling in a manageable fashion. Implementation of liquid cooling at its best demands an architecture that is flexible to a variety of heat rejection scenarios, is cost effective and can be implemented without disruption. This allows for a smooth transition in moving the installation from air cooling to liquid cooling. Heat capture with distributed liquid cooling combined with options for heat rejection into either data center air or facilities liquid provides a flexibility to address high-wattage on an as-needed basis while evolving to implementation of liquid cooling across the data center. Asetek’s direct-to-chip liquid cooling provides a distributed cooling architecture to address the full range of heat rejection scenarios. It is based on low pressure, redundant pumps and sealed liquid path cooling within each server node. Placing coolers (integrated pumps/cold plates) within server or blade nodes, with coolers replacing CPU/GPU air heat sinks to remove heat with hot water has advantages. The liquid cooling circuit in the server can also incorporate memory, VRs and other high wattage components into this low PSI pumping circuit. Unlike centralize pumping systems, placing the pumping within each server node, enables very low pressures (4psi typical). This mitigates failure risk, reduces complexity, expense and avoids the high pressures required by centralized pumping. With multiple CPUs or GPUs in a given node, redundancy at server level is provided as only a single pump is required for pumping. On the side of heat-rejection, the Asetek architecture enables adaption to existing air-cooled data centers and evolution to fully liquid-cooled facilities. Adding liquid cooling with no impact on data center infrastructure can be done with Asetek’s ServerLSL™, a server-level Liquid Assisted Air Cooling (LAAC). With ServerLSL the redundant liquid pump/cold plates are paired with a HEX (radiator) also in the node . Via the HEX the captured heat is exhausted into the data center. Existing HVAC systems handle the heat. ServerLSL can be viewed as a transitional stage or as a tool to quickly incorporate the highest wattage CPUs/GPUs. Further, racks can contain a mix liquid-cooled and air-cooled nodes. The next step in LAAC allows for even higher density nodes. Available in 2018, InRackLAAC™ places a shared HEX in a 2U “box” that is connected to up to 12 servers. Because the HEX is removed from the individual nodes greater component density is possible. Later, when facilities water is routed to the racks, Asetek’s RackCDU™ options enable a much greater impact on OPEX for the data center. RackCDU D2C (Direct-to-Chip) captures between 60 percent and 80 percent of server heat into liquid, reducing data center cooling costs by over 50 percent and allowing 2.5x-5x increases in data center server density. It is used by all of the current sites in the TOP500 using Asetek liquid cooling. RackCDU addresses both the node level and the facility overall. It uses the same redundant pumps/cold plates on server CPUs/GPUs while optionally cooling memory and other high heat components. But the heat collected is moved via a sealed liquid path to heat exchangers for transfer of heat into facilities water. Heat removal is done by using heat exchangers in the RackCDU. RackCDUs come in two types to give additional flexibility to data center operators. InRackCDU™ is mounted in the rack along with servers. Using 4U it connects to nodes via Zero-U PDU style manifolds in the rack. Alternatively, VerticalRackCDU™ consists of a Zero-U rack level CDU (Cooling Distribution Unit) mounted as a 10.5” extension at the rear of the rack. RackCDUs have additional advantages in the OPEX of heat management. Because hot water (up to 40ºC) is used to cool, the data center does not require expensive CRACs and cooling towers and can utilize inexpensive dry coolers. However, when there is unused cooling capacity available, data centers may choose to use facilities water coming from the CDU with existing CRAC and cooling towers. Distributed pumping at the server, rack, cluster and site levels delivers flexibility in the areas of heat capture, coolant distribution and heat rejection that other approaches do not. As silicon wattage trends continue to grow in 2018 and beyond, the need for flexibility in liquid cooling provides options for both immediate needs and the long term adoption. Visit Asetek.com to learn more.