翻譯:宋媛媛 校對:劉海峰 標題:為了冷卻數(shù)據(jù)中心服務器,,微軟轉(zhuǎn)向沸騰的液體 To cool datacenter servers,Microsoft turns to boiling liquid 在哥倫比亞河東岸的這個數(shù)據(jù)中心,,微軟員工之間發(fā)送的電子郵件和其他通信實際上正在讓一個裝滿計算機服務器的鋼制容器內(nèi)的液體沸騰,。 Emails and other communications sent betweenMicrosoft employees are literally making liquid boil inside a steel holdingtank packed with computer servers at this datacenter on the eastern bank of theColumbia River. 與水不同的是,,沙發(fā)形儲罐內(nèi)的液體對電子設備無害,設計成沸騰溫度為122華氏度,,比水的沸點低90度,。 Unlike water, the fluid inside the couch-shapedtank is harmless to electronic equipment and engineered to boil at 122 degreesFahrenheit, 90 degrees lower than the boiling point of water. 服務器正在做的工作產(chǎn)生的沸騰效應將熱量從承擔繁重工作的計算機處理器中帶走。低溫煮沸使服務器能夠在全功率下連續(xù)運行,,而不會因過熱而導致故障,。 The boiling effect, which is generated by thework the servers are doing, carries heat away from laboring computerprocessors. The low-temperature boil enables the servers to operatecontinuously at full power without risk of failure due to overheating. 在儲罐內(nèi)部,從沸騰的流體中升起的蒸汽與儲罐蓋中的冷卻冷凝器接觸,,這導致蒸汽變?yōu)橐后w,,然后如下雨般滴回到浸入式服務器中,,從而形成了一個閉環(huán)冷卻系統(tǒng)。 Inside the tank, the vapor rising from the boilingfluid contacts a cooled condenser in the tank lid, which causes the vapor tochange to liquid and rain back onto the immersed servers, creating a closedloop cooling system. 位于華盛頓州雷蒙德市的微軟數(shù)據(jù)中心高級開發(fā)團隊的首席硬件工程師胡薩姆·艾麗薩(Husam Alissa)說:“我們是第一家在生產(chǎn)環(huán)境中運行兩相浸沒冷卻的云提供商,?!?/p> “We are thefirst cloud provider that is running two-phase immersion cooling in aproduction environment,” said Husam Alissa, a principal hardware engineer onMicrosoft’s team for datacenter advanced development in Redmond, Washington. Microsoft的數(shù)據(jù)中心高級開發(fā)團隊的首席軟件工程師IoannisManousakis(左)和Microsoft的數(shù)據(jù)中心高級開發(fā)團隊的首席硬件工程師HusamAlissa(右)檢查Microsoft數(shù)據(jù)中心的兩相浸入式冷卻箱的內(nèi)部。由GeneTwedt為Microsoft攝影,。 IoannisManousakis,a principal software engineer with Azure (left), and Husam Alissa, a principalhardware engineer on Microsoft’s team for datacenter advanced development(right), inspect the inside of a two-phase immersion cooling tank at aMicrosoft datacenter. Photo by Gene Twedt for Microsoft. 數(shù)據(jù)中心的摩爾定律 Moore’s Law for the datacenter在風冷計算機芯片技術(shù)的可靠發(fā)展減慢之際,,Microsoft長期計劃的下一步就是在生產(chǎn)環(huán)境中部署兩相浸入式冷卻,以適應對更快,,功能更強大的數(shù)據(jù)中心計算機的需求,。 The production environment deployment oftwo-phase immersion cooling is the next step in Microsoft’s long-term plan tokeep up with demand for faster, more powerful datacenter computers at a timewhen reliable advances in air-cooled computer chip technology have slowed. 幾十年來,芯片的進步源于將更多的晶體管封裝到相同尺寸的芯片上的能力,,這使得計算機處理器的速度每兩年大約翻一番,,而不會增加其電力需求。 For decades, chip advances stemmed from theability to pack more transistors onto the same size chip, roughly doubling thespeed of computer processors every two years without increasing their electricpower demand. 這種倍增現(xiàn)象被稱為摩爾定律摩爾定律在1965年觀察到了這一趨勢,,并預測這種趨勢將持續(xù)至少十年,。它一直持續(xù)到2010年代,現(xiàn)在開始放慢速度,。 This doubling phenomenon is called Moore’s Lawafter Intel co-founder Gordon Moore, who observed the trend in 1965 andpredicted it would continue for at least a decade. It held through the 2010sand has now begun to slow. 那是因為晶體管的寬度已經(jīng)縮小到原子級,,并且已經(jīng)達到物理極限。同時,,Alissa指出,,對諸如人工智能等高性能應用的更快計算機處理器的需求正在加速增長。 That’s because transistor widths have shrunk tothe atomic scale and are reaching a physical limit. Meanwhile, the demand forfaster computer processors for high performance applications such as artificialintelligence has accelerated, Alissa noted. 為了滿足性能需求,,計算行業(yè)已經(jīng)轉(zhuǎn)向可以允許更多電能消耗的芯片架構(gòu),。例如,中央處理單元或CPU已從每個芯片150瓦增加到300瓦以上,。圖形處理單元(GPU)已增加到每個芯片700瓦以上,。 To meet the need for performance, the computingindustry has turned to chip architectures that can handle more electric power.Central processing units, or CPUs, have increased from 150 watts to more than300 watts per chip, for example. Graphics processing units, or GPUs, haveincreased to more than 700 watts per chip. 輸入到這些處理器的電能越多,芯片就會變得越熱,。增加的熱量提升了冷卻要求,,以防止芯片發(fā)生故障。 The more electric power pumped through theseprocessors, the hotter the chips get. The increased heat has ramped up coolingrequirements to prevent the chips from malfunctioning. 位于Redmond的微軟數(shù)據(jù)中心高級開發(fā)小組杰出工程師及副總裁克里斯蒂安·貝拉迪(Christian Belady)說:“風冷不夠了,?!薄斑@就是驅(qū)使我們進行浸入式冷卻的原因,我們可以在其中直接將芯片的表面煮沸,?!?nbsp; “Air cooling is not enough,” said ChristianBelady, distinguished engineer and vice president of Microsoft’s datacenteradvanced development group in Redmond. “That’s what’s driving us to immersioncooling, where we can directly boil off the surfaces of the chip.” 他指出,液體中的熱傳遞比空氣更有效,。 Heat transfer in liquids, he noted, is ordersof magnitude more efficient than air. 他補充說,,此外,,向液體冷卻的轉(zhuǎn)變?yōu)檎麄€數(shù)據(jù)中心帶來了類似摩爾定律的思維模式。 What’s more, he added, the switch to liquidcooling brings a Moore’s Law-like mindset to the whole of the datacenter. 他說:“液體冷卻使我們能夠變得更密集,,從而在數(shù)據(jù)中心級別上繼續(xù)保持摩爾定律的趨勢,。” “Liquid cooling enables us to go denser, andthus continue the Moore’s Law trend at the datacenter level,” he said. 微軟數(shù)據(jù)中心高級開發(fā)小組的杰出工程師兼副總裁克里斯蒂安·貝拉迪(Christian Belady)站在微軟數(shù)據(jù)中心的兩相浸入式冷卻水箱旁邊,。由Gene Twedt為Microsoft攝影,。 ChristianBelady, distinguished engineer and vice president of Microsoft’s datacenteradvanced development group, stands next to a two-phase immersion cooling tankat a Microsoft datacenter. Photo by Gene Twedt for Microsoft. 從加密貨幣礦工那里學到的教訓 Lesson learned from cryptocurrency minersBelady指出,液體冷卻是一種行之有效的技術(shù),。如今,,道路上的大多數(shù)汽車都依靠它來防止發(fā)動機過熱。包括微軟在內(nèi)的多家技術(shù)公司正在試驗冷板技術(shù),,該技術(shù)通過將液體通過金屬板輸送到服務器來冷卻服務器,。 Liquid cooling is a proven technology, Beladynoted. Most cars on the road today rely on it to prevent engines fromoverheating. Several technology companies, including Microsoft, areexperimenting with cold plate technology, in which liquid is piped throughmetal plates, to chill servers. 加密貨幣行業(yè)的參與者率先開發(fā)了用于計算設備的浸沒式液冷,利用它來冷卻記錄數(shù)字貨幣交易的芯片,。 Participants in the cryptocurrency industrypioneered liquid immersion cooling for computing equipment, using it to coolthe chips that log digital currency transactions. 微軟深入研究了液體浸沒作為AI等高性能計算應用程序的冷卻解決方案,。除其他事項外,,調(diào)查顯示,,兩相浸入式冷卻可將任何給定服務器的功耗降低5%至15%。 Microsoft investigated liquid immersion as acooling solution for high-performance computing applications such as AI. Amongother things, the investigation revealed that two-phase immersion cooling reducedpower consumption for any given server by 5% to 15%. 這些發(fā)現(xiàn)促使微軟團隊與數(shù)據(jù)中心IT系統(tǒng)制造商和設計師Wiwynn合作開發(fā)了兩階段浸入式冷卻解決方案,。第一個解決方案現(xiàn)在在昆西的Microsoft數(shù)據(jù)中心運行,。 The findings motivated the Microsoft team towork with Wiwynn, a datacenter IT system manufacturer and designer, to develop atwo-phase immersion cooling solution. The first solution is now running atMicrosoft’s datacenter in Quincy. 沙發(fā)形的水箱中充滿了3M的特別設計研發(fā)的流體。3M的液體冷卻液具有介電特性,,使其成為有效的絕緣體,,使服務器在完全浸入液體的情況下仍能正常運行。 That couch-shaped tank is filled with anengineered fluid from 3M. 3M’s liquid cooling fluids have dielectric propertiesthat make them effective insulators, allowing the servers to operate normallywhile fully immersed in the fluid. 微軟技術(shù)研究員兼公司副總裁,、Azure計算首席架構(gòu)師馬庫斯·豐圖拉(MarcusFonoura)表示,,這種向兩階段液體浸泡冷卻的轉(zhuǎn)變?yōu)楦咝Ч芾碓瀑Y源提供了更大的靈活性。 This shift to two-phase liquid immersioncooling enables increased flexibility for the efficient management of cloudresources, according to Marcus Fontoura, a technical fellow and corporate vicepresident at Microsoft who is the chief architect of Azure compute. 例如,,管理云資源的軟件可以將數(shù)據(jù)中心計算需求的突然峰值分配給液冷箱中的服務器,。這是因為這些服務器可以在更高的功率下運行-這一過程被稱為超頻-而不會有過熱的風險。 For example,software that manages cloud resources can allocate sudden spikes in datacentercompute demand to the servers in the liquid cooled tanks. That’s because theseservers can run at elevated power – a process called overclocking – withoutrisk of overheating. 方圖拉說:“例如,,當你到達1點鐘或2點鐘的時候,,Teams就會出現(xiàn)一個巨大的峰值,因為人們在同一時間加入會議,?!薄敖胧嚼鋮s為我們提供了更大的靈活性來處理這些突發(fā)性工作負載?!?nbsp; “For instance, we know that with Teams when youget to 1 o’clock or 2 o’clock, there is a huge spike because people are joiningmeetings at the same time,” Fontoura said. “Immersion cooling gives us moreflexibility to deal with these burst-y workloads.” 沸騰的液體帶走了Microsoft數(shù)據(jù)中心的計算機服務器產(chǎn)生的熱量,。微軟是第一家在生產(chǎn)環(huán)境中運行兩階段浸入式冷卻的云提供商,。由Gene Twedt為Microsoft攝影。 Boilingliquid carries away heat generated by computer servers at a Microsoftdatacenter. Microsoft is the first cloud provider to run two-phase immersioncooling in a production environment. Photo by Gene Twedt for Microsoft. 可持續(xù)數(shù)據(jù)中心 Sustainable datacentersFonoura補充說,,將兩階段沉浸冷卻服務器添加到可用的計算資源組合中,,還將允許機器學習軟件在整個數(shù)據(jù)中心(從電力和冷卻到維護技術(shù)人員)更有效地管理這些資源。 Adding the two-phase immersion cooled serversto the mix of available compute resources will also allow machine learningsoftware to manage these resources more efficiently across the datacenter, frompower and cooling to maintenance technicians, Fontoura added. 他說:“我們不僅會對效率產(chǎn)生巨大影響,,還會對可持續(xù)性產(chǎn)生巨大影響,,因為你要確保不會浪費,確保我們部署的每一件IT設備都能得到很好的利用,?!?nbsp; “We will have not only a huge impact onefficiency, but also a huge impact on sustainability because you make sure thatthere is not wastage, that every piece of IT equipment that we deploy will bewell utilized,” he said. 液體冷卻也是一種無水技術(shù),這將幫助微軟實現(xiàn)到本世紀末補水量超過消耗量的承諾,。 Liquid cooling is also a waterless technology,which will help Microsoft meet its commitmentto replenish more water than it consumes bythe end of this decade. 流經(jīng)儲罐并使蒸汽凝結(jié)的冷卻盤管連接到一個單獨的閉環(huán)系統(tǒng),,該系統(tǒng)使用流體將熱量從儲罐轉(zhuǎn)移到儲罐容器外的干式冷卻器。艾麗莎解釋說,,因為這些盤管中的流體總是比周圍的空氣更熱,,所以沒有必要噴水來調(diào)節(jié)空氣的蒸發(fā)冷卻。 The cooling coils that run through the tank andenable the vapor to condense are connected to a separate closed loop systemthat uses fluid to transfer heat from the tank to a dry cooler outside thetank’s container. Because the fluid in these coils is always warmer than theambient air, there’s no need to spray water to condition the air forevaporative cooling, Alissa explained. 微軟與基礎設施行業(yè)合作伙伴一起,,也在研究如何以減少流體損失并且對環(huán)境幾乎沒有影響的方式來運行儲罐,。 Microsoft, together with infrastructureindustry partners, is also investigating how to run the tanks in ways thatmitigate fluid loss and will have little to no impact on the environment. Azure首席軟件工程師伊安尼斯·馬努薩基斯(IoannisManousakis)表示:“如果方法得當,兩相浸沒冷卻將同時實現(xiàn)我們所有的成本,、可靠性和性能要求,,而與空氣冷卻相比,其能耗基本上只有一小部分,?!?nbsp; “If done right, two-phase immersion coolingwill attain all our cost, reliability and performance requirementssimultaneously with essentially a fraction of the energy spend compared to aircooling,” said IoannisManousakis, a principal software engineer with Azure. Microsoft團隊正在研究兩相浸沒式冷卻技術(shù)。從左至右圖:數(shù)據(jù)中心運營管理部門的Dave Starkenburg,,Microsoft數(shù)據(jù)中心高級開發(fā)小組的杰出工程師兼副總裁ChristianBelady,,Azure首席軟件工程師IoannisManousakis和Microsoft數(shù)據(jù)中心高級團隊的首席硬件工程師HusamAlissa發(fā)展。由Gene Twedt為Microsoft攝影,。 A Microsoftteam is exploring two-phase immersion cooling technology. Pictured from left toright: Dave Starkenburg, datacenter operations management, Christian Belady,distinguished engineer and vice president of Microsoft’s datacenter advanceddevelopment group, IoannisManousakis, principal software engineer with Azure,and Husam Alissa, principal hardware engineer on Microsoft’s team fordatacenter advanced development. Photo by Gene Twedt for Microsoft. '我們把海帶到了服務器上’ 'We brought the sea to the servers’微軟對兩相浸沒式冷卻的深入研究是該公司多管齊下的戰(zhàn)略的一部分,,該戰(zhàn)略旨在使數(shù)據(jù)中心的構(gòu)建,運營和維護更加可持續(xù)和高效,。 Microsoft’s investigation into two-phaseimmersion cooling is part of the company’s multi-pronged strategy to makedatacenters more sustainable and efficient to build, operate and maintain. 例如,,數(shù)據(jù)中心高級開發(fā)團隊還正在探索使用氫燃料電池代替柴油發(fā)電機在數(shù)據(jù)中心進行備用發(fā)電的可能性。 For example, the datacenter advanceddevelopment team is also exploring the potential to usehydrogen fuel cells instead of diesel generators for backuppower generation at datacenters. 液體冷卻項目類似于微軟的Natick項目,,該項目正在探索水下數(shù)據(jù)中心的可能性,,這些數(shù)據(jù)中心可以快速部署,并且可以在海床上密封于類似潛艇的管狀容器內(nèi)運行數(shù)年,,而無需人工進行任何現(xiàn)場維護,。 The liquid cooling project is similar to Microsoft’sProject Natick, which is exploring the potential ofunderwater datacenters that are quick to deploy and can operate for years onthe seabed sealed inside submarine-like tubes without any onsite maintenance bypeople. 水下數(shù)據(jù)中心充斥著干燥的氮氣空氣,,而不是特別設計研發(fā)的流體。服務器用風扇和熱交換管道系統(tǒng)冷卻,,管道系統(tǒng)通過密封的管道輸送海水,。 Instead of an engineered fluid, the underwaterdatacenter is filled with dry nitrogen air. The servers are cooled with fansand a heat exchange plumbing system that pumps piped seawater through thesealed tube. 來自Project Natick的一個關鍵發(fā)現(xiàn)是,海底服務器的故障率是陸地數(shù)據(jù)中心復制服務器故障率的八分之一,。初步分析表明,,缺乏濕度和氧氣的腐蝕作用是水下服務器性能優(yōu)越的主要原因。 A key finding from Project Natick is that theservers on the seafloor experienced one-eighth the failure rate of replicaservers in a land datacenter. Preliminary analysis indicates that the lack ofhumidity and corrosive effects of oxygen were primarily responsible for thesuperior performance of the servers underwater. Alissa預計,,液浸箱中的服務器將體驗到類似的卓越性能,。他說:“我們把大海帶到了服務器上,而不是把數(shù)據(jù)中心放在海底,?!?nbsp; Alissa anticipates the servers inside theliquid immersion tank will experience similar superior performance. “We broughtthe sea to the servers rather than put the datacenter under the sea,” he said. Azure的首席軟件工程師IoannisManousakis從Microsoft數(shù)據(jù)中心的兩相浸入式冷卻水箱中卸下了刀片服務器。由Gene Twedt為Microsoft攝影,。 IoannisManousakis,a principal software engineer with Azure, removes a server blade from atwo-phase immersion cooling tank at a Microsoft datacenter. Photo by Gene Twedtfor Microsoft. 未來 The future如果浸沒式箱體中的服務器的故障率如預期的那樣降低,,則Microsoft可以轉(zhuǎn)換到一種模式,即在出現(xiàn)故障時不立即更換組件,。這將限制蒸氣損失,,并允許將油箱部署在偏遠且難以維修的位置。 If the servers in the immersion tank experiencereduced failure rates as anticipated, Microsoft could move to a model wherecomponents are not immediately replaced when they fail. This would limit vaporloss as well as allow tank deployment in remote, hard-to-service locations. 此外,,Belady指出,,能夠?qū)⒎掌髅芗匕b在儲罐中,,從而實現(xiàn)了重新構(gòu)想的服務器體系結(jié)構(gòu),,該體系結(jié)構(gòu)針對低延遲,高性能應用程序和低維護操作進行了優(yōu)化,。 What’s more, the ability to densely packservers in the tank enables a re-envisioned server architecture that’soptimized for low-latency, high-performance applications as well aslow-maintenance operation, Belady noted. 例如,,這種箱體可以部署在城市中心的5G蜂窩通信塔下面,用于自動駕駛汽車等應用,。 Such a tank, for example, could be deployedunder a 5G cellular communications tower in the middle of a city forapplications such as self-driving cars. 到目前為止,,Microsoft在超大規(guī)模數(shù)據(jù)中心中只有一個運行工作負載的箱體。在接下來的幾個月中,,Microsoft團隊將進行一系列測試,,以證明箱體和這項技術(shù)的可行性。 For now, Microsoft has one tank runningworkloads in a hyperscale datacenter. For the next several months, the Microsoftteam will perform a series of tests to prove the viability of the tank and thetechnology. Belady說:“第一步是讓人們對這一概念感到舒適,,并表明我們可以運行生產(chǎn)工作負載,。” “This first step is about making people feelcomfortable with the concept and showing we can run production workloads,”Belady said. THE END |
|
來自: 冬日暖陽2024 > 《數(shù)據(jù)中心》