Skip to main content

Carbon Footprint

From Andrew A. Chien, Liuzixuan Lin, Hai Nguyen, Varsha Rao, Tristan Sharma, and Rajini Wijayawardana. 2023. Reducing the Carbon Impact of Generative AI Inference (today and in 2035): In 2nd Workshop on Sustainable Computer Systems (HotCarbon ’23), July 9, 2023, Boston, MA, USA. ACM, New York, NY, USA, 7 pages. 𝑇𝐷𝑃 = 0.428 kW per GPU (1/8 of 3.43 kW for the instance) x 1.1 PUE 𝑂𝐼 = 0.35 is TFLOPS per inference assuming GPT-3 model (around 175 billion weights) processed with BF16 operations. πΌπ‘Š = 5 is the number of inferences per output word (assumed window/sampling of 5 for each output word) π‘ŠπΆ is the output word count (measured average of 185 output words/request) 𝐢 = 156 TFLOPS is the GPU capacity assuming 50% efficiency 𝐸h𝑀 is per-GPU emission calculated as 1/8 of estimated per-instance emissions: 𝐸h𝑀 = 1/8 (𝑃𝐹 +πΈπΊπ‘ƒπ‘ˆ +πΈπΆπ‘ƒπ‘ˆ +𝐸𝐷𝑅𝐴𝑀 +𝐸𝑆𝑆𝐷 +𝐸𝐻𝐷𝐷) where 𝑃𝐹 is IC packaging Carbon footprint while πΈπΊπ‘ƒπ‘ˆ , πΈπΆπ‘ƒπ‘ˆ , 𝐸𝐷𝑅𝐴𝑀, 𝐸𝑆𝑆𝐷, and 𝐸𝐻𝐷𝐷 are GPU, CPU, memory, and storage emissions, respectively. We estimate these emissions based on previous reports [26] and instance hardware specifications [1, 3, 11], yielding 𝐸h𝑀 = 318 kgCO2 per GPU

Water Use

Data for A100:
  • Water consumption per wafer-layer (Liter/12-inch equivalent wafer mask layer) from TSMC 2022 ESG report: 137.3 L per 12-inch equivalent wafer mask layer
  • Mask layers for the TSMC 7nm process: 87
  • A100 chips per wafer: the A100 is 826 mmΒ² which is similar to the H100’s 814 mmΒ², and the H100 yields 29 sets per 12” wafer
Using the manufacturing water use formula:
(water use per chip) = (water use per wafer mask layer) x (mask layers) / (chips per wafer)

(water use per A100) = (137.3 L/layer) x (87 layers/wafer) / (29 chips/wafer) = 411.9 L/chip
Note: this doesn’t include the memory chips that are also on the A100… need to find a source for the water use there
⌘I