Carbon Footprint

From Andrew A. Chien, Liuzixuan Lin, Hai Nguyen, Varsha Rao, Tristan Sharma, and Rajini Wijayawardana. 2023. Reducing the Carbon Impact of Generative AI Inference (today and in 2035): In 2nd Workshop on Sustainable Computer Systems (HotCarbon ’23), July 9, 2023, Boston, MA, USA. ACM, New York, NY, USA, 7 pages.

𝑇𝐷𝑃 = 0.428 kW per GPU (1/8 of 3.43 kW for the instance) x 1.1 PUE

𝑂𝐼 = 0.35 is TFLOPS per inference assuming GPT-3 model (around 175 billion weights) processed with BF16 operations.

πΌπ‘Š = 5 is the number of inferences per output word (assumed window/sampling of 5 for each output word)

π‘ŠπΆ is the output word count (measured average of 185 output words/request)

𝐢 = 156 TFLOPS is the GPU capacity assuming 50% efficiency

𝐸h𝑀 is per-GPU emission calculated as 1/8 of estimated per-instance emissions: 𝐸h𝑀 = 1/8 (𝑃𝐹 +πΈπΊπ‘ƒπ‘ˆ +πΈπΆπ‘ƒπ‘ˆ +𝐸𝐷𝑅𝐴𝑀 +𝐸𝑆𝑆𝐷 +𝐸𝐻𝐷𝐷) where 𝑃𝐹 is IC packaging Carbon footprint while πΈπΊπ‘ƒπ‘ˆ , πΈπΆπ‘ƒπ‘ˆ , 𝐸𝐷𝑅𝐴𝑀, 𝐸𝑆𝑆𝐷, and 𝐸𝐻𝐷𝐷 are GPU, CPU, memory, and storage emissions, respectively. We estimate these emissions based on previous reports [26] and instance hardware specifications [1, 3, 11], yielding 𝐸h𝑀 = 318 kgCO2 per GPU

Water Use

Data for A100:

  • Water consumption per wafer-layer (Liter/12-inch equivalent wafer mask layer) from TSMC 2022 ESG report: 137.3 L per 12-inch equivalent wafer mask layer
  • Mask layers for the TSMC 7nm process: 87
  • A100 chips per wafer: the A100 is 826 mmΒ² which is similar to the H100’s 814 mmΒ², and the H100 yields 29 sets per 12” wafer

Using the manufacturing water use formula:

(water use per chip) = (water use per wafer mask layer) x (mask layers) / (chips per wafer)

(water use per A100) = (137.3 L/layer) x (87 layers/wafer) / (29 chips/wafer) = 411.9 L/chip

Note: this doesn’t include the memory chips that are also on the A100… need to find a source for the water use there