Carbon Footprint
From Andrew A. Chien, Liuzixuan Lin, Hai Nguyen, Varsha Rao, Tristan Sharma, and Rajini Wijayawardana. 2023. Reducing the Carbon Impact of Generative AI Inference (today and in 2035): In 2nd Workshop on Sustainable Computer Systems (HotCarbon β23), July 9, 2023, Boston, MA, USA. ACM, New York, NY, USA, 7 pages. ππ·π = 0.428 kW per GPU (1/8 of 3.43 kW for the instance) x 1.1 PUE ππΌ = 0.35 is TFLOPS per inference assuming GPT-3 model (around 175 billion weights) processed with BF16 operations. πΌπ = 5 is the number of inferences per output word (assumed window/sampling of 5 for each output word) ππΆ is the output word count (measured average of 185 output words/request) πΆ = 156 TFLOPS is the GPU capacity assuming 50% efficiency πΈhπ€ is per-GPU emission calculated as 1/8 of estimated per-instance emissions: πΈhπ€ = 1/8 (ππΉ +πΈπΊππ +πΈπΆππ +πΈπ·π π΄π +πΈπππ· +πΈπ»π·π·) where ππΉ is IC packaging Carbon footprint while πΈπΊππ , πΈπΆππ , πΈπ·π π΄π, πΈπππ·, and πΈπ»π·π· are GPU, CPU, memory, and storage emissions, respectively. We estimate these emissions based on previous reports [26] and instance hardware specifications [1, 3, 11], yielding πΈhπ€ = 318 kgCO2 per GPUWater Use
Data for A100:- Water consumption per wafer-layer (Liter/12-inch equivalent wafer mask layer) from TSMC 2022 ESG report: 137.3 L per 12-inch equivalent wafer mask layer
- Mask layers for the TSMC 7nm process: 87
- A100 chips per wafer: the A100 is 826 mmΒ² which is similar to the H100βs 814 mmΒ², and the H100 yields 29 sets per 12β wafer