It’s no secret that the current AI boom is using up immense amounts of energy. Now we have a better idea of how much. 

A new paper, from a team at the Harvard T.H. Chan School of Public Health, examined 2,132 data centers operating in the United States (78% of all facilities in the country). These facilities—essentially buildings filled to the brim with rows of servers—are where AI models get trained, and they also get “pinged” every time we send a request through models like ChatGPT. They require huge amounts of energy both to power the servers and to keep them cool. 

Since 2018, carbon emissions from data centers in the US have tripled. For the 12 months ending August 2024, data centers were responsible for 105 million metric tons of CO2, accounting for 2.18% of national emissions (for comparison, domestic commercial airlines are responsible for about 131 million metric tons). About 4.59% of all the energy used in the US goes toward data centers, a figure that’s doubled since 2018.

It’s difficult to put a number on how much AI in particular, which has been booming since ChatGPT launched in November 2022, is responsible for this surge. That’s because data centers process lots of different types of data—in addition to training or pinging AI models, they do everything from hosting websites to storing your photos in the cloud. However, the researchers say, AI’s share is certainly growing rapidly as nearly every segment of the economy attempts to adopt the technology.

“It’s a pretty big surge,” says Eric Gimon, a senior fellow at the think tank Energy Innovation, who was not involved in the research. “There’s a lot of breathless analysis about how quickly this exponential growth could go. But it’s still early days for the business in terms of figuring out efficiencies, or different kinds of chips.”

Notably, the sources for all this power are particularly “dirty.” Since so many data centers are located in coal-producing regions, like Virginia, the “carbon intensity” of the energy they use is 48% higher than the national average. The paper, which was published on arXiv and has not yet been peer-reviewed, found that 95% of data centers in the US are built in places with sources of electricity that are dirtier than the national average. 

There are causes other than simply being located in coal country, says Falco Bargagli-Stoffi, an author of the paper. “Dirtier energy is available throughout the entire day,” he says, and plenty of data centers require that to maintain peak operation 24-7. “Renewable energy, like wind or solar, might not be as available.” Political or tax incentives, and local pushback, can also affect where data centers get built.  

One key shift in AI right now means that the field’s emissions are soon likely to skyrocket. AI models are rapidly moving from fairly simple text generators like ChatGPT toward highly complex image, video, and music generators. Until now, many of these “multimodal” models have been stuck in the research phase, but that’s changing. 

OpenAI released its video generation model Sora to the public on December 9, and its website has been so flooded with traffic from people eager to test it out that it is still not functioning properly. Competing models, like Veo from Google and Movie Gen from Meta, have still not been released publicly, but if those companies follow OpenAI’s lead as they have in the past, they might be soon. Music generation models from Suno and Udio are growing (despite lawsuits), and Nvidia released its own audio generator last month. Google is working on its Astra project, which will be a video-AI companion that can converse with you about your surroundings in real time. 

“As we scale up to images and video, the data sizes increase exponentially,” says Gianluca Guidi, a PhD student in artificial intelligence at University of Pisa and IMT Lucca, who is the paper’s lead author. Combine that with wider adoption, he says, and emissions will soon jump. 

One of the goals of the researchers was to build a more reliable way to get snapshots of just how much energy data centers are using. That’s been a more complicated task than you might expect, given that the data is dispersed across a number of sources and agencies. They’ve now built a portal that shows data center emissions across the country. The long-term goal of the data pipeline is to inform future regulatory efforts to curb emissions from data centers, which are predicted to grow enormously in the coming years. 

“There’s going to be increased pressure, between the environmental and sustainability-conscious community and Big Tech,” says Francesca Dominici, director of the Harvard Data Science Initiative and another coauthor. “But my prediction is that there is not going to be regulation. Not in the next four years.”

It’s no secret that the current AI boom is using up immense amounts of energy. Now we have a better idea of how much. 

A new paper, from a team at the Harvard T.H. Chan School of Public Health, examined 2,132 data centers operating in the United States (78% of all facilities in the country). These facilities—essentially buildings filled to the brim with rows of servers—are where AI models get trained, and they also get “pinged” every time we send a request through models like ChatGPT. They require huge amounts of energy both to power the servers and to keep them cool. 

Since 2018, carbon emissions from data centers in the US have tripled. For the 12 months ending August 2024, data centers were responsible for 105 million metric tons of CO2, accounting for 2.18% of national emissions (for comparison, domestic commercial airlines are responsible for about 131 million metric tons). About 4.59% of all the energy used in the US goes toward data centers, a figure that’s doubled since 2018.

It’s difficult to put a number on how much AI in particular, which has been booming since ChatGPT launched in November 2022, is responsible for this surge. That’s because data centers process lots of different types of data—in addition to training or pinging AI models, they do everything from hosting websites to storing your photos in the cloud. However, the researchers say, AI’s share is certainly growing rapidly as nearly every segment of the economy attempts to adopt the technology.

“It’s a pretty big surge,” says Eric Gimon, a senior fellow at the think tank Energy Innovation, who was not involved in the research. “There’s a lot of breathless analysis about how quickly this exponential growth could go. But it’s still early days for the business in terms of figuring out efficiencies, or different kinds of chips.”

Notably, the sources for all this power are particularly “dirty.” Since so many data centers are located in coal-producing regions, like Virginia, the “carbon intensity” of the energy they use is 48% higher than the national average. The paper, which was published on arXiv and has not yet been peer-reviewed, found that 95% of data centers in the US are built in places with sources of electricity that are dirtier than the national average. 

There are causes other than simply being located in coal country, says Falco Bargagli-Stoffi, an author of the paper. “Dirtier energy is available throughout the entire day,” he says, and plenty of data centers require that to maintain peak operation 24-7. “Renewable energy, like wind or solar, might not be as available.” Political or tax incentives, and local pushback, can also affect where data centers get built.  

One key shift in AI right now means that the field’s emissions are soon likely to skyrocket. AI models are rapidly moving from fairly simple text generators like ChatGPT toward highly complex image, video, and music generators. Until now, many of these “multimodal” models have been stuck in the research phase, but that’s changing. 

OpenAI released its video generation model Sora to the public on December 9, and its website has been so flooded with traffic from people eager to test it out that it is still not functioning properly. Competing models, like Veo from Google and Movie Gen from Meta, have still not been released publicly, but if those companies follow OpenAI’s lead as they have in the past, they might be soon. Music generation models from Suno and Udio are growing (despite lawsuits), and Nvidia released its own audio generator last month. Google is working on its Astra project, which will be a video-AI companion that can converse with you about your surroundings in real time. 

“As we scale up to images and video, the data sizes increase exponentially,” says Gianluca Guidi, a PhD student in artificial intelligence at University of Pisa and IMT Lucca, who is the paper’s lead author. Combine that with wider adoption, he says, and emissions will soon jump. 

One of the goals of the researchers was to build a more reliable way to get snapshots of just how much energy data centers are using. That’s been a more complicated task than you might expect, given that the data is dispersed across a number of sources and agencies. They’ve now built a portal that shows data center emissions across the country. The long-term goal of the data pipeline is to inform future regulatory efforts to curb emissions from data centers, which are predicted to grow enormously in the coming years. 

“There’s going to be increased pressure, between the environmental and sustainability-conscious community and Big Tech,” says Francesca Dominici, director of the Harvard Data Science Initiative and another coauthor. “But my prediction is that there is not going to be regulation. Not in the next four years.”

 

Leave a Reply

Your email address will not be published. Required fields are marked *