2025-11-25T17:13:17.763733

Datacenters in the Desert: Feasibility and Sustainability of LLM Inference in the Middle East

Hassan, ElZeftawy, Mahmoud

As the Middle East emerges as a strategic hub for artificial intelligence (AI) infrastructure, the feasibility of deploying sustainable datacenters in desert environments has become a topic of growing relevance. This paper presents an empirical study analyzing the energy consumption and carbon footprint of large language model (LLM) inference across four countries: the United Arab Emirates, Iceland, Germany, and the United States of America using DeepSeek Coder 1.3B and the HumanEval dataset on the task of code generation. We use the CodeCarbon library to track energy and carbon emissions andcompare geographical trade-offs for climate-aware AI deployment. Our findings highlight both the challenges and potential of datacenters in desert regions and provide a balanced outlook on their role in global AI expansion.

academic

Datacenters in the Desert: Feasibility and Sustainability of LLM Inference in the Middle East

Basic Information

Paper ID: 2511.17683
Title: Datacenters in the Desert: Feasibility and Sustainability of LLM Inference in the Middle East
Authors: Lara Hassan, Mohamed ElZeftawy, Abdulrahman Mahmoud (MBZUAI)
Classification: cs.CY (Computers and Society), cs.AI (Artificial Intelligence)
Publication Date: November 21, 2025 (arXiv preprint)
Paper Link: https://arxiv.org/abs/2511.17683

Abstract

As the Middle East emerges as a strategic hub for AI infrastructure, the feasibility of deploying sustainable datacenters in desert environments has become an increasingly important issue. This paper presents an empirical study analyzing energy consumption and carbon footprint of large language model (LLM) inference across four countries (United Arab Emirates, Iceland, Germany, and the United States) using the DeepSeek Coder 1.3B model and HumanEval dataset for code generation tasks. The research employs the CodeCarbon library to track energy and carbon emissions, comparing geographic trade-offs in climate-aware AI deployment. The findings reveal both challenges and potential for desert datacenters, providing a balanced perspective on their role in global AI expansion.

Research Background and Motivation

1. Core Research Questions

This study focuses on the feasibility and sustainability of deploying AI datacenters in desert environments, particularly in the Middle East. Specific questions include:

Energy efficiency of datacenters under desert climate conditions
Carbon emission variations across different geographic locations
Trade-offs between economic costs and environmental impact

2. Significance of the Problem

Surging AI Computational Demand: AI computing capacity grows 10-fold every six months, placing enormous environmental pressure on datacenters
Middle East Strategic Development: United Arab Emirates and Saudi Arabia have announced multi-billion-watt AI datacenter projects involving substantial investments
Global Infrastructure Diversification: Need to evaluate the role of emerging markets in global AI infrastructure
Sustainability Challenges: Extreme temperatures and fossil fuel-dominated power grids pose challenges to environmental sustainability

3. Limitations of Existing Research

Lack of empirical carbon emission studies for desert environment datacenters
Absence of systematic comparison of energy-cost-carbon trade-offs across different geographic locations
Insufficient assessment of sustainability potential for Middle Eastern datacenters

4. Research Motivation

Economic Incentives: Significantly lower electricity costs in the Middle East (some solar facilities in Abu Dhabi at only $0.014/kWh)
Policy Drivers: G42-NVIDIA partnership agreement (annual GPU quota of 500,000, with 20% retained locally)
Clean Energy Investment: 5GW AI park planned with hybrid power supply from solar, natural gas, and nuclear
Technological Innovation Needs: Advanced cooling technologies required to address extreme temperatures

Core Contributions

First Empirical Study of LLM Inference Carbon Footprint in the Middle East: Provides quantitative comparison between UAE datacenters and traditional cold-climate hubs (Iceland, Germany) and the United States
Multi-dimensional Trade-off Analysis Framework: Systematically evaluates geographic variations across three dimensions: energy consumption, carbon emissions, and operational costs
Real Workload Testing: Uses actual LLM inference tasks (DeepSeek Coder 1.3B + HumanEval) rather than theoretical models
Policy Insights: Provides data-driven recommendations for sustainable development pathways for Middle Eastern datacenters, including clean energy integration and advanced cooling technology adoption
Balanced Perspective: Acknowledges both challenges (high carbon emissions) and potential (low costs, rapid deployment capability, renewable energy potential) of desert datacenters

Methodology Details

Task Definition

Research Task: Quantitatively assess environmental impact and economic costs of executing identical LLM inference workloads across different geographic locations

Inputs:

Fixed hardware configuration (NVIDIA RTX 5000 ADA GPU + Intel Xeon w7-2495X CPU)
Standardized inference task (code generation using DeepSeek Coder 1.3B on HumanEval dataset)
Energy grid data for four geographic locations (2023)

Outputs:

Energy consumption (kWh)
Carbon emissions (kgCO2)
Operational costs (based on local electricity rates)

Experimental Design

1. Model Selection

DeepSeek Coder 1.3B: Large language model specifically designed for code generation
Rationale: Moderate scale, suitable for inference tasks, representative of practical applications

2. Dataset

HumanEval: Standard code generation evaluation benchmark
Purpose: Provides consistent inference workload

3. Monitoring Tools

CodeCarbon Library: Open-source carbon emission tracking tool
Capabilities:
- Monitors CPU, GPU, and RAM power consumption
- Calculates CO2 emissions based on regional power grid carbon intensity
- Uses publicly available datasets through 2023

4. Geographic Location Selection

The study selected four representative regions:

Region	Climate Characteristics	Energy Structure	Representativeness
UAE	Desert climate	Natural gas-dominated, emerging solar and nuclear	Emerging Middle East AI hub
Iceland	Subarctic climate	~100% renewable (geothermal + hydroelectric)	Sustainability benchmark
Germany	Temperate climate	Mixed grid (renewables + fossil fuels)	European representative
Texas	Semi-arid to humid subtropical	Diversified (wind, natural gas, solar)	Important US AI infrastructure region

Technical Innovations

1. Rigorous Application of Control Variable Method

Fixed Hardware: All experiments use identical hardware configuration
Consistent Workload: Same model, dataset, and task across all locations
Isolated Geographic Factors: Only variable is geographic location (grid carbon intensity and electricity rates)

2. Real-world Scenario Simulation

Uses actual running LLM inference tasks rather than synthetic workloads
Reflects computational patterns of real datacenters

3. Multi-dimensional Evaluation Framework

Considers not only carbon emissions but also:

Environmental impact (CO2 emissions)
Economic costs (electricity rates)
Energy efficiency (PUE values)
Infrastructure potential (deployment speed, scalability)

Experimental Setup

Dataset

HumanEval Dataset: Benchmark set containing 164 programming problems
Purpose: Evaluates functional correctness of code generation models
Processing: Complete dataset used for inference testing without train/validation/test splits

Hardware Configuration

GPU: NVIDIA RTX 5000 ADA Generation
CPU: Intel(R) Xeon(R) w7-2495X
Consistency Assurance: All regions simulated with identical hardware specifications

Evaluation Metrics

Energy Consumption (kWh)
- Measurement: Total power consumption of CPU, GPU, and RAM
- Significance: Direct energy cost of datacenter operations
Carbon Emissions (kgCO2)
- Calculation: Energy consumption × Regional power grid carbon intensity
- Significance: Core indicator of environmental impact
Electricity Cost ($/kWh)
- Data Source: Public electricity rate data for each region
- Significance: Assessment of operational economics
PUE (Power Usage Effectiveness)
- Definition: Total facility energy consumption / IT equipment energy consumption
- Significance: Datacenter efficiency metric
- Ideal Value: Close to 1.0 (all energy used for computation)

Implementation Details

Monitoring Frequency: Real-time monitoring of energy consumption during inference
Data Source: CodeCarbon library uses publicly available energy data through 2023
Simulation Method: Simulates different geographic locations by configuring CodeCarbon's regional parameters

Experimental Results

Main Findings

1. Carbon Emissions Comparison (Figure 1 Key Findings)

Key Data:

Consistent Energy Consumption: Energy usage identical across all regions (control variables effective)
Massive Carbon Emission Variations:
- UAE and Texas: Significantly higher carbon emissions than other regions
- Iceland: Nearly negligible emissions (~100% renewable energy)
- Germany: Intermediate level (partially decarbonized grid)
- UAE slightly higher than Texas

Order of Magnitude Differences: CO2 emissions in UAE are orders of magnitude higher than Iceland, highlighting the decisive role of grid composition on environmental impact

2. Electricity Cost Comparison

Region	Electricity Rate ($/kWh)	Cost Ranking	Relative to UAE
UAE	$0.077	Lowest	1.0×
Texas	$0.109	Second	1.42×
Iceland	$0.156	Third	2.03×
Germany	$0.323	Highest	4.19×

Key Findings:

UAE offers lowest operational costs, approximately 76% cheaper than Germany
For large-scale LLM inference, cost advantages may outweigh environmental disadvantages
Economic incentives may drive datacenter concentration toward low-cost regions

3. PUE (Power Usage Effectiveness) Analysis

Desert Climate Challenges:

Traditional air cooling systems: PUE > 1.8 (extreme temperatures)
Advanced cooling technologies: PUE ≈ 1.3-1.5
- Evaporative cooling
- Liquid immersion cooling
- Seawater cooling systems

Middle East Improvement Targets:

Major cloud and hosting providers target: PUE < 1.5
Local deployments have achieved PUE reductions exceeding 0.4
Implementation of hot/cold aisle containment, liquid cooling, and AI-optimized HVAC systems

Experimental Findings

Finding 1: Grid Composition is the Decisive Factor

With identical energy consumption, carbon emissions are entirely determined by grid carbon intensity, not datacenter efficiency itself.

Finding 2: Fundamental Trade-off Between Cost and Sustainability

Most Environmentally Friendly ≠ Most Economical: Iceland cleanest but highest cost
Most Economical = High Carbon Emissions: UAE cheapest but high emissions
This trade-off is critical for AI infrastructure decisions

Finding 3: Necessity of Region-Specific Strategies

No "one-size-fits-all" optimal solution exists; deployment location selection must align with organizational priorities (cost vs. environment).

Finding 4: Potential for Clean Energy Integration

Despite current high carbon emissions, Middle Eastern regions have improvement potential through:

Natural alignment between solar availability and cooling demand peaks (daytime solar peak = cooling demand peak)
Nuclear energy providing stable baseload
Ongoing large-scale clean energy investments

1. Datacenter Sustainability Research

The paper cites industry reports on Middle Eastern datacenter markets (PwC, Mordor Intelligence), emphasizing growth trends in regional datacenter opportunities and cooling technology markets.

2. AI Environmental Impact Assessment

The emergence of tools like CodeCarbon has enabled precise tracking of AI workload carbon footprints, with this research representing an application of such tools in geographic comparison studies.

3. Regional AI Infrastructure Development

SemiAnalysis reports detail the trilateral agreement between the US, UAE, and Saudi Arabia, including:

G42's annual NVIDIA GPU quota of 500,000
20% local retention for regional AI development
5GW AI park planning

4. LLM Code Generation Evaluation

DeepSeek Coder: Specialized code intelligence model
HumanEval: OpenAI-developed standard code generation benchmark

Unique Contributions of This Paper

Compared to existing work, this paper is the first to:

Combine environmental impact of LLM inference with Middle Eastern datacenter feasibility
Provide multi-region empirical carbon emission comparison data
Integrate economic, environmental, and infrastructure factors

Conclusions and Discussion

Main Conclusions

1. Existence of Fundamental Trade-offs

Environment vs. Economics:

Iceland: Most sustainable but highest cost
UAE/Texas: Most economically attractive but high emissions
No single solution optimizes both dimensions simultaneously

2. Dual Nature of Middle Eastern Datacenters

Challenges:

Current grid dominated by fossil fuels
Extreme temperatures increase cooling burden
Significantly higher carbon emissions than cold-climate regions

Potential:

Lowest electricity costs ($0.077/kWh)
Large-scale clean energy investments underway
Rapid deployment capability and policy support
Abundant solar resources

3. Sustainable Development Pathways are Feasible

LLM deployment sustainability in the Middle East is not a question of "if possible" but "how to implement responsibly":

Co-deployment with solar and nuclear facilities
Adoption of advanced cooling technologies
Continuous energy efficiency improvements

4. Value of Global Infrastructure Diversification

Geographic Resilience: Distributed risk, enhanced global AI infrastructure stability
Latency Optimization: Serve rapidly growing regional markets
Capacity Supplementation: Alleviate regulatory and land constraints in Western markets

Limitations

1. Simulation Method Limitations

Uses 2023 data, not reflecting latest grid improvements
Simulation rather than field measurement may introduce bias
Does not account for complex operational conditions of actual datacenters

2. Single Model and Task

Tests only DeepSeek Coder 1.3B (1.3B parameters)
Larger models (70B+ parameters) may show different performance
Evaluates inference only, excludes training workloads

3. Lack of Temporal Dimension

Static snapshot, does not evaluate seasonal variations
Does not predict impact of future grid decarbonization
Lacks long-term trend analysis

4. Unquantified Trade-offs

Water consumption (evaporative cooling) not detailed
Land use efficiency not compared
Infrastructure complexity not quantified

5. Limited Geographic Coverage

Evaluates only four regions, excludes other important AI markets (China, Singapore, etc.)

Future Directions

1. Technological Innovation

Next-generation Cooling Systems: Further development of modular and liquid immersion cooling
AI-optimized Energy Management: Using AI to optimize datacenter energy consumption
Renewable Energy Integration: Intelligent scheduling of on-site solar generation with AI workloads

2. Policy and Incentives

Regional cooperation to accelerate low-carbon datacenter development
Carbon credit and offset mechanisms
Sustainability certification and standards

3. Research Extensions

Evaluation of larger-scale models
Carbon footprint research for training workloads
Seasonal and temporal variation analysis
Detailed water resource impact assessment

4. Capacity Forecasting

Middle East projected to contribute over 6GW additional capacity by 2030
Requires continuous monitoring and sustainability progress evaluation

In-Depth Evaluation

Strengths

1. Rigorous Research Design

Control Variable Method: Fixed hardware and workload effectively isolate geographic factors
Real Workloads: Uses actual LLM inference tasks rather than synthetic benchmarks
Multi-dimensional Assessment: Considers not only carbon emissions but also costs and efficiency

2. High Practical Application Value

Policy Reference: Provides data support for Middle Eastern AI infrastructure investment
Business Decision Guidance: Helps enterprises balance cost and sustainability
Technology Roadmap Recommendations: Clarifies importance of cooling technology and clean energy

3. Balanced and Objective Perspective

Neither overly pessimistic nor blindly optimistic
Acknowledges challenges while demonstrating potential
Avoids simplistic binary "good/bad" judgments

4. Strong Timeliness

Tracks latest Middle Eastern AI infrastructure developments (G42-NVIDIA agreement)
Addresses current hot topics in AI energy consumption
Uses relatively recent data (2023 power grid data)

5. Method Reproducibility

Uses open-source tools (CodeCarbon)
Clear experimental setup description
Transparent data sources

Weaknesses

1. Limited Experimental Scale

Single Model: Tests only 1.3B parameter model with limited representativeness
- Current mainstream models range from 70B-405B parameters
- Small model energy patterns may differ significantly from large models
Short-term Testing: Single inference task, does not evaluate long-term operations

2. Lack of Field Verification

All data based on simulation, not measured in actual Middle Eastern datacenters
PUE values cited from industry reports, not original measurements
Cooling system effectiveness lacks first-hand data

3. Insufficient Analysis Depth

Water Resources: Mentions evaporative cooling water consumption but does not quantify
- Critical limiting factor in water-scarce desert regions
Peak Load: Does not analyze grid peak-valley differences
Reliability: Does not evaluate extreme weather impact on operations

4. Simplified Economic Analysis

Considers only electricity rates, excludes:
- Infrastructure construction costs
- Operations and maintenance personnel costs
- Land costs
- Cooling system capital expenditure
Does not calculate total cost of ownership (TCO)

5. Lack of Dynamic Perspective

Grid decarbonization process not modeled
Does not predict changes across different time scales
Seasonal effects not considered (summer vs. winter)

6. Geographic Selection Representativeness Issues

Texas represents US, but states vary significantly
Excludes major Asian AI markets (China, Singapore)
Lacks other desert climate comparisons (Australia, Chile)

Impact Assessment

1. Academic Contribution

Pioneering: First systematic study of Middle Eastern LLM inference carbon footprint
Methodology: Provides reproducible framework for geographic comparison research
Data Contribution: Fills gap in environmental impact data for the region

2. Practical Value

Highly Relevant: Directly serves multi-billion-dollar investment decisions
Policy Impact: May influence UAE and Saudi Arabia datacenter policies
Enterprise Application: Helps tech companies optimize global datacenter layouts

3. Impact Limitations from Weaknesses

Single model size limits conclusion generalizability
Lack of field data reduces credibility
Simplified economic analysis may lead to decision bias

4. Reproducibility

Open-source Tools: CodeCarbon freely available
Clear Methods: Experiment setup sufficiently described
Accessible Data: Uses public datasets
Challenge: Requires identical hardware configuration, high cost

Applicable Scenarios

1. Directly Applicable Scenarios

LLM Inference Services: Code generation, text generation, and other inference-intensive applications
Small-scale Model Deployment: Models in 1-10B parameter range
Geographic Site Selection: Initial assessment for datacenter location

2. Scenarios Requiring Adjustment

Large-scale Models (70B+): Energy patterns may differ, requires additional verification
Training Workloads: Energy characteristics significantly different from inference
Mixed Workloads: Actual datacenters run multiple task types

3. Inapplicable Scenarios

Edge Computing: Small distributed deployments
Real-time Systems: Applications extremely sensitive to latency
Non-AI Workloads: Traditional cloud computing services

4. Extension Application Potential

Other Desert Regions: Methods transferable to similar climate zones
Other AI Tasks: Image generation, speech recognition, etc.
Comprehensive Evaluation Framework: Can serve as foundation for more complete assessment

Summary

This is a timely and important study that fills a gap in environmental impact assessment for Middle Eastern AI infrastructure. The research design is rigorous, effectively using control variable methods to isolate geographic factors, reaching clear conclusions: Middle Eastern datacenters are highly competitive economically but currently have higher carbon emissions; future sustainability depends on clean energy integration and cooling technology innovation.

Main Strengths include balanced objective perspective, high practical application value, and method reproducibility. The research not only identifies problems but also demonstrates solutions, providing valuable reference for policymakers and business decision-makers.

Main Limitations include limited experimental scale (single small model), lack of field verification, simplified economic analysis, and insufficient assessment of critical factors like water resources. These shortcomings somewhat limit conclusion generalizability and credibility.

Future Research Directions should include:

Extension to larger-scale models and training workloads
Field measurements in actual Middle Eastern datacenters
Detailed assessment of water consumption and other environmental impacts
Dynamic modeling to predict clean energy transition effects
Comprehensive total cost of ownership analysis

Overall, this is a high-quality applied research paper providing valuable empirical data and insights for a rapidly developing field with major economic and environmental significance. As Middle Eastern AI infrastructure investment continues to grow, the value of this research will become increasingly apparent.

Key References Cited in the Paper

PwC Middle East (2025): "Unlocking the data centre opportunity in the middle east" - Middle East datacenter market analysis
SemiAnalysis (2025): "AI Arrives in the Middle East: US Strikes a Deal with UAE and KSA" - Details of US-UAE-Saudi trilateral AI agreement
Mordor Intelligence (2025): Middle East datacenter cooling market size and trend forecasting report
Guo et al. (2024): "DeepSeek-coder: When the large language model meets programming" - Code generation model used in this research
Chen et al. (2021): "Evaluating large language models trained on code" - Original HumanEval dataset paper
CodeCarbon (2024): v2.4.1 - Open-source carbon emission tracking library