2025-11-25T17:13:17.763733

Datacenters in the Desert: Feasibility and Sustainability of LLM Inference in the Middle East

Hassan, ElZeftawy, Mahmoud
As the Middle East emerges as a strategic hub for artificial intelligence (AI) infrastructure, the feasibility of deploying sustainable datacenters in desert environments has become a topic of growing relevance. This paper presents an empirical study analyzing the energy consumption and carbon footprint of large language model (LLM) inference across four countries: the United Arab Emirates, Iceland, Germany, and the United States of America using DeepSeek Coder 1.3B and the HumanEval dataset on the task of code generation. We use the CodeCarbon library to track energy and carbon emissions andcompare geographical trade-offs for climate-aware AI deployment. Our findings highlight both the challenges and potential of datacenters in desert regions and provide a balanced outlook on their role in global AI expansion.
academic

Datacenters in the Desert: Feasibility and Sustainability of LLM Inference in the Middle East

Basic Information

  • Paper ID: 2511.17683
  • Title: Datacenters in the Desert: Feasibility and Sustainability of LLM Inference in the Middle East
  • Authors: Lara Hassan, Mohamed ElZeftawy, Abdulrahman Mahmoud (MBZUAI)
  • Classification: cs.CY (Computers and Society), cs.AI (Artificial Intelligence)
  • Publication Date: November 21, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2511.17683

Abstract

As the Middle East emerges as a strategic hub for AI infrastructure, the feasibility of deploying sustainable datacenters in desert environments has become an increasingly important issue. This paper presents an empirical study analyzing energy consumption and carbon footprint of large language model (LLM) inference across four countries (United Arab Emirates, Iceland, Germany, and the United States) using the DeepSeek Coder 1.3B model and HumanEval dataset for code generation tasks. The research employs the CodeCarbon library to track energy and carbon emissions, comparing geographic trade-offs in climate-aware AI deployment. The findings reveal both challenges and potential for desert datacenters, providing a balanced perspective on their role in global AI expansion.

Research Background and Motivation

1. Core Research Questions

This study focuses on the feasibility and sustainability of deploying AI datacenters in desert environments, particularly in the Middle East. Specific questions include:

  • Energy efficiency of datacenters under desert climate conditions
  • Carbon emission variations across different geographic locations
  • Trade-offs between economic costs and environmental impact

2. Significance of the Problem

  • Surging AI Computational Demand: AI computing capacity grows 10-fold every six months, placing enormous environmental pressure on datacenters
  • Middle East Strategic Development: United Arab Emirates and Saudi Arabia have announced multi-billion-watt AI datacenter projects involving substantial investments
  • Global Infrastructure Diversification: Need to evaluate the role of emerging markets in global AI infrastructure
  • Sustainability Challenges: Extreme temperatures and fossil fuel-dominated power grids pose challenges to environmental sustainability

3. Limitations of Existing Research

  • Lack of empirical carbon emission studies for desert environment datacenters
  • Absence of systematic comparison of energy-cost-carbon trade-offs across different geographic locations
  • Insufficient assessment of sustainability potential for Middle Eastern datacenters

4. Research Motivation

  • Economic Incentives: Significantly lower electricity costs in the Middle East (some solar facilities in Abu Dhabi at only $0.014/kWh)
  • Policy Drivers: G42-NVIDIA partnership agreement (annual GPU quota of 500,000, with 20% retained locally)
  • Clean Energy Investment: 5GW AI park planned with hybrid power supply from solar, natural gas, and nuclear
  • Technological Innovation Needs: Advanced cooling technologies required to address extreme temperatures

Core Contributions

  1. First Empirical Study of LLM Inference Carbon Footprint in the Middle East: Provides quantitative comparison between UAE datacenters and traditional cold-climate hubs (Iceland, Germany) and the United States
  2. Multi-dimensional Trade-off Analysis Framework: Systematically evaluates geographic variations across three dimensions: energy consumption, carbon emissions, and operational costs
  3. Real Workload Testing: Uses actual LLM inference tasks (DeepSeek Coder 1.3B + HumanEval) rather than theoretical models
  4. Policy Insights: Provides data-driven recommendations for sustainable development pathways for Middle Eastern datacenters, including clean energy integration and advanced cooling technology adoption
  5. Balanced Perspective: Acknowledges both challenges (high carbon emissions) and potential (low costs, rapid deployment capability, renewable energy potential) of desert datacenters

Methodology Details

Task Definition

Research Task: Quantitatively assess environmental impact and economic costs of executing identical LLM inference workloads across different geographic locations

Inputs:

  • Fixed hardware configuration (NVIDIA RTX 5000 ADA GPU + Intel Xeon w7-2495X CPU)
  • Standardized inference task (code generation using DeepSeek Coder 1.3B on HumanEval dataset)
  • Energy grid data for four geographic locations (2023)

Outputs:

  • Energy consumption (kWh)
  • Carbon emissions (kgCO2)
  • Operational costs (based on local electricity rates)

Experimental Design

1. Model Selection

  • DeepSeek Coder 1.3B: Large language model specifically designed for code generation
  • Rationale: Moderate scale, suitable for inference tasks, representative of practical applications

2. Dataset

  • HumanEval: Standard code generation evaluation benchmark
  • Purpose: Provides consistent inference workload

3. Monitoring Tools

  • CodeCarbon Library: Open-source carbon emission tracking tool
  • Capabilities:
    • Monitors CPU, GPU, and RAM power consumption
    • Calculates CO2 emissions based on regional power grid carbon intensity
    • Uses publicly available datasets through 2023

4. Geographic Location Selection

The study selected four representative regions:

RegionClimate CharacteristicsEnergy StructureRepresentativeness
UAEDesert climateNatural gas-dominated, emerging solar and nuclearEmerging Middle East AI hub
IcelandSubarctic climate~100% renewable (geothermal + hydroelectric)Sustainability benchmark
GermanyTemperate climateMixed grid (renewables + fossil fuels)European representative
TexasSemi-arid to humid subtropicalDiversified (wind, natural gas, solar)Important US AI infrastructure region

Technical Innovations

1. Rigorous Application of Control Variable Method

  • Fixed Hardware: All experiments use identical hardware configuration
  • Consistent Workload: Same model, dataset, and task across all locations
  • Isolated Geographic Factors: Only variable is geographic location (grid carbon intensity and electricity rates)

2. Real-world Scenario Simulation

  • Uses actual running LLM inference tasks rather than synthetic workloads
  • Reflects computational patterns of real datacenters

3. Multi-dimensional Evaluation Framework

Considers not only carbon emissions but also:

  • Environmental impact (CO2 emissions)
  • Economic costs (electricity rates)
  • Energy efficiency (PUE values)
  • Infrastructure potential (deployment speed, scalability)

Experimental Setup

Dataset

  • HumanEval Dataset: Benchmark set containing 164 programming problems
  • Purpose: Evaluates functional correctness of code generation models
  • Processing: Complete dataset used for inference testing without train/validation/test splits

Hardware Configuration

  • GPU: NVIDIA RTX 5000 ADA Generation
  • CPU: Intel(R) Xeon(R) w7-2495X
  • Consistency Assurance: All regions simulated with identical hardware specifications

Evaluation Metrics

  1. Energy Consumption (kWh)
    • Measurement: Total power consumption of CPU, GPU, and RAM
    • Significance: Direct energy cost of datacenter operations
  2. Carbon Emissions (kgCO2)
    • Calculation: Energy consumption × Regional power grid carbon intensity
    • Significance: Core indicator of environmental impact
  3. Electricity Cost ($/kWh)
    • Data Source: Public electricity rate data for each region
    • Significance: Assessment of operational economics
  4. PUE (Power Usage Effectiveness)
    • Definition: Total facility energy consumption / IT equipment energy consumption
    • Significance: Datacenter efficiency metric
    • Ideal Value: Close to 1.0 (all energy used for computation)

Implementation Details

  • Monitoring Frequency: Real-time monitoring of energy consumption during inference
  • Data Source: CodeCarbon library uses publicly available energy data through 2023
  • Simulation Method: Simulates different geographic locations by configuring CodeCarbon's regional parameters

Experimental Results

Main Findings

1. Carbon Emissions Comparison (Figure 1 Key Findings)

Key Data:

  • Consistent Energy Consumption: Energy usage identical across all regions (control variables effective)
  • Massive Carbon Emission Variations:
    • UAE and Texas: Significantly higher carbon emissions than other regions
    • Iceland: Nearly negligible emissions (~100% renewable energy)
    • Germany: Intermediate level (partially decarbonized grid)
    • UAE slightly higher than Texas

Order of Magnitude Differences: CO2 emissions in UAE are orders of magnitude higher than Iceland, highlighting the decisive role of grid composition on environmental impact

2. Electricity Cost Comparison

RegionElectricity Rate ($/kWh)Cost RankingRelative to UAE
UAE$0.077Lowest1.0×
Texas$0.109Second1.42×
Iceland$0.156Third2.03×
Germany$0.323Highest4.19×

Key Findings:

  • UAE offers lowest operational costs, approximately 76% cheaper than Germany
  • For large-scale LLM inference, cost advantages may outweigh environmental disadvantages
  • Economic incentives may drive datacenter concentration toward low-cost regions

3. PUE (Power Usage Effectiveness) Analysis

Desert Climate Challenges:

  • Traditional air cooling systems: PUE > 1.8 (extreme temperatures)
  • Advanced cooling technologies: PUE ≈ 1.3-1.5
    • Evaporative cooling
    • Liquid immersion cooling
    • Seawater cooling systems

Middle East Improvement Targets:

  • Major cloud and hosting providers target: PUE < 1.5
  • Local deployments have achieved PUE reductions exceeding 0.4
  • Implementation of hot/cold aisle containment, liquid cooling, and AI-optimized HVAC systems

Experimental Findings

Finding 1: Grid Composition is the Decisive Factor

With identical energy consumption, carbon emissions are entirely determined by grid carbon intensity, not datacenter efficiency itself.

Finding 2: Fundamental Trade-off Between Cost and Sustainability

  • Most Environmentally FriendlyMost Economical: Iceland cleanest but highest cost
  • Most Economical = High Carbon Emissions: UAE cheapest but high emissions
  • This trade-off is critical for AI infrastructure decisions

Finding 3: Necessity of Region-Specific Strategies

No "one-size-fits-all" optimal solution exists; deployment location selection must align with organizational priorities (cost vs. environment).

Finding 4: Potential for Clean Energy Integration

Despite current high carbon emissions, Middle Eastern regions have improvement potential through:

  • Natural alignment between solar availability and cooling demand peaks (daytime solar peak = cooling demand peak)
  • Nuclear energy providing stable baseload
  • Ongoing large-scale clean energy investments

1. Datacenter Sustainability Research

The paper cites industry reports on Middle Eastern datacenter markets (PwC, Mordor Intelligence), emphasizing growth trends in regional datacenter opportunities and cooling technology markets.

2. AI Environmental Impact Assessment

The emergence of tools like CodeCarbon has enabled precise tracking of AI workload carbon footprints, with this research representing an application of such tools in geographic comparison studies.

3. Regional AI Infrastructure Development

SemiAnalysis reports detail the trilateral agreement between the US, UAE, and Saudi Arabia, including:

  • G42's annual NVIDIA GPU quota of 500,000
  • 20% local retention for regional AI development
  • 5GW AI park planning

4. LLM Code Generation Evaluation

  • DeepSeek Coder: Specialized code intelligence model
  • HumanEval: OpenAI-developed standard code generation benchmark

Unique Contributions of This Paper

Compared to existing work, this paper is the first to:

  1. Combine environmental impact of LLM inference with Middle Eastern datacenter feasibility
  2. Provide multi-region empirical carbon emission comparison data
  3. Integrate economic, environmental, and infrastructure factors

Conclusions and Discussion

Main Conclusions

1. Existence of Fundamental Trade-offs

Environment vs. Economics:

  • Iceland: Most sustainable but highest cost
  • UAE/Texas: Most economically attractive but high emissions
  • No single solution optimizes both dimensions simultaneously

2. Dual Nature of Middle Eastern Datacenters

Challenges:

  • Current grid dominated by fossil fuels
  • Extreme temperatures increase cooling burden
  • Significantly higher carbon emissions than cold-climate regions

Potential:

  • Lowest electricity costs ($0.077/kWh)
  • Large-scale clean energy investments underway
  • Rapid deployment capability and policy support
  • Abundant solar resources

3. Sustainable Development Pathways are Feasible

LLM deployment sustainability in the Middle East is not a question of "if possible" but "how to implement responsibly":

  • Co-deployment with solar and nuclear facilities
  • Adoption of advanced cooling technologies
  • Continuous energy efficiency improvements

4. Value of Global Infrastructure Diversification

  • Geographic Resilience: Distributed risk, enhanced global AI infrastructure stability
  • Latency Optimization: Serve rapidly growing regional markets
  • Capacity Supplementation: Alleviate regulatory and land constraints in Western markets

Limitations

1. Simulation Method Limitations

  • Uses 2023 data, not reflecting latest grid improvements
  • Simulation rather than field measurement may introduce bias
  • Does not account for complex operational conditions of actual datacenters

2. Single Model and Task

  • Tests only DeepSeek Coder 1.3B (1.3B parameters)
  • Larger models (70B+ parameters) may show different performance
  • Evaluates inference only, excludes training workloads

3. Lack of Temporal Dimension

  • Static snapshot, does not evaluate seasonal variations
  • Does not predict impact of future grid decarbonization
  • Lacks long-term trend analysis

4. Unquantified Trade-offs

  • Water consumption (evaporative cooling) not detailed
  • Land use efficiency not compared
  • Infrastructure complexity not quantified

5. Limited Geographic Coverage

Evaluates only four regions, excludes other important AI markets (China, Singapore, etc.)

Future Directions

1. Technological Innovation

  • Next-generation Cooling Systems: Further development of modular and liquid immersion cooling
  • AI-optimized Energy Management: Using AI to optimize datacenter energy consumption
  • Renewable Energy Integration: Intelligent scheduling of on-site solar generation with AI workloads

2. Policy and Incentives

  • Regional cooperation to accelerate low-carbon datacenter development
  • Carbon credit and offset mechanisms
  • Sustainability certification and standards

3. Research Extensions

  • Evaluation of larger-scale models
  • Carbon footprint research for training workloads
  • Seasonal and temporal variation analysis
  • Detailed water resource impact assessment

4. Capacity Forecasting

  • Middle East projected to contribute over 6GW additional capacity by 2030
  • Requires continuous monitoring and sustainability progress evaluation

In-Depth Evaluation

Strengths

1. Rigorous Research Design

  • Control Variable Method: Fixed hardware and workload effectively isolate geographic factors
  • Real Workloads: Uses actual LLM inference tasks rather than synthetic benchmarks
  • Multi-dimensional Assessment: Considers not only carbon emissions but also costs and efficiency

2. High Practical Application Value

  • Policy Reference: Provides data support for Middle Eastern AI infrastructure investment
  • Business Decision Guidance: Helps enterprises balance cost and sustainability
  • Technology Roadmap Recommendations: Clarifies importance of cooling technology and clean energy

3. Balanced and Objective Perspective

  • Neither overly pessimistic nor blindly optimistic
  • Acknowledges challenges while demonstrating potential
  • Avoids simplistic binary "good/bad" judgments

4. Strong Timeliness

  • Tracks latest Middle Eastern AI infrastructure developments (G42-NVIDIA agreement)
  • Addresses current hot topics in AI energy consumption
  • Uses relatively recent data (2023 power grid data)

5. Method Reproducibility

  • Uses open-source tools (CodeCarbon)
  • Clear experimental setup description
  • Transparent data sources

Weaknesses

1. Limited Experimental Scale

  • Single Model: Tests only 1.3B parameter model with limited representativeness
    • Current mainstream models range from 70B-405B parameters
    • Small model energy patterns may differ significantly from large models
  • Short-term Testing: Single inference task, does not evaluate long-term operations

2. Lack of Field Verification

  • All data based on simulation, not measured in actual Middle Eastern datacenters
  • PUE values cited from industry reports, not original measurements
  • Cooling system effectiveness lacks first-hand data

3. Insufficient Analysis Depth

  • Water Resources: Mentions evaporative cooling water consumption but does not quantify
    • Critical limiting factor in water-scarce desert regions
  • Peak Load: Does not analyze grid peak-valley differences
  • Reliability: Does not evaluate extreme weather impact on operations

4. Simplified Economic Analysis

  • Considers only electricity rates, excludes:
    • Infrastructure construction costs
    • Operations and maintenance personnel costs
    • Land costs
    • Cooling system capital expenditure
  • Does not calculate total cost of ownership (TCO)

5. Lack of Dynamic Perspective

  • Grid decarbonization process not modeled
  • Does not predict changes across different time scales
  • Seasonal effects not considered (summer vs. winter)

6. Geographic Selection Representativeness Issues

  • Texas represents US, but states vary significantly
  • Excludes major Asian AI markets (China, Singapore)
  • Lacks other desert climate comparisons (Australia, Chile)

Impact Assessment

1. Academic Contribution

  • Pioneering: First systematic study of Middle Eastern LLM inference carbon footprint
  • Methodology: Provides reproducible framework for geographic comparison research
  • Data Contribution: Fills gap in environmental impact data for the region

2. Practical Value

  • Highly Relevant: Directly serves multi-billion-dollar investment decisions
  • Policy Impact: May influence UAE and Saudi Arabia datacenter policies
  • Enterprise Application: Helps tech companies optimize global datacenter layouts

3. Impact Limitations from Weaknesses

  • Single model size limits conclusion generalizability
  • Lack of field data reduces credibility
  • Simplified economic analysis may lead to decision bias

4. Reproducibility

  • Open-source Tools: CodeCarbon freely available
  • Clear Methods: Experiment setup sufficiently described
  • Accessible Data: Uses public datasets
  • Challenge: Requires identical hardware configuration, high cost

Applicable Scenarios

1. Directly Applicable Scenarios

  • LLM Inference Services: Code generation, text generation, and other inference-intensive applications
  • Small-scale Model Deployment: Models in 1-10B parameter range
  • Geographic Site Selection: Initial assessment for datacenter location

2. Scenarios Requiring Adjustment

  • Large-scale Models (70B+): Energy patterns may differ, requires additional verification
  • Training Workloads: Energy characteristics significantly different from inference
  • Mixed Workloads: Actual datacenters run multiple task types

3. Inapplicable Scenarios

  • Edge Computing: Small distributed deployments
  • Real-time Systems: Applications extremely sensitive to latency
  • Non-AI Workloads: Traditional cloud computing services

4. Extension Application Potential

  • Other Desert Regions: Methods transferable to similar climate zones
  • Other AI Tasks: Image generation, speech recognition, etc.
  • Comprehensive Evaluation Framework: Can serve as foundation for more complete assessment

Summary

This is a timely and important study that fills a gap in environmental impact assessment for Middle Eastern AI infrastructure. The research design is rigorous, effectively using control variable methods to isolate geographic factors, reaching clear conclusions: Middle Eastern datacenters are highly competitive economically but currently have higher carbon emissions; future sustainability depends on clean energy integration and cooling technology innovation.

Main Strengths include balanced objective perspective, high practical application value, and method reproducibility. The research not only identifies problems but also demonstrates solutions, providing valuable reference for policymakers and business decision-makers.

Main Limitations include limited experimental scale (single small model), lack of field verification, simplified economic analysis, and insufficient assessment of critical factors like water resources. These shortcomings somewhat limit conclusion generalizability and credibility.

Future Research Directions should include:

  1. Extension to larger-scale models and training workloads
  2. Field measurements in actual Middle Eastern datacenters
  3. Detailed assessment of water consumption and other environmental impacts
  4. Dynamic modeling to predict clean energy transition effects
  5. Comprehensive total cost of ownership analysis

Overall, this is a high-quality applied research paper providing valuable empirical data and insights for a rapidly developing field with major economic and environmental significance. As Middle Eastern AI infrastructure investment continues to grow, the value of this research will become increasingly apparent.

Key References Cited in the Paper

  1. PwC Middle East (2025): "Unlocking the data centre opportunity in the middle east" - Middle East datacenter market analysis
  2. SemiAnalysis (2025): "AI Arrives in the Middle East: US Strikes a Deal with UAE and KSA" - Details of US-UAE-Saudi trilateral AI agreement
  3. Mordor Intelligence (2025): Middle East datacenter cooling market size and trend forecasting report
  4. Guo et al. (2024): "DeepSeek-coder: When the large language model meets programming" - Code generation model used in this research
  5. Chen et al. (2021): "Evaluating large language models trained on code" - Original HumanEval dataset paper
  6. CodeCarbon (2024): v2.4.1 - Open-source carbon emission tracking library