As the Middle East emerges as a strategic hub for artificial intelligence (AI) infrastructure, the feasibility of deploying sustainable datacenters in desert environments has become a topic of growing relevance. This paper presents an empirical study analyzing the energy consumption and carbon footprint of large language model (LLM) inference across four countries: the United Arab Emirates, Iceland, Germany, and the United States of America using DeepSeek Coder 1.3B and the HumanEval dataset on the task of code generation. We use the CodeCarbon library to track energy and carbon emissions andcompare geographical trade-offs for climate-aware AI deployment. Our findings highlight both the challenges and potential of datacenters in desert regions and provide a balanced outlook on their role in global AI expansion.
Datacenters in the Desert: Feasibility and Sustainability of LLM Inference in the Middle East
- Paper ID: 2511.17683
- Title: Datacenters in the Desert: Feasibility and Sustainability of LLM Inference in the Middle East
- Authors: Lara Hassan, Mohamed ElZeftawy, Abdulrahman Mahmoud (MBZUAI)
- Classification: cs.CY (Computers and Society), cs.AI (Artificial Intelligence)
- Publication Date: November 21, 2025 (arXiv preprint)
- Paper Link: https://arxiv.org/abs/2511.17683
As the Middle East emerges as a strategic hub for AI infrastructure, the feasibility of deploying sustainable datacenters in desert environments has become an increasingly important issue. This paper presents an empirical study analyzing energy consumption and carbon footprint of large language model (LLM) inference across four countries (United Arab Emirates, Iceland, Germany, and the United States) using the DeepSeek Coder 1.3B model and HumanEval dataset for code generation tasks. The research employs the CodeCarbon library to track energy and carbon emissions, comparing geographic trade-offs in climate-aware AI deployment. The findings reveal both challenges and potential for desert datacenters, providing a balanced perspective on their role in global AI expansion.
This study focuses on the feasibility and sustainability of deploying AI datacenters in desert environments, particularly in the Middle East. Specific questions include:
- Energy efficiency of datacenters under desert climate conditions
- Carbon emission variations across different geographic locations
- Trade-offs between economic costs and environmental impact
- Surging AI Computational Demand: AI computing capacity grows 10-fold every six months, placing enormous environmental pressure on datacenters
- Middle East Strategic Development: United Arab Emirates and Saudi Arabia have announced multi-billion-watt AI datacenter projects involving substantial investments
- Global Infrastructure Diversification: Need to evaluate the role of emerging markets in global AI infrastructure
- Sustainability Challenges: Extreme temperatures and fossil fuel-dominated power grids pose challenges to environmental sustainability
- Lack of empirical carbon emission studies for desert environment datacenters
- Absence of systematic comparison of energy-cost-carbon trade-offs across different geographic locations
- Insufficient assessment of sustainability potential for Middle Eastern datacenters
- Economic Incentives: Significantly lower electricity costs in the Middle East (some solar facilities in Abu Dhabi at only $0.014/kWh)
- Policy Drivers: G42-NVIDIA partnership agreement (annual GPU quota of 500,000, with 20% retained locally)
- Clean Energy Investment: 5GW AI park planned with hybrid power supply from solar, natural gas, and nuclear
- Technological Innovation Needs: Advanced cooling technologies required to address extreme temperatures
- First Empirical Study of LLM Inference Carbon Footprint in the Middle East: Provides quantitative comparison between UAE datacenters and traditional cold-climate hubs (Iceland, Germany) and the United States
- Multi-dimensional Trade-off Analysis Framework: Systematically evaluates geographic variations across three dimensions: energy consumption, carbon emissions, and operational costs
- Real Workload Testing: Uses actual LLM inference tasks (DeepSeek Coder 1.3B + HumanEval) rather than theoretical models
- Policy Insights: Provides data-driven recommendations for sustainable development pathways for Middle Eastern datacenters, including clean energy integration and advanced cooling technology adoption
- Balanced Perspective: Acknowledges both challenges (high carbon emissions) and potential (low costs, rapid deployment capability, renewable energy potential) of desert datacenters
Research Task: Quantitatively assess environmental impact and economic costs of executing identical LLM inference workloads across different geographic locations
Inputs:
- Fixed hardware configuration (NVIDIA RTX 5000 ADA GPU + Intel Xeon w7-2495X CPU)
- Standardized inference task (code generation using DeepSeek Coder 1.3B on HumanEval dataset)
- Energy grid data for four geographic locations (2023)
Outputs:
- Energy consumption (kWh)
- Carbon emissions (kgCO2)
- Operational costs (based on local electricity rates)
- DeepSeek Coder 1.3B: Large language model specifically designed for code generation
- Rationale: Moderate scale, suitable for inference tasks, representative of practical applications
- HumanEval: Standard code generation evaluation benchmark
- Purpose: Provides consistent inference workload
- CodeCarbon Library: Open-source carbon emission tracking tool
- Capabilities:
- Monitors CPU, GPU, and RAM power consumption
- Calculates CO2 emissions based on regional power grid carbon intensity
- Uses publicly available datasets through 2023
The study selected four representative regions:
| Region | Climate Characteristics | Energy Structure | Representativeness |
|---|
| UAE | Desert climate | Natural gas-dominated, emerging solar and nuclear | Emerging Middle East AI hub |
| Iceland | Subarctic climate | ~100% renewable (geothermal + hydroelectric) | Sustainability benchmark |
| Germany | Temperate climate | Mixed grid (renewables + fossil fuels) | European representative |
| Texas | Semi-arid to humid subtropical | Diversified (wind, natural gas, solar) | Important US AI infrastructure region |
- Fixed Hardware: All experiments use identical hardware configuration
- Consistent Workload: Same model, dataset, and task across all locations
- Isolated Geographic Factors: Only variable is geographic location (grid carbon intensity and electricity rates)
- Uses actual running LLM inference tasks rather than synthetic workloads
- Reflects computational patterns of real datacenters
Considers not only carbon emissions but also:
- Environmental impact (CO2 emissions)
- Economic costs (electricity rates)
- Energy efficiency (PUE values)
- Infrastructure potential (deployment speed, scalability)
- HumanEval Dataset: Benchmark set containing 164 programming problems
- Purpose: Evaluates functional correctness of code generation models
- Processing: Complete dataset used for inference testing without train/validation/test splits
- GPU: NVIDIA RTX 5000 ADA Generation
- CPU: Intel(R) Xeon(R) w7-2495X
- Consistency Assurance: All regions simulated with identical hardware specifications
- Energy Consumption (kWh)
- Measurement: Total power consumption of CPU, GPU, and RAM
- Significance: Direct energy cost of datacenter operations
- Carbon Emissions (kgCO2)
- Calculation: Energy consumption × Regional power grid carbon intensity
- Significance: Core indicator of environmental impact
- Electricity Cost ($/kWh)
- Data Source: Public electricity rate data for each region
- Significance: Assessment of operational economics
- PUE (Power Usage Effectiveness)
- Definition: Total facility energy consumption / IT equipment energy consumption
- Significance: Datacenter efficiency metric
- Ideal Value: Close to 1.0 (all energy used for computation)
- Monitoring Frequency: Real-time monitoring of energy consumption during inference
- Data Source: CodeCarbon library uses publicly available energy data through 2023
- Simulation Method: Simulates different geographic locations by configuring CodeCarbon's regional parameters
Key Data:
- Consistent Energy Consumption: Energy usage identical across all regions (control variables effective)
- Massive Carbon Emission Variations:
- UAE and Texas: Significantly higher carbon emissions than other regions
- Iceland: Nearly negligible emissions (~100% renewable energy)
- Germany: Intermediate level (partially decarbonized grid)
- UAE slightly higher than Texas
Order of Magnitude Differences: CO2 emissions in UAE are orders of magnitude higher than Iceland, highlighting the decisive role of grid composition on environmental impact
| Region | Electricity Rate ($/kWh) | Cost Ranking | Relative to UAE |
|---|
| UAE | $0.077 | Lowest | 1.0× |
| Texas | $0.109 | Second | 1.42× |
| Iceland | $0.156 | Third | 2.03× |
| Germany | $0.323 | Highest | 4.19× |
Key Findings:
- UAE offers lowest operational costs, approximately 76% cheaper than Germany
- For large-scale LLM inference, cost advantages may outweigh environmental disadvantages
- Economic incentives may drive datacenter concentration toward low-cost regions
Desert Climate Challenges:
- Traditional air cooling systems: PUE > 1.8 (extreme temperatures)
- Advanced cooling technologies: PUE ≈ 1.3-1.5
- Evaporative cooling
- Liquid immersion cooling
- Seawater cooling systems
Middle East Improvement Targets:
- Major cloud and hosting providers target: PUE < 1.5
- Local deployments have achieved PUE reductions exceeding 0.4
- Implementation of hot/cold aisle containment, liquid cooling, and AI-optimized HVAC systems
With identical energy consumption, carbon emissions are entirely determined by grid carbon intensity, not datacenter efficiency itself.
- Most Environmentally Friendly ≠ Most Economical: Iceland cleanest but highest cost
- Most Economical = High Carbon Emissions: UAE cheapest but high emissions
- This trade-off is critical for AI infrastructure decisions
No "one-size-fits-all" optimal solution exists; deployment location selection must align with organizational priorities (cost vs. environment).
Despite current high carbon emissions, Middle Eastern regions have improvement potential through:
- Natural alignment between solar availability and cooling demand peaks (daytime solar peak = cooling demand peak)
- Nuclear energy providing stable baseload
- Ongoing large-scale clean energy investments
The paper cites industry reports on Middle Eastern datacenter markets (PwC, Mordor Intelligence), emphasizing growth trends in regional datacenter opportunities and cooling technology markets.
The emergence of tools like CodeCarbon has enabled precise tracking of AI workload carbon footprints, with this research representing an application of such tools in geographic comparison studies.
SemiAnalysis reports detail the trilateral agreement between the US, UAE, and Saudi Arabia, including:
- G42's annual NVIDIA GPU quota of 500,000
- 20% local retention for regional AI development
- 5GW AI park planning
- DeepSeek Coder: Specialized code intelligence model
- HumanEval: OpenAI-developed standard code generation benchmark
Compared to existing work, this paper is the first to:
- Combine environmental impact of LLM inference with Middle Eastern datacenter feasibility
- Provide multi-region empirical carbon emission comparison data
- Integrate economic, environmental, and infrastructure factors
Environment vs. Economics:
- Iceland: Most sustainable but highest cost
- UAE/Texas: Most economically attractive but high emissions
- No single solution optimizes both dimensions simultaneously
Challenges:
- Current grid dominated by fossil fuels
- Extreme temperatures increase cooling burden
- Significantly higher carbon emissions than cold-climate regions
Potential:
- Lowest electricity costs ($0.077/kWh)
- Large-scale clean energy investments underway
- Rapid deployment capability and policy support
- Abundant solar resources
LLM deployment sustainability in the Middle East is not a question of "if possible" but "how to implement responsibly":
- Co-deployment with solar and nuclear facilities
- Adoption of advanced cooling technologies
- Continuous energy efficiency improvements
- Geographic Resilience: Distributed risk, enhanced global AI infrastructure stability
- Latency Optimization: Serve rapidly growing regional markets
- Capacity Supplementation: Alleviate regulatory and land constraints in Western markets
- Uses 2023 data, not reflecting latest grid improvements
- Simulation rather than field measurement may introduce bias
- Does not account for complex operational conditions of actual datacenters
- Tests only DeepSeek Coder 1.3B (1.3B parameters)
- Larger models (70B+ parameters) may show different performance
- Evaluates inference only, excludes training workloads
- Static snapshot, does not evaluate seasonal variations
- Does not predict impact of future grid decarbonization
- Lacks long-term trend analysis
- Water consumption (evaporative cooling) not detailed
- Land use efficiency not compared
- Infrastructure complexity not quantified
Evaluates only four regions, excludes other important AI markets (China, Singapore, etc.)
- Next-generation Cooling Systems: Further development of modular and liquid immersion cooling
- AI-optimized Energy Management: Using AI to optimize datacenter energy consumption
- Renewable Energy Integration: Intelligent scheduling of on-site solar generation with AI workloads
- Regional cooperation to accelerate low-carbon datacenter development
- Carbon credit and offset mechanisms
- Sustainability certification and standards
- Evaluation of larger-scale models
- Carbon footprint research for training workloads
- Seasonal and temporal variation analysis
- Detailed water resource impact assessment
- Middle East projected to contribute over 6GW additional capacity by 2030
- Requires continuous monitoring and sustainability progress evaluation
- Control Variable Method: Fixed hardware and workload effectively isolate geographic factors
- Real Workloads: Uses actual LLM inference tasks rather than synthetic benchmarks
- Multi-dimensional Assessment: Considers not only carbon emissions but also costs and efficiency
- Policy Reference: Provides data support for Middle Eastern AI infrastructure investment
- Business Decision Guidance: Helps enterprises balance cost and sustainability
- Technology Roadmap Recommendations: Clarifies importance of cooling technology and clean energy
- Neither overly pessimistic nor blindly optimistic
- Acknowledges challenges while demonstrating potential
- Avoids simplistic binary "good/bad" judgments
- Tracks latest Middle Eastern AI infrastructure developments (G42-NVIDIA agreement)
- Addresses current hot topics in AI energy consumption
- Uses relatively recent data (2023 power grid data)
- Uses open-source tools (CodeCarbon)
- Clear experimental setup description
- Transparent data sources
- Single Model: Tests only 1.3B parameter model with limited representativeness
- Current mainstream models range from 70B-405B parameters
- Small model energy patterns may differ significantly from large models
- Short-term Testing: Single inference task, does not evaluate long-term operations
- All data based on simulation, not measured in actual Middle Eastern datacenters
- PUE values cited from industry reports, not original measurements
- Cooling system effectiveness lacks first-hand data
- Water Resources: Mentions evaporative cooling water consumption but does not quantify
- Critical limiting factor in water-scarce desert regions
- Peak Load: Does not analyze grid peak-valley differences
- Reliability: Does not evaluate extreme weather impact on operations
- Considers only electricity rates, excludes:
- Infrastructure construction costs
- Operations and maintenance personnel costs
- Land costs
- Cooling system capital expenditure
- Does not calculate total cost of ownership (TCO)
- Grid decarbonization process not modeled
- Does not predict changes across different time scales
- Seasonal effects not considered (summer vs. winter)
- Texas represents US, but states vary significantly
- Excludes major Asian AI markets (China, Singapore)
- Lacks other desert climate comparisons (Australia, Chile)
- Pioneering: First systematic study of Middle Eastern LLM inference carbon footprint
- Methodology: Provides reproducible framework for geographic comparison research
- Data Contribution: Fills gap in environmental impact data for the region
- Highly Relevant: Directly serves multi-billion-dollar investment decisions
- Policy Impact: May influence UAE and Saudi Arabia datacenter policies
- Enterprise Application: Helps tech companies optimize global datacenter layouts
- Single model size limits conclusion generalizability
- Lack of field data reduces credibility
- Simplified economic analysis may lead to decision bias
- Open-source Tools: CodeCarbon freely available
- Clear Methods: Experiment setup sufficiently described
- Accessible Data: Uses public datasets
- Challenge: Requires identical hardware configuration, high cost
- LLM Inference Services: Code generation, text generation, and other inference-intensive applications
- Small-scale Model Deployment: Models in 1-10B parameter range
- Geographic Site Selection: Initial assessment for datacenter location
- Large-scale Models (70B+): Energy patterns may differ, requires additional verification
- Training Workloads: Energy characteristics significantly different from inference
- Mixed Workloads: Actual datacenters run multiple task types
- Edge Computing: Small distributed deployments
- Real-time Systems: Applications extremely sensitive to latency
- Non-AI Workloads: Traditional cloud computing services
- Other Desert Regions: Methods transferable to similar climate zones
- Other AI Tasks: Image generation, speech recognition, etc.
- Comprehensive Evaluation Framework: Can serve as foundation for more complete assessment
This is a timely and important study that fills a gap in environmental impact assessment for Middle Eastern AI infrastructure. The research design is rigorous, effectively using control variable methods to isolate geographic factors, reaching clear conclusions: Middle Eastern datacenters are highly competitive economically but currently have higher carbon emissions; future sustainability depends on clean energy integration and cooling technology innovation.
Main Strengths include balanced objective perspective, high practical application value, and method reproducibility. The research not only identifies problems but also demonstrates solutions, providing valuable reference for policymakers and business decision-makers.
Main Limitations include limited experimental scale (single small model), lack of field verification, simplified economic analysis, and insufficient assessment of critical factors like water resources. These shortcomings somewhat limit conclusion generalizability and credibility.
Future Research Directions should include:
- Extension to larger-scale models and training workloads
- Field measurements in actual Middle Eastern datacenters
- Detailed assessment of water consumption and other environmental impacts
- Dynamic modeling to predict clean energy transition effects
- Comprehensive total cost of ownership analysis
Overall, this is a high-quality applied research paper providing valuable empirical data and insights for a rapidly developing field with major economic and environmental significance. As Middle Eastern AI infrastructure investment continues to grow, the value of this research will become increasingly apparent.
- PwC Middle East (2025): "Unlocking the data centre opportunity in the middle east" - Middle East datacenter market analysis
- SemiAnalysis (2025): "AI Arrives in the Middle East: US Strikes a Deal with UAE and KSA" - Details of US-UAE-Saudi trilateral AI agreement
- Mordor Intelligence (2025): Middle East datacenter cooling market size and trend forecasting report
- Guo et al. (2024): "DeepSeek-coder: When the large language model meets programming" - Code generation model used in this research
- Chen et al. (2021): "Evaluating large language models trained on code" - Original HumanEval dataset paper
- CodeCarbon (2024): v2.4.1 - Open-source carbon emission tracking library