We introduce GO-Diff, a diffusion-based method for global structure optimization that learns to directly sample low-energy atomic configurations without requiring prior data or explicit relaxation. GO-Diff is trained from scratch using a Boltzmann-weighted score-matching loss, leveraging only the known energy function to guide generation toward thermodynamically favorable regions. The method operates in a two-stage loop of self-sampling and model refinement, progressively improving its ability to target low-energy structures. Compared to traditional optimization pipelines, GO-Diff achieves competitive results with significantly fewer energy evaluations. Moreover, by reusing pretrained models across related systems, GO-Diff supports amortized optimization - enabling faster convergence on new tasks without retraining from scratch.
GO-Diff: Data-free and amortized global structure optimization
- Paper ID: 2510.13448
- Title: GO-Diff: Data-free and amortized global structure optimization
- Authors: Nikolaj Rønne, Tejs Vegge, Arghya Bhowmik (Technical University of Denmark)
- Classification: physics.comp-ph cond-mat.dis-nn cond-mat.mtrl-sci cs.CE
- Publication Date: October 15, 2025 (Preprint)
- Paper Link: https://arxiv.org/abs/2510.13448
This paper introduces GO-Diff, a diffusion model-based global structure optimization method capable of directly sampling low-energy atomic configurations without prior data or explicit relaxation. GO-Diff is trained from scratch using Boltzmann-weighted score matching loss, leveraging only known energy functions to guide the generative process toward thermodynamically favorable regions. The method employs a two-stage iterative cycle of self-sampling and model refinement to progressively enhance its capability to target low-energy structures. Compared to traditional optimization workflows, GO-Diff achieves competitive results with significantly fewer energy evaluations. Furthermore, by reusing pretrained models across related systems, GO-Diff enables amortized optimization—converging faster on new tasks without retraining from scratch.
This research addresses the global structure optimization problem for atomic systems, namely finding low-energy stable atomic configurations on the potential energy surface (PES). The potential energy surface is a high-dimensional, non-convex function mapping atomic positions to corresponding potential energies. Exploring this surface to identify low-energy structures is a fundamental challenge in computational materials science, chemistry, and catalysis.
Global structure optimization is foundational to applications ranging from catalytic surface discovery to functional material design, with importance for:
- Discovery of novel catalytic surfaces
- Design of functional materials
- Prediction of stable atomic configurations
- Understanding of material properties
Traditional global optimization methods suffer from the following issues:
- High computational cost: Methods such as random structure search (RSS), basin hopping, genetic algorithms, and simulated annealing rely on local relaxations and gradient-based optimizers, requiring numerous energy and force evaluations
- Limited to local optimization: Prone to becoming trapped in local optima, restricting exploration of complex energy landscapes
- Data dependency: Machine learning interatomic potentials require carefully selected training data to capture relevant minima, otherwise risking self-reinforcing local minima
- Lack of transferability: Existing methods struggle to reuse learned knowledge across related systems
Diffusion models have shown promise in structure generation for molecular and materials science, but applying them to global optimization tasks is challenging because the objective is to sample rare low-energy configurations corresponding to PES global minima, yet the data distribution of such structures is typically unknown or unavailable.
- Proposed a data-free generative optimization method: Directly samples potential energy surface minima without prior data or explicit relaxation
- Developed Boltzmann-weighted loss function: Combines annealing strategy to guide sampling toward low-energy regions while maintaining exploration
- Enabled amortized optimization: Achieves knowledge reuse through model transfer across related systems
- Demonstrated superior sample efficiency: Exhibits higher sample efficiency compared to classical search methods
Input: Energy function E(x) of an atomic system, where x represents atomic configuration
Output: Low-energy stable atomic configurations
Objective: Sample from the Boltzmann distribution: πT(x)=ZTexp(−E(x)/T)
GO-Diff employs a self-sampling iterative operation:
- Model generates atomic structures through reverse diffusion
- Evaluates energy of generated structures
- Uses resulting samples to refine the model
Maintains a replay buffer B={(x0(i),E(i))} storing generated configurations and their energies.
The core innovation is the Boltzmann-weighted score matching loss:
LθBoltzmann=Et∼U(0,1)[λ(t)Ex0∼q,xt∼pt∣0(xt∣x0)w(E)∥sθ(xt,t)−∇xtlogpt∣0(xt∣x0)∥22]
where the Boltzmann weight is:
w(E)=∑E(i)∈Bexp(−E(i)/T)exp(−E/T)
This design avoids the need to directly sample from the true Boltzmann distribution through importance sampling.
Temperature T is annealed from a high initial value to a low final value, balancing exploration and exploitation:
- Early stage: High temperature encourages broad exploration
- Late stage: Low temperature converges to deep minima
Leverages atomic forces typically available alongside energy:
- Attaches a force prediction head on the shared representation backbone of the score network
- Uses predicted forces in a predictor-corrector sampling scheme:
Δx=α(1−t)ζFθ(x)
- As diffusion time t→0, the correction term provides stronger guidance
- Direct Boltzmann weighting: Avoids force evaluations and Monte Carlo estimation using direct Boltzmann-weighted score matching loss
- Self-supervised learning: Learns from its own generations without external data
- Model transfer: Demonstrates capability to transfer pretrained models across related systems
- Physics-guided approach: Incorporates force field information to accelerate convergence
Two atomic optimization tasks using the MACE-MP0 universal potential:
- Pt adatom optimization on Pt step surface: 3D system, visualizable as 2D through projection along surface normal
- Pt heptamer discovery on 6×6 Pt(111) surface: More complex system for benchmarking and amortized optimization validation
- Success rate in finding target structures
- Average number of energy evaluations required to find target structure
- Best energy over time
- Random Structure Search (RSS): Traditional method implemented using AGOX package
- GO-Diff variants: Without FFG, with FFG, with model transfer
Universal hyperparameters:
- Diffusion sampling steps: 500
- Noise schedule: Linear (VE-SDE)
- Score model architecture: PaiNN GNN (4 blocks), 6Å cutoff
- Final temperature: 0.02
- Learning rate: 10^-4
- Optimizer: AdamW
Task-specific parameters:
- Pt adatom: Buffer size 16, 32 samples per iteration, 10 iterations with exponential annealing
- Pt heptamer: Buffer size 64, 128 samples per iteration, 20 iterations with exponential annealing
- Successfully demonstrated progressive concentration of sampling in low-energy basins
- Validated effectiveness of Boltzmann-weighted loss and annealing schedule
| Method | Evaluations | Success Rate | Avg. Successful Iteration |
|---|
| RSS | 10,000 | 1/8 | 7,816 |
| GO-Diff | 2,560 | 5/8 | 1,667 |
| GO-Diff + FFG | 2,560 | 8/8 | 1,994 |
| GO-Diff + Transfer | 1,280 | 7/8 | 591 |
- Sample efficiency: GO-Diff achieves better success rates with significantly fewer energy evaluations
- Force field guidance effectiveness: FFG improves success rate (from 5/8 to 8/8) and performance
- Transfer learning advantage: Model transfer reduces required evaluations by more than 2-fold (from 1,667 to 591)
- Robustness: Stochasticity of the diffusion process enables GO-Diff to robustly escape local minima
The acceleration from transfer learning is expected, as the transferred model has already captured bonding preferences (e.g., stability of hollow sites below step edges), reducing the optimization task to adjusting inter-atomic geometry rather than learning bonding from scratch.
- Random structure search, basin hopping, genetic algorithms, simulated annealing
- Machine learning interatomic potentials (pretrained or online learning)
- Structure generation in molecular and materials science
- Diffusion models for black-box optimization (DDOM)
- Boltzmann samplers (iDEM, BNEM, Adjoint Sampling)
- Avoids Monte Carlo estimation and force labeling
- Simpler and more sample-efficient training loop
- First demonstration of transfer learning capability across systems
- GO-Diff is an effective data-free global structure optimization framework
- Boltzmann-weighted score matching loss effectively guides low-energy configuration generation
- Amortized optimization significantly improves efficiency through model transfer
- Outperforms traditional methods in sample efficiency and success rate
- Hyperparameter sensitivity: Sample quantity, temperature curve, and training steps are critical hyperparameters requiring careful tuning
- Scalability constraints: Current atomic diffusion models are primarily validated on systems with <20 atoms
- System scale: Further research needed to make GO-Diff applicable to very large realistic-scale systems
- Extension to multi-objective or multi-component design optimization
- Dynamic temperature adjustment and adaptive sampling
- Improved scalability for large systems
- Surrogate acceleration and multi-objective optimization
- Methodological novelty: First successful application of diffusion models to data-free global structure optimization
- Technical sophistication: Boltzmann-weighted score matching loss design is elegant, avoiding complexity of existing methods
- Practical value: Amortized optimization demonstrates significant advantages in real applications
- Comprehensive experiments: Thorough testing on systems of varying complexity
- Solid theoretical foundation: Rigorous theoretical derivation based on importance sampling
- System size limitations: Validation only on relatively small atomic systems (≤20 atoms)
- Hyperparameter tuning: Method sensitivity to multiple hyperparameters may limit generalizability
- Limited benchmarking: Comparison only with RSS, lacking comparison with other modern methods
- Insufficient theoretical analysis: Lacks theoretical guarantees on convergence and sample complexity
- Academic contribution: Introduces new generative modeling paradigm to global optimization field
- Practical value: Potential applications in materials discovery and catalyst design
- Reproducibility: Provides complete code and implementation details
- Inspirational significance: Opens new directions for diffusion model applications in optimization problems
- Materials discovery: Structure prediction for novel catalysts and functional materials
- Surface science: Investigation of adsorption sites and surface reconstruction
- Small molecule optimization: Molecular conformation search and drug design
- Related systems: Particularly suitable for scenarios requiring multiple optimizations across similar systems
This paper cites 38 relevant references covering key works in global optimization, diffusion models, and machine learning potentials, providing a solid theoretical foundation for method development.