2025-11-14T19:19:11.421355

GO-Diff: Data-free and amortized global structure optimization

RÃ¸nne, Vegge, Bhowmik

We introduce GO-Diff, a diffusion-based method for global structure optimization that learns to directly sample low-energy atomic configurations without requiring prior data or explicit relaxation. GO-Diff is trained from scratch using a Boltzmann-weighted score-matching loss, leveraging only the known energy function to guide generation toward thermodynamically favorable regions. The method operates in a two-stage loop of self-sampling and model refinement, progressively improving its ability to target low-energy structures. Compared to traditional optimization pipelines, GO-Diff achieves competitive results with significantly fewer energy evaluations. Moreover, by reusing pretrained models across related systems, GO-Diff supports amortized optimization - enabling faster convergence on new tasks without retraining from scratch.

academic

GO-Diff: Data-free and amortized global structure optimization

Basic Information

Paper ID: 2510.13448
Title: GO-Diff: Data-free and amortized global structure optimization
Authors: Nikolaj Rønne, Tejs Vegge, Arghya Bhowmik (Technical University of Denmark)
Classification: physics.comp-ph cond-mat.dis-nn cond-mat.mtrl-sci cs.CE
Publication Date: October 15, 2025 (Preprint)
Paper Link: https://arxiv.org/abs/2510.13448

Abstract

This paper introduces GO-Diff, a diffusion model-based global structure optimization method capable of directly sampling low-energy atomic configurations without prior data or explicit relaxation. GO-Diff is trained from scratch using Boltzmann-weighted score matching loss, leveraging only known energy functions to guide the generative process toward thermodynamically favorable regions. The method employs a two-stage iterative cycle of self-sampling and model refinement to progressively enhance its capability to target low-energy structures. Compared to traditional optimization workflows, GO-Diff achieves competitive results with significantly fewer energy evaluations. Furthermore, by reusing pretrained models across related systems, GO-Diff enables amortized optimization—converging faster on new tasks without retraining from scratch.

Research Background and Motivation

Problem Statement

This research addresses the global structure optimization problem for atomic systems, namely finding low-energy stable atomic configurations on the potential energy surface (PES). The potential energy surface is a high-dimensional, non-convex function mapping atomic positions to corresponding potential energies. Exploring this surface to identify low-energy structures is a fundamental challenge in computational materials science, chemistry, and catalysis.

Problem Significance

Global structure optimization is foundational to applications ranging from catalytic surface discovery to functional material design, with importance for:

Discovery of novel catalytic surfaces
Design of functional materials
Prediction of stable atomic configurations
Understanding of material properties

Limitations of Existing Methods

Traditional global optimization methods suffer from the following issues:

High computational cost: Methods such as random structure search (RSS), basin hopping, genetic algorithms, and simulated annealing rely on local relaxations and gradient-based optimizers, requiring numerous energy and force evaluations
Limited to local optimization: Prone to becoming trapped in local optima, restricting exploration of complex energy landscapes
Data dependency: Machine learning interatomic potentials require carefully selected training data to capture relevant minima, otherwise risking self-reinforcing local minima
Lack of transferability: Existing methods struggle to reuse learned knowledge across related systems

Research Motivation

Diffusion models have shown promise in structure generation for molecular and materials science, but applying them to global optimization tasks is challenging because the objective is to sample rare low-energy configurations corresponding to PES global minima, yet the data distribution of such structures is typically unknown or unavailable.

Core Contributions

Proposed a data-free generative optimization method: Directly samples potential energy surface minima without prior data or explicit relaxation
Developed Boltzmann-weighted loss function: Combines annealing strategy to guide sampling toward low-energy regions while maintaining exploration
Enabled amortized optimization: Achieves knowledge reuse through model transfer across related systems
Demonstrated superior sample efficiency: Exhibits higher sample efficiency compared to classical search methods

Methodology Details

Task Definition

Input: Energy function E(x) of an atomic system, where x represents atomic configuration Output: Low-energy stable atomic configurations Objective: Sample from the Boltzmann distribution: $\pi_T(x) = \frac{\exp(-E(x)/T)}{Z_T}$

Model Architecture

Training Loop

GO-Diff employs a self-sampling iterative operation:

Model generates atomic structures through reverse diffusion
Evaluates energy of generated structures
Uses resulting samples to refine the model

Maintains a replay buffer $B = \{(x_0^{(i)}, E^{(i)})\}$ storing generated configurations and their energies.

Boltzmann-Weighted Score Matching

The core innovation is the Boltzmann-weighted score matching loss:

$L_{\theta}^{Boltzmann} = E_{t\sim U(0,1)}\left[\lambda(t)E_{x_0\sim q, x_t\sim p_{t|0}(x_t|x_0)} w(E) \|s_\theta(x_t,t) - \nabla_{x_t}\log p_{t|0}(x_t|x_0)\|_2^2\right]$

where the Boltzmann weight is: $w(E) = \frac{\exp(-E/T)}{\sum_{E^{(i)}\in B} \exp(-E^{(i)}/T)}$

This design avoids the need to directly sample from the true Boltzmann distribution through importance sampling.

Annealing Strategy

Temperature T is annealed from a high initial value to a low final value, balancing exploration and exploitation:

Early stage: High temperature encourages broad exploration
Late stage: Low temperature converges to deep minima

Force Field Guidance (FFG)

Leverages atomic forces typically available alongside energy:

Attaches a force prediction head on the shared representation backbone of the score network
Uses predicted forces in a predictor-corrector sampling scheme: $\Delta x = \alpha(1-t)\zeta F_\theta(x)$
As diffusion time t→0, the correction term provides stronger guidance

Technical Innovations

Direct Boltzmann weighting: Avoids force evaluations and Monte Carlo estimation using direct Boltzmann-weighted score matching loss
Self-supervised learning: Learns from its own generations without external data
Model transfer: Demonstrates capability to transfer pretrained models across related systems
Physics-guided approach: Incorporates force field information to accelerate convergence

Experimental Setup

Datasets

Two atomic optimization tasks using the MACE-MP0 universal potential:

Pt adatom optimization on Pt step surface: 3D system, visualizable as 2D through projection along surface normal
Pt heptamer discovery on 6×6 Pt(111) surface: More complex system for benchmarking and amortized optimization validation

Evaluation Metrics

Success rate in finding target structures
Average number of energy evaluations required to find target structure
Best energy over time

Comparison Methods

Random Structure Search (RSS): Traditional method implemented using AGOX package
GO-Diff variants: Without FFG, with FFG, with model transfer

Implementation Details

Universal hyperparameters:

Diffusion sampling steps: 500
Noise schedule: Linear (VE-SDE)
Score model architecture: PaiNN GNN (4 blocks), 6Å cutoff
Final temperature: 0.02
Learning rate: 10^-4
Optimizer: AdamW

Task-specific parameters:

Pt adatom: Buffer size 16, 32 samples per iteration, 10 iterations with exponential annealing
Pt heptamer: Buffer size 64, 128 samples per iteration, 20 iterations with exponential annealing

Experimental Results

Main Results

Pt Adatom Optimization

Successfully demonstrated progressive concentration of sampling in low-energy basins
Validated effectiveness of Boltzmann-weighted loss and annealing schedule

Pt Heptamer Discovery

Method	Evaluations	Success Rate	Avg. Successful Iteration
RSS	10,000	1/8	7,816
GO-Diff	2,560	5/8	1,667
GO-Diff + FFG	2,560	8/8	1,994
GO-Diff + Transfer	1,280	7/8	591

Key Findings

Sample efficiency: GO-Diff achieves better success rates with significantly fewer energy evaluations
Force field guidance effectiveness: FFG improves success rate (from 5/8 to 8/8) and performance
Transfer learning advantage: Model transfer reduces required evaluations by more than 2-fold (from 1,667 to 591)
Robustness: Stochasticity of the diffusion process enables GO-Diff to robustly escape local minima

Amortized Optimization Analysis

The acceleration from transfer learning is expected, as the transferred model has already captured bonding preferences (e.g., stability of hollow sites below step edges), reducing the optimization task to adjusting inter-atomic geometry rather than learning bonding from scratch.

Traditional Global Optimization Methods

Random structure search, basin hopping, genetic algorithms, simulated annealing
Machine learning interatomic potentials (pretrained or online learning)

Diffusion Model Applications

Structure generation in molecular and materials science
Diffusion models for black-box optimization (DDOM)
Boltzmann samplers (iDEM, BNEM, Adjoint Sampling)

Avoids Monte Carlo estimation and force labeling
Simpler and more sample-efficient training loop
First demonstration of transfer learning capability across systems

Conclusions and Discussion

Main Conclusions

GO-Diff is an effective data-free global structure optimization framework
Boltzmann-weighted score matching loss effectively guides low-energy configuration generation
Amortized optimization significantly improves efficiency through model transfer
Outperforms traditional methods in sample efficiency and success rate

Limitations

Hyperparameter sensitivity: Sample quantity, temperature curve, and training steps are critical hyperparameters requiring careful tuning
Scalability constraints: Current atomic diffusion models are primarily validated on systems with <20 atoms
System scale: Further research needed to make GO-Diff applicable to very large realistic-scale systems

Future Directions

Extension to multi-objective or multi-component design optimization
Dynamic temperature adjustment and adaptive sampling
Improved scalability for large systems
Surrogate acceleration and multi-objective optimization

In-Depth Evaluation

Strengths

Methodological novelty: First successful application of diffusion models to data-free global structure optimization
Technical sophistication: Boltzmann-weighted score matching loss design is elegant, avoiding complexity of existing methods
Practical value: Amortized optimization demonstrates significant advantages in real applications
Comprehensive experiments: Thorough testing on systems of varying complexity
Solid theoretical foundation: Rigorous theoretical derivation based on importance sampling

Weaknesses

System size limitations: Validation only on relatively small atomic systems (≤20 atoms)
Hyperparameter tuning: Method sensitivity to multiple hyperparameters may limit generalizability
Limited benchmarking: Comparison only with RSS, lacking comparison with other modern methods
Insufficient theoretical analysis: Lacks theoretical guarantees on convergence and sample complexity

Impact

Academic contribution: Introduces new generative modeling paradigm to global optimization field
Practical value: Potential applications in materials discovery and catalyst design
Reproducibility: Provides complete code and implementation details
Inspirational significance: Opens new directions for diffusion model applications in optimization problems

Applicable Scenarios

Materials discovery: Structure prediction for novel catalysts and functional materials
Surface science: Investigation of adsorption sites and surface reconstruction
Small molecule optimization: Molecular conformation search and drug design
Related systems: Particularly suitable for scenarios requiring multiple optimizations across similar systems

References

This paper cites 38 relevant references covering key works in global optimization, diffusion models, and machine learning potentials, providing a solid theoretical foundation for method development.