2025-11-25T09:01:17.655044

Leveraging recurrence in neural network wavefunctions for large-scale simulations of Heisenberg antiferromagnets on the triangular lattice

Moss, Wiersema, Hibat-Allah et al.
Variational Monte Carlo simulations have been crucial for understanding quantum many-body systems, especially when the Hamiltonian is frustrated and the ground-state wavefunction has a non-trivial sign structure. In this paper, we use recurrent neural network (RNN) wavefunction ansätze to study the triangular-lattice antiferromagnetic Heisenberg model (TLAHM) for lattice sizes up to $30\times30$. In a recent study [M. S. Moss et al. arXiv:2502.17144], the authors demonstrated how RNN wavefunctions can be iteratively retrained in order to obtain variational results for multiple lattice sizes with a reasonable amount of compute. That study, which looked at the sign-free, square-lattice antiferromagnetic Heisenberg model, showed favorable scaling properties, allowing accurate finite-size extrapolations to the thermodynamic limit. In contrast, our present results illustrate in detail the relative difficulty in simulating the sign-problematic TLAHM. We find that the accuracy of our simulations can be significantly improved by transforming the Hamiltonian with a judicious choice of basis rotation. We also show that a similar benefit can be achieved by using variational neural annealing, an alternative optimization technique that minimizes a pseudo free energy. Ultimately, we are able to obtain estimates of the ground-state properties of the TLAHM in the thermodynamic limit that are in close agreement with values in the literature, showing that RNN wavefunctions provide a powerful toolbox for performing finite-size scaling studies for frustrated quantum many-body systems.
academic

Leveraging recurrence in neural network wavefunctions for large-scale simulations of Heisenberg antiferromagnets on the triangular lattice

Basic Information

  • Paper ID: 2505.20406
  • Title: Leveraging recurrence in neural network wavefunctions for large-scale simulations of Heisenberg antiferromagnets on the triangular lattice
  • Authors: M. Schuyler Moss, Roeland Wiersema, Mohamed Hibat-Allah, Juan Carrasquilla, Roger G. Melko
  • Classification: cond-mat.str-el cond-mat.dis-nn quant-ph
  • Publication Date: October 13, 2025 (arXiv version v3)
  • Paper Link: https://arxiv.org/abs/2505.20406

Abstract

This paper investigates the triangular lattice antiferromagnetic Heisenberg model (TLAHM) using recurrent neural network (RNN) wavefunction ansätze for system sizes up to 30×30. Unlike previously studied square lattice models without sign problems, the TLAHM exhibits complex sign structures that render numerical simulations significantly more challenging. The study demonstrates that through appropriate basis transformations and variational neural annealing techniques, simulation accuracy can be substantially improved. The resulting thermodynamic limit ground state properties are highly consistent with literature values, establishing the efficacy of RNN wavefunctions for finite-size scaling studies of frustrated quantum many-body systems.

Research Background and Motivation

Importance of the Problem

The triangular lattice antiferromagnetic Heisenberg model (TLAHM) represents a canonical example of frustrated quantum magnetism. Although the ground state is now known to exhibit 120° magnetic ordering, numerical investigation of this system remains exceptionally challenging due to geometric frustration. Unlike the square lattice, the TLAHM exhibits a sign problem, rendering quantum Monte Carlo (QMC) simulations difficult.

Limitations of Existing Methods

  1. Exact Diagonalization: Restricted to small system sizes with severe finite-size effects
  2. Traditional Variational Monte Carlo: Dependent on ansatz selection with limited accuracy
  3. QMC Methods: Plagued by sign problems, difficult to achieve controllable errors

Research Motivation

Neural quantum states (NQS) have recently garnered significant attention as highly expressive variational ansätze. However, frustration and non-trivial sign structures have been considered potential obstacles to NQS optimization. The TLAHM thus serves as an important benchmark for testing NQS performance, and this work aims to validate the effectiveness of RNN wavefunctions in such challenging systems.

Core Contributions

  1. First successful application of iteratively retrained RNN wavefunctions to TLAHM, achieving large-scale simulations up to 30×30 systems
  2. Systematic investigation of basis transformation effects on simulation accuracy, revealing that 120° transformations significantly outperform Marshall-Peierls sign rules
  3. Introduction of variational neural annealing (VNA) technique, effectively overcoming optimization difficulties arising from frustration through pseudo-free energy minimization
  4. Extraction of thermodynamic limit properties via finite-size scaling, with ground state energy and sublattice magnetization highly consistent with literature benchmarks
  5. Detailed computational complexity and runtime analysis, demonstrating practical feasibility of the method

Methodology Details

Problem Definition

Investigation of ground state properties of TLAHM: H^=ijSiSj\hat{H} = \sum_{\langle ij \rangle} \vec{S}_i \cdot \vec{S}_j where i,j\langle i,j \rangle denotes nearest-neighbor interactions on the triangular lattice and Si\vec{S}_i represents spin-1/2 operators.

Model Architecture

RNN Wavefunction Design

Two-dimensional recurrent neural network construction of the wavefunction: p(σ)=p(σ1)p(σ2σ1)p(σNσN1,,σ1)p(|\sigma\rangle) = p(\sigma_1)p(\sigma_2|\sigma_1)\cdots p(\sigma_N|\sigma_{N-1},\ldots,\sigma_1)

Key Components:

  1. Gated Recurrent Units (GRU): Process hidden vector information propagation
  2. Complex Phase Parameterization: Handle non-trivial sign structures ΨW(σ)=exp[iϕW(σ)]pW(σ)\Psi_W(\sigma) = \exp[i\phi_W(\sigma)]\sqrt{p_W(\sigma)}
  3. Pseudo-Periodic Boundary Conditions: Maintain causality while simulating periodic systems

Basis Transformation Techniques

Marshall-Peierls Transformation (UsqU_{sq}): Usq=exp(iπjBsqS^jz)U_{sq} = \exp\left(-i\pi\sum_{j\in B_{sq}}\hat{S}^z_j\right)

120° Transformation (UtriU_{tri}): Utri=exp(2πi3[bBtriS^bzcCtriS^cz])U_{tri} = \exp\left(-\frac{2\pi i}{3}\left[\sum_{b\in B_{tri}}\hat{S}^z_b - \sum_{c\in C_{tri}}\hat{S}^z_c\right]\right)

Variational Neural Annealing

Minimization of pseudo-free energy: FW(t)=EWT(t)Sclassical(pW)F_W(t) = E_W - T(t)S_{classical}(p_W) where T(t)T(t) is the annealing temperature and SclassicalS_{classical} is the Shannon entropy.

Technical Innovations

  1. Weight Sharing Mechanism: RNN parameter count independent of system size, enabling iterative retraining
  2. Symmetry Averaging: Apply C6vC_{6v} group averaging only to wavefunction amplitude, avoiding numerical instability from phase averaging
  3. Parameterized Training Schedule: Nsteps(L,s,r;L0,C,F)=s×[Cexp(r(LL0))+F]N_{steps}(L,s,r;L_0,C,F) = s \times [C\exp(-r(L-L_0)) + F]
  4. Zero-Variance Extrapolation: Utilize improved variational state sequences for more accurate energy estimates

Experimental Setup

System Parameters

  • Lattice Sizes: L = 6, 12, 18, 24, 30 (periodic boundary conditions)
  • Hidden Vector Dimension: dhd_h = fixed value (ensuring sufficient expressivity)
  • Symmetries: Enforce U(1) symmetry (zero magnetization), apply C6vC_{6v} point group symmetry

Training Strategy

Four-Stage Training (L=6):

  1. Fixed learning rate γ=5×104\gamma = 5 \times 10^{-4}, temperature T0T_0
  2. Variational neural annealing: Linear cooling to zero
  3. Learning rate decay: γ(t)=γ0×(1+(t/δ))1\gamma(t) = \gamma_0 \times (1+(t/\delta))^{-1}
  4. Apply symmetries, final optimization

Iterative Retraining: Initialize large-size training using optimized results from smaller sizes

Evaluation Metrics

  1. Variational Energy: EW=ΨWH^ΨW/ΨWΨWE_W = \langle\Psi_W|\hat{H}|\Psi_W\rangle/\langle\Psi_W|\Psi_W\rangle
  2. Energy Variance: Measures proximity to eigenstates
  3. V-score: V=Nvar(E)/(EE)2V = N\text{var}(E)/(E-E_\infty)^2
  4. Sublattice Magnetization: Computed via momentum-space correlation functions

Experimental Results

Main Results

Basis Transformation Comparison (L=6)

  • No Transformation/Marshall-Peierls: Requires high-temperature annealing (T0=1.0T_0 = 1.0) for accurate results
  • 120° Transformation: Insensitive to annealing temperature, achieves excellent results at T0=0T_0 = 0
  • Optimal Energy: -0.5562(2) (approaching exact diagonalization result -0.5603734)

Finite-Size Scaling Results

Energy Scaling (using E(L)=E+e1/L3E(L) = E_\infty + e_1/L^3):

  • Zero-variance extrapolated energy: E=0.5517569(9)E_\infty = -0.5517569(9)
  • DMRG benchmark: EDMRG=0.5503(8)E_\infty^{DMRG} = -0.5503(8)
  • iPEPS benchmark: EiPEPS=0.55161(6)E_\infty^{iPEPS} = -0.55161(6)

Sublattice Magnetization:

  • M=0.192(2)M_\infty = 0.192(2) (from M2M^2 extrapolation)
  • M=0.198(2)M_\infty = 0.198(2) (from MC2M^2_C extrapolation)
  • DMRG benchmark: MDMRG=0.208(8)M_\infty^{DMRG} = 0.208(8)

Computational Complexity Analysis

  • Single-Step Training Time: O(L4)O(L^4) scaling
  • Total Runtime: Maximum 1700 GPU hours for simulation (covering six system sizes)
  • Parameterized training schedule effectively controls computational costs for large systems

Key Findings

  1. SU(2) Symmetry Breaking: Learned RNN states represent superpositions of Anderson tower states rather than true singlets
  2. Importance of Sign Structure: Success of 120° transformation demonstrates critical role of basis choice in learning non-trivial sign structures
  3. VNA Effectiveness: Achieves good results through appropriate annealing even in suboptimal bases

Numerical Methods for Quantum Many-Body Systems

  • DMRG: Significant progress in cylindrical geometry
  • iPEPS: Direct parameterization of thermodynamic limit ground states
  • Traditional VMC: Using projected wavefunction ansätze

Neural Quantum State Development

  • RBM: Earliest NQS architecture
  • CNN: Exploiting translational invariance
  • Transformer: Handling long-range correlations
  • RNN: Focus of this work, supporting iterative retraining

Specialized TLAHM Research

Historically controversial regarding ground state properties, ultimately confirmed as 120° antiferromagnetic ordered state through Green's function Monte Carlo and other methods.

Conclusions and Discussion

Main Conclusions

  1. RNN wavefunctions successfully simulate TLAHM despite frustration and non-trivial sign structures
  2. Basis transformations and VNA are key techniques, substantially enhancing optimization
  3. Iterative retraining strategy is effective, enabling efficient large-scale simulations
  4. Thermodynamic limit results consistent with benchmarks, validating method reliability

Limitations

  1. Requires more computational resources than square lattice: Minimum decay rate reduced from 0.25 to 0.158
  2. Poor V-scores: Indicates TLAHM is indeed a more challenging optimization problem
  3. Incomplete SU(2) symmetry preservation: May affect accuracy of certain observables
  4. Still requires Adam optimizer: Advanced optimization methods like SR perform poorly on RNNs

Future Directions

  1. Systematic study of sign structures: Understanding deeper mechanisms of basis transformation success
  2. Advanced optimization algorithms: Exploring SR variants suitable for RNNs
  3. Other frustrated systems: Extension to kagome lattice and other geometries
  4. Quantum phase transitions: Leveraging scalability to study critical phenomena

In-Depth Evaluation

Strengths

  1. Strong Technical Innovation: First successful application of iteratively retrained RNNs to challenging frustrated systems
  2. Comprehensive Experimental Design: Systematic comparison of different basis transformations and optimization strategies
  3. High Result Credibility: Multiple verification methods with high consistency to independent benchmarks
  4. Significant Practical Value: Provides effective tools for large-scale frustrated quantum systems
  5. Thorough Analysis: Understanding sign problem effects from optimization perspective

Weaknesses

  1. Limited Theoretical Understanding: Lacks deep analysis of 120° transformation success mechanisms
  2. High Computational Cost: Still requires more resources compared to square lattice
  3. Symmetry Treatment: SU(2) breaking may affect precision of certain observables
  4. Unknown Generalizability: Performance on other frustrated systems remains to be verified

Impact

  1. Methodological Contribution: Important precedent for NQS applications to frustrated systems
  2. Technical Transferability: Iterative retraining strategy applicable to other quantum many-body problems
  3. Benchmark Value: Provides new high-precision numerical results for TLAHM
  4. Inspirational Significance: Reveals importance of basis transformations in quantum machine learning

Applicable Scenarios

  1. Two-Dimensional Frustrated Quantum Magnets: Particularly suited for geometrically frustrated systems
  2. Finite-Size Scaling Studies: RNN scalability advantages evident
  3. Ground State Property Calculations: Energy, magnetization and other ground state observables
  4. Methodological Research: Benchmark problem for testing new NQS architectures

References

This paper cites important literature in the field, including:

  • Anderson's pioneering resonating valence bond theory
  • Bernu et al.'s exact diagonalization benchmark results
  • Capriotti et al.'s Green's function Monte Carlo studies
  • Carleo-Troyer's foundational neural quantum state work
  • Recent high-precision DMRG and iPEPS results

Overall Assessment: This is a high-quality computational physics paper with significant contributions at both methodological and application levels. By cleverly combining basis transformations, variational annealing, and iterative retraining techniques, it successfully tackles the challenging TLAHM problem, opening new pathways for neural quantum state applications to frustrated systems. Despite some theoretical understanding limitations, its practical value and inspirational significance make it an important advance in the field.