2025-11-25T09:01:17.655044

Leveraging recurrence in neural network wavefunctions for large-scale simulations of Heisenberg antiferromagnets on the triangular lattice

Moss, Wiersema, Hibat-Allah et al.

Variational Monte Carlo simulations have been crucial for understanding quantum many-body systems, especially when the Hamiltonian is frustrated and the ground-state wavefunction has a non-trivial sign structure. In this paper, we use recurrent neural network (RNN) wavefunction ansÃ¤tze to study the triangular-lattice antiferromagnetic Heisenberg model (TLAHM) for lattice sizes up to $30\times30$. In a recent study [M. S. Moss et al. arXiv:2502.17144], the authors demonstrated how RNN wavefunctions can be iteratively retrained in order to obtain variational results for multiple lattice sizes with a reasonable amount of compute. That study, which looked at the sign-free, square-lattice antiferromagnetic Heisenberg model, showed favorable scaling properties, allowing accurate finite-size extrapolations to the thermodynamic limit. In contrast, our present results illustrate in detail the relative difficulty in simulating the sign-problematic TLAHM. We find that the accuracy of our simulations can be significantly improved by transforming the Hamiltonian with a judicious choice of basis rotation. We also show that a similar benefit can be achieved by using variational neural annealing, an alternative optimization technique that minimizes a pseudo free energy. Ultimately, we are able to obtain estimates of the ground-state properties of the TLAHM in the thermodynamic limit that are in close agreement with values in the literature, showing that RNN wavefunctions provide a powerful toolbox for performing finite-size scaling studies for frustrated quantum many-body systems.

academic

Leveraging recurrence in neural network wavefunctions for large-scale simulations of Heisenberg antiferromagnets on the triangular lattice

基本信息

论文ID: 2505.20406
标题: Leveraging recurrence in neural network wavefunctions for large-scale simulations of Heisenberg antiferromagnets on the triangular lattice
作者: M. Schuyler Moss, Roeland Wiersema, Mohamed Hibat-Allah, Juan Carrasquilla, Roger G. Melko
分类: cond-mat.str-el cond-mat.dis-nn quant-ph
发表时间: 2025年10月13日 (arXiv版本v3)
论文链接: https://arxiv.org/abs/2505.20406

摘要

本文使用递归神经网络(RNN)波函数ansätze研究三角晶格反铁磁海森堡模型(TLAHM)，系统尺寸达到30×30。与之前研究的无符号问题的方格晶格模型不同，TLAHM存在复杂的符号结构，使得数值模拟更加困难。研究发现通过合理的基变换和变分神经退火技术可以显著提高模拟精度，最终获得的热力学极限基态性质与文献值高度一致，证明了RNN波函数在受阻量子多体系统有限尺寸标度研究中的强大能力。

研究背景与动机

问题的重要性

三角晶格反铁磁海森堡模型(TLAHM)是受阻量子磁学的标准例子之一。虽然现在已知其基态表现出120°磁有序，但由于几何受阻的存在，该系统的数值研究极具挑战性。与方格晶格不同，TLAHM存在符号问题，使得量子蒙特卡罗(QMC)模拟困难。

现有方法的局限性

精确对角化：仅限于小系统尺寸，有限尺寸效应严重
传统变分蒙特卡罗：依赖于ansätze的选择，准确性有限
QMC方法：受符号问题困扰，难以获得可控误差

研究动机

神经量子态(NQS)作为高表达力的变分ansätze近年来备受关注，但受阻和非平凡符号结构被认为是NQS优化的潜在障碍。TLAHM因此成为测试NQS性能的重要基准，本文旨在验证RNN波函数在此类困难系统中的有效性。

核心贡献

首次将迭代重训练的RNN波函数成功应用于TLAHM，实现了高达30×30系统的大规模模拟
系统研究了基变换对模拟精度的影响，发现120°变换相比Marshall-Peierls符号规则能显著提升结果
引入变分神经退火(VNA)技术，通过最小化伪自由能有效克服受阻带来的优化困难
通过有限尺寸标度获得热力学极限性质，基态能量和子晶格磁化强度与文献基准值高度一致
提供了详细的计算复杂度和运行时间分析，证明了方法的实用性

方法详解

任务定义

研究TLAHM的基态性质： $\hat{H} = \sum_{\langle ij \rangle} \vec{S}_i \cdot \vec{S}_j$ 其中 $\langle i,j \rangle$ 表示三角晶格上的最近邻相互作用， $\vec{S}_i$ 为自旋-1/2算符。

模型架构

RNN波函数设计

采用二维递归神经网络构建波函数： $p(|\sigma\rangle) = p(\sigma_1)p(\sigma_2|\sigma_1)\cdots p(\sigma_N|\sigma_{N-1},\ldots,\sigma_1)$

关键组件：

门控递归单元(GRU)：处理隐藏向量信息传递
复相位参数化：处理非平凡符号结构 $\Psi_W(\sigma) = \exp[i\phi_W(\sigma)]\sqrt{p_W(\sigma)}$
伪周期边界条件：保持因果性的同时模拟周期系统

基变换技术

Marshall-Peierls变换 ( $U_{sq}$ )： $U_{sq} = \exp\left(-i\pi\sum_{j\in B_{sq}}\hat{S}^z_j\right)$

120°变换 ( $U_{tri}$ )： $U_{tri} = \exp\left(-\frac{2\pi i}{3}\left[\sum_{b\in B_{tri}}\hat{S}^z_b - \sum_{c\in C_{tri}}\hat{S}^z_c\right]\right)$

变分神经退火

最小化伪自由能： $F_W(t) = E_W - T(t)S_{classical}(p_W)$ 其中 $T(t)$ 为退火温度， $S_{classical}$ 为Shannon熵。

技术创新点

权重共享机制：RNN参数数量与系统尺寸无关，支持迭代重训练
对称性平均：仅对波函数幅值进行 $C_{6v}$ 群平均，避免相位平均的数值不稳定
参数化训练计划： $N_{steps}(L,s,r;L_0,C,F) = s \times [C\exp(-r(L-L_0)) + F]$
零方差外推：利用系统改进的变分态序列获得更精确的能量估计

实验设置

系统参数

晶格尺寸：L = 6, 12, 18, 24, 30 (周期边界条件)
隐藏向量维度： $d_h$ = 固定值（保证表达力充足）
对称性：强制U(1)对称性（零磁化），应用 $C_{6v}$ 点群对称

训练策略

四阶段训练（L=6）：

固定学习率 $\gamma = 5 \times 10^{-4}$ ，温度 $T_0$
变分神经退火：线性降温至0
学习率衰减： $\gamma(t) = \gamma_0 \times (1+(t/\delta))^{-1}$
应用对称性，最终优化

迭代重训练：使用小尺寸优化结果初始化大尺寸训练

评价指标

变分能量： $E_W = \langle\Psi_W|\hat{H}|\Psi_W\rangle/\langle\Psi_W|\Psi_W\rangle$
能量方差：衡量与本征态的接近程度
V-score： $V = N\text{var}(E)/(E-E_\infty)^2$
子晶格磁化强度：通过动量空间关联函数计算