The exploitation of space group symmetries in numerical calculations of periodic crystalline solids accelerates calculations and provides physical insight. We present results for a space-group symmetry adaptation of electronic structure calculations within the finite-temperature self-consistent GW method along with an efficient parallelization scheme on accelerators. Our implementation employs the simultaneous diagonalization of the Dirac characters of the orbital representation. Results show that symmetry adaptation in self-consistent many-body codes results in substantial improvements of the runtime, and that block diagonalization on top of a restriction to the irreducible wedge results in additional speedup.
Symmetry adaptation for self-consistent many-body calculations
- Paper ID: 2405.09494
- Title: Symmetry adaptation for self-consistent many-body calculations
- Authors: Xinyang Dong (AI for Science Institute Beijing & University of Michigan), Emanuel Gull (University of Michigan)
- Classification: physics.comp-ph
- Publication Date: May 16, 2024 (Preprint submitted to Computer Physics Communications)
- Paper Link: https://arxiv.org/abs/2405.09494
This paper investigates the utilization of space group symmetry to accelerate computations and provide physical insights in numerical calculations of periodic crystalline solids. The authors implement space group symmetry adaptation in finite-temperature self-consistent GW electronic structure calculations and propose an efficient parallelization scheme on accelerators. The implementation employs the simultaneous diagonalization method of Dirac characteristics in the orbital representation. Results demonstrate that symmetry adaptation in self-consistent many-body codes significantly improves runtime, with block diagonalization on the irreducible wedge basis providing additional acceleration benefits.
- Problem to be addressed: Modern many-body theoretical calculations (such as self-consistent GW methods) face enormous computational burdens when processing periodic crystalline materials, requiring repeated calculations of complex objects such as frequency-dependent propagators, vertex functions, and screened interactions.
- Problem significance:
- Space group symmetry is fundamental to understanding crystalline materials and provides physical insights
- Exploitation of symmetry can significantly accelerate numerical computations
- Modern computational architectures such as GPUs can effectively utilize the parallelism exposed by group structures
- Limitations of existing methods:
- Standard electronic structure codes (Hartree-Fock, DFT, non-self-consistent GW) are primarily based on single-particle density matrices, with mature symmetry adaptation formalism
- However, modern many-body techniques require calculations of objects beyond the density matrix, and the formalization of symmetry adaptation is insufficient
- Research motivation: Extend the symmetry adaptation formalism pioneered by Dovesi et al. in Hartree-Fock and DFT theory to self-consistent GW methods, and implement efficient parallelization on modern GPU architectures.
- Method extension: Extend the symmetry adaptation method based on simultaneous diagonalization of Dirac characteristics from single-body theory to self-consistent many-body GW calculations
- Efficient implementation: Develop an efficient parallelization scheme on GPU accelerators, implementing hybrid MPI and CUDA parallelization
- Performance improvement: Demonstrate that symmetry adaptation combined with block diagonalization can achieve approximately one order of magnitude reduction in floating-point operations
- Algorithm optimization: Propose a complete numerical algorithm for handling non-symmorphic space groups and projective representations
This paper studies how to exploit space group symmetry to accelerate electronic structure calculations of periodic crystalline solids at finite temperature, particularly in self-consistent GW methods. The input consists of crystal structure and Hamiltonian, with output being self-consistent Green's functions and self-energy.
- Space group operations: Represented as α^={α∣v(α)}, where α is a point group operation and v(α) is a translation
- Orbital transformation: The action of symmetry operations on orbitals is:
α^g(xj)k(r)=exp[−ik~⋅v(αx)]×[O(α)g(xj)k~(r)]
For momentum k, the projective representation matrix is defined as:
Dk(α)=exp[ik⋅v(α)]Ok(α^)λk(α,β)=exp{ik⋅[v(β)−αv(β)]}
- Dirac characteristic definition:
Ωc=hnc∑β∈GD(α)⋅D(γ)⋅D(β)−1
- Simultaneous diagonalization: Obtain transformation matrix Uk through simultaneous diagonalization of all relevant Dirac characteristics
- Extension to many-body theory: First systematic application of simultaneous diagonalization of Dirac characteristics to self-consistent GW calculations
- Tensor transformation: Develop symmetry transformation formulas for three-index interaction tensors:
Vk~ik~j=Oˉq(α^)Oki(α^)VkikjOkj†(α^)
- GPU optimization: Design GPU acceleration scheme with asynchronous stream processing and batched ZGEMM calls
Tested four III-V and IV group compounds:
- Si (space group 227, non-symmorphic)
- BN (space group 194, non-symmorphic)
- AlP (space group 216, symmorphic)
- GaAs (space group 216, symmorphic)
- Basis set: gthdzvp basis set and def2-svp-ri auxiliary basis set
- Temperature grid: 114 imaginary time points, 103 bosonic frequency points
- Momentum grid: nk×nk×nk (nk=1,2,4,6)
- Floating-point operations (FLOP)
- GPU acceleration ratio
- Memory usage
- Full Brillouin zone calculation (Full)
- Irreducible wedge rotation only (Rotation)
- Rotation + block diagonalization (Block Diag)
Taking Si as an example, FLOP comparison at different nk values:
| nk | nik | Full | Rotation | Block Diag | Speedup |
|---|
| 1 | 1 | 1.31×1010 | 1.31×1010 | 1.50×109 | 8.7× |
| 2 | 3 | 1.73×1012 | 1.01×1012 | 2.24×1011 | 7.7× |
| 4 | 8 | 1.10×1014 | 2.13×1013 | 8.55×1012 | 12.9× |
| 6 | 16 | 1.25×1015 | 1.43×1014 | 6.87×1013 | 18.2× |
- Achieved near-ideal linear speedup on 16 V100 GPUs
- Both P0 and Σ̃ computation kernels demonstrate excellent scalability
- Effect of k-point count: The advantage of inter-point rotation becomes more pronounced as the total number of k-points increases
- Block diagonalization advantage: Block diagonalization is more effective with fewer k-points, as more points lie on the IBZ boundary
- Non-symmorphic group advantage: Non-symmorphic space groups (such as Si, BN) show greater speedup compared to symmorphic groups
- Traditional symmetry adaptation: Pioneering work by Dovesi et al. in the CRYSTAL code
- Many-body theory: Hedin's GW method and its self-consistent implementation
- GPU computing: Accelerator optimization for electronic structure calculations
- First systematic extension of symmetry adaptation to self-consistent many-body calculations
- Provides complete treatment of non-symmorphic space groups
- Implements efficient GPU parallelization
- Symmetry adaptation brings significant runtime improvements in self-consistent many-body codes
- Block diagonalization provides additional acceleration on the irreducible wedge basis
- GPU architecture effectively utilizes the parallelism exposed by symmetry
- Current implementation is limited to standard space groups, excluding magnetic space groups
- For systems with very large k-point counts, the advantage of block diagonalization diminishes
- Requires sufficient GPU memory to store critical data structures
- Magnetic space groups: Extension to Shubnikov groups for handling magnetic and relativistic systems
- Optical response: Exploit symmetry knowledge to interpret optical response functions
- Higher-order methods: Application to more accurate simulation methods including vertex functions
- Theoretical rigor: Based on mature group theory foundations with complete mathematical derivations
- Practical value: Achieves approximately one order of magnitude computational acceleration, significant for large-scale calculations
- Technical completeness: Provides complete solution from theory to implementation
- Performance verification: Method effectiveness validated across multiple material systems
- Scope of applicability: Currently limited to periodic systems; extensibility to surface or defect systems remains unclear
- Memory requirements: GPU implementation has high memory demands, potentially limiting application to large systems
- Algorithm stability: Simultaneous diagonalization may encounter numerical stability issues for large orbital representation matrices
- Academic contribution: Provides standard paradigm for symmetry exploitation in many-body calculations
- Practical value: Significantly reduces computational cost of self-consistent GW calculations, enabling calculations on larger systems
- Reproducibility: Implemented based on open-source software, facilitating community adoption and improvement
- Periodic crystalline materials with high symmetry
- Electronic structure calculations requiring accurate many-body effects
- Large-scale parallel computing environments, particularly GPU clusters
This paper is primarily based on the following key works:
- Dovesi et al.'s symmetry adaptation theory (Int. J. Quantum Chem. 1986, 1998)
- Hedin's GW method (Phys. Rev. 1965)
- Bradley & Cracknell's mathematical theory of solid symmetry
- Lax's symmetry principles in solid and molecular physics
This paper represents an important contribution to computational physics, successfully combining symmetry theory with modern many-body calculations and GPU acceleration technology, providing a new solution for efficient electronic structure calculations.