2025-11-18T22:43:13.755250

Sparse Iterative Solvers Using High-Precision Arithmetic with Quasi Multi-Word Algorithms

Mukunoki, Ozaki
To obtain accurate results in numerical computation, high-precision arithmetic is a straightforward approach. However, most processors lack hardware support for floating-point formats beyond double precision (FP64). Double-word arithmetic (Dekker 1971) extends precision by using standard floating-point operations to represent numbers with twice the mantissa length. Building on this concept, various multi-word arithmetic methods have been proposed to further increase precision by combining additional words. Simplified variants, known as quasi algorithms, have also been introduced, which trade a certain loss of accuracy for reduced computational cost. In this study, we investigate the performance of quasi algorithms for double- and triple-word arithmetic in sparse iterative solvers based on the Conjugate Gradient method, and compare them with both non-quasi algorithms and standard FP64. We evaluate execution time on an x86 processor, the number of iterations to convergence, and solution accuracy. Although quasi algorithms require appropriate normalization to preserve accuracy - without it, convergence cannot be achieved - they can still reduce runtime when normalization is applied correctly, while maintaining accuracy comparable to full multi-word algorithms. In particular, quasi triple-word arithmetic can yield more accurate solutions without significantly increasing execution time relative to double-word arithmetic and its quasi variant. Furthermore, for certain problems, a reduction in iteration count contributes to additional speedup. Thus, quasi triple-word arithmetic can serve as a compelling alternative to conventional double-word arithmetic in sparse iterative solvers.
academic

Sparse Iterative Solvers Using High-Precision Arithmetic with Quasi Multi-Word Algorithms

Basic Information

  • Paper ID: 2510.13536
  • Title: Sparse Iterative Solvers Using High-Precision Arithmetic with Quasi Multi-Word Algorithms
  • Authors: Daichi Mukunoki (Nagoya University), Katsuhisa Ozaki (Shibaura Institute of Technology)
  • Classification: cs.MS (Mathematical Software)
  • Publication Date: October 15, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.13536

Abstract

To obtain accurate results in numerical computations, high-precision arithmetic is a direct approach. However, most processors lack hardware support for floating-point formats beyond double precision (FP64). Double-word arithmetic (Dekker 1971) extends precision by representing numbers with twice the mantissa length using standard floating-point operations. Building on this concept, various multi-word arithmetic methods have been proposed to further increase precision by combining additional words. Simplified variants, namely quasi-algorithms, have also been introduced, trading certain precision loss for reduced computational cost. This study investigates the performance of quasi-algorithms for double-word and triple-word arithmetic in sparse iterative solvers based on the conjugate gradient method, comparing them with non-quasi variants and standard FP64.

Research Background and Motivation

Core Problems

  1. Hardware Limitations: Most processors lack hardware support for floating-point formats beyond double precision (FP64), limiting the implementation of high-precision numerical computations
  2. Precision Requirements of Sparse Iterative Solvers: When solving large-scale sparse linear systems, rounding errors increase the number of iterations required for convergence, affecting solution precision and efficiency
  3. Trade-off Between Performance and Precision: Traditional multi-word arithmetic methods can improve precision but incur significant computational overhead

Research Significance

  • Sparse iterative solvers are widely applied in scientific computing and engineering applications
  • High-precision arithmetic can improve convergence and reduce iteration counts
  • In memory-constrained applications, the additional overhead of multi-word arithmetic may be masked by memory latency

Limitations of Existing Methods

  • Traditional multi-word arithmetic (e.g., DW, TW) has high computational cost
  • Quasi-algorithms reduce computational cost but may result in precision loss
  • Lack of systematic evaluation of quasi-algorithm performance in iterative solvers

Core Contributions

  1. Systematic Performance Evaluation of Quasi-Algorithms: First systematic evaluation of QDW and QTW algorithm performance in sparse iterative solvers
  2. Discovery of Normalization's Critical Role: Demonstrates the importance of appropriate normalization for quasi-algorithm convergence
  3. Proposal of QTW as Effective Alternative: Proves that quasi-triple-word arithmetic (QTW) can serve as an effective alternative to traditional double-word arithmetic
  4. Comprehensive Performance Analysis: Comprehensive evaluation from three dimensions: execution time, iteration count, and solution precision

Methodology Details

Problem Definition

Solve the symmetric positive definite linear system Ax = b, where:

  • A is an n×n symmetric positive definite sparse matrix
  • b is the right-hand side vector
  • x is the solution vector to be computed

The conjugate gradient (CG) method is used for iterative solving, evaluating performance across different precision arithmetic.

Multi-Word Arithmetic Architecture

Fundamental Algorithms

Error-Free Transformation Algorithms:

  • TwoSum(a,b): Decomposes a+b into floating-point result x and rounding error y
  • QuickTwoSum(a,b): Efficient variant of TwoSum, requiring |a|≥|b|
  • TwoProdFMA(a,b): Decomposes a×b into result and error using FMA operations

Double-Word Arithmetic (DW)

DWadd: [c1,c2] = DWadd(a1,a2,b1,b2)
- Operands: 11 FP64 operations
- Includes normalization step (QuickTwoSum)

DWmul: [c1,c2] = DWmul(a1,a2,b1,b2)  
- Operands: 7 FP64 operations
- Includes normalization step

Quasi Double-Word Arithmetic (QDW)

  • Omits normalization step, allowing high and low word overlap
  • QDWadd: 8 operations, QDWmul: 4 operations
  • Significantly reduced computational cost

Quasi Triple-Word Arithmetic (QTW)

QTWadd: [c1,c2,c3] = QTWadd(a1,a2,a3,b1,b2,b3)
- Operands: 21 FP64 operations
- Does not enforce fl(c1+c2)=c1 and fl(c2+c3)=c2

QTWmul: [c1,c2,c3] = QTWmul(a1,a2,a3,b1,b2,b3)
- Operands: 24 FP64 operations

Technical Innovations

  1. SIMD Vectorization Optimization:
    • Vectorization using AVX2 and AVX-512 instruction sets
    • QTW algorithm eliminates conditional branches, better suited for vectorization
  2. Normalization Strategy:
    • Normalization performed after residual vector update in CG method
    • VecSum3 algorithm used to mitigate bit overlap in triple-word arithmetic
  3. Mixed-Precision Implementation:
    • Coefficient matrix A and right-hand side vector b stored in FP64
    • Internal computations use corresponding multi-word arithmetic

Experimental Setup

Datasets

Eight symmetric positive definite matrices from the SuiteSparse collection:

MatrixDimension nNon-zeros nnzApplication Domain
Hook_14981,498,02360,917,445Structural problems
bone010986,70347,851,783Model reduction
nd24k72,00028,715,6342D/3D problems
crankseg_263,83814,148,858Structural problems

Evaluation Metrics

  1. Execution Time: Per-iteration time and total convergence time
  2. Iteration Count: Number of iterations required to achieve convergence
  3. Solution Precision: Relative error norm ||xk-x*||2/||x*||2

Comparison Methods

  • CG-FP64: Standard double-precision implementation
  • CG-DW: Double-word arithmetic implementation
  • CG-QDW: Quasi double-word arithmetic implementation
  • CG-TW: Triple-word arithmetic implementation
  • CG-QTW: Quasi triple-word arithmetic implementation

Implementation Details

  • Hardware Platform: Intel Xeon Gold 6230 CPU (20 cores, 2.10-3.90 GHz)
  • Compiler: g++ 11.3.0, optimization flags -O3 -march=native
  • Parallelization: OpenMP + SIMD vectorization
  • Convergence Tolerance: ε = 10^-16, 10^-24, 10^-32

Experimental Results

Main Results

Performance Overhead Analysis

Execution time overhead relative to FP64 (100 iterations):

  • CG-QDW: approximately 1.3×
  • CG-DW: approximately 2.1×
  • CG-QTW: approximately 2.4×
  • CG-TW: up to 67×

Convergence Performance Comparison

Typical results under ε=10^-16 convergence tolerance:

MatrixFP64 Time(s)/IterationsQDW Time(s)/IterationsQTW Time(s)/Iterations
bone010170/21780120/12547150/11352
pdb1HYS5.4/128073.4/62854.9/5346

Key Findings

  1. Necessity of Normalization:
    • Quasi-algorithms fail to converge without normalization
    • Normalization after residual vector update ensures convergence
  2. Advantages of QTW:
    • Significantly reduced computational overhead compared to TW
    • Achieves comparable precision to TW
    • Supports SIMD vectorization for better performance
  3. Benefits of Reduced Iteration Count:
    • High-precision arithmetic reduces iteration count
    • Total execution time may be lower than FP64 implementation

Throughput Analysis

SpMV operation throughput (GB/s):

  • FP64 and QDW: Close to memory bandwidth limit (approximately 90 GB/s)
  • DW and QTW: Reach memory-bound performance after SIMD optimization
  • TW: Significantly degraded performance due to branch effects

Multi-Word Arithmetic Development

  • Foundational Theory: Dekker (1971) double-word arithmetic
  • Extended Methods: Triple-word (TW), quad-word (QW) arithmetic
  • Simplified Variants: Quasi-algorithms (QDW, QTW, QQW)

High-Precision Linear Algebra Libraries

  • QD Library: Fortran/C++ implementation of double-word and quad-word arithmetic
  • XBLAS: BLAS routines based on DW arithmetic
  • MPLAPACK: High-precision BLAS and LAPACK

Sparse Iterative Solver Applications

  • Research on quad-precision CG solvers
  • Mixed-precision methods
  • Ozaki scheme for accurate sparse matrix-vector multiplication

Conclusions and Discussion

Main Conclusions

  1. Feasibility of Quasi-Algorithms: Through appropriate normalization, quasi-algorithms can be effectively applied in sparse iterative solvers
  2. Advantages of QTW: Quasi-triple-word arithmetic provides a good balance between precision and performance
  3. Performance Improvement Potential: On certain problems, reduced iteration counts can provide additional acceleration

Limitations

  1. Normalization Overhead: Requires trade-off between precision and execution time
  2. Problem Dependency: Performance improvement depends on specific problem characteristics
  3. Evaluation Scope: Only evaluates basic CG method, does not include preconditioning techniques

Future Directions

  1. Normalization Strategy Optimization: Study effects of more frequent normalization on precision
  2. Extension to Other Iterative Methods: Evaluate application in other solvers
  3. Distributed Environment Applications: Potential in communication-latency-dominated environments
  4. Low-Precision Format Implementation: Multi-word arithmetic implementation using FP16/FP32 on AI processors

In-Depth Evaluation

Strengths

  1. Systematic Study: First systematic evaluation of quasi-algorithm performance in iterative solvers
  2. High Practical Value: QTW algorithm provides a practical high-precision computation scheme
  3. Comprehensive Experiments: Thorough evaluation from multiple dimensions (time, precision, convergence)
  4. Sound Technical Innovation: Well-designed SIMD optimization and normalization strategies

Weaknesses

  1. Insufficient Theoretical Analysis: Lacks theoretical analysis of error accumulation in quasi-algorithms
  2. Limited Evaluation Scope: Only evaluates CG method, lacks verification on other iterative methods
  3. Single Normalization Strategy: Only attempts one normalization location and frequency

Impact

  • Academic Contribution: Provides new algorithm options for high-precision numerical computation
  • Practical Value: QTW algorithm can be directly applied to practical scientific computing problems
  • Reproducibility: Sufficient implementation details for reproduction

Applicable Scenarios

  1. Scientific Computing: Large-scale sparse linear systems requiring high-precision solutions
  2. Engineering Simulation: Structural analysis, electromagnetic field computation, etc.
  3. Resource-Constrained Environments: Systems lacking hardware quad-precision support

References

This paper cites 29 relevant references covering key works in multi-word arithmetic theory, high-precision linear algebra libraries, and sparse iterative solvers, providing a solid theoretical foundation for the research.