2025-11-18T22:43:13.755250

Sparse Iterative Solvers Using High-Precision Arithmetic with Quasi Multi-Word Algorithms

Mukunoki, Ozaki

To obtain accurate results in numerical computation, high-precision arithmetic is a straightforward approach. However, most processors lack hardware support for floating-point formats beyond double precision (FP64). Double-word arithmetic (Dekker 1971) extends precision by using standard floating-point operations to represent numbers with twice the mantissa length. Building on this concept, various multi-word arithmetic methods have been proposed to further increase precision by combining additional words. Simplified variants, known as quasi algorithms, have also been introduced, which trade a certain loss of accuracy for reduced computational cost. In this study, we investigate the performance of quasi algorithms for double- and triple-word arithmetic in sparse iterative solvers based on the Conjugate Gradient method, and compare them with both non-quasi algorithms and standard FP64. We evaluate execution time on an x86 processor, the number of iterations to convergence, and solution accuracy. Although quasi algorithms require appropriate normalization to preserve accuracy - without it, convergence cannot be achieved - they can still reduce runtime when normalization is applied correctly, while maintaining accuracy comparable to full multi-word algorithms. In particular, quasi triple-word arithmetic can yield more accurate solutions without significantly increasing execution time relative to double-word arithmetic and its quasi variant. Furthermore, for certain problems, a reduction in iteration count contributes to additional speedup. Thus, quasi triple-word arithmetic can serve as a compelling alternative to conventional double-word arithmetic in sparse iterative solvers.

academic

Sparse Iterative Solvers Using High-Precision Arithmetic with Quasi Multi-Word Algorithms

基本信息

论文ID: 2510.13536
标题: Sparse Iterative Solvers Using High-Precision Arithmetic with Quasi Multi-Word Algorithms
作者: Daichi Mukunoki (Nagoya University), Katsuhisa Ozaki (Shibaura Institute of Technology)
分类: cs.MS (Mathematical Software)
发表时间: 2025年10月15日 (arXiv预印本)
论文链接: https://arxiv.org/abs/2510.13536

硬件限制问题：大多数处理器缺乏对双精度(FP64)以外浮点格式的硬件支持，限制了高精度数值计算的实现
稀疏迭代求解器的精度需求：在求解大型稀疏线性系统时，舍入误差会增加收敛所需的迭代次数，影响求解精度和效率
性能与精度的权衡：传统多字算术方法虽然能提高精度，但计算开销较大

研究重要性

稀疏迭代求解器广泛应用于科学计算和工程应用中
高精度算术可以改善收敛性，减少迭代次数
在内存受限的应用中，多字算术的额外开销可能被内存延迟掩盖

现有方法局限性

传统多字算术(如DW、TW)计算成本高
准算法虽然降低了计算成本，但可能导致精度损失
缺乏对准算法在迭代求解器中性能的系统性评估

核心贡献

系统性评估准算法性能：首次在稀疏迭代求解器中系统评估QDW和QTW算法的性能
发现归一化的关键作用：证明了适当的归一化对准算法收敛性的重要性
提出QTW作为有效替代方案：证明准三字算术(QTW)可以作为传统双字算术的有效替代
全面的性能分析：从执行时间、迭代次数和求解精度三个维度进行综合评估

方法详解

任务定义

求解对称正定线性系统 Ax = b，其中：

A为n×n对称正定稀疏矩阵
b为右端向量
x为待求解向量

使用共轭梯度(CG)方法进行迭代求解，评估不同精度算术的性能。

多字算术架构

基础算法

错误自由变换算法：

TwoSum(a,b)：将a+b分解为浮点结果x和舍入误差y
QuickTwoSum(a,b)：TwoSum的高效变体，要求|a|≥|b|
TwoProdFMA(a,b)：使用FMA运算将a×b分解为结果和误差

双字算术(DW)

DWadd: [c1,c2] = DWadd(a1,a2,b1,b2)
- 操作数：11个FP64操作
- 包含归一化步骤(QuickTwoSum)

DWmul: [c1,c2] = DWmul(a1,a2,b1,b2)  
- 操作数：7个FP64操作
- 包含归一化步骤

准双字算术(QDW)

省略归一化步骤，允许高低字重叠
QDWadd：8个操作，QDWmul：4个操作
计算成本显著降低

准三字算术(QTW)

QTWadd: [c1,c2,c3] = QTWadd(a1,a2,a3,b1,b2,b3)
- 操作数：21个FP64操作
- 不强制fl(c1+c2)=c1和fl(c2+c3)=c2

QTWmul: [c1,c2,c3] = QTWmul(a1,a2,a3,b1,b2,b3)
- 操作数：24个FP64操作