Sparse Iterative Solvers Using High-Precision Arithmetic with Quasi Multi-Word Algorithms
Mukunoki, Ozaki
To obtain accurate results in numerical computation, high-precision arithmetic is a straightforward approach. However, most processors lack hardware support for floating-point formats beyond double precision (FP64). Double-word arithmetic (Dekker 1971) extends precision by using standard floating-point operations to represent numbers with twice the mantissa length. Building on this concept, various multi-word arithmetic methods have been proposed to further increase precision by combining additional words. Simplified variants, known as quasi algorithms, have also been introduced, which trade a certain loss of accuracy for reduced computational cost. In this study, we investigate the performance of quasi algorithms for double- and triple-word arithmetic in sparse iterative solvers based on the Conjugate Gradient method, and compare them with both non-quasi algorithms and standard FP64. We evaluate execution time on an x86 processor, the number of iterations to convergence, and solution accuracy. Although quasi algorithms require appropriate normalization to preserve accuracy - without it, convergence cannot be achieved - they can still reduce runtime when normalization is applied correctly, while maintaining accuracy comparable to full multi-word algorithms. In particular, quasi triple-word arithmetic can yield more accurate solutions without significantly increasing execution time relative to double-word arithmetic and its quasi variant. Furthermore, for certain problems, a reduction in iteration count contributes to additional speedup. Thus, quasi triple-word arithmetic can serve as a compelling alternative to conventional double-word arithmetic in sparse iterative solvers.
수치 계산에서 정확한 결과를 얻기 위해 고정밀 산술은 직접적인 방법이다. 그러나 대부분의 프로세서는 배정밀도(FP64) 이외의 부동소수점 형식에 대한 하드웨어 지원이 부족하다. 이중 단어 산술(Dekker 1971)은 표준 부동소수점 연산을 사용하여 두 배의 가수 길이를 가진 숫자를 나타냄으로써 정밀도를 확장한다. 이 개념을 기반으로 추가 단어를 결합하여 정밀도를 더욱 증가시키는 다양한 다중 단어 산술 방법이 제안되었다. 간소화된 변형인 준 알고리즘도 도입되었으며, 이들은 계산 비용 감소를 위해 일정한 정밀도 손실을 감수한다. 본 연구는 켤레 기울기 방법을 기반으로 하는 희소 반복 해석기에서 이중 단어 및 삼중 단어 산술의 준 알고리즘 성능을 조사하고, 이를 비준 알고리즘 및 표준 FP64와 비교한다.