2025-11-10T02:34:56.080990

Verification Challenges in Sparse Matrix Vector Multiplication in High Performance Computing: Part I

Zhang
Sparse matrix vector multiplication (SpMV) is a fundamental kernel in scientific codes that rely on iterative solvers. In this first part of our work, we present both a sequential and a basic MPI parallel implementations of SpMV, aiming to provide a challenge problem for the scientific software verification community. The implementations are described in the context of the PETSc library.
academic

Verification Challenges in Sparse Matrix Vector Multiplication in High Performance Computing: Part I

Basic Information

  • Paper ID: 2510.13427
  • Title: Verification Challenges in Sparse Matrix Vector Multiplication in High Performance Computing: Part I
  • Author: Junchao Zhang (Argonne National Laboratory)
  • Classification: cs.LO cs.DC cs.MS
  • Conference: International Workshop on Verification of Scientific Software (VSS 2025)
  • Publication Information: EPTCS 432, 2025, pp. 98–105
  • Paper Link: https://arxiv.org/abs/2510.13427

Abstract

Sparse matrix vector multiplication (SpMV) is a fundamental kernel in scientific codes that rely on iterative solvers. In this first part of our work, we present both a sequential and a basic MPI parallel implementations of SpMV, aiming to provide a challenge problem for the scientific software verification community. The implementations are described in the context of the PETSc library.

Research Background and Motivation

Problem Definition

This research addresses software verification challenges for sparse matrix vector multiplication (SpMV) in high performance computing. SpMV is the core operation for solving sparse linear systems Ax=b and is widely employed in scientific computing codes based on iterative solvers, particularly in large-scale Krylov subspace methods.

Significance

  1. Fundamental Nature: SpMV is a foundational algorithm in scientific computing, and its correctness directly impacts the reliability of higher-level applications
  2. Complexity: Although mathematically simple (yi = Σ(aij·xj)), storage formats, data distribution, parallelization, and optimization make implementation complex
  3. Verification Challenges: Existing highly optimized implementations present significant challenges for software verification

Limitations of Existing Approaches

  • The PETSc library contains highly optimized MPI parallel SpMV implementations, but their complexity makes verification difficult
  • Lack of standardized challenge problems specifically designed for the software verification community

Research Motivation

To provide a structured challenge problem for the scientific software verification community by offering SpMV implementations ranging from simple to complex, thereby facilitating the development and evaluation of verification tools and methods.

Core Contributions

  1. Provision of Standardized Verification Challenge Problem: Designed standard test cases for SpMV specifically for the scientific software verification community
  2. Implementation of Two SpMV Algorithms with Different Complexity Levels:
    • Sequential implementation (seq.c)
    • Basic MPI parallel implementation (mpibasic.c)
  3. Establishment of Complete Verification Framework: Including input data generation, correctness checking, and error detection mechanisms
  4. Clear Definition of Verification Objectives: Provided specific verification requirements and challenges for each implementation

Methodology Details

Task Definition

Input:

  • Sparse matrix A (M×N with NNZ nonzero elements) stored in CSR format
  • Vector x (N-dimensional)
  • Expected result z = Ax (M-dimensional)

Output:

  • Computed result y = Ax
  • Correctness verification: ||y-z||² should be 0 (ignoring floating-point rounding errors)

Constraints:

  • Use Compressed Sparse Row (CSR) format
  • Support MPI parallel distributed computation

Data Structure Design

CSR Format Representation

Sparse matrix A is represented using three arrays:

  • a[nnz]: Stores nonzero element values
  • j[nnz]: Stores column indices of nonzero elements
  • i[m+1]: Row pointer; ik points to the starting position of row k in a and j

Data Type Definition

// Sequential version
typedef struct {
    int m, n;        // Matrix dimensions
    int *i, *j;      // CSR format indices
    double *a;       // Nonzero element values
} Mat;

// MPI parallel version
typedef struct {
    int m, n, M, N;  // Local and global dimensions
    int rstart, cstart; // Starting row and column indices
    int *i, *j;
    double *a;
} Mat;

Algorithm Implementation

Sequential Implementation

for (i = 0; i < A.m; i++) {
    y.a[i] = 0.0;
    for (j = A.i[i]; j < A.i[i + 1]; j++)
        y.a[i] += A.a[j] * x.a[A.j[j]];
}

MPI Parallel Implementation

  1. Data Distribution Strategy:
    • Matrix distributed by row blocks: m = M/size + (M%size > rank ? 1 : 0)
    • Vector x distributed by column layout: n = N/size + (N%size > rank ? 1 : 0)
  2. Communication Patterns:
    • Use MPI_Allgather or MPI_Allgatherv to collect complete vector x
    • Use MPI_Allreduce to compute global norm
  3. Computational Flow:
    • Compute local matrix layout (rstart, cstart)
    • Extract local data from global arrays
    • Gather distributed vector x into local complete copy X
    • Execute local SpMV computation
    • Compute local error norm and global reduction

Technical Innovations

  1. Progressive Complexity Design: From simple sequential implementation to basic parallel implementation, facilitating progressive testing of verification tools
  2. Standardized Verification Interface: Provides unified input/output format and correctness checking mechanisms
  3. Real-World Application Context: Based on actual implementation patterns from the PETSc library with practical significance
  4. Extensible Framework: Establishes foundation for future more complex optimized versions (Part II)

Experimental Setup

Datasets

  • Matrix Scale: 32×36 matrix with 50 nonzero elements
  • Data Generation: Test data generated using accompanying Python script csr.py
  • Hardcoded Input: Test data embedded directly in source code for simplified usage
  • Customizability: Users can modify Python script parameters to generate different test cases

Evaluation Metrics

  • Correctness Verification: Compute squared norm ||y-z||²
  • Success Criterion: Norm ≤ 1e-6 (accounting for floating-point rounding errors)
  • Return Code: Returns 0 on success, non-zero on error

Implementation Details

  • Programming Language: C
  • Parallel Framework: MPI
  • Compilation Requirements: Only C compiler or MPI compiler needed
  • Code Availability: Publicly released on GitHub repository

Experimental Results

Verification Objectives

Sequential Implementation Verification Requirements

Verify that computed result y satisfies: yi = Σ(aij·xj), where aij not stored in CSR representation is treated as 0

MPI Parallel Implementation Verification Requirements

  1. Layout Correctness: Verify Σm = M, Σn = N
  2. Local Computation Correctness: Verify that y on each process is the correct product of the corresponding local submatrix and complete vector x

Test Cases

The paper provides specific test data:

  • Input matrix: 32×36 sparse matrix (50 nonzero elements)
  • Input vector: 36-dimensional vector
  • Expected output: 32-dimensional result vector
  • All data provided as integer and floating-point arrays
  1. Krylov Subspace Methods: SpMV is the core component of iterative solvers
  2. PETSc Library: Provides rich suite of Krylov methods and SpMV implementations
  3. Scientific Software Verification: Correctness verification for high-performance scientific computing software

Paper Positioning

  • Fills the gap of lacking standardized SpMV verification challenges in the scientific software verification community
  • Provides foundational reference for verification of complex optimized implementations
  • Bridges theoretical verification methods and practical HPC application needs

Conclusions and Discussion

Main Conclusions

  1. Successfully established standardized challenge problem for SpMV software verification
  2. Provided two implementations with different complexity levels suitable for progressive verification tool testing
  3. Based on real PETSc library patterns with practical application value

Limitations

  1. Scale Limitation: Current test cases are relatively small (32×36 matrix)
  2. Missing Optimizations: Basic MPI implementation does not include complex optimizations present in production environments
  3. Verification Scope: Covers only basic correctness, not addressing performance and numerical stability verification

Future Directions

  1. Part II Development: Plan to provide more complex optimized MPI implementations in subsequent work
  2. Extended Test Cases: Add test matrices with larger scales and different sparsity patterns
  3. Verification Tool Integration: Integration testing with existing verification tools

In-Depth Evaluation

Strengths

  1. High Practical Value: Addresses actual needs of the scientific software verification community
  2. Reasonable Design: Progressive complexity design facilitates verification tool development and testing
  3. High Standardization: Provides complete input/output specifications and correctness checking mechanisms
  4. Strong Reproducibility: Code publicly available, test data embedded, easy to reproduce
  5. Real-World Context: Based on widely-used PETSc library with genuine application significance

Weaknesses

  1. Lack of Theoretical Analysis: No algorithm complexity analysis or theoretical property discussion
  2. Limited Test Coverage: Only single test case provided, insufficient diversity
  3. Shallow Verification Depth: Primarily focuses on functional correctness, lacking numerical precision and boundary condition analysis
  4. Performance Considerations: Does not address performance verification-related challenges

Impact

  1. Field Contribution: Provides important standardized testing platform for scientific software verification
  2. Practical Value: Can be directly used for verification tool development and evaluation
  3. Extensibility: Establishes foundational framework for subsequent more complex implementations
  4. Community Value: Promotes communication and collaboration between HPC and software verification communities

Applicable Scenarios

  1. Verification Tool Development: As standard test case for verifying tool effectiveness
  2. Educational Applications: Suitable as teaching case for parallel computing and software verification
  3. Benchmark Testing: Can serve as reference benchmark for SpMV implementation correctness
  4. Research Platform: Provides standardized platform for related algorithm and optimization research

References

  1. S Balay et al. (2025): PETSc/TAO Users Manual. Technical Report ANL-21/39 - Revision 3.23, Argonne National Laboratory
  2. Yousef Saad (2003): Iterative Methods for Sparse Linear Systems, second edition. Society for Industrial and Applied Mathematics

Overall Assessment: This is a highly practical work paper that, while contributing limited theoretical innovation, provides the scientific software verification community with much-needed standardized challenge problems. The paper is well-structured, complete in implementation, and demonstrates good reproducibility and extensibility, making it valuable for advancing the development of HPC software verification.