2025-11-18T07:43:13.662683

A direct PinT algorithm for higher-order nonlinear time-evolution equations

Zhong, Zhao, Shu
Higher-order nonlinear time-evolution equations have widespread applications in science and engineering, such as in solid mechanics, materials science, and fluid mechanics. This paper mainly studies a direct time-parallel algorithm for solving time-dependent differential equations of orders 1 to 3. Different from the traditional time-stepping approach, we directly solve the all-at-once system from higher-order evolution equations by diagonalization the time discretization matrix $B$. Based on the connection between the characteristic equation and Chebyshev polynomials, we give explicit formulas for the eigenvector matrix $V$ of $B$ and its inverse $V^{-1}$. We prove that $Cond_2\left( V \right) =\mathcal{O} \left( n^3 \right)$, where $n$ is the number of time steps. A direct parallel-in-time algorithm is designed by exploring the structure of the spectral decomposition of $B$. Numerical experiments are provided to show the significant computational speedup of the proposed algorithm.
academic

A direct PinT algorithm for higher-order nonlinear time-evolution equations

Basic Information

  • Paper ID: 2507.05743
  • Title: A direct PinT algorithm for higher-order nonlinear time-evolution equations
  • Authors: Shun-Zhi Zhong, Yong-Liang Zhao, Qian-Yu Shu (School of Mathematical Sciences, Sichuan Normal University)
  • Classification: math.NA cs.NA
  • Publication Date: October 12, 2025 (arXiv v2)
  • Paper Link: https://arxiv.org/abs/2507.05743v2

Abstract

Higher-order nonlinear time-evolution equations have widespread applications in solid mechanics, materials science, and fluid mechanics. This paper investigates direct parallel-in-time (PinT) algorithms for solving first to third-order time-dependent differential equations. Unlike traditional time-stepping methods, this work directly solves the monolithic system of higher-order evolution equations by diagonalizing the temporal discretization matrix BB. Based on the connection between characteristic equations and Chebyshev polynomials, explicit formulas are provided for the eigenvector matrix VV of BB and its inverse V1V^{-1}. It is proven that Cond2(V)=O(n3)\text{Cond}_2(V) = \mathcal{O}(n^3), where nn is the number of time steps. By exploring the spectral decomposition structure of BB, a direct parallel-in-time algorithm is designed. Numerical experiments demonstrate significant computational acceleration.

Research Background and Motivation

Problem Background

Temporal parallelization of time-evolution problems is a recent hot research topic. Traditional time-stepping methods often fail to obtain ideal solutions efficiently on modern supercomputers. Introducing parallelization can significantly reduce computational cost and improve efficiency.

Limitations of Existing Methods

  1. Limitations of iterative PinT algorithms: While existing parallel algorithms (such as MGRiT, PFASST) perform well for strongly dissipative problems, they show poor performance for wave propagation problems, as convergence speed largely depends on dissipativity.
  2. Challenges in diagonalization methods:
    • Traditional backward Euler discretization leads to non-diagonalizable matrices
    • Using different time step sizes enables diagonalization but may result in large condition numbers for the eigenvector matrix, increasing rounding errors
    • Existing methods impose restrictions on the number of time steps nn (typically nn can only be between 20-25)

Research Motivation

This work aims to eliminate unfavorable restrictions on nn, extend special second-order partial differential equations to more general first to third-order PDE forms, and design a direct PinT algorithm for solving the monolithic system.

Core Contributions

  1. Theoretical Proof: Theoretically proves that matrix BB can be diagonalized as B=VDV1B = VDV^{-1}
  2. Explicit Expressions: Provides analytical expressions for VV, V1V^{-1}, and DD, rigorously proving that the condition number of matrix VV satisfies Cond2(V)=O(n3)\text{Cond}_2(V) = \mathcal{O}(n^3)
  3. Fast Algorithm: Proposes a fast algorithm for computing V1V^{-1} that is faster than MATLAB's built-in eig function
  4. Algorithm Extension: Extends the direct PinT algorithm to first to third-order nonlinear differential equations

Methodology Details

Problem Formulation

Solving higher-order nonlinear time-evolution equations of the following forms:

  • First-order problem: u(t)+f(u(t))=0u'(t) + f(u(t)) = 0, u(0)=u0u(0) = u_0
  • Second-order problem: u(t)+a1u(t)+f(u(t))=0u''(t) + a_1u'(t) + f(u(t)) = 0, u(0)=u0u(0) = u_0, u(0)=u~0u'(0) = \tilde{u}_0
  • Third-order problem: u(t)+a1u(t)+a2u(t)+f(u(t))=0u'''(t) + a_1u''(t) + a_2u'(t) + f(u(t)) = 0, with additional initial conditions

Core Algorithm Framework

Temporal Discretization Scheme

A hybrid temporal discretization scheme is employed:

  • Central finite difference formulas for the first n1n-1 steps
  • BDF2 formula for the final step

{uj+1uj12Δt+Auj=fj,j=1,2,,n132un2un1+12un2Δt+Aun=fn\begin{cases} \frac{u_{j+1}-u_{j-1}}{2\Delta t} + Au_j = f_j, & j = 1,2,\ldots,n-1 \\ \frac{\frac{3}{2}u_n - 2u_{n-1} + \frac{1}{2}u_{n-2}}{\Delta t} + Au_n = f_n \end{cases}

The corresponding temporal discretization matrix is: B=1Δt(012120121201212232)B = \frac{1}{\Delta t}\begin{pmatrix} 0 & \frac{1}{2} & & & \\ -\frac{1}{2} & 0 & \frac{1}{2} & & \\ & \ddots & \ddots & \ddots & \\ & & -\frac{1}{2} & 0 & \frac{1}{2} \\ & & \frac{1}{2} & -2 & \frac{3}{2} \end{pmatrix}

Spectral Decomposition Theory

Theorem 3.1: The eigenvalues of matrix BB are λj=ixj\lambda_j = ix_j, where {xj}j=1n\{x_j\}_{j=1}^n are the nn roots of the equation: Un1(x)+iUn2(x)iTn(x)+Tn1(x)=0U_{n-1}(x) + iU_{n-2}(x) - iT_n(x) + T_{n-1}(x) = 0

The corresponding eigenvectors are Pj=[pj,0,,pj,n1]TP_j = [p_{j,0}, \ldots, p_{j,n-1}]^T, where: pj,k=ikUk(xj),k=0,,n1p_{j,k} = i^k U_k(x_j), \quad k = 0,\ldots,n-1

Here Tn(x)T_n(x) and Un(x)U_n(x) are Chebyshev polynomials of the first and second kind, respectively.

Direct PinT Algorithm

For nonlinear problems, simplified Newton iteration (SNI) is used: (BIx+ItAk)uk+1=b+[(ItAk)ukF(uk)](B \otimes I_x + I_t \otimes A^k)u^{k+1} = b + [(I_t \otimes A^k)u^k - F(u^k)]

where Ak=1nj=1nf(ujk)A^k = \frac{1}{n}\sum_{j=1}^n \nabla f(u_j^k) is the average Jacobian matrix.

Through spectral decomposition B=VDV1B = VDV^{-1}, the system can be solved in parallel:

  1. g~=(V1Ix)rk\tilde{g} = (V^{-1} \otimes I_x)r^k (Step a)
  2. (λjIx+Ak)zj=g~j(\lambda_j I_x + A^k)z_j = \tilde{g}_j, j=1,2,,nj = 1,2,\ldots,n (Step b)
  3. uk+1=(VIx)zu^{k+1} = (V \otimes I_x)z (Step c)

Technical Innovations

  1. Chebyshev Polynomial Connection: Establishes the connection between characteristic equations and Chebyshev polynomials to obtain explicit spectral decomposition
  2. Condition Number Control: Proves Cond2(V)=O(n3)\text{Cond}_2(V) = \mathcal{O}(n^3), significantly improving upon existing methods
  3. Fast Algorithm: Designs an O(n2)\mathcal{O}(n^2) complexity algorithm for computing V1V^{-1}
  4. Higher-order Extension: Extends the algorithm to second and third-order nonlinear equations

Experimental Setup

Numerical Experiment Configuration

  • Computing Environment: Intel(R) Core(TM) i7-14700K 3.40GHz processor, 32GB memory
  • Software Platform: MATLAB 2022a
  • Parallel Cores: Up to 20 cores for acceleration testing

Evaluation Metrics

  1. CPU Time: Measured using MATLAB's tic/toc functions
  2. Relative Error: ω=BVDV1FBF\omega = \frac{\|B - VDV^{-1}\|_F}{\|B\|_F}
  3. Condition Number: Cond2(V)\text{Cond}_2(V)
  4. Speedup Ratio: Comparison of computation times across different core counts

Comparison Methods

  • MATLAB built-in eig function for spectral decomposition
  • Traditional time-stepping methods (as baseline)

Experimental Results

Fast Spectral Decomposition Performance

nMATLAB eig+mrdivideFast AlgorithmSpeedup
320.002s0.003s0.67×
2560.050s0.023s2.17×
10241.285s0.306s4.20×
409667.599s8.626s7.84×
8192580.663s62.270s9.32×

Parallel Acceleration Performance

Experiments demonstrate:

  1. More pronounced acceleration effects when the number of time steps NtN_t is larger
  2. At Nt=29=512N_t = 2^9 = 512, using 20 cores shows significant CPU time reduction compared to single-core execution
  3. When core count exceeds 8, acceleration gradually diminishes (likely due to increased communication overhead)

Numerical Test Cases

Four numerical examples were tested:

  • Example 1: Two-dimensional nonlinear equation (Dirichlet boundary conditions)
  • Example 2: Two-dimensional Sine-Gordon equation
  • Example 3: Third-order linear evolution equation
  • Example 4: Third-order nonlinear evolution equation

All examples validate the algorithm's effectiveness and parallel acceleration capability.

Temporal Parallel Methods

  1. Iterative PinT algorithms: Parareal, MGRiT, PFASST and other methods perform well on strongly dissipative problems
  2. Diagonalization methods: Maday and Rønquist first proposed diagonalization-based PinT algorithms
  3. Improved methods: Including space-time discretization, low-rank approximation techniques, domain decomposition algorithms, etc.

Advantages of This Work

Compared to existing work, this paper:

  1. Eliminates restrictions on the number of time steps nn
  2. Provides explicit spectral decomposition formulas
  3. Extends the method to higher-order nonlinear equations
  4. Provides rigorous condition number analysis

Conclusions and Discussion

Main Conclusions

  1. Successfully extends the diagonalization PinT algorithm to first to third-order nonlinear time-evolution equations
  2. Provides explicit diagonalization formulas B=VDV1B = VDV^{-1} for the temporal discretization matrix BB
  3. Proves that the condition number of the eigenvector matrix is O(n3)\mathcal{O}(n^3)
  4. Designs a fast algorithm with O(n2)\mathcal{O}(n^2) complexity

Limitations

  1. Condition Number Growth: Although improved compared to existing methods, the condition number still grows as n3n^3
  2. Communication Overhead: Large-scale parallelization may be limited by communication overhead
  3. Applicable Scope: Primarily applicable to problems with certain dissipative properties

Future Directions

  1. Further optimize the computational algorithm for V1V^{-1}
  2. Investigate extensions to higher-order differential equations
  3. Explore methods to reduce condition number growth
  4. Application research in wave equations, fluid dynamics, and other fields

In-depth Evaluation

Strengths

  1. Rigorous Theory: Provides complete mathematical theoretical analysis, including explicit expressions for eigenvalues, eigenvectors, and condition number estimates
  2. Methodological Innovation: Cleverly utilizes Chebyshev polynomials to establish connections in characteristic equations, obtaining analytical solutions
  3. Practical Value: The algorithm demonstrates significant computational acceleration on large-scale problems
  4. Strong Extensibility: Extension from first-order to third-order nonlinear equations demonstrates good generality

Weaknesses

  1. Condition Number Issue: Despite improvements over existing methods, the O(n3)\mathcal{O}(n^3) condition number growth may still cause numerical instability in extremely large-scale problems
  2. Experimental Limitations: Numerical experiments focus mainly on relatively simple model problems, lacking verification on complex engineering applications
  3. Parallel Efficiency: Parallel efficiency decreases with increasing core count, requiring further optimization of communication strategies

Impact

  1. Academic Contribution: Provides new theoretical tools and methods for the temporal parallel algorithms field
  2. Application Prospects: Has important application value in scientific computing, engineering simulation, and other fields requiring solutions to large-scale time-evolution problems
  3. Reproducibility: The paper provides detailed algorithm descriptions and mathematical derivations, facilitating reproduction and further research

Applicable Scenarios

  1. Large-scale time-evolution problems: Particularly suitable for physical simulations requiring long-time integration
  2. High-performance computing environments: Can fully leverage parallel advantages in multi-core or cluster environments
  3. Scientific and engineering applications: Numerical simulations in solid mechanics, materials science, fluid mechanics, and other fields

References

The paper cites 44 related references, primarily including:

  • Lions, Maday, Turinici (2001): Pioneering work on the Parareal algorithm
  • Gander, Halpern et al.: Theoretical analysis of temporal parallel methods
  • Liu, Wu, Zhou et al.: Recent research on diagonalization PinT algorithms
  • Classical textbooks on Chebyshev polynomials and numerical linear algebra

Overall Assessment: This is a high-quality numerical analysis paper with significant contributions in both theoretical analysis and algorithm design. The paper addresses important limitations of existing diagonalization PinT algorithms and provides an effective parallel solution scheme for higher-order nonlinear time-evolution equations. Despite some limitations, its theoretical and practical value are outstanding, making important contributions to advancing the development of temporal parallel algorithms.