2025-11-21T15:28:16.335445

Statistical Rounding Error Analysis for Random Matrix Computations

Fang, Chen
The conventional rounding error analysis provides worst-case bounds with an associated failure probability and ignores the statistical property of the rounding errors. In this paper, we develop a new statistical rounding error analysis for random matrix computations. Such computations have numerous applications in the field of wireless communications, signal processing, and machine learning. By assuming the relative errors are independent random variables, we derive the approximate closed-form expressions for the expectation and variance of the rounding errors in various key computations for random matrices. Numerical experiments validate the accuracy of our derivations and demonstrate that our analytical expressions are generally at least two orders of magnitude tighter than alternative worst-case bounds, exemplified through the inner products.
academic

Statistical Rounding Error Analysis for Random Matrix Computations

Basic Information

  • Paper ID: 2405.07537
  • Title: Statistical Rounding Error Analysis for Random Matrix Computations
  • Authors: Yiming Fang, Li Chen (University of Science and Technology of China)
  • Classification: math.NA cs.NA
  • Publication Date: arXiv v4, November 1, 2025
  • Paper Link: https://arxiv.org/abs/2405.07537

Abstract

Traditional rounding error analysis provides worst-case bounds and associated failure probabilities, but overlooks the statistical characteristics of rounding errors. This paper develops a novel statistical rounding error analysis method for random matrix computations, which have widespread applications in wireless communications, signal processing, and machine learning. By assuming relative errors are independent random variables, the authors derive approximate closed-form expressions for the expectation and variance of rounding errors in various critical computations of random matrices. Numerical experiments validate the accuracy of the derived expressions and demonstrate that the analytical expressions are typically tighter than alternative worst-case bounds by at least two orders of magnitude.

Research Background and Motivation

1. Problem to be Solved

Classical rounding error analysis (such as bounds involving the constant γₙ = nu/(1-nu)) is overly pessimistic for large dimensions and low-precision arithmetic. Existing probabilistic rounding error analysis still approaches from the perspective of worst-case bounds, which is overly conservative for applications involving random matrix computations (such as precoding and detection in wireless communications).

2. Problem Significance

Random matrix computations have important applications in multiple critical domains:

  • Wireless Communications: Channel matrices are typically treated as random vectors or matrices, with precoding and detection involving random matrix computations
  • Signal Processing: Covariance estimation algorithms and radar waveform design
  • Machine Learning: Random matrix computations in various machine learning tasks

3. Limitations of Existing Methods

  • Traditional methods provide deterministic loose bounds or probabilistic bounds relying on pessimistic failure probabilities
  • Worst-case analysis ignores the statistical characteristics of rounding errors
  • When inputs are random variables, worst-case scenarios occur rarely in statistics
  • Existing bounds are often not closed-form expressions, containing higher-order terms such as "+O(u²)"

4. Research Motivation

Conducting rounding error analysis from a statistical perspective can yield more accurate and tighter results for random matrix computations. While Constantinides et al. and Dahlqvist et al. derived closed-form expressions for scalar computations, the expectation and variance for random matrix computations remain unknown.

Core Contributions

  1. General Random Matrix Rounding Error Analysis:
    • Analyze rounding errors in random matrix computations from a statistical perspective without knowing specific distributions
    • Derive approximate closed-form expressions for the expectation and variance of inner product rounding errors
    • Analysis results can degenerate to probabilistic bounds via the Bienaymé-Chebyshev inequality
    • Extend the analysis to matrix-vector and matrix-matrix products
  2. Specific Rounding Error Analysis for Wishart Matrices:
    • Provide examples of zero-forcing (ZF) detection and least squares (LS) problems
    • Provide rounding error analysis for matrix decomposition and triangular system solving
    • Derive approximate closed-form expressions under Wishart matrix conditions
  3. Tighter Analytical Expressions:
    • Tighter than worst-case bounds by at least two orders of magnitude
    • Provide true closed-form expressions without higher-order remainder terms
    • Use mean squared error (MSE) as the comparison metric

Detailed Methodology

Task Definition

For random matrix computations in floating-point arithmetic, derive the statistical characteristics (expectation and variance) of rounding errors, including:

  • Input: Random matrices/vectors following some probability distribution
  • Output: Expectation E(Δ) and variance V(Δ) of the rounding error in the computed result
  • Constraints: Floating-point arithmetic model based on IEEE 754 standard

Core Theoretical Framework

1. Probabilistic Floating-Point Arithmetic Model (Model 2)

Probabilistic Model of Relative Errors: Assuming input signals are independent random variables, the relative error δ associated with each pair of operands is an independent random variable with probability density function:

\frac{3}{4u}t & t \in [-\frac{u}{2}, \frac{u}{2}] \\ \frac{1}{2u}(\frac{u}{t}-1) + \frac{1}{4u}(\frac{u}{t}-1)^2 & t \in [-u, -\frac{u}{2}) \cup (\frac{u}{2}, u] \end{cases}$$ where u is the unit rounding error. Through calculation: - **Expectation**: E(δ) ≈ 0 - **Variance**: V(δ) ≈ u²/6 ≜ σ² **Probabilistic Floating-Point Arithmetic Definition**: $$fl(x \text{ op } y) = (x \text{ op } y)(1 + δ) = (x \text{ op } y) + Δ$$ where Δ = (x op y)δ is the rounding error. #### 2. Rounding Error Analysis for Inner Products (Theorem 1) For inner product s = x^T y, where x, y ∈ ℝ^(n×1) are independent random vectors: **Expectation**: $$E(Δ_s) = 0$$ **Variance (Complete Form)**: $$V(Δ_s) \approx \tau\left[(1+σ^2)^n + \frac{(1+σ^2)^2[(1+σ^2)^{n-1}-1]}{σ^2} - n\right] + 2μ_x^2μ_y^2\left[\frac{(1+σ^2)^2[(1+σ^2)^{n-1}-1]}{σ^4} - \frac{(n-1)(1+σ^2)}{σ^2} - \frac{n(n-1)}{2}\right]$$ where τ = σ_x²σ_y² + σ_x²μ_y² + σ_y²μ_x² + μ_x²μ_y² **Asymptotic Approximation**: $$V(Δ_s) \approx \frac{τ}{2}n^2σ^2 + \frac{μ_x^2μ_y^2}{3}n^3σ^2$$ **Key Insights**: - For zero-mean variables, variance grows quadratically with dimension n - For non-zero-mean variables, variance grows cubically with dimension n - Can degenerate to the classical O(√nu) probabilistic bound #### 3. Matrix-Vector and Matrix-Matrix Products (Theorems 2-3) **Matrix-Vector Product** y = Ab: - E(Δ_y) = 0_(m×1) - R_Δy ≈ diag(ℏ, ..., ℏ), where ℏ is given by the inner product variance formula **Matrix-Matrix Product** C = AB: - E(Δ_C) = 0_(m×p) - R_ΔC = diag(pℏ, ..., pℏ) ### Specific Analysis for Wishart Matrices #### 1. Triangular System Solving (Theorem 4) For triangular system Tx = b, where elements of T satisfy: - t²_ii ~ χ²_(m-i+1) - t_ij ~ N(0,1) (i > j) **Rounding Error Variance (Recursive Form)**: $$V(Δ_{x_i}) \approx \frac{(1+σ^2)^i + \sum_{j=1}^{i-1}V(x_j)(1+σ^2_{\psi_j})(1+σ^2)^{i-j+2}}{m-i-1} - V(x_i)$$ where σ²_ψj = V(Δx_j)/V(x_j) represents the relative error variance. #### 2. LU Decomposition (Theorem 5) For LU decomposition of Wishart matrix A ~ W_n(m, I_n): **Error in Upper Triangular Matrix U**: - Diagonal elements u_kk: variance involves (m²-4) terms and iterative accumulation - Off-diagonal elements u_kj: variance involves (m-2) terms **Error in Lower Triangular Matrix L**: $$V(Δ_{l_{ik}}) \approx \frac{(m-6)[(1+σ^2_{\eta_k})(1+σ^2)^k-1]}{(m-k-1)(m-k-3)} + \text{accumulated terms}$$ ## Experimental Setup ### Experimental Environment - **Software**: MATLAB R2023b - **Precision**: Primarily single precision (fp32), with some experiments using fp16 and bfloat16 - **Simulation Tool**: chop.m function to simulate low-precision arithmetic - **Repetitions**: 10,000 repetitions per experiment - **Random Seed**: rng(1) for reproducibility ### Data Distributions Multiple input distributions tested: - Uniform distribution: U(0,1), U(-1,1) - Gaussian distribution: N(0,1), N(1,1) - Chi-square distribution: χ²_m ### Evaluation Metrics - **Primary Metric**: Mean squared error MSE = E(|Δ|²) = V(Δ) - **Comparison Methods**: - DB1: Deterministic bound [Higham 2002] - PB1: Probabilistic bound [Higham & Mary 2019] - PB2: Probabilistic bound [Higham & Mary 2020] - DB2, PB3: Probabilistic bounds [Ipsen & Zhou 2020] ### Experimental Parameters - **Dimension Range**: n = 10¹ to 10⁴ - **Degrees of Freedom**: m = 10 to 10³ (Wishart matrices) - **Failure Probability**: λ = 1, ζ = 10⁻¹⁶ (for probabilistic bounds) ## Experimental Results ### Main Results #### 1. Inner Product Computation Verification **Performance with Different Input Distributions (Figure 1)**: - **U(0,1)**: Analytical curve matches simulation curve perfectly, error variance grows from 10⁻¹⁴ to 10⁻⁴ - **U(-1,1)**: Zero-mean distribution, significantly lower variance (approximately 10⁻¹⁴ to 10⁻⁸) - **N(0,1)**: Similar low variance characteristics as U(-1,1) - **N(1,1)**: Non-zero mean, rapidly increasing variance (10⁻¹⁰ to 10⁵) **Key Finding**: Zero-mean inputs have variance several orders of magnitude lower than non-zero-mean inputs, validating theoretical predictions. #### 2. Comparison with Worst-Case Bounds (Figure 2) For single-precision inner product computation: | Method | Tightness (Relative to Actual MSE) | Order of Magnitude Difference | |--------|-----------------------------------|------------------------------| | This Work | Nearly coincident | 0 | | DB1 (γ_n²) | Very loose | 2-8 orders | | PB1 (γ_n²(λ)) | Loose | 2-6 orders | | PB2 | Moderately loose | 1-4 orders | | DB2, PB3 | Loose | 2-5 orders | **Conclusion**: The analytical expressions in this work are **at least 2 orders of magnitude tighter** than existing worst-case bounds, reaching **8 orders of magnitude** in some cases. #### 3. Low-Precision Arithmetic Verification (Figure 3) **fp16 Arithmetic**: - Analytical and simulation curves are highly consistent - Variance range: 10⁻⁶ to 10⁻² **bfloat16 Arithmetic**: - Similarly maintains high-precision matching - Variance range: 10⁻⁴ to 10² **Conclusion**: The statistical model remains accurate even under low precision. #### 4. Model Failure Cases (Figure 4) For **large-dimension strongly correlated inputs** (n=10⁸, y_i = x_i h): - i ≤ 10⁵: Model is accurate - i > 10⁵: Significant deviation occurs - **Reason**: Distribution of relative error δ changes with large correlated inputs **Implication**: Model 2 is effective for independent random variables but may fail for strongly correlated large-scale inputs. ### Ablation Studies #### 1. Dimension Effects in Matrix-Matrix Products (Figure 5) Varying a single dimension while fixing others: | Varying Dimension | Effect on R_ΔC(2,2) | Conclusion | |------------------|-------------------|-----------| | n (10→10⁴) | 10⁻¹²→10⁻⁶ | Strong correlation, exponential growth | | p (10→10⁴) | 10⁻¹³→10⁻⁹ | Linear growth | | m (10→10⁴) | Remains 10⁻¹⁴ | No effect | **Conclusion**: Rounding error is primarily affected by inner product dimension n, not by external dimensions m. #### 2. Triangular System Solving (Figure 6) **Effect of Degrees of Freedom m**: - As m increases, V(Δx_3) decreases from 10⁻¹⁵ to 10⁻¹⁸ - **Reason**: Higher degrees of freedom lead to increased variance in t_ii, reducing relative error **Effect of Dimension n**: - As n increases from 10 to 10³, variance remains nearly unchanged - **Conclusion**: Variance is independent of input dimension, depending only on degrees of freedom #### 3. LU Decomposition Verification (Figure 7) Verification of u_33, u_35, l_43: - **All Elements**: Analytical and simulation results match perfectly - **Effect of Degrees of Freedom**: - U elements: As m increases, variance increases (10⁻¹³→10⁻⁸) - L elements: As m increases, variance decreases (10⁻¹⁸→10⁻¹⁵) - **Dimension Independence**: Changing n does not affect variance ### Summary of Experimental Findings 1. **Accuracy of Statistical Model**: Model 2 is highly accurate under independent random inputs 2. **Tightness Advantage**: 2-8 orders of magnitude tighter than worst-case bounds 3. **Zero-Mean Advantage**: Zero-mean inputs have significantly lower errors than non-zero-mean inputs 4. **Precision Robustness**: Model is effective from fp64 to bfloat16 5. **Dimension Characteristics**: - Inner products: Error grows as n² (zero-mean) or n³ (non-zero-mean) - Wishart matrices: Error is independent of n, depending only on degrees of freedom m 6. **Applicability Boundaries**: Model may fail for strongly correlated large-scale inputs ## Related Work ### 1. Classical Rounding Error Analysis - **Wilkinson (1971)**, **Higham (2002)**: Deterministic bounds γ_n = nu/(1-nu) - **Limitations**: Overly pessimistic for large dimensions and low precision ### 2. Probabilistic Rounding Error Analysis - **Neumann & Goldstine (1947)**: Utilizing central limit theorem - **Higham & Mary (2019)**: Concentration inequalities, O(√nu) bounds - **Higham & Mary (2020)**: Assuming both data and relative errors are random variables - **Ipsen & Zhou (2020)**: Forward error bounds for inner products - **Limitations**: Still from worst-case perspective, no closed-form expectation/variance provided ### 3. Probabilistic Models for Scalar Computations - **Constantinides et al. (2019)**, **Dahlqvist et al. (2021)**: Rounding error distributions for scalar computations - **This Work's Extension**: From scalars to vectors/matrices, considering error accumulation ### 4. Application Domain Related Work - **Wireless Communications**: Tulino & Verdú, Ngo et al., Jiang et al. - **Signal Processing**: Chen et al., Wei & Zhao - **Machine Learning**: Couillet & Liao, Pennington & Worah ### Advantages of This Work 1. First to provide closed-form expressions for expectation and variance in random matrix computations 2. At least 2 orders of magnitude tighter than existing probabilistic bounds 3. No need to assume bounded inputs or sufficiently large dimensions 4. Can degenerate to classical probabilistic bounds, demonstrating theoretical consistency ## Conclusions and Discussion ### Main Conclusions 1. **Theoretical Contributions**: - Establish a statistical rounding error analysis framework for random matrix computations - Derive closed-form expressions for expectation and variance of inner products and matrix products - Provide specific analysis for triangular systems and LU decomposition of Wishart matrices 2. **Practical Value**: - Analytical expressions are 2-8 orders of magnitude tighter than worst-case bounds - Provide more accurate error estimates for wireless communications, signal processing, and machine learning - Support multiple precisions from fp64 to bfloat16 3. **Key Insights**: - Zero-mean inputs can significantly reduce rounding errors - Error growth rate depends on input mean, variance, dimension, and precision - Wishart matrix errors are independent of dimension, depending only on degrees of freedom ### Limitations 1. **Model Assumptions**: - Assumes relative errors δ are independent, which may not hold in practice - Assumes inputs are random variables, not applicable to deterministic inputs - Model 2 may fail for strongly correlated large-scale inputs (e.g., correlated vectors with n=10⁸) 2. **Scope of Applicability**: - Primarily for IEEE 754 standard floating-point arithmetic - Requires inputs to satisfy certain statistical independence - Does not consider algorithmic optimizations (e.g., Kahan summation) affecting errors 3. **Theoretical Completeness**: - Some expressions are asymptotic approximations, ignoring higher-order terms - Lacks rigorous convergence proofs - Insufficient analysis for extreme cases (e.g., m ≤ n+3) 4. **Experimental Limitations**: - Primarily verified in MATLAB environment, actual hardware may differ - Not all possible distribution types tested - Large-scale experiments (n > 10⁴) limited by computational resources ### Future Directions 1. **Theoretical Extensions**: - Relax independence assumptions, study error propagation for correlated inputs - Extend to other matrix distributions (complex Wishart, generalized Wishart) - Study non-IEEE standard arithmetic (e.g., stochastic rounding) 2. **Algorithm Applications**: - Apply to mixed-precision algorithm design - Guide error control in low-precision training and inference - Optimize quantization strategies in communication systems 3. **Practical Systems**: - Verify on real hardware (GPU/TPU) - Consider implementation details (caching, pipelining) - Integrate into numerical software libraries 4. **Other Computations**: - Extend to QR decomposition, SVD, and other decompositions - Analyze cumulative errors in iterative algorithms (e.g., conjugate gradient) - Study error propagation in nonlinear operations ## In-Depth Evaluation ### Strengths #### 1. Methodological Innovation (9/10) - **Breakthrough Contribution**: First to provide closed-form statistical error analysis expressions for random matrix computations - **Theoretical Rigor**: Based on probabilistic models with complete derivation (see Appendices A-D) - **Strong Generality**: Applicable to random matrices with unknown distributions, can degenerate to classical bounds - **High Practicality**: 2 orders of magnitude tighter than existing methods with real application value #### 2. Experimental Sufficiency (8.5/10) - **Comprehensive Coverage**: Tests multiple distributions (uniform, Gaussian, chi-square) and precisions (fp64 to bfloat16) - **Good Reproducibility**: 10,000 repetitions per experiment with fixed random seed - **Sufficient Comparisons**: Compares with 5 existing bounds, demonstrating significant advantages - **Rich Case Studies**: Includes inner products, matrix products, triangular systems, LU decomposition **Room for Improvement**: - Could include larger-scale experiments (n > 10⁴) - Could test more matrix types (sparse, structured matrices) #### 3. Result Convincingness (9/10) - **Numerical Verification**: Analytical and simulation curves nearly perfectly match - **Quantified Advantages**: Clearly states "2 orders of magnitude" improvement - **Theoretical Consistency**: Can degenerate to Higham & Mary's O(√nu) bound - **Honest Presentation**: Shows model failure cases (Figure 4), enhancing credibility #### 4. Writing Clarity (8/10) - **Logical Structure**: Progresses from general to specific, deepening gradually - **Clear Notation**: Well-defined symbols with summary tables of floating-point parameters - **Rich Visualizations**: 12 figures intuitively present results - **Complete Proofs**: Core theorem proofs included in appendix **Improvement Suggestions**: - Some formulas are complex; could add intuitive explanations - Could include algorithm pseudocode ### Weaknesses #### 1. Theoretical Limitations - **Independence Assumption**: Strong assumption of independent relative errors, may not hold in practice - **Asymptotic Approximation**: Ignores higher-order terms, may be inaccurate in extreme cases - **Distribution Dependence**: PDF formula (3) in Model 2 lacks sufficient universal verification #### 2. Experimental Defects - **MATLAB Limitations**: Uses loop implementation rather than optimized BLAS, may not reflect actual performance - **Scale Limitations**: Maximum dimension 10⁴, not tested for ultra-large scale (e.g., 10⁶) - **Single Hardware**: Not verified on specialized hardware like GPU/TPU #### 3. Insufficient Application Analysis - **Few Real Cases**: Only ZF detection as example, other applications not demonstrated - **Missing Performance Comparison**: No comparison of actual system performance after optimization using this method - **Parameter Selection Guidance**: Lacks guidance on choosing parameters like m, n #### 4. Literature Review - Fewer citations for machine learning related work - Insufficient discussion of relationship with stochastic rounding ### Impact Assessment #### 1. Contribution to Field (8.5/10) - **Theoretical Value**: Fills gap in statistical analysis of rounding errors for random matrix computations - **Methodological Significance**: Provides paradigm shift from worst-case to statistical analysis - **Interdisciplinary Impact**: Connects numerical analysis, probability theory, and applications #### 2. Practical Value (8/10) - **Wireless Communications**: Can optimize quantization strategies in massive MIMO systems - **Machine Learning**: Guides mixed-precision training, reducing computational costs - **Signal Processing**: Improves error control in covariance estimation **Potential Applications**: - Low-precision algorithm design for edge computing devices - Error analysis for quantum computing (by analogy) - Communication error modeling for federated learning #### 3. Reproducibility (7.5/10) - **Strengths**: - Detailed mathematical derivations - Clear experimental setup (random seed, parameters) - Uses public tools (MATLAB, chop.m) - **Weaknesses**: - Complete code not publicly available - Some implementation details (e.g., vpa.m usage) not fully described - Requires high numerical computation skills to reproduce ### Applicable Scenarios #### 1. Most Suitable Scenarios - **Random Inputs**: Input data are independent random variables (e.g., communication channels, sensor noise) - **Medium Dimensions**: n = 10²-10⁴, balancing accuracy and computational cost - **Low-Precision Arithmetic**: fp16, bfloat16, etc., where error analysis is more critical - **Statistical Guarantees**: Applications needing expectation/variance rather than worst-case bounds #### 2. Unsuitable Scenarios - **Deterministic Inputs**: Known exact values (e.g., identity matrices) - **Strongly Correlated Data**: Highly correlated or specially structured inputs - **Extreme Dimensions**: n > 10⁶ or n < 10, model may be inaccurate - **Real-Time Systems**: Scenarios requiring online error bound computation (closed-form expressions still complex) #### 3. Recommended Application Domains 1. **5G/6G Communications**: Error budget for massive MIMO precoding/detection 2. **Deep Learning**: Error propagation analysis for quantized neural networks 3. **Scientific Computing**: Precision assessment for large-scale linear system solving 4. **Financial Engineering**: Rounding error control in Monte Carlo simulations 5. **Radar Signal Processing**: Precision guarantees for covariance matrix estimation ## Selected References ### Core Theoretical Foundations 1. **Higham (2002)**: "Accuracy and Stability of Numerical Algorithms" - Classical rounding error analysis 2. **Higham & Mary (2019)**: "A New Approach to Probabilistic Rounding Error Analysis" - Probabilistic bounds O(√nu) 3. **Dahlqvist et al. (2021)**: "Rigorous Roundoff Error Analysis of Probabilistic Floating-Point Computations" - Theoretical basis for Model 2 ### Application Domains 4. **Tulino & Verdú (2004)**: "Random Matrix Theory and Wireless Communications" - Applications of random matrices in communications 5. **Gupta & Nagar (2018)**: "Matrix Variate Distributions" - Mathematical foundations of Wishart distribution ### Methodological Related Work 6. **Ipsen & Zhou (2020)**: "Probabilistic Error Analysis for Inner Products" - Probabilistic error analysis for inner products 7. **Higham & Mary (2020)**: "Sharper Probabilistic Backward Error Analysis" - Backward error for random data --- ## Overall Assessment | Dimension | Score | Remarks | |-----------|-------|---------| | Innovation | 9/10 | First systematic statistical analysis, theoretical breakthrough | | Rigor | 8.5/10 | Complete derivation, but strong assumptions | | Practicality | 8/10 | Significant improvement, needs further verification | | Completeness | 8/10 | Comprehensive coverage, some details could be deepened | | Clarity | 8/10 | Clear writing, but formulas are complex | | **Overall Score** | **8.3/10** | **Excellent theoretical work with important application prospects** | ### Recommendation Index - **Numerical Analysis Researchers**: ⭐⭐⭐⭐⭐ Must read - **Wireless Communication Engineers**: ⭐⭐⭐⭐ Highly recommended - **Machine Learning Practitioners**: ⭐⭐⭐⭐ Recommended (especially for quantization) - **General Readers**: ⭐⭐⭐ Requires strong mathematical background