This paper focuses on the problem of the mean square optimal estimation of linear functionals which depend on the unknown values of a multidimensional stationary stochastic sequence.
Estimates are based on observations of the sequence with an additive stationary noise sequence.
The aim of the paper is to develop methods of finding the optimal estimates of the functionals in the case of missing observations.
The problem is investigated in the case of spectral certainty where the spectral densities of the sequences are exactly known.
Formulas for calculating the mean-square errors and the spectral characteristics of the optimal linear estimates of functionals are derived under the condition of spectral certainty.
The minimax (robust) method of estimation is applied in the case of spectral uncertainty, where spectral densities of the sequences are not known exactly while sets of admissible spectral densities are given. Formulas that determine the least favorable spectral densities and the minimax spectral characteristics of the optimal estimates of functionals are proposed for some special sets of admissible densities.
Paper ID : 2511.07228Title : Extrapolation Problem for Multidimensional Stationary Sequences with Missing ObservationsAuthors : Oleksandr Masyutka, Mikhail Moklyachuk, Maria SideiInstitution : Taras Shevchenko National University of KyivClassification : math.ST (Statistics Theory), stat.THPublished Journal : Statistics, Optimization and Information Computing, Vol. 7, March 2019, pp 97-117Paper Link : https://arxiv.org/abs/2511.07228 This paper investigates the mean-square optimal extrapolation problem for multidimensional stationary random sequences with missing observations. Estimation is based on observations of sequences with additive stationary noise. The study is conducted under two scenarios: spectral certainty and spectral uncertainty. Under spectral certainty, formulas are derived for computing the mean-square error and spectral characteristics of optimal linear estimates. Under spectral uncertainty, minimax-robust methods are applied to determine formulas for the least favorable spectral density and minimax spectral characteristics.
The core problem addressed in this paper is: How can one optimally estimate linear functionals of multidimensional stationary random sequences when missing observations are present? Specifically:
Observation Model : The observed sequence is ξ ( j ) + η ( j ) \xi(j) + \eta(j) ξ ( j ) + η ( j ) , where ξ ( j ) \xi(j) ξ ( j ) is the signal sequence and η ( j ) \eta(j) η ( j ) is the noise sequenceMissing Pattern : Observation points are j ∈ Z − ∖ S j \in \mathbb{Z}^- \setminus S j ∈ Z − ∖ S , where S = ⋃ l = 1 s { − M l − N l , … , − M l } S = \bigcup_{l=1}^{s}\{-M_l-N_l, \ldots, -M_l\} S = ⋃ l = 1 s { − M l − N l , … , − M l } represents missing observation segmentsEstimation Target : Linear functional A ξ = ∑ j = 0 ∞ a ( j ) ⊤ ξ ( j ) A\xi = \sum_{j=0}^{\infty} a(j)^\top \xi(j) A ξ = ∑ j = 0 ∞ a ( j ) ⊤ ξ ( j ) Theoretical Value : Extends classical Kolmogorov-Wiener prediction theory to the missing observation scenarioPractical Importance : In real applications, sensor failures and data transmission interruptions frequently cause missing observationsRobustness Requirements : In practice, spectral density is often unknown or imprecisely known, necessitating robust estimation methodsComplete Observation Assumption : Traditional methods (Wiener, Yaglom, Rozanov, etc.) assume complete observationsSpectral Certainty Assumption : Most methods require precisely known spectral density, which is difficult to satisfy in practiceUnivariate Limitations : Theory and methods for multidimensional cases are relatively underdevelopedThe innovation of this paper lies in:
Extending Hilbert space projection methods to the missing observation scenario Developing minimax robust estimation theory under spectral uncertainty Providing a complete theoretical framework and computational formulas for the multidimensional case Theoretical Framework : Establishes a complete theoretical system for the extrapolation problem of multidimensional stationary sequences with missing observationsSpectral Certainty Case :
Derives explicit spectral characteristic formulas for optimal linear estimates (Formula 10) Provides exact computational formulas for mean-square error (Formula 11) Spectral Uncertainty Case :
Develops minimax robust estimation methods Proposes characterization equations for least favorable spectral density Provides specific solutions for multiple special admissible spectral density classes Special Cases : Provides corollaries for noise-free observations, uncorrelated noise, and other special casesComputational Methods : Establishes a computable framework through operator equations and Fourier coefficientsInput :
Observation sequence: { ξ ( j ) + η ( j ) , j ∈ Z − ∖ S } \{\xi(j) + \eta(j), j \in \mathbb{Z}^- \setminus S\} { ξ ( j ) + η ( j ) , j ∈ Z − ∖ S } Missing set: S = ⋃ l = 1 s { − M l − N l , … , − M l } S = \bigcup_{l=1}^{s}\{-M_l-N_l, \ldots, -M_l\} S = ⋃ l = 1 s { − M l − N l , … , − M l } Functional coefficients: { a ( j ) , j = 0 , 1 , … } \{a(j), j=0,1,\ldots\} { a ( j ) , j = 0 , 1 , … } satisfying ∑ j = 0 ∞ ∑ k = 1 T ∣ a k ( j ) ∣ < ∞ \sum_{j=0}^{\infty}\sum_{k=1}^{T}|a_k(j)| < \infty ∑ j = 0 ∞ ∑ k = 1 T ∣ a k ( j ) ∣ < ∞ Output :
Optimal estimate: A ^ ξ = ∫ − π π h ( e i λ ) ⊤ ( Z ξ ( d λ ) + Z η ( d λ ) ) \hat{A}\xi = \int_{-\pi}^{\pi} h(e^{i\lambda})^\top (Z_\xi(d\lambda) + Z_\eta(d\lambda)) A ^ ξ = ∫ − π π h ( e iλ ) ⊤ ( Z ξ ( d λ ) + Z η ( d λ )) Mean-square error: Δ ( h ; F , G ) = E ∣ A ξ − A ^ ξ ∣ 2 \Delta(h; F, G) = E|A\xi - \hat{A}\xi|^2 Δ ( h ; F , G ) = E ∣ A ξ − A ^ ξ ∣ 2 Constraints :
Minimality condition: ∫ − π π Tr ( F ( λ ) + G ( λ ) ) − 1 d λ < ∞ \int_{-\pi}^{\pi} \text{Tr}(F(\lambda) + G(\lambda))^{-1}d\lambda < \infty ∫ − π π Tr ( F ( λ ) + G ( λ ) ) − 1 d λ < ∞ The core method is based on Kolmogorov's Hilbert space projection theory:
Hilbert Space Construction :H = L 2 ( Ω , F , P ) H = L_2(\Omega, \mathcal{F}, P) H = L 2 ( Ω , F , P ) : Generated by zero-mean, finite-variance random variablesH s ( ξ + η ) H_s(\xi + \eta) H s ( ξ + η ) : Closed linear subspace generated by observed values { ξ k ( j ) + η k ( j ) : j ∈ Z − ∖ S , k = 1 , … , T } \{\xi_k(j) + \eta_k(j): j \in \mathbb{Z}^- \setminus S, k=1,\ldots,T\} { ξ k ( j ) + η k ( j ) : j ∈ Z − ∖ S , k = 1 , … , T } Optimal Estimate Characterization : The optimal estimate A ^ ξ \hat{A}\xi A ^ ξ is the orthogonal projection of A ξ A\xi A ξ onto H s ( ξ + η ) H_s(\xi+\eta) H s ( ξ + η ) , satisfying:A ^ ξ ∈ H s ( ξ + η ) \hat{A}\xi \in H_s(\xi + \eta) A ^ ξ ∈ H s ( ξ + η ) A ξ − A ^ ξ ⊥ H s ( ξ + η ) A\xi - \hat{A}\xi \perp H_s(\xi + \eta) A ξ − A ^ ξ ⊥ H s ( ξ + η ) Using spectral decomposition:
ξ ( j ) = ∫ − π π e i j λ Z ξ ( d λ ) , A ξ = ∫ − π π A ( e i λ ) ⊤ Z ξ ( d λ ) \xi(j) = \int_{-\pi}^{\pi} e^{ij\lambda} Z_\xi(d\lambda), \quad A\xi = \int_{-\pi}^{\pi} A(e^{i\lambda})^\top Z_\xi(d\lambda) ξ ( j ) = ∫ − π π e ijλ Z ξ ( d λ ) , A ξ = ∫ − π π A ( e iλ ) ⊤ Z ξ ( d λ )
where A ( e i λ ) = ∑ j = 0 ∞ a ( j ) e i j λ A(e^{i\lambda}) = \sum_{j=0}^{\infty} a(j)e^{ij\lambda} A ( e iλ ) = ∑ j = 0 ∞ a ( j ) e ijλ
Through orthogonality conditions, the spectral characteristic h ( e i λ ) h(e^{i\lambda}) h ( e iλ ) satisfies:
( A ( e i λ ) ) ⊤ ( F ( λ ) + F ξ η ( λ ) ) − ( h ( e i λ ) ) ⊤ F ζ ( λ ) = ( C ( e i λ ) ) ⊤ (A(e^{i\lambda}))^\top(F(\lambda) + F_{\xi\eta}(\lambda)) - (h(e^{i\lambda}))^\top F_\zeta(\lambda) = (C(e^{i\lambda}))^\top ( A ( e iλ ) ) ⊤ ( F ( λ ) + F ξ η ( λ )) − ( h ( e iλ ) ) ⊤ F ζ ( λ ) = ( C ( e iλ ) ) ⊤
where F ζ ( λ ) = F ( λ ) + F ξ η ( λ ) + F η ξ ( λ ) + G ( λ ) F_\zeta(\lambda) = F(\lambda) + F_{\xi\eta}(\lambda) + F_{\eta\xi}(\lambda) + G(\lambda) F ζ ( λ ) = F ( λ ) + F ξ η ( λ ) + F η ξ ( λ ) + G ( λ ) , C ( e i λ ) = ∑ j ∈ U c ( j ) e i j λ C(e^{i\lambda}) = \sum_{j \in U} c(j)e^{ij\lambda} C ( e iλ ) = ∑ j ∈ U c ( j ) e ijλ , U = S ∪ { 0 , 1 , … } U = S \cup \{0,1,\ldots\} U = S ∪ { 0 , 1 , … }
Introducing Fourier coefficients:
B ( k − j ) = 1 2 π ∫ − π π ( F ζ ( λ ) ) − 1 e − i ( k − j ) λ d λ B(k-j) = \frac{1}{2\pi}\int_{-\pi}^{\pi} (F_\zeta(\lambda))^{-1}e^{-i(k-j)\lambda}d\lambda B ( k − j ) = 2 π 1 ∫ − π π ( F ζ ( λ ) ) − 1 e − i ( k − j ) λ d λ
R ( k − j ) = 1 2 π ∫ − π π ( F ( λ ) + F ξ η ( λ ) ) ( F ζ ( λ ) ) − 1 e − i ( k − j ) λ d λ R(k-j) = \frac{1}{2\pi}\int_{-\pi}^{\pi} (F(\lambda) + F_{\xi\eta}(\lambda))(F_\zeta(\lambda))^{-1}e^{-i(k-j)\lambda}d\lambda R ( k − j ) = 2 π 1 ∫ − π π ( F ( λ ) + F ξ η ( λ )) ( F ζ ( λ ) ) − 1 e − i ( k − j ) λ d λ
Unknown coefficients c ( k ) , k ∈ U c(k), k \in U c ( k ) , k ∈ U are determined by the operator equation:
R a = B c Ra = Bc R a = B c
where operators B , R B, R B , R are defined by corresponding block matrices accounting for the missing observation structure.
( h ( e i λ ) ) ⊤ = ( A ( e i λ ) ) ⊤ ( F ( λ ) + F ξ η ( λ ) ) ( F ζ ( λ ) ) − 1 − ( ∑ k ∈ U ( B − 1 R a ) ( k ) e i k λ ) ⊤ ( F ζ ( λ ) ) − 1 (h(e^{i\lambda}))^\top = (A(e^{i\lambda}))^\top(F(\lambda) + F_{\xi\eta}(\lambda))(F_\zeta(\lambda))^{-1} - \left(\sum_{k \in U}(B^{-1}Ra)(k)e^{ik\lambda}\right)^\top(F_\zeta(\lambda))^{-1} ( h ( e iλ ) ) ⊤ = ( A ( e iλ ) ) ⊤ ( F ( λ ) + F ξ η ( λ )) ( F ζ ( λ ) ) − 1 − ( ∑ k ∈ U ( B − 1 R a ) ( k ) e ikλ ) ⊤ ( F ζ ( λ ) ) − 1
Δ ( h ; F , G ) = ⟨ R a , B − 1 R a ⟩ + ⟨ Q a , a ⟩ \Delta(h; F, G) = \langle Ra, B^{-1}Ra \rangle + \langle Qa, a \rangle Δ ( h ; F , G ) = ⟨ R a , B − 1 R a ⟩ + ⟨ Q a , a ⟩
where Q Q Q is a linear operator defined by Fourier coefficients Q ( k − j ) Q(k-j) Q ( k − j ) .
Least Favorable Spectral Density (Definition 3.1): ( F 0 , G 0 ) ∈ D (F^0, G^0) \in \mathcal{D} ( F 0 , G 0 ) ∈ D is called least favorable if
Δ ( h ( F 0 , G 0 ) ; F 0 , G 0 ) = max ( F , G ) ∈ D Δ ( h ( F , G ) ; F , G ) \Delta(h(F^0, G^0); F^0, G^0) = \max_{(F,G) \in \mathcal{D}} \Delta(h(F,G); F, G) Δ ( h ( F 0 , G 0 ) ; F 0 , G 0 ) = max ( F , G ) ∈ D Δ ( h ( F , G ) ; F , G )
Minimax Spectral Characteristic (Definition 3.2): h 0 ∈ H D h^0 \in H_{\mathcal{D}} h 0 ∈ H D is called minimax if
min h ∈ H D max ( F , G ) ∈ D Δ ( h ; F , G ) = max ( F , G ) ∈ D Δ ( h 0 ; F , G ) \min_{h \in H_{\mathcal{D}}} \max_{(F,G) \in \mathcal{D}} \Delta(h; F, G) = \max_{(F,G) \in \mathcal{D}} \Delta(h^0; F, G) min h ∈ H D max ( F , G ) ∈ D Δ ( h ; F , G ) = max ( F , G ) ∈ D Δ ( h 0 ; F , G )
The minimax problem is equivalent to constrained optimization:
max ( F , G ) ∈ D ( ⟨ R a , B − 1 R a ⟩ + ⟨ Q a , a ⟩ ) \max_{(F,G) \in \mathcal{D}} (\langle Ra, B^{-1}Ra \rangle + \langle Qa, a \rangle) max ( F , G ) ∈ D (⟨ R a , B − 1 R a ⟩ + ⟨ Q a , a ⟩)
Transformed to unconstrained optimization:
Δ D ( F , G ) = − Δ ( h ( F 0 , G 0 ) ; F , G ) + δ ( ( F , G ) ∣ D ) → inf \Delta_{\mathcal{D}}(F,G) = -\Delta(h(F^0, G^0); F, G) + \delta((F,G)|\mathcal{D}) \to \inf Δ D ( F , G ) = − Δ ( h ( F 0 , G 0 ) ; F , G ) + δ (( F , G ) ∣ D ) → inf
where δ \delta δ is the indicator function.
The least favorable spectral density is determined by subdifferential conditions:
0 ∈ ∂ Δ D ( F 0 , G 0 ) 0 \in \partial \Delta_{\mathcal{D}}(F^0, G^0) 0 ∈ ∂ Δ D ( F 0 , G 0 )
Using Lagrange multiplier methods and subdifferential forms, specific characterization equations can be derived.
The paper considers multiple special categories, for example:
D 0 1 = { F ( λ ) ∣ 1 2 π ∫ − π π Tr F ( λ ) d λ = p } \mathcal{D}^1_0 = \left\{F(\lambda) \left| \frac{1}{2\pi}\int_{-\pi}^{\pi} \text{Tr}F(\lambda)d\lambda = p\right.\right\} D 0 1 = { F ( λ ) 2 π 1 ∫ − π π Tr F ( λ ) d λ = p }
D 1 U V = { G ( λ ) ∣ Tr V ( λ ) ≤ Tr G ( λ ) ≤ Tr U ( λ ) , 1 2 π ∫ − π π Tr G ( λ ) d λ = q } \mathcal{D}^{UV}_1 = \left\{G(\lambda) \left| \text{Tr}V(\lambda) \leq \text{Tr}G(\lambda) \leq \text{Tr}U(\lambda), \frac{1}{2\pi}\int_{-\pi}^{\pi}\text{Tr}G(\lambda)d\lambda = q\right.\right\} D 1 U V = { G ( λ ) Tr V ( λ ) ≤ Tr G ( λ ) ≤ Tr U ( λ ) , 2 π 1 ∫ − π π Tr G ( λ ) d λ = q }
Least Favorable Spectral Density Equation (Theorem 4.1):
( r G 0 ( λ ) ) ∗ ( r G 0 ( λ ) ) ⊤ = α 2 ( F 0 ( λ ) + G 0 ( λ ) ) 2 (r^0_G(\lambda))^*(r^0_G(\lambda))^\top = \alpha^2(F^0(\lambda) + G^0(\lambda))^2 ( r G 0 ( λ ) ) ∗ ( r G 0 ( λ ) ) ⊤ = α 2 ( F 0 ( λ ) + G 0 ( λ ) ) 2
( r F 0 ( λ ) ) ∗ ( r F 0 ( λ ) ) ⊤ = ( β 2 + γ 1 ( λ ) + γ 2 ( λ ) ) ( F 0 ( λ ) + G 0 ( λ ) ) 2 (r^0_F(\lambda))^*(r^0_F(\lambda))^\top = (\beta^2 + \gamma_1(\lambda) + \gamma_2(\lambda))(F^0(\lambda) + G^0(\lambda))^2 ( r F 0 ( λ ) ) ∗ ( r F 0 ( λ ) ) ⊤ = ( β 2 + γ 1 ( λ ) + γ 2 ( λ )) ( F 0 ( λ ) + G 0 ( λ ) ) 2
where α 2 , β 2 \alpha^2, \beta^2 α 2 , β 2 are Lagrange multipliers, γ 1 ( λ ) ≤ 0 \gamma_1(\lambda) \leq 0 γ 1 ( λ ) ≤ 0 (equals 0 when Tr G 0 ( λ ) > Tr V ( λ ) \text{Tr}G^0(\lambda) > \text{Tr}V(\lambda) Tr G 0 ( λ ) > Tr V ( λ ) ), γ 2 ( λ ) ≥ 0 \gamma_2(\lambda) \geq 0 γ 2 ( λ ) ≥ 0 (equals 0 when Tr G 0 ( λ ) < Tr U ( λ ) \text{Tr}G^0(\lambda) < \text{Tr}U(\lambda) Tr G 0 ( λ ) < Tr U ( λ ) ).
The paper also considers:
D 0 2 × D 2 U V \mathcal{D}^2_0 \times \mathcal{D}^{UV}_2 D 0 2 × D 2 U V : Diagonal element constraintsD 0 3 × D 3 U V \mathcal{D}^3_0 \times \mathcal{D}^{UV}_3 D 0 3 × D 3 U V : Weighted trace constraintsD 0 4 × D 4 U V \mathcal{D}^4_0 \times \mathcal{D}^{UV}_4 D 0 4 × D 4 U V : Matrix inequality constraintsD ϵ × D δ 1 \mathcal{D}_\epsilon \times \mathcal{D}^1_\delta D ϵ × D δ 1 : ϵ \epsilon ϵ -contamination and δ \delta δ -neighborhood modelsEach category provides corresponding characterization equations.
The paper provides a concrete two-dimensional sequence extrapolation example:
Problem Setup :
Functional: A 1 ξ = a ( 0 ) ⊤ ξ ( 0 ) + a ( 1 ) ⊤ ξ ( 1 ) A_1\xi = a(0)^\top\xi(0) + a(1)^\top\xi(1) A 1 ξ = a ( 0 ) ⊤ ξ ( 0 ) + a ( 1 ) ⊤ ξ ( 1 ) , where a ( 0 ) = a ( 1 ) = ( 1 , 1 ) ⊤ a(0) = a(1) = (1,1)^\top a ( 0 ) = a ( 1 ) = ( 1 , 1 ) ⊤ Sequence: ξ 1 ( n ) = ξ ( n ) \xi_1(n) = \xi(n) ξ 1 ( n ) = ξ ( n ) , ξ 2 ( n ) = ξ ( n ) + η ( n ) \xi_2(n) = \xi(n) + \eta(n) ξ 2 ( n ) = ξ ( n ) + η ( n ) Missing set: S = { − 3 , − 2 } S = \{-3, -2\} S = { − 3 , − 2 } Spectral density:
f ( λ ) = 1 ∣ 1 − b 1 e i λ ∣ 2 , g ( λ ) = 1 ∣ 1 − b 2 e i λ ∣ 2 f(\lambda) = \frac{1}{|1-b_1e^{i\lambda}|^2}, \quad g(\lambda) = \frac{1}{|1-b_2e^{i\lambda}|^2} f ( λ ) = ∣1 − b 1 e iλ ∣ 2 1 , g ( λ ) = ∣1 − b 2 e iλ ∣ 2 1 Spectral density matrix:
F ( λ ) = ( f ( λ ) f ( λ ) f ( λ ) f ( λ ) + g ( λ ) ) F(\lambda) = \begin{pmatrix} f(\lambda) & f(\lambda) \\ f(\lambda) & f(\lambda) + g(\lambda) \end{pmatrix} F ( λ ) = ( f ( λ ) f ( λ ) f ( λ ) f ( λ ) + g ( λ ) ) Inverse Spectral Density Matrix :
( F ( λ ) ) − 1 = ( 1 f ( λ ) + 1 g ( λ ) − 1 g ( λ ) − 1 g ( λ ) 1 g ( λ ) ) = B ( − 1 ) e − i λ + B ( 0 ) + B ( 1 ) e i λ (F(\lambda))^{-1} = \begin{pmatrix} \frac{1}{f(\lambda)} + \frac{1}{g(\lambda)} & -\frac{1}{g(\lambda)} \\ -\frac{1}{g(\lambda)} & \frac{1}{g(\lambda)} \end{pmatrix} = B(-1)e^{-i\lambda} + B(0) + B(1)e^{i\lambda} ( F ( λ ) ) − 1 = ( f ( λ ) 1 + g ( λ ) 1 − g ( λ ) 1 − g ( λ ) 1 g ( λ ) 1 ) = B ( − 1 ) e − iλ + B ( 0 ) + B ( 1 ) e iλ Fourier Coefficients :
B ( 0 ) = ( 2 + b 1 2 + b 2 2 − 1 − b 2 2 − 1 − b 2 2 1 + b 2 2 ) , B ( 1 ) = B ( − 1 ) = ( − b 1 − b 2 b 2 b 2 − b 2 ) B(0) = \begin{pmatrix} 2+b_1^2+b_2^2 & -1-b_2^2 \\ -1-b_2^2 & 1+b_2^2 \end{pmatrix}, \quad B(1) = B(-1) = \begin{pmatrix} -b_1-b_2 & b_2 \\ b_2 & -b_2 \end{pmatrix} B ( 0 ) = ( 2 + b 1 2 + b 2 2 − 1 − b 2 2 − 1 − b 2 2 1 + b 2 2 ) , B ( 1 ) = B ( − 1 ) = ( − b 1 − b 2 b 2 b 2 − b 2 ) Operator Matrix : Construct block matrix B B B accounting for missing positions { − 3 , − 2 } \{-3, -2\} { − 3 , − 2 } and future positions { 0 , 1 , 2 , … } \{0, 1, 2, \ldots\} { 0 , 1 , 2 , … } Spectral Factorization : Utilize factorization
( F ( λ ) ) − 1 = ( ∑ j = 0 ∞ ψ ( j ) e − i j λ ) ⋅ ( ∑ j = 0 ∞ ψ ( j ) e − i j λ ) ∗ (F(\lambda))^{-1} = \left(\sum_{j=0}^{\infty}\psi(j)e^{-ij\lambda}\right) \cdot \left(\sum_{j=0}^{\infty}\psi(j)e^{-ij\lambda}\right)^* ( F ( λ ) ) − 1 = ( ∑ j = 0 ∞ ψ ( j ) e − ijλ ) ⋅ ( ∑ j = 0 ∞ ψ ( j ) e − ijλ ) ∗ where ψ ( 0 ) = ( 1 1 0 − 1 ) \psi(0) = \begin{pmatrix} 1 & 1 \\ 0 & -1 \end{pmatrix} ψ ( 0 ) = ( 1 0 1 − 1 ) , ψ ( 1 ) = ( − b 1 − b 2 0 b 2 ) \psi(1) = \begin{pmatrix} -b_1 & -b_2 \\ 0 & b_2 \end{pmatrix} ψ ( 1 ) = ( − b 1 0 − b 2 b 2 ) Inverse Operator Computation : B 11 − 1 ( i , j ) = ( Θ ∗ Θ ) ( i , j ) = ∑ l = 0 min ( i , j ) ( θ ( i − l ) ) ∗ θ ( j − l ) B^{-1}_{11}(i,j) = (\Theta^*\Theta)(i,j) = \sum_{l=0}^{\min(i,j)}(\theta(i-l))^*\theta(j-l) B 11 − 1 ( i , j ) = ( Θ ∗ Θ ) ( i , j ) = ∑ l = 0 m i n ( i , j ) ( θ ( i − l ) ) ∗ θ ( j − l ) Spectral Characteristic :
( h 1 ( e i λ ) ) ⊤ = − ( b 2 + b 2 2 − 2 ( b 1 + b 1 2 ) , − b 2 − b 2 2 ) e − i λ (h_1(e^{i\lambda}))^\top = -(b_2 + b_2^2 - 2(b_1 + b_1^2), -b_2 - b_2^2)e^{-i\lambda} ( h 1 ( e iλ ) ) ⊤ = − ( b 2 + b 2 2 − 2 ( b 1 + b 1 2 ) , − b 2 − b 2 2 ) e − iλ
Mean-Square Error :
Δ ( h 1 ; F ) = 10 + 8 b 1 + 4 b 1 2 + 2 b 2 + b 2 2 \Delta(h_1; F) = 10 + 8b_1 + 4b_1^2 + 2b_2 + b_2^2 Δ ( h 1 ; F ) = 10 + 8 b 1 + 4 b 1 2 + 2 b 2 + b 2 2
This example demonstrates:
How to handle block structure of missing observations How to utilize spectral factorization to simplify computations Explicit form of optimal spectral characteristics The paper verifies the feasibility of the theoretical framework through Example 2.1:
Simplicity of Spectral Characteristics : The optimal spectral characteristic has finite support (non-zero only in λ − 1 \lambda^{-1} λ − 1 terms), reflecting that the impact of missing observations is localComputability of Error : The mean-square error expression is a simple polynomial in parameters b 1 , b 2 b_1, b_2 b 1 , b 2 , facilitating analysis and optimizationParameter Effects :Larger b 1 , b 2 b_1, b_2 b 1 , b 2 lead to larger errors (enhanced autocorrelation of signal and noise) Error is more sensitive to b 1 b_1 b 1 (signal autocorrelation has more significant impact) Compared to existing methods:
Completeness : Provides a complete framework from problem modeling to concrete computationGenerality : Applicable to multidimensional sequences and arbitrary missing patternsRobustness : Minimax method handles spectral uncertaintyComputability : Implemented through operator equations and Fourier coefficientsThe paper provides multiple theorems guaranteeing:
Theorem 2.1 : Existence and uniqueness of optimal solution under spectral certaintyTheorems 4.1, 5.1 : Characterization of least favorable spectral density under different admissible classesCorollaries 2.1-2.4, 4.1-4.2, 5.1-5.2 : Simplified results for special casesKolmogorov (1941) : First proposed spectral methods for stationary sequence predictionWiener (1949) : Developed continuous-time filtering theoryYaglom (1955, 1987) : Systematically studied related theory of stationary processesRozanov (1967) : Multidimensional stationary process theoryHannan (1970) : Multivariate time series analysisBondon (2002, 2005) : Prediction with incomplete pastCheng & Pourahmadi (1996, 1998) : Extremal problems and interpolation in L p ( w ) L^p(w) L p ( w ) spacesKasahara, Pourahmadi & Inoue (2009) : Dual methods for missing value predictionPelagatti (2015) : Time series modeling with unobservable componentsGrenander (1957) : First proposed minimax methods for stationary process extrapolationKassam & Poor (1985) : Survey of robust techniques in signal processingFranke (1984, 1985) : Robust prediction and interpolation for time seriesFranke & Poor (1984) : Minimax robust filteringVastola & Poor (1983) : Analysis of spectral uncertainty effects on Wiener filteringMoklyachuk (2008, 2015) : Robust estimation of stationary sequence functionalsMoklyachuk & Masyutka (2008-2012) : Minimax prediction for multidimensional stationary processesMoklyachuk & Sidei (2015-2017) : Interpolation, extrapolation and filtering with missing observationsLuz & Moklyachuk (2015-2016) : Estimation for stationary increment processesCompared to existing work:
Systematicity : First systematic study of extrapolation for multidimensional sequences with missing observationsCompleteness : Addresses both spectral certainty and uncertainty casesGenerality : Considers multiple missing patterns and admissible spectral density classesOperability : Provides explicit computational formulas and operator equationsTheoretical Framework : Successfully establishes a complete theoretical system for extrapolation of multidimensional stationary sequences with missing observationsSpectral Certainty Results :Optimal spectral characteristic is determined by operator equation R a = B c Ra = Bc R a = B c and formula (10) Mean-square error can be precisely computed via formula (11) Method applies to correlated and uncorrelated noise Spectral Uncertainty Results :Least favorable spectral density is characterized by subdifferential condition 0 ∈ ∂ Δ D ( F 0 , G 0 ) 0 \in \partial\Delta_{\mathcal{D}}(F^0, G^0) 0 ∈ ∂ Δ D ( F 0 , G 0 ) Explicit Lagrange equations provided for multiple special admissible classes Minimax estimate possesses saddle-point property Computational Methods : Achieves computable framework through Fourier coefficients and operator matricesComputational Complexity :Requires solving infinite-dimensional operator equations (truncation needed in practice) Computation of inverse operator B − 1 B^{-1} B − 1 may be difficult Matrix dimension increases with number of missing segments Theoretical Assumptions :Requires minimality condition (1) or (12) to hold Assumes operator B B B is invertible (see Salehi 1979) Functional coefficients must satisfy absolute summability condition (3) Spectral Uncertainty :Only considers specific admissible spectral density classes Numerical solution of least favorable spectral density may be complex Does not discuss how to estimate admissible classes from data Practical Applicability :Lacks large-scale numerical experiments Not combined with real data applications Lacks numerical comparison with other methods Research directions suggested by the paper:
Algorithm Development :Efficient numerical algorithms for solving operator equations Approximation methods for large-scale problems Adaptive truncation dimension selection Theoretical Extensions :Generalization to non-stationary sequences Periodically correlated sequences (partial work exists) Stationary increment sequences (partial work exists) Application Research :Real problems in signal processing Financial time series analysis Sensor network data fusion Statistical Inference :Spectral density estimation from data Methods for admissible class selection Confidence intervals and hypothesis testing Solid Mathematical Foundation : Based on Hilbert space theory and convex optimization theoryComplete Proofs : Clear logical flow of theorems and corollaries with explicit conditionsStandardized Notation : Mathematical symbols used consistently and clearlyMissing Observation Handling : Cleverly embeds missing structure into operator matricesMinimax Framework : Systematically develops robust estimation under spectral uncertaintyMultidimensional Generalization : Successfully handles complexity of multidimensional casesMultiple Cases : Covers correlated/uncorrelated noise, with/without noise observationsMultiple Spectral Classes : Considers 8 different admissible spectral density classesExplicit Formulas : Provides computable explicit expressionsClear Historical Context : From Kolmogorov to latest workComprehensive References : Includes 41 referencesAccurate Positioning : Clearly states relationship to existing workOnly One Example : Example 2.1 is too simple (two-dimensional, simple missing pattern)Lacks Numerical Comparison : No numerical comparison with other methodsNo Real Data : Not validated on real datasetsHeavy Notation : Abundant matrix and operator symbols, high reading thresholdComplex Structure : Block matrix structure description lacks intuitive presentationMissing Visualizations : No figures or diagrams to aid understandingComputational Cost : Algorithm complexity and computational efficiency not discussedParameter Selection : Lacks practical guidance for choosing admissible class parametersSoftware Implementation : No code or software package providedInvertibility Assumption : Invertibility condition for operator B B B not sufficiently clearConvergence Analysis : Truncation error analysis for infinite-dimensional problems missingStability : Numerical stability not discussedTheoretical Contribution : ★★★★☆Fills gap in extrapolation theory with missing observations Provides systematic framework for subsequent research Method Innovation : ★★★★☆Operator equation method for missing observations is innovative Systematic development of minimax framework has value Application Potential : ★★★☆☆Theory is complete but practical applicability needs verification Requires more real application cases Reproducibility : ★★☆☆☆Theoretical formulas complete but algorithm details insufficient Lacks code and numerical experiments Time Series Analysis : Provides theoretical tools for missing data handlingSignal Processing : Applicable to sensor data fusionFinancial Engineering : Missing data handling in high-frequency tradingStatistics : Development of robust estimation theorySensor Networks : Data loss due to sensor failuresCommunication Systems : Signal reconstruction from packet lossFinancial Time Series : Prediction with irregular trading timesEnvironmental Monitoring : Imputation of missing weather station dataNon-stationary Processes : Method assumes stationarityNonlinear Systems : Only considers linear functionalsHigh-dimensional Large-scale : Computational complexity may be prohibitiveCompletely Unknown Spectrum : Requires some prior informationTime Series Theory Researchers : ★★★★★Provides systematic theoretical framework Signal Processing Engineers : ★★★☆☆Theory-heavy, requires mathematical background Statistics Researchers : ★★★★☆Robust estimation methods have reference value Applied Data Scientists : ★★☆☆☆Lacks practical algorithms and code The paper cites classical and cutting-edge works in the field:
Foundational Works :Kolmogorov (1992): Random process prediction theory Wiener (1966): Filtering and prediction theory Yaglom (1987): Related theory Methodology :Grenander (1957): Minimax methods Franke (1984, 1985): Robust prediction Pshenichnyj (1971): Convex optimization Missing Observations :Bondon (2002, 2005) Pourahmadi et al. (2007, 2009) Authors' Series Work : Demonstrates continuity and depth of researchThis is a high-quality academic paper with rigorous theory and systematic methodology. Main strengths include:
Establishes complete theoretical framework for extrapolation with missing observations Addresses both spectral certainty and uncertainty cases Provides explicit solutions for multiple special cases Main weaknesses are:
Weak experimental verification with only one simple example Insufficient practical considerations, lacking algorithms and code Readability issues with heavy notation Recommendation Index : ★★★★☆ (for theory researchers) / ★★★☆☆ (for applied researchers)
The paper makes important theoretical contributions to time series analysis and robust estimation, but requires follow-up work in algorithm implementation and practical applications for supplementation and verification.