This paper studies distributed convex optimization with both affine equality and nonlinear inequality couplings through the duality analysis. We first formulate the dual of the coupling-constraint problem and reformulate it as a consensus optimization problem over a connected network. To efficiently solve this dual problem and hence the primal problem, we design an accelerated linearized algorithm that, at each round, a look-ahead linearization of the separable objective is combined with a quadratic penalty on the Laplacian constraint, a proximal step, and an aggregation of iterations. On the theory side, we prove non-ergodic rates for both the primal optimality error and the feasibility error. On the other hand, numerical experiments show a faster decrease of optimality error and feasibility residual than augmented-Lagrangian tracking and distributed subgradient baselines under the same communication budget.
An Accelerated Distributed Algorithm with Equality and Inequality Coupling Constraints Paper ID : 2511.19708Title : An Accelerated Distributed Algorithm with Equality and Inequality Coupling ConstraintsAuthors : Chenyang Qiu, Yangyang Qian, Zongli Lin, Yacov A. ShamashAffiliations : University of Virginia (Qiu, Qian, Lin), Stony Brook University (Shamash)Classification : math.OC (Optimization and Control), cs.SY (Systems and Control), eess.SY (Systems and Control)Submission Date : November 24, 2025Paper Link : https://arxiv.org/abs/2511.19708 This paper investigates distributed convex optimization problems with affine equality constraints and nonlinear inequality constraints. Through dual analysis, the coupled constraint problem is transformed into a consensus optimization problem over a connected network. To efficiently solve the dual problem and subsequently the primal problem, an accelerated linearized algorithm is designed that combines lookahead linearization of separable objective functions, quadratic penalty terms for Laplacian constraints, proximal steps, and iterative aggregation in each iteration. Theoretically, non-ergodic convergence rates for both optimality error and feasibility error of the primal problem are established. Numerical experiments demonstrate that under the same communication budget, the proposed algorithm outperforms state-of-the-art methods in terms of convergence speed for optimality error and feasibility residuals.
Distributed optimization aims to minimize a global objective function in multi-agent systems through local computation and communication. This paper focuses on the Coupling-Constraint Problem (CCP) , which is particularly challenging because agents must coordinate local decisions while satisfying global coupling constraints.
Such problems are widely encountered in practical applications:
Smart Grids : In economic dispatch problems, global affine equality constraints represent power balance conditions (total generation meets total demand)Resource Allocation : Requires simultaneously satisfying individual and global constraintsEmission Constraints : Network capacity limitations and other physical constraints modeled as coupling inequality constraintsEquality Constraint Handling : Existing methods such as ADMM, mirror methods, and gradient tracking primarily target equality constraintsInequality Constraint Handling : Methods for affine inequality constraints are inapplicable to nonlinear constraintsConvergence Rate Issues : Algorithms addressing global coupling nonlinear inequality constraints face the following limitations:
Asymptotic convergence 13,17,18 Ergodic convergence rates: O(ln N/√N) 14 , O(1/√N) 15 , O(1/N) 16 Lack of acceleration and non-ergodic convergence guarantees Most existing distributed algorithms do not consider accelerated convergence, resulting in relatively slow convergence rates. This paper aims to develop a distributed algorithm with provable accelerated non-ergodic convergence rates, extending theoretical guarantees of classical first-order methods to the CCP framework with general (possibly non-smooth) cost functions.
Algorithm Innovation : Proposes a novel accelerated distributed optimization algorithm capable of simultaneously handling affine equality constraints and nonlinear inequality coupling constraintsTheoretical Breakthrough : Establishes non-ergodic convergence rates :Optimality error of primal problem: O(1/N²) + O(1/N) Constraint violation error: O(1/N²) + O(1/N) Significantly improves upon existing ergodic or asymptotic convergence guarantees Dual Reformulation : Transforms CCP into a dual problem, leveraging separability to interpret it as a consensus optimization problemExperimental Validation : Numerical experiments demonstrate that under the same iteration budget, the algorithm outperforms state-of-the-art methods such as ALT and distributed subgradient algorithms in terms of optimality error and feasibility residual descent speedPrimal Problem (Problem 1) :
min x ∈ X f ( x ) = ∑ i = 1 n f i ( x i ) \min_{x \in X} f(x) = \sum_{i=1}^{n} f_i(x_i) min x ∈ X f ( x ) = ∑ i = 1 n f i ( x i )
Subject to:
Equality coupling constraint: ∑ i = 1 n B i x i = ∑ i = 1 n b i \sum_{i=1}^{n} B_i x_i = \sum_{i=1}^{n} b_i ∑ i = 1 n B i x i = ∑ i = 1 n b i Inequality coupling constraint: ∑ i = 1 n h i ( x i ) ≤ 0 \sum_{i=1}^{n} h_i(x_i) \leq 0 ∑ i = 1 n h i ( x i ) ≤ 0 Local constraint: x i ∈ X i ⊆ R p x_i \in X_i \subseteq \mathbb{R}^p x i ∈ X i ⊆ R p Where:
x = [ x 1 T , x 2 T , … , x n T ] T ∈ R n p x = [x_1^T, x_2^T, \ldots, x_n^T]^T \in \mathbb{R}^{np} x = [ x 1 T , x 2 T , … , x n T ] T ∈ R n p B i ∈ R d × p B_i \in \mathbb{R}^{d \times p} B i ∈ R d × p , b i ∈ R d b_i \in \mathbb{R}^d b i ∈ R d h i : R p → R m h_i: \mathbb{R}^p \to \mathbb{R}^m h i : R p → R m is a possibly nonlinear functionKey Assumptions :
Assumption 1 : f i f_i f i is a proper μ f \mu_f μ f -strongly convex function; h i h_i h i is convex and l h l_h l h -Lipschitz continuousAssumption 2 : X i X_i X i is a compact convex set; Slater condition holds (a strictly feasible point exists)Introduce Lagrange multipliers μ ∈ R d \mu \in \mathbb{R}^d μ ∈ R d (equality constraints) and δ ∈ R + m \delta \in \mathbb{R}_+^m δ ∈ R + m (inequality constraints). The Lagrangian function is:
L ( x , μ , δ ) = ∑ i = 1 n ( F i ( x i ) + ⟨ μ , B i x i − b i ⟩ + ⟨ δ , h i ( x i ) ⟩ ) L(x, \mu, \delta) = \sum_{i=1}^{n} \left( F_i(x_i) + \langle \mu, B_i x_i - b_i \rangle + \langle \delta, h_i(x_i) \rangle \right) L ( x , μ , δ ) = ∑ i = 1 n ( F i ( x i ) + ⟨ μ , B i x i − b i ⟩ + ⟨ δ , h i ( x i )⟩ )
where F i = f i + 1 X i F_i = f_i + \mathbb{1}_{X_i} F i = f i + 1 X i (1 X i \mathbb{1}_{X_i} 1 X i is the indicator function).
Dual Problem :
min μ ∈ R d , δ ∈ R + m ∑ i = 1 n g i ( μ , δ ) \min_{\mu \in \mathbb{R}^d, \delta \in \mathbb{R}_+^m} \sum_{i=1}^{n} g_i(\mu, \delta) min μ ∈ R d , δ ∈ R + m ∑ i = 1 n g i ( μ , δ )
where g i ( μ , δ ) = − min x i L i ( x i , μ , δ ) g_i(\mu, \delta) = -\min_{x_i} L_i(x_i, \mu, \delta) g i ( μ , δ ) = − min x i L i ( x i , μ , δ ) .
Each agent i i i maintains copies of dual variables y i = [ μ i T , δ i T ] T ∈ Y = R d × R + m y_i = [\mu_i^T, \delta_i^T]^T \in Y = \mathbb{R}^d \times \mathbb{R}_+^m y i = [ μ i T , δ i T ] T ∈ Y = R d × R + m . The dual problem is reformulated as:
min y ∈ Y G ( y ) = ∑ i = 1 n g i ( y i ) \min_{y \in \mathcal{Y}} G(y) = \sum_{i=1}^{n} g_i(y_i) min y ∈ Y G ( y ) = ∑ i = 1 n g i ( y i ) s.t. y 1 = y 2 = ⋯ = y n \text{s.t. } y_1 = y_2 = \cdots = y_n s.t. y 1 = y 2 = ⋯ = y n
Using the Laplacian matrix H H H and W = H ⊗ I d + m W = H \otimes I_{d+m} W = H ⊗ I d + m , the consensus constraint is equivalent to W 1 / 2 y = 0 W^{1/2}y = 0 W 1/2 y = 0 , yielding the compact form (Problem 4):
min y ∈ Y G ( y ) s.t. W 1 / 2 y = 0 \min_{y \in \mathcal{Y}} G(y) \quad \text{s.t. } W^{1/2}y = 0 min y ∈ Y G ( y ) s.t. W 1/2 y = 0
Augmented Lagrangian Function :
L ρ ( y , v ) = G ( y ) − ⟨ v , W 1 / 2 y ⟩ + ρ 2 ∥ W 1 / 2 y ∥ 2 \mathcal{L}_\rho(y, v) = G(y) - \langle v, W^{1/2}y \rangle + \frac{\rho}{2} \|W^{1/2}y\|^2 L ρ ( y , v ) = G ( y ) − ⟨ v , W 1/2 y ⟩ + 2 ρ ∥ W 1/2 y ∥ 2
Algorithm Iteration (Algorithm 1) :
Initialization: ŷ_{i,1} = y_{i,1} ∈ Y, λ_{i,1} = 0
For k = 1, 2, ..., N:
1. Extrapolation step:
ỹ_{i,k} = (1 - α_k)ŷ_{i,k} + α_k y_{i,k}
2. Local optimization (gradient computation):
x_{i,k} = argmin_x {F_i(x) + ⟨[B_i x - b_i; h_i(x)], ỹ_{i,k}⟩}
∇g_i(ỹ_{i,k}) = -[B_i x_{i,k} - b_i; h_i(x_{i,k})]
3. Information exchange:
t_{i,k} = Σ_{j∈N_i} H_{ij}(y_{i,k} - y_{j,k})
4. Proximal update:
y_{i,k+1} = P_Y{y_{i,k} - 1/η_k(∇g_i(ỹ_{i,k}) - λ_{i,k} - θ_k t_{i,k})}
5. Aggregation step:
ŷ_{i,k+1} = (1 - α_k)ŷ_{i,k} + α_k y_{i,k+1}
6. Dual variable update:
λ_{i,k+1} = λ_{i,k} - β_k t_{i,k}
Parameter Settings :
α k = 2 k + 1 \alpha_k = \frac{2}{k+1} α k = k + 1 2 (Nesterov acceleration parameter)θ k = ρ N k \theta_k = \frac{\rho N}{k} θ k = k ρN (adaptive Laplacian penalty)β k = ρ k N \beta_k = \frac{\rho k}{N} β k = N ρ k (dual step size)η k = 2 l g + ρ N ∥ W ∥ k \eta_k = \frac{2l_g + \rho N \|W\|}{k} η k = k 2 l g + ρN ∥ W ∥ (proximal parameter)where l g = 2 μ f 2 ( ∥ B i ∥ 2 + l h 2 ) ⋅ max { ∥ B i ∥ 2 , l h 2 } l_g = \sqrt{\frac{2}{\mu_f^2}(\|B_i\|^2 + l_h^2)} \cdot \max\{\|B_i\|^2, l_h^2\} l g = μ f 2 2 ( ∥ B i ∥ 2 + l h 2 ) ⋅ max { ∥ B i ∥ 2 , l h 2 } is the Lipschitz constant of g i g_i g i .
Three-Variable Coordination Mechanism :y ~ k \tilde{y}_k y ~ k : Extrapolated prediction point for gradient evaluation, introducing momentum effectsy k y_k y k : Proximal correction point ensuring stabilityy ^ k \hat{y}_k y ^ k : Smoothed trajectory point enabling optimal convergence analysisAdaptive Parameter Scheduling :θ k \theta_k θ k and β k \beta_k β k adaptively adjust with iteration count, balancing convergence speed and stabilityParameter design ensures non-ergodic O(1/N²) acceleration rate Linearization Strategy :Linearizes the non-separable quadratic term ρ 2 ∥ W 1 / 2 y ∥ 2 \frac{\rho}{2}\|W^{1/2}y\|^2 2 ρ ∥ W 1/2 y ∥ 2 Combines lookahead gradient ∇ G ( y ~ k ) \nabla G(\tilde{y}_k) ∇ G ( y ~ k ) rather than current point gradient Distributed Implementation :Each node only solves local subproblems (Equation 14) Requires only one round of neighbor information exchange (Step 6: t i , k t_{i,k} t i , k ) No global coordinator needed Synthetic Optimization Problem :
min x i ∈ X i ∑ i = 1 n ( x i T A i x i + b i T x i + ∥ x i ∥ 1 ) \min_{x_i \in X_i} \sum_{i=1}^{n} \left( x_i^T A_i x_i + b_i^T x_i + \|x_i\|_1 \right) min x i ∈ X i ∑ i = 1 n ( x i T A i x i + b i T x i + ∥ x i ∥ 1 )
Subject to:
Equality: ∑ i = 1 n C i x i = 0 p \sum_{i=1}^{n} C_i x_i = 0_p ∑ i = 1 n C i x i = 0 p Inequality: ∑ i = 1 n ∥ x i − r i ∥ 1 ≤ ∑ i = 1 n d i \sum_{i=1}^{n} \|x_i - r_i\|_1 \leq \sum_{i=1}^{n} d_i ∑ i = 1 n ∥ x i − r i ∥ 1 ≤ ∑ i = 1 n d i Parameter Settings :
Number of agents: n = 20 n = 20 n = 20 Local dimension: p = 5 p = 5 p = 5 Box constraints: x i ∈ X i = { x ∈ R p ∣ x ‾ i ≤ x ≤ x ˉ i } x_i \in X_i = \{x \in \mathbb{R}^p | \underline{x}_i \leq x \leq \bar{x}_i\} x i ∈ X i = { x ∈ R p ∣ x i ≤ x ≤ x ˉ i } x ‾ i ∼ U [ − 10 , − 9 ] \underline{x}_i \sim U[-10, -9] x i ∼ U [ − 10 , − 9 ] , x ˉ i ∼ U [ 9 , 10 ] \bar{x}_i \sim U[9, 10] x ˉ i ∼ U [ 9 , 10 ] Matrix A i = U i Λ i U i T A_i = U_i \Lambda_i U_i^T A i = U i Λ i U i T :
U i U_i U i is a random orthogonal matrixEigenvalues of Λ i \Lambda_i Λ i linearly distributed in [ 1 , 100 ] [1, 100] [ 1 , 100 ] (condition number κ = 100 \kappa = 100 κ = 100 ) C i , b i ∼ N ( 0 , I p ) C_i, b_i \sim \mathcal{N}(0, I_p) C i , b i ∼ N ( 0 , I p ) d i ∼ U ( 1 , 6 ) d_i \sim U(1, 6) d i ∼ U ( 1 , 6 ) Communication Network :
Connected undirected graph: each node connected to nearest and second-nearest neighbors Edge set: ( i , i + 1 ) (i, i+1) ( i , i + 1 ) for 1 ≤ i ≤ 19 1 \leq i \leq 19 1 ≤ i ≤ 19 , plus ( 1 , 20 ) (1, 20) ( 1 , 20 ) Primal Problem Optimality Error :
∣ f ( x k ) − f ( x ∗ ) ∣ 2 ∣ f ( x 1 ) − f ( x ∗ ) ∣ 2 \frac{|f(x_k) - f(x^*)|^2}{|f(x_1) - f(x^*)|^2} ∣ f ( x 1 ) − f ( x ∗ ) ∣ 2 ∣ f ( x k ) − f ( x ∗ ) ∣ 2 Constraint Violation Absolute Error :
∥ ∑ i = 1 n C i x i , k ∥ + [ ∑ i = 1 n ( ∥ x i , k − r i ∥ 1 − d i ) ] + \left\| \sum_{i=1}^{n} C_i x_{i,k} \right\| + \left[ \sum_{i=1}^{n} (\|x_{i,k} - r_i\|_1 - d_i) \right]_+ ∥ ∑ i = 1 n C i x i , k ∥ + [ ∑ i = 1 n ( ∥ x i , k − r i ∥ 1 − d i ) ] + Distributed Subgradient 14 : Distributed subgradient algorithmALT (Augmented Lagrangian Tracking) 17 : Augmented Lagrangian tracking algorithmIPLUX (Integrated Primal-Dual Proximal) 16 : Integrated primal-dual proximal algorithmBenchmark Solution : Optimal solution x ∗ x^* x ∗ obtained using YALMIP with MOSEK solver
All algorithms use identical initialization Number of iterations: N = 1200 N = 1200 N = 1200 Proposed algorithm parameters set according to Theorem 1 Figure 1: Primal Problem Optimality Error
Proposed Algorithm : Achieves 10 − 6 10^{-6} 1 0 − 6 precision at k = 1200 k=1200 k = 1200 ALT : Monotonic decrease but slower, ending at approximately 10 − 2 10^{-2} 1 0 − 2 Distributed Subgradient : Slowest descent, remaining in 10 − 1 10^{-1} 1 0 − 1 -10 0 10^0 1 0 0 rangeIPLUX : Performance between ALT and proposed algorithmFigure 2: Constraint Violation Absolute Error
Proposed Algorithm : Earliest to reach below 10 − 4 10^{-4} 1 0 − 4 Other Algorithms : Significantly slower convergenceConvergence Speed : The proposed algorithm converges significantly faster than all comparison methods under the same iteration budgetAccuracy Advantage :Optimality error reduced by approximately 4 orders of magnitude (from 10 − 2 10^{-2} 1 0 − 2 to 10 − 6 10^{-6} 1 0 − 6 ) Feasibility error reduced by approximately 2 orders of magnitude Clear Acceleration Effect : Theoretical advantages of non-ergodic convergence rate are verified in experimentsRobustness : Algorithm performs stably with non-smooth objective functions (containing ℓ 1 \ell_1 ℓ 1 norm) and nonlinear constraintsADMM Methods 6,7 : Alternating Direction Method of MultipliersMirror Methods 8 : Distributed algorithms based on mirror descentGradient Tracking 9 : Gradient tracking for dual problemsAffine Inequality 10-12 : Distributed proximal algorithms, aggregated optimizationNonlinear Inequality 13-18 :
Dual subgradient method 13 Operator splitting primal-dual framework 14 Dynamic average consensus 15 Sparse/dense constraint handling 16 ALT algorithm 17 Nesterov Acceleration 19 : O(1/N²) rate for unconstrained convex optimizationFISTA 20 : Fast Iterative Shrinkage-Thresholding AlgorithmFast Lagrangian Methods 21,22 : Accelerated Lagrangian methods for convex optimizationDistributed Acceleration 23,24 : DCatalyst, energy conservation principleFirst to extend Nesterov acceleration to distributed CCP with simultaneous equality and nonlinear inequality coupling constraintsProvides non-ergodic convergence guarantees (independent of averaging), improving existing ergodic or asymptotic results Applicable to non-smooth objective functions Lipschitz Smoothness of Dual Function :
∥ ∇ g i ( z 1 ) − ∇ g i ( z 2 ) ∥ ≤ l g ∥ z 1 − z 2 ∥ \|\nabla g_i(z_1) - \nabla g_i(z_2)\| \leq l_g \|z_1 - z_2\| ∥∇ g i ( z 1 ) − ∇ g i ( z 2 ) ∥ ≤ l g ∥ z 1 − z 2 ∥
where l g = 2 μ f 2 ( ∥ B i ∥ 2 + l h 2 ) ⋅ max { ∥ B i ∥ 2 , l h 2 } l_g = \sqrt{\frac{2}{\mu_f^2}(\|B_i\|^2 + l_h^2)} \cdot \max\{\|B_i\|^2, l_h^2\} l g = μ f 2 2 ( ∥ B i ∥ 2 + l h 2 ) ⋅ max { ∥ B i ∥ 2 , l h 2 }
Proof Strategy :
Utilize strong convexity of F i F_i F i and convexity of h i h_i h i Obtain gradient expression via Danskin's theorem Establish inequality combining strong convexity and Lipschitz continuity Convergence Rate :
Feasibility Error :
∥ ∑ i = 1 n B i x i , N + 1 − b i ∥ + ∥ [ ∑ i = 1 n h i ( x i , N + 1 ) ] + ∥ ≤ ε c \left\| \sum_{i=1}^{n} B_i x_{i,N+1} - b_i \right\| + \left\| \left[ \sum_{i=1}^{n} h_i(x_{i,N+1}) \right]_+ \right\| \leq \varepsilon_c ∥ ∑ i = 1 n B i x i , N + 1 − b i ∥ + [ ∑ i = 1 n h i ( x i , N + 1 ) ] + ≤ ε c where:
ε c = ( 2 l g N ( N + 1 ) + ρ N + 1 ∥ W ∥ ) ∥ y 1 − y ∗ ∥ 2 + 1 ρ ( N + 1 ) λ 2 ( W ) \varepsilon_c = \left( \frac{2l_g}{N(N+1)} + \frac{\rho}{N+1}\|W\| \right) \|y_1 - y^*\|^2 + \frac{1}{\rho(N+1)\lambda_2(W)} ε c = ( N ( N + 1 ) 2 l g + N + 1 ρ ∥ W ∥ ) ∥ y 1 − y ∗ ∥ 2 + ρ ( N + 1 ) λ 2 ( W ) 1
Optimality Error :
− ε p ≤ f ( x N + 1 ) − f ( x ∗ ) ≤ ε ˉ p -\varepsilon_p \leq f(x_{N+1}) - f(x^*) \leq \bar{\varepsilon}_p − ε p ≤ f ( x N + 1 ) − f ( x ∗ ) ≤ ε ˉ p where ε p \varepsilon_p ε p and ε ˉ p \bar{\varepsilon}_p ε ˉ p have similar O(1/N²) + O(1/N) form.
Proof Key Steps :
Energy Function Construction :
Φ k = G ( y ^ k ) − G ( y ∗ ) − ⟨ λ , y ^ k − y ∗ ⟩ \Phi_k = G(\hat{y}_k) - G(y^*) - \langle \lambda, \hat{y}_k - y^* \rangle Φ k = G ( y ^ k ) − G ( y ∗ ) − ⟨ λ , y ^ k − y ∗ ⟩ Recursive Inequality : Using convexity and smoothness:
k ( k + 1 ) Φ k + 1 − k ( k − 1 ) Φ k ≤ 2 k [ telescoping terms ] k(k+1)\Phi_{k+1} - k(k-1)\Phi_k \leq 2k[\text{telescoping terms}] k ( k + 1 ) Φ k + 1 − k ( k − 1 ) Φ k ≤ 2 k [ telescoping terms ] Summation Technique : Sum from k = 1 k=1 k = 1 to N N N , utilizing telescoping propertyParameter Selection : Achieve acceleration through carefully designed α k , θ k , β k , η k \alpha_k, \theta_k, \beta_k, \eta_k α k , θ k , β k , η k Algorithm Contribution : Proposes the first accelerated distributed algorithm for simultaneous affine equality and nonlinear inequality coupling constraintsTheoretical Guarantee : Establishes non-ergodic O(1/N²) + O(1/N) convergence rate, significantly improving existing resultsPracticality : Each iteration involves simple computation (one local subproblem + one round of neighbor communication), suitable for large-scale deploymentExperimental Validation : On representative test sets, the algorithm achieves higher feasibility and lower error under the same iteration budgetStrong Convexity Assumption : Algorithm and theoretical analysis depend on strong convexity of objective functions (Assumption 1), limiting applicabilitySlater Condition : Requires existence of strictly feasible points (Assumption 2), which may not hold in some practical problemsCompact Set Assumption : Assumption 2 requires local constraint sets X i X_i X i to be compact, excluding unbounded constraintsParameter Tuning : While theoretical parameter settings are provided, practical applications may require problem-specific adjustmentsCommunication Complexity : Does not explicitly analyze communication complexity, focusing only on iteration complexityNon-convex Extension : Theoretical and algorithmic framework does not cover non-convex optimization problemsRelax Strong Convexity : Extend to general convex or even non-convex problemsStochastic/Online Versions : Develop stochastic gradient versions for large-scale dataAsynchronous Communication : Study convergence under asynchronous communication protocolsTime-Varying Networks : Extend to dynamically changing communication topologiesPractical Applications : Validate in real systems such as smart grids and multi-agent coordinationStrong Theoretical Innovation :First to achieve O(1/N²) acceleration in distributed optimization with simultaneous equality and nonlinear inequality coupling constraints Non-ergodic convergence guarantee superior to existing ergodic or asymptotic results Rigorous mathematical proofs with clear logic Clever Algorithm Design :Three-variable coordination mechanism (y ~ k , y k , y ^ k \tilde{y}_k, y_k, \hat{y}_k y ~ k , y k , y ^ k ) effectively achieves acceleration Adaptive parameter scheduling balances convergence speed and stability Linearization strategy maintains computational separability Comprehensive Experiments :Comparison with three state-of-the-art algorithms Clear experimental results demonstrating acceleration effect High-quality figures with clear conclusions High Practical Value :Completely distributed algorithm suitable for large-scale deployment Reasonable computational load per iteration Applicable to non-smooth objective functions Clear Writing :Well-structured with rigorous logic Clear symbol definitions Detailed and understandable proofs Strong Assumptions :Strong convexity assumption limits applicability (many practical problems are only convex or non-convex) Compact set and Slater conditions difficult to verify in some applications Experimental Limitations :Testing only on synthetic data, lacking real application scenario validation No testing on large-scale networks (n=20 is relatively small) No analysis of communication overhead and computation time Parameter Dependence :Algorithm performance depends on problem parameters (μ f , l h , ∥ B i ∥ \mu_f, l_h, \|B_i\| μ f , l h , ∥ B i ∥ , etc.) These parameters may be unknown or difficult to estimate in practical applications Convergence Constants :Constants in theoretical convergence rate may be large No lower bounds or optimality analysis provided Missing Analysis :No discussion of algorithm sensitivity to initialization No analysis of parameter selection impact on convergence Lacks discussion of failure cases or difficult scenarios Academic Value :Provides new theoretical tools for distributed constrained optimization Acceleration techniques may inspire other distributed algorithm designs Expected high citation in optimization and control fields Practical Value :Directly applicable to smart grid economic dispatch Extensible to multi-robot coordination, sensor networks, etc. Algorithm 1 provides clear implementation guidelines Reproducibility :Detailed algorithm description, easy to implement Clear experimental setup Authors encouraged to open-source code for broader application Strongly Recommended Scenarios :
Smart grid economic dispatch (satisfies strong convexity and compact set assumptions) Resource allocation problems (convex cost functions) Distributed machine learning (strongly convex regularization) Use with Caution Scenarios :
Non-convex optimization problems (theory inapplicable) Unbounded constraint sets (violates compact set assumption) Real-time systems (may require many iterations) Scenarios Requiring Improvement :
Large-scale networks (scalability needs verification) Time-varying environments (algorithm extension needed) Communication-limited settings (communication efficiency consideration needed) 6 T.-H. Chang et al., "Multi-agent distributed optimization via inexact consensus ADMM," IEEE Trans. Signal Process., 2014.
14 S. Liang and G. Yin, "Distributed dual subgradient algorithms with iterate-averaging feedback," IEEE Trans. Cybernetics, 2019.
16 X. Wu et al., "Distributed optimization with coupling constraints," IEEE Trans. Automatic Control, 2022.
17 A. Falsone and M. Prandini, "Augmented Lagrangian tracking for distributed optimization," Automatica, 2023.
19 Y. Nesterov, "A method for unconstrained convex minimization problem with the rate of convergence O(1/k²)," Dokl. Akad. Nauk. SSSR, 1983.
Overall Assessment : This is a high-quality theoretical paper making important contributions to distributed optimization. The algorithm design is clever, theoretical analysis rigorous, and experimental results convincing. While certain assumption limitations exist, the algorithm demonstrates significant advantages within its applicable scope. Further validation in practical systems is recommended, along with exploration of relaxing the strong convexity assumption.