2025-11-25T01:19:18.327955

Distributed Thompson sampling under constrained communication

Zerefa, Ren, Ma et al.
In Bayesian optimization, a black-box function is maximized via the use of a surrogate model. We apply distributed Thompson sampling, using a Gaussian process as a surrogate model, to approach the multi-agent Bayesian optimization problem. In our distributed Thompson sampling implementation, each agent receives sampled points from neighbors, where the communication network is encoded in a graph; each agent utilizes their own Gaussian process to model the objective function. We demonstrate theoretical bounds on Bayesian average regret and Bayesian simple regret, where the bound depends on the structure of the communication graph. Unlike in batch Bayesian optimization, this bound is applicable in cases where the communication graph amongst agents is constrained. When compared to sequential single-agent Thompson sampling, our bound guarantees faster convergence with respect to time as long as the communication graph is connected. We confirm the efficacy of our algorithm with numerical simulations on traditional optimization test functions, demonstrating the significance of graph connectivity on improving regret convergence.
academic

Distributed Thompson sampling under constrained communication

Basic Information

  • Paper ID: 2410.15543
  • Title: Distributed Thompson sampling under constrained communication
  • Authors: Saba Zerefa, Zhaolin Ren, Haitong Ma, Na Li (Harvard School of Engineering and Applied Sciences)
  • Classification: cs.LG cs.SY eess.SY math.OC stat.ML
  • Publication Date: January 1, 2025 (arXiv v3)
  • Paper Link: https://arxiv.org/abs/2410.15543

Abstract

This paper investigates multi-agent Bayesian optimization under communication constraints. The authors propose a distributed Thompson sampling algorithm using Gaussian processes as surrogate models. In this implementation, each agent receives sampling points from neighbors, with the communication network encoded as a graph structure; each agent utilizes its own Gaussian process to model the objective function. The paper establishes theoretical bounds on Bayesian average regret and Bayesian simple regret, which depend on the structure of the communication graph. Unlike batch Bayesian optimization, these bounds apply to scenarios with restricted communication graphs between agents. Compared to sequential single-agent Thompson sampling, the algorithm guarantees faster convergence rates whenever the communication graph is connected.

Research Background and Motivation

Problem Definition

The core problem addressed in this paper is black-box function optimization in multi-agent systems with communication constraints. Specifically:

  1. Black-box Stochastic Optimization Challenge: Finding the maximum of a function when the objective function is unknown and can only be accessed through noisy evaluations
  2. Multi-agent Collaboration Requirements: Multiple agents can sample the objective function in parallel, but their mutual communication may be restricted
  3. Communication Constraint Realism: In practical applications (e.g., multi-robot source seeking, sensor networks), agents may not have access to information from all other agents

Research Significance

This problem has broad applications in several important domains:

  • Hyperparameter tuning in machine learning
  • Simulation-based optimization
  • Experimental design
  • Multi-robot systems
  • Sensor network optimization

Limitations of Existing Methods

  1. Centralized Approaches Inapplicable: Require a central coordinator to manage all agent data, impractical in distributed scenarios
  2. Batch Bayesian Optimization Assumptions Too Strong: Assumes all agents have access to identical information, inconsistent with communication-constrained reality
  3. Existing Theoretical Guarantees Demanding: Prior literature providing theoretical guarantees for distributed Bayesian optimization requires fully connected communication graphs

Research Motivation

The authors' starting point is to develop a distributed Bayesian optimization algorithm that works under arbitrary communication graph structures and provides corresponding theoretical guarantees.

Core Contributions

  1. Proposed Distributed Thompson Sampling Algorithm: Designed a novel algorithm for multi-agent Bayesian optimization under communication constraints
  2. Established Theoretical Bounds:
    • Bayesian average regret bound: O~(θ(G)Mt)\tilde{O}\left(\sqrt{\frac{\theta(G)}{\sqrt{Mt}}}\right)
    • Bayesian simple regret bound: O~(1tVmax)\tilde{O}\left(\sqrt{\frac{1}{t|V_{max}|}}\right)
  3. Graph Structure Dependency Analysis: Bounds depend on the clique cover number θ(G)\theta(G) and maximum clique size Vmax|V_{max}| of the communication graph
  4. Convergence Guarantees: Proved faster convergence compared to sequential single-agent methods when the communication graph is connected
  5. Numerical Verification: Validated algorithm effectiveness on standard optimization test functions

Methodology Details

Task Definition

For a compact set XRdX \subset \mathbb{R}^d, consider an unknown continuous function f:XRf: X \rightarrow \mathbb{R}, with the goal of finding its maximum. There are MM agents, each capable of querying ff and receiving noisy observations y=f(x)+ϵy = f(x) + \epsilon, where ϵN(0,σϵ2)\epsilon \sim \mathcal{N}(0, \sigma_\epsilon^2).

The communication network is described by a graph G=(V,E)G = (V,E), where V=M|V| = M, and an edge (i,j)E(i,j) \in E indicates that agents ii and jj can communicate. The data accessible to agent ii at time tt is Dt,i={(xτ,j,yτ,j)}jN(i){i},τ<tD_{t,i} = \{(x_{\tau,j}, y_{\tau,j})\}_{j \in \mathcal{N}(i) \cup \{i\}, \tau < t}.

Model Architecture

Gaussian Process Modeling

Each agent ii uses an independent Gaussian process GPt,iGP_{t,i} to model the objective function: fFDt,iGPt,i(μDt,i(x),kDt,i(x,x))f | \mathcal{F}_{D_{t,i}} \sim GP_{t,i}(\mu_{D_{t,i}}(x), k_{D_{t,i}}(x,x'))

where:

  • μDt(x)=kt(x)T(KDt+σn2I)1yDt\mu_{D_t}(x) = k_t(x)^T(K_{D_t} + \sigma_n^2 I)^{-1}y_{D_t}
  • kDt(x,x)=k(x,x)kDt(x)T(KDt+σn2I)1kDt(x)k_{D_t}(x,x') = k(x,x') - k_{D_t}(x)^T(K_{D_t} + \sigma_n^2 I)^{-1}k_{D_t}(x')

Distributed Thompson Sampling Algorithm

Algorithm 1: Distributed Thompson Sampling

1. Set GP prior on f
2. Initialization: For i=1,...,M, set initial data D_{1,i} and GP_{0,i}
3. For t=1,...,T:
   For i=1,...,M:
   a) Update posterior GP_{t,i} based on D_{t,i}
   b) Sample function realization from GP_{t,i}: f̂_{t,i} ~ GP_{t,i}
   c) Select query point: x_{t,i} = argmax_x f̂_{t,i}(x)
   d) Observe y_{t,i}
   e) Broadcast (x_{t,i}, y_{t,i}) to neighbors
   f) Collect evaluations C_{t,i} from neighbors
   g) Update data history: D_{t+1,i} = D_{t,i} ∪ C_{t,i} ∪ {(x_{t,i}, y_{t,i})}

Technical Innovations

  1. Coordinator-Free Design: Each agent independently maintains its own GP model, avoiding bottlenecks of centralized approaches
  2. Communication Graph Structure Exploitation: Theoretical analysis cleverly decomposes the communication graph into disjoint cliques and analyzes the performance of each clique separately
  3. Information-Theoretic Analysis Framework: Utilizes information-theoretic concepts such as maximum information gain (MIG) to bound algorithm performance

Experimental Setup

Test Functions

Two standard optimization test functions are used:

  1. Rosenbrock Function: f(x,y)=(1x)2+100(yx2)2f(x,y) = (1-x)^2 + 100(y-x^2)^2
    • Characteristic: Contains a large valley with the global minimum located within it
  2. Ackley Function: f(x,y)=20exp(0.2x2+y22)exp(12(cos(2πx)+cos(2πy)))+20+ef(x,y) = -20\exp(-0.2\sqrt{\frac{x^2+y^2}{2}}) - \exp(\frac{1}{2}(\cos(2\pi x) + \cos(2\pi y))) + 20 + e
    • Characteristic: Has many local maxima and one global minimum

Communication Networks

Erdős-Rényi random graphs are used with 20 agents and connection probabilities of 0.2, 0.4, and 0.6 respectively.

Evaluation Metrics

  1. Instantaneous Average Regret: RA(t)=1Mi=1M(ff(xt,i))R^A(t) = \frac{1}{M}\sum_{i=1}^M (f^* - f(x_{t,i}))
  2. Instantaneous Simple Regret: RS(t)=fmaxi,τf(xt,i)R^S(t) = f^* - \max_{i,\tau} f(x_{t,i})
  3. Cumulative Regret: Time accumulation of the above metrics

Implementation Details

  • BOTorch package for implementation
  • Gaussian processes use Matérn kernel (ν=5/2\nu = 5/2)
  • 50 time steps per run
  • Argmax computed via grid search

Experimental Results

Main Results

Experimental results strongly support theoretical predictions:

  1. Connectivity Impact: On both Rosenbrock and Ackley functions, graphs with higher connection probability (0.6 > 0.4 > 0.2) achieve better regret convergence performance
  2. Consistent Performance: This trend is verified across both instantaneous simple regret and average regret metrics
  3. Algorithm Effectiveness: Distributed Thompson sampling successfully locates extrema of both test functions

Theoretical Verification

Numerical results validate core predictions of theoretical analysis:

  • Higher connectivity communication graphs yield better performance
  • Graph structure significantly impacts algorithm convergence speed

Theoretical Analysis

Main Theorems

Theorem 3.1 (Bayesian Average Regret Bound): Let {Gk}k{1,...,n}\{G_k\}_{k \in \{1,...,n\}} be a set of nn disjoint cliques of communication graph GG. Then the Bayesian average regret after tt steps satisfies: RAB(t)1Mk=1nVk(C1tVk+C2ξVkβtΨtVktVk)R_{AB}(t) \leq \frac{1}{M}\sum_{k=1}^n |V_k|\left(\frac{C_1}{t|V_k|} + \sqrt{\frac{C_2\xi_{|V_k|}\beta_t\Psi_{t|V_k|}}{t|V_k|}}\right)

Corollary 3.2: Choosing nn as the clique cover number θ(G)\theta(G) of graph GG, we obtain: RAB(t)=O~(θ(G)Mt)R_{AB}(t) = \tilde{O}\left(\sqrt{\frac{\theta(G)}{\sqrt{Mt}}}\right)

Theorem 3.3 (Bayesian Simple Regret Bound): Let Gs=(Vs,Es)G_s = (V_s, E_s) be a clique of GG. Then: RSB(t)C1tVs+C2ξVsβtΨtVstVsR_{SB}(t) \leq \frac{C_1}{t|V_s|} + \sqrt{\frac{C_2\xi_{|V_s|}\beta_t\Psi_{t|V_s|}}{t|V_s|}}

Corollary 3.4: Choosing GmaxG_{max} as the maximum clique, we obtain: RSB(t)=O~(1tVmax)R_{SB}(t) = \tilde{O}\left(\sqrt{\frac{1}{t|V_{max}|}}\right)

Convergence Analysis

Compared to sequential single-agent Thompson sampling with O~(1/t)\tilde{O}(\sqrt{1/t}) regret:

  • Average regret improvement factor: θ(G)/M\sqrt{\theta(G)/M}
  • Simple regret improvement factor: 1/Vmax\sqrt{1/|V_{max}|}

Bayesian Optimization Field

  1. Single-Agent Methods: GP-UCB, Expected Improvement, Thompson Sampling
  2. Batch Methods: Parallel knowledge gradient, batch Thompson sampling
  3. Multi-Agent Methods: Primarily focused on centralized or batch methods under complete connectivity assumptions

Positioning of This Work

This paper is the first to provide theoretical guarantees for distributed Bayesian optimization under communication constraints (non-fully connected graphs), filling an important gap in the field.

Conclusions and Discussion

Main Conclusions

  1. Algorithm Effectiveness: The proposed distributed Thompson sampling algorithm effectively solves multi-agent Bayesian optimization under communication constraints
  2. Theoretical Guarantees: Established regret bounds dependent on communication graph structure, proving convergence advantages under connected graphs
  3. Graph Structure Importance: Communication graph connectivity significantly impacts algorithm performance

Limitations

  1. Synchronization Assumption: The algorithm assumes a global synchronous clock, which may be unrealistic in practical applications
  2. Computational Complexity: Efficiency issues of argmax computation in high-dimensional spaces remain incompletely resolved
  3. Kernel Function Selection: Theoretical analysis depends on specific kernel function assumptions

Future Directions

  1. Asynchronous Versions: Develop algorithm variants not requiring global synchronization
  2. Efficient Optimization: Research efficient computation methods for argmax in high-dimensional Thompson sampling
  3. Tighter Bounds: Seek tighter regret bounds
  4. Practical Applications: Validate the algorithm in real multi-robot or sensor network systems

In-Depth Evaluation

Strengths

  1. Significant Theoretical Contribution: First to provide theoretical guarantees for distributed Bayesian optimization under communication constraints
  2. Practical Problem Setting: Addresses the important real-world issue of communication constraints
  3. Rigorous Analysis: Clear theoretical proof structure with deep analysis using information-theoretic tools
  4. Sufficient Experimental Support: Numerical experiments well validate theoretical predictions

Weaknesses

  1. Limited Experimental Scale: Validation only on 2D test functions and relatively small-scale networks
  2. Insufficient Practical Considerations: Synchronization assumptions and argmax computation efficiency issues limit practical applicability
  3. Lack of Comparative Experiments: Missing direct comparisons with other distributed optimization methods

Impact

  1. High Theoretical Value: Makes important contributions to distributed Bayesian optimization theory
  2. Broad Application Prospects: Potential applications in multi-robot systems, IoT, and other domains
  3. Strong Extensibility: Provides solid theoretical foundation for subsequent research

Applicable Scenarios

  • Multi-robot collaborative optimization tasks
  • Distributed sensor network parameter tuning
  • Collaborative learning in edge computing environments
  • Parallel optimization problems with limited communication bandwidth

Overall Assessment: This is a high-quality paper with significant theoretical contributions to the field of distributed Bayesian optimization. The authors cleverly combine graph theory, information theory, and Bayesian optimization to provide theoretical guarantees for communication-constrained scenarios commonly encountered in practice. While there is room for improvement in practical applicability, its theoretical value and guidance for future research are substantial.