2025-11-10T02:34:50.114959

The Runtime of Random Local Search on the Generalized Needle Problem

Doerr, Kelley

In their recent work, C. Doerr and Krejca (Transactions on Evolutionary Computation, 2023) proved upper bounds on the expected runtime of the randomized local search heuristic on generalized Needle functions. Based on these upper bounds, they deduce in a not fully rigorous manner a drastic influence of the needle radius $k$ on the runtime. In this short article, we add the missing lower bound necessary to determine the influence of parameter $k$ on the runtime. To this aim, we derive an exact description of the expected runtime, which also significantly improves the upper bound given by C. Doerr and Krejca. We also describe asymptotic estimates of the expected runtime.

academic

The Runtime of Random Local Search on the Generalized Needle Problem

基本信息

论文ID: 2403.08153
标题: The Runtime of Random Local Search on the Generalized Needle Problem
作者: Benjamin Doerr, Andrew James Kelley
分类: cs.NE (Neural and Evolutionary Computation), cs.AI (Artificial Intelligence), cs.DS (Data Structures and Algorithms)
发表时间: March 21, 2024
论文链接: https://arxiv.org/abs/2403.08153

摘要

本文针对C. Doerr和Krejca在2023年发表的关于广义Needle函数上随机局部搜索启发式算法期望运行时间上界的研究进行了补充和改进。原研究基于上界推导出needle半径k对运行时间的显著影响，但缺乏严格的理论证明。本文通过推导期望运行时间的精确表达式，提供了必要的下界分析，显著改进了原有的上界结果，并给出了期望运行时间的渐近估计。

研究背景与动机

要解决的问题: 确定随机局部搜索(RLS)算法在广义Needle问题上的精确运行时间复杂度，特别是参数k（needle半径）对算法性能的影响。
问题重要性:
- 广义Needle问题是理解随机搜索启发式算法如何处理常数适应度平台的重要基准测试
- 该问题集成了对皇家道路函数、平台问题和BlockLeadingOnes问题等经典问题的研究
- 为设计和分析具有可调特征的基准测试提供理论基础
现有方法局限性:
- C. Doerr和Krejca的工作仅提供了上界，缺乏下界分析
- 使用了复杂的漂移分析、可选停时定理和广义Wald方程
- 对于k = o(n)的情况，上界是超指数的，明显过于宽松
研究动机: 通过提供精确的运行时间表达式和渐近估计，完善理论分析，并简化证明方法。

核心贡献

提供了精确的期望运行时间公式: 对于初始解有i个1的情况，期望运行时间为 $\sum_{j=i}^{n-k-1} \binom{n}{\leq j} / \binom{n-1}{j}$
显著改进了现有上界: 特别是对于k = o(n)的情况，从超指数上界改进到 $2^n \binom{n}{k}^{-1}$ 的渐近紧界
简化了分析方法: 使用经典的马尔可夫链方法替代复杂的漂移分析
提供了完整的渐近分析: 涵盖了k的不同取值范围，包括亚线性、线性和接近n/2的情况
纠正了原文的错误: 指出并修正了原文中关于k = n/2 - Θ(1)时运行时间为常数的错误结论

方法详解

任务定义

广义Needle函数定义: 对于 $n \in \mathbb{N}$ 和 $k \in [0..n]$ ，广义Needle函数 $\text{Needle}_{n,k}$ 定义为：

$\text{Needle}_{n,k}(x) = \begin{cases} 0, & \text{if } \|x\|_1 < n-k \\ 1, & \text{if } \|x\|_1 \geq n-k \end{cases}$

其中 $\|x\|_1$ 表示位串x中1的个数。全局最优解包括全1串和与其最多相差k位的所有位串。

随机局部搜索(RLS): 每次迭代随机翻转当前解的一个位，如果新解不劣于当前解则接受。

模型架构

马尔可夫链建模:

将RLS在超立方体 $\{0,1\}^n$ 上的随机游走简化为在 $[0..n]$ 上的马尔可夫链
状态空间为当前解中1的个数
转移概率：
- 从状态i到i-1： $p_i^- = i/n$
- 从状态i到i+1： $p_i^+ = (n-i)/n$

关键引理: 使用Droste, Jansen和Wegener的经典结果，从状态i到i+1的期望首达时间为： $E[T_i^+] = \sum_{k=0}^i \frac{1}{p_k^+} \prod_{\ell=k+1}^i \frac{p_\ell^-}{p_\ell^+}$

技术创新点

精确公式推导: 通过马尔可夫链分析得到： $E[T_i^+] = \binom{n}{\leq i} / \binom{n-1}{i}$
渐近分析框架:
- 对于不同的k值范围采用不同的分析策略
- 利用二项式系数的渐近性质和Jensen不等式
凹函数性质: 证明了期望运行时间作为起始状态的函数具有凹性，便于应用Jensen不等式

实验设置

本文主要是理论分析，没有传统意义上的实验部分，而是通过数学证明验证理论结果。

分析范围

亚线性k: k = o(n)
线性k: k = n/2 - εn，其中ε > 0为常数
接近n/2的k: n/2 - k = o(n)
大于n/2的k: k ≥ n/2 + √n log n

定理11 (线性k的情况): 当k = n/2 - εn（0 < ε < 1/2）时： $E[T] = \Theta\left(2^n \binom{n}{k}^{-1}\right)$

定理13 (接近n/2的情况):

若k = n/2 - g(n)，其中g(n) = ω(√n)且g(n) = o(n)： $E[T] = O\left(g(n)2^n \binom{n}{k}^{-1}\right) \text{ 且 } E[T] = \Omega\left(2^n \binom{n}{k}^{-1}\right)$
若k = n/2 - O(√n)： $E[T] = \Theta(n)$