2025-11-22T22:52:16.673046

Finite sample learning of moving targets

Vertovec, Margellos, Prandini
We consider a moving target that we seek to learn from samples. Our results extend randomized techniques developed in control and optimization for a constant target to the case where the target is changing. We derive a novel bound on the number of samples that are required to construct a probably approximately correct (PAC) estimate of the target. Furthermore, when the moving target is a convex polytope, we provide a constructive method of generating the PAC estimate using a mixed integer linear program (MILP). The proposed method is demonstrated on an application to autonomous emergency braking.
academic

Finite Sample Learning of Moving Targets

Basic Information

  • Paper ID: 2408.04406
  • Title: Finite sample learning of moving targets
  • Authors: Nikolaus Vertovec (University of Oxford), Kostas Margellos (University of Oxford), Maria Prandini (Politecnico di Milano)
  • Classification: math.OC (Optimization and Control), cs.LG (Machine Learning)
  • Submission Date: August 2024 (v3: November 10, 2025)
  • Paper Link: https://arxiv.org/abs/2408.04406

Abstract

This paper investigates the problem of learning moving targets from finite samples. The work extends randomization techniques developed in control and optimization for constant targets to scenarios where targets change over time. The paper derives new bounds on the number of samples required to construct Probably Approximately Correct (PAC) target estimates. Furthermore, when moving targets are convex polyhedra, it provides a constructive method using Mixed Integer Linear Programming (MILP) to generate PAC estimates. The approach is validated in an Automatic Emergency Braking (AEB) application.

Research Background and Motivation

Problem Statement

Traditional statistical learning theory (such as PAC learning) assumes that the target labeling function remains fixed over time. However, in many practical applications, the learning target changes dynamically. This paper investigates how to learn such structured time-varying labeling mechanisms from finite samples while providing probabilistic guarantees.

Problem Significance

  1. Practical Necessity: In control systems, robotics, autonomous driving, and other fields, environmental and system parameters drift over time (e.g., brake performance degradation, vehicle mass variation)
  2. Theoretical Challenge: Classical PAC theory cannot be directly applied to moving targets, requiring a new theoretical framework
  3. Safety-Critical Applications: In safety-critical systems like autonomous driving, strict probabilistic guarantees are essential

Limitations of Existing Methods

  1. Scenario Approach: Primarily addresses uncertainty in optimization for constant targets, not accounting for time-varying targets
  2. VC Theory: Requires finite VC dimension and is mainly designed for static targets
  3. Existing Moving Target Learning: Works such as 2,3,15,20,22,23 mostly provide expectation-based evaluations rather than PAC-type double-sided probabilistic guarantees
  4. Lack of Constructive Methods: Existing analyses are primarily existence proofs without practical algorithms for constructing hypotheses

Research Motivation

This paper aims to:

  1. Provide PAC-type finite sample complexity bounds for moving target learning
  2. Develop constructive algorithms (MILP) to generate hypotheses satisfying theoretical guarantees
  3. Correct mathematical gaps in reference 20 regarding target change bounds

Core Contributions

  1. A Priori Sample Complexity Bounds: Provides a priori bounds on the minimum number of samples needed to generate PAC hypotheses in Section 3 (Theorem 2), extending the work of 20 but using PAC-type results rather than expectation-based evaluation
  2. Mathematical Correction: Corrects mathematical gaps in 20, Theorem 1 by introducing a lower bound μ on target variation (not just upper bound μ̄)
  3. Constructive MILP Method: Proposes the first constructive method in Section 4, using Mixed Integer Linear Programming to generate minimum disagreement hypotheses for convex polyhedra classes (the first constructive method for tracking problems)
  4. Practical Application Validation: Validates theoretical results through an Automatic Emergency Braking (AEB) system case study in Section 5, proposing a sample elimination strategy that improves computational efficiency (eliminating 95% of redundant samples)

Methodology Details

Task Definition

Input:

  • Labeled m-sample: {(x₁, f₁(x₁)), ..., (xₘ, fₘ(xₘ))}
  • Samples xᵢ ∈ X ⊆ ℝⁿ are drawn i.i.d. from probability distribution P
  • Each sample is labeled by a different target function fᵢ: X → {0,1}

Output:

  • Hypothesis hₘ: X → {0,1} to predict the label fₘ₊₁(x) for the next sample x

Constraints:

  • All target and hypothesis functions belong to the same class H with finite VC dimension (Assumption 1)
  • Target variation satisfies a structured assumption: average disagreement probability μ = (1/m)∑ᵢ₌₁ᵐ er(fᵢ, fₘ₊₁) satisfies μ ≤ μ ≤ μ̄ (Assumption 2)

Objective: Find m₀(ε, δ) such that for m ≥ m₀, the constructed hypothesis satisfies:

Pᵐ{(x₁,...,xₘ) ∈ Xᵐ : P{x ∈ X : hₘ(x) ≠ fₘ₊₁(x)} ≤ ε₀ + ε} ≥ 1-δ

where ε₀ depends on the target movement speed.

Theoretical Framework

Key Definitions

  1. Probabilistic Disagreement:
er(f, h) := P{x ∈ X : h(x) ≠ f(x)}
  1. Empirical Disagreement:
êrₘ(f, h) := (1/m)∑ᵢ₌₁ᵐ |f(xᵢ) - h(xᵢ)|
  1. Minimum Disagreement Hypothesis Set (Definition 1):
Mₘ := argminₕ∈H (1/m)∑ᵢ₌₁ᵐ |fᵢ(xᵢ) - h(xᵢ)|

Main Theoretical Result (Theorem 2)

For ε, δ ∈ (0,1), if μ < 1/4 and m ≥ m₀(ε, δ), where:

m₀(ε, δ) = max{
    (1/(2μ²))ln(2/δ),
    (5(4μ̄ + ε)/ε²)(ln(8/δ) + d·ln(40(4μ̄ + ε)/ε²))
}

then for any hₘ ∈ Mₘ:

Pᵐ{(x₁,...,xₘ) ∈ Xᵐ : er(fₘ₊₁, hₘ) ≤ 4μ̄ + ε} ≥ 1-δ

Proof Sketch:

  1. Define two events:
    • E = {true error > 4μ̄ + ε} (error set)
    • A = {empirical average disagreement > 2μ̄} (approximation set)
  2. Decompose probability: Pᵐ{E} ≤ Pᵐ{A} + Pᵐ{E ∩ Ā}
  3. Bound Pᵐ{A}: Using Hoeffding's inequality (Proposition 1), requiring m ≥ (1/(2μ²))ln(2/δ)
  4. Bound Pᵐ{E ∩ Ā}:
    • Utilize minimum disagreement property: ∑|fᵢ(xᵢ) - hₘ(xᵢ)| ≤ ∑|fᵢ(xᵢ) - fₘ₊₁(xᵢ)|
    • Apply triangle inequality: êrₘ(fₘ₊₁, hₘ) ≤ 2·(1/m)∑|fᵢ(xᵢ) - fₘ₊₁(xᵢ)|
    • Apply Theorem 1 (VC theory result), requiring the second sample bound

Constructive MILP Method

Problem Setup

Assume targets and hypotheses are indicator functions of convex polyhedra:

fᵢ(x) = 1_{Bᵢ}(x), hₘ(x) = 1_{Bhₘ}(x)

where Bₕₘ is parameterized by Ax + b ≤ 0 with at most nf faces.

MILP Construction Steps

Step 1: Handle samples with label 1 (i ∈ I₁)

If hₘ(xᵢ) = fᵢ(xᵢ) = 1, then xᵢ ∈ Bₕₘ, i.e.:

aⱼxᵢ + bⱼ ≤ 0, ∀j = 1,...,nf

Introduce slack variables sᵢⱼ ≥ 0 to allow disagreement:

aⱼxᵢ + bⱼ ≤ sᵢⱼ, ∀j = 1,...,nf

Step 2: Handle samples with label 0 (i ∈ I₀)

If hₘ(xᵢ) = fᵢ(xᵢ) = 0, then xᵢ ∉ Bₕₘ. Use binary variables zᵢⱼ ∈ {0,1} and big-M method:

aⱼxᵢ + bⱼ ≤ Mⱼ(1 - zᵢⱼ), ∀j
aⱼxᵢ + bⱼ ≥ ϱ + (mⱼ - ϱ)zᵢⱼ - sᵢⱼ, ∀j
∑ⱼzᵢⱼ ≤ nf - 1

Step 3: Minimize Disagreement

Introduce binary variables vᵢ ∈ {0,1} indicating disagreement:

vᵢ = 1 ⟺ ∑ⱼsᵢⱼ > 0

Implement via constraints:

∑ⱼsᵢⱼ - vᵢ∑ⱼMⱼ ≤ 0 (if i ∈ I₁)
∑ⱼsᵢⱼ + vᵢ∑ⱼmⱼ ≤ 0 (if i ∈ I₀)

Step 4: Complete MILP

minimize ∑ᵢ₌₁ᵐ vᵢ
subject to:
  ∀i ∈ I₁: constraints (35)
  ∀i ∈ I₀: constraints (36)

Technical Innovations

  1. Double-Sided Probabilistic Guarantees: Compared to expectation-based evaluation in 20, provides stronger PAC-type guarantees
  2. Target Variation Lower Bound: Introduces μ to correct mathematical errors in 20, making bounds more precise
  3. Constructive Algorithm: First constructive method for tracking problems (all prior work only provides existence proofs)
  4. Sample Elimination Strategy: In AEB application, eliminates 95% of redundant samples through geometric analysis, significantly improving computational efficiency
  5. Theoretical Unification: Constant targets as special case (Theorem 2 reduces to Theorem 1 when μ = μ̄ = 0)

Experimental Setup

Application Scenario: Automatic Emergency Braking (AEB) System

Problem Description:

  • Vehicle receives measurements of obstacle distance l and own velocity v
  • Safety condition: braking distance ≤ available distance, i.e., (1/2)v²(m/F) ≤ l
  • Braking force F and vehicle mass m change over time (brake wear, fuel/passenger variation)

Labeling Function:

fᵢ(x) = 1 if (1/2)v²(mᵢ/Fᵢ) ≤ l, else 0

where x = (l, v²)

Data Generation

Distribution Setup:

  • Distance l: uniform distribution on 40m, 120m
  • Velocity squared v²: normal distribution with mean (70km/h)², standard deviation (20km/h)²

Target Variation:

  • Brake force degradation: Fᵢ₊₁ = ωF·Fᵢ, ωF ~ N(1-3×10⁻⁷, 10⁻⁶)
  • Mass variation: mᵢ₊₁ = ωₘ·mᵢ, ωₘ ~ N(1, 10⁻³)
  • Initial mass: m = 900kg

Parameter Settings

Theoretical Parameters:

  • Confidence level: δ = 10⁻⁶
  • Accuracy: ε = 1%
  • Target variation bounds: μ = 0.78%, μ̄ = 2%
  • VC dimension: d = 1 (single half-plane determines label)

Theoretical Sample Requirement: According to Theorem 2, m₀(ε, δ) = 119,237

Implementation Details

  1. Polyhedron Parameterization:
    • Fix rotation angle θ = tan⁻¹(m/(2F)) to simplify nonlinearity
    • Consider only relevant faces (third face determines label)
  2. Sample Elimination:
    • Eliminate cyan region samples (left of all I₁ samples)
    • Eliminate magenta region samples (right of all I₀ samples)
    • Elimination rate: 95%
  3. MILP Solving:
    • Use commercial solver
    • Solving time: 561 seconds
    • Objective: minimize disagreement count + volume as tie-break

Experimental Results

Main Results

MILP Solution:

  • Total violations (disagreement count): v = 1,335
  • Solving time: 561 seconds
  • Sample utilization: 5% of 119,237 samples retained for MILP

Theory vs. Practice:

  • Theoretical guarantee: er(fₘ₊₁, hₘ) ≤ 4μ̄ + ε = 9% (confidence 1-δ)
  • Actual average empirical disagreement: ≈ 2.4% (via 500 Monte Carlo runs)
  • Conclusion: Actual performance significantly exceeds theoretical guarantee

Monte Carlo Verification

Verification Method:

  • 500 independent runs
  • Each run: generate new fₘ₊₁ (random brake degradation and mass variation)
  • Each run: sample 5000 test samples to compute êrₘ(fₘ₊₁, hₘ)

Result Distribution (Figure 7):

  • Empirical disagreement concentrated in 2-3% range
  • Far below theoretical upper bound of 9%
  • Validates effectiveness and conservatism of theoretical guarantees

Visualization Analysis

Figure 3: Brake performance evolution over time

  • Red half-plane: true safety boundary (changes with iteration)
  • Transparent half-planes: historical safety boundaries
  • Green circles: label 0 (safe)
  • Blue triangles: label 1 (unsafe)

Figure 5: Sample elimination strategy

  • Cyan/magenta regions: redundant samples (eliminated)
  • Red samples: retained for MILP
  • Elimination rate: 95%

Figure 6: Generated hypothesis

  • Red half-plane: constructed hypothesis hₘ
  • Red samples: violation points (1,335 total)
  • Hypothesis effectively tracks moving safety boundary

Sample Complexity Analysis (Figure 2)

Trend Observations:

  1. High ε Region: First term sample bound dominates (constant part), depends on μ
  2. Low ε Region: Second term sample bound dominates (non-constant part), depends on μ̄
  3. μ Impact: Smaller μ requires more samples (difficult to learn true change rate)
  4. μ̄ Impact: Larger μ̄ requires more samples (fast-moving targets harder to track)

Statistical Learning Theory Foundations

  1. VC Theory 29:
    • Provides PAC learning bounds for finite VC dimension classes
    • This paper extends to moving target scenarios
  2. Scenario Approach 5-7,9,12:
    • Randomization method for uncertain convex optimization
    • Provides a priori feasibility guarantees
    • This paper applies these ideas to non-convex and moving targets
  3. Compressed Learning 11,24:
    • Connection between scenario approach and statistical learning
    • Generalization bounds based on compression concepts

Moving Target Learning

  1. Concept Drift Learning 2,3,15,22,23:
    • 2,22: Learning by exploiting change structure
    • 3: Complexity of learning from drifting distributions
    • 23: Simultaneous distribution and target variation
    • Distinction: Most provide expectation-based evaluation; this paper provides PAC guarantees
  2. Tracking Drifting Concepts 20:
    • Tracking through disagreement minimization
    • This paper's improvement: Corrects mathematical errors, provides PAC rather than expectation-based guarantees
  3. Adaptive Change Rates 19:
    • Adapts to variable target change rates
    • This paper assumes change rate bounds are known

Control Applications

  1. Control Synthesis 13,14,16,28:
    • Randomization methods in control design
    • Sample complexity bounds
  2. Game Theory 17:
    • PAC Nash equilibrium learning
  3. Reinforcement Learning 14:
    • Safe policy training and deployment

Conclusions and Discussion

Main Conclusions

  1. Theoretical Contribution: Moving targets under structured change assumptions remain PAC learnable with accuracy 4μ̄ + ε
  2. Sample Complexity: Provides explicit sample bounds depending on:
    • Accuracy ε (polynomial dependence on 1/ε)
    • Confidence δ (logarithmic dependence)
    • Target variation bounds μ, μ̄
    • VC dimension d
  3. Constructive Method: First MILP-based constructive method for generating minimum disagreement hypotheses
  4. Practical Utility: Validated on AEB system with actual performance exceeding theoretical guarantees

Limitations

  1. Variation Bound Assumption: Requires prior knowledge of μ and μ̄
    • May be difficult to estimate accurately in practice
    • Conservative estimates increase sample requirements
  2. Accuracy Degradation: Compared to constant targets, accuracy degrades by 4μ̄
    • Only meaningful when μ̄ is relatively small
    • May not apply to rapidly changing targets
  3. MILP Computational Complexity:
    • High computational cost for large sample sizes
    • While elimination strategy is effective, depends on problem geometry
  4. Convex Polyhedron Restriction: Constructive method only applies to convex polyhedra classes
    • Other hypothesis classes require alternative methods
  5. Fixed Distribution Assumption: Sample distribution P remains fixed
    • 23 considers distribution variation; this paper does not

Future Directions

  1. Distribution Drift: Consider sample distribution P also changing over time (as in 23)
  2. Adaptive Methods:
    • Online estimation of μ and μ̄
    • Dynamically adjust sample requirements
  3. More General Hypothesis Classes:
    • Extend MILP method to other structures
    • Non-convex hypotheses like neural networks
  4. Computational Optimization:
    • More efficient MILP solving
    • Approximation algorithms balancing accuracy and efficiency
  5. Theoretical Improvements:
    • Tighter sample complexity bounds
    • Reduce dependence on μ̄

In-Depth Evaluation

Strengths

1. Theoretical Rigor

  • Complete mathematical derivations, correcting errors in reference 20
  • Provides double-sided probabilistic guarantees (PAC-type), stronger than expectation-based evaluation
  • Constant targets as special case, unified theory

2. Methodological Innovation

  • First constructive algorithm for moving target tracking
  • Elegant MILP formulation with clever constraint design (big-M method, slack variables)
  • Practical sample elimination strategy (95% elimination rate)

3. Experimental Sufficiency

  • Selects safety-critical AEB application with clear motivation
  • Comprehensive Monte Carlo verification (500 runs)
  • Clear comparison between theory and practice

4. Writing Clarity

  • Clear structure, progressing from problem definition through theory to construction to application
  • Rich visualizations (Figure 1 conceptual diagram, Figures 3-7 result plots)
  • Standard mathematical notation

Weaknesses

1. Practical Assumption Utility

  • Prior Knowledge of μ and μ̄: Difficult to obtain accurately in practice
    • Paper lacks discussion of estimation methods
    • Impact of incorrect estimation not analyzed
  • μ < 1/4 Constraint: Strong restriction, unsuitable for rapidly changing systems

2. Experimental Limitations

  • Single Application: Only AEB case study, lacks diversity
  • Simplified Assumptions: Fixed rotation angle θ avoids nonlinearity but loses generality
  • Missing Comparisons: No direct experimental comparison with method 20 and others

3. Computational Efficiency

  • Large Sample Size: 119,237 samples may be impractical in some applications
  • MILP Solving Time: 561 seconds still lengthy, limiting real-time applications
  • Scalability: Extensibility to high dimensions and complex polyhedra not thoroughly explored

4. Theoretical Bound Tightness

  • Actual error 2.4% vs. theory 9%: significant gap
  • Potential for improvement, but paper lacks deep analysis

5. Missing Distribution Drift

  • Fixed P assumption invalid in many practical scenarios
  • While proposed as future work, limits current applicability

Impact

1. Academic Contribution

  • Theoretical Advance: Extends PAC learning to moving targets, filling theoretical gap
  • Methodological Contribution: Constructive MILP method can inspire other tracking problems
  • Interdisciplinary: Connects statistical learning, optimization, and control theory

2. Practical Value

  • Safety-Critical Systems: Theoretical foundation for AEB and similar applications
  • Industrial Relevance: Brake degradation and similar problems exist in practice
  • Extensibility: Framework applicable to other time-varying systems

3. Reproducibility

4. Future Research Directions

  • Inspires research on simultaneous distribution and target drift
  • Motivates online learning and adaptive methods
  • Encourages constructive methods for other hypothesis classes

Applicable Scenarios

Suitable Scenarios:

  1. Slowly Changing Systems: Small μ̄ (<5%), e.g., gradual brake degradation
  2. Convex Structure Problems: Targets representable as convex polyhedra
  3. Offline Learning: Sufficient time for sample collection and MILP solving
  4. Safety-Critical Applications: Requiring strict probabilistic guarantees

Unsuitable Scenarios:

  1. Rapid Changes: μ̄ near 1/4 or larger
  2. Real-Time Requirements: Cannot afford large samples and long solving times
  3. Complex Non-Convex Targets: Beyond convex polyhedra class
  4. Distribution Drift: Sample distribution P also changes significantly
  5. Unknown Change Rates: Cannot estimate μ and μ̄

Potential Improvements

  1. Adaptive Estimation: Online estimation of μ and μ̄, dynamically adjust sample needs
  2. Distributed MILP: Parallel solving to accelerate computation
  3. Neural Network Approximation: Use NN to quickly approximate MILP solutions
  4. Active Learning: Intelligently select sample locations to reduce sample requirements
  5. Robustness Analysis: Sensitivity analysis of μ and μ̄ estimation errors

Selected References

1 Alamo et al., 2009. "Randomized strategies for probabilistic solutions" - Foundation of randomization methods

5-7,9,12 Calafiore & Campi series. "The scenario approach" - Core scenario approach literature

20 Helmbold & Long, 1994. "Tracking drifting concepts by minimizing disagreements" - Primary extension target of this paper

29 Vidyasagar, 2003. "Learning and Generalisation" - Classical textbook on PAC learning and VC theory

28 Tempo et al., 2005. "Randomized algorithms for analysis and control" - Randomization methods in control


Overall Assessment: This is an excellent paper combining theoretical rigor with methodological innovation. Main contributions extend PAC learning to moving targets and provide the first constructive algorithm. Theory is complete, correcting literature errors, and experiments are comprehensive. Primary limitations include need for prior variation bounds, high computational complexity, and fixed distribution assumption. The paper is well-suited for slowly changing safety-critical systems and makes important contributions to the intersection of control theory and statistical learning. Recommended future work should focus on adaptive estimation, distribution drift, and computational efficiency optimization.