2025-11-13T22:16:11.184529

Fast and Interpretable Mixed-Integer Linear Program Solving by Learning Model Reduction

Li, Chen, Li et al.
By exploiting the correlation between the structure and the solution of Mixed-Integer Linear Programming (MILP), Machine Learning (ML) has become a promising method for solving large-scale MILP problems. Existing ML-based MILP solvers mainly focus on end-to-end solution learning, which suffers from the scalability issue due to the high dimensionality of the solution space. Instead of directly learning the optimal solution, this paper aims to learn a reduced and equivalent model of the original MILP as an intermediate step. The reduced model often corresponds to interpretable operations and is much simpler, enabling us to solve large-scale MILP problems much faster than existing commercial solvers. However, current approaches rely only on the optimal reduced model, overlooking the significant preference information of all reduced models. To address this issue, this paper proposes a preference-based model reduction learning method, which considers the relative performance (i.e., objective cost and constraint feasibility) of all reduced models on each MILP instance as preferences. We also introduce an attention mechanism to capture and represent preference information, which helps improve the performance of model reduction learning tasks. Moreover, we propose a SetCover based pruning method to control the number of reduced models (i.e., labels), thereby simplifying the learning process. Evaluation on real-world MILP problems shows that 1) compared to the state-of-the-art model reduction ML methods, our method obtains nearly 20% improvement on solution accuracy, and 2) compared to the commercial solver Gurobi, two to four orders of magnitude speedups are achieved.
academic

Fast and Interpretable Mixed-Integer Linear Program Solving by Learning Model Reduction

Basic Information

  • Paper ID: 2501.00307
  • Title: Fast and Interpretable Mixed-Integer Linear Program Solving by Learning Model Reduction
  • Authors: Yixuan Li, Can Chen, Jiajun Li, Jiahui Duan, Xiongwei Han, Tao Zhong, Vincent Chau, Weiwei Wu, Wanyuan Wang
  • Classification: cs.LG cs.AI
  • Publication Date: December 31, 2024 (arXiv preprint)
  • Institutions: School of Computer Science and Engineering, Southeast University; Huawei Noah's Ark Laboratory
  • Paper Link: https://arxiv.org/abs/2501.00307

Abstract

This paper proposes a machine learning-based approach for solving large-scale mixed-integer linear programs (MILPs) by leveraging the correlation between MILP structure and solutions. Unlike existing end-to-end solution learning methods, this work learns a simplified equivalent model of the original MILP as an intermediate step. The paper introduces a preference-based model reduction learning method that considers the relative performance of all reduced models on each MILP instance as preference information and employs attention mechanisms to capture this preference information. Experimental results demonstrate approximately 20% improvement in solution accuracy compared to state-of-the-art model reduction ML methods, and achieve 2-4 orders of magnitude speedup compared to the commercial solver Gurobi.

Research Background and Motivation

Problem Definition

Mixed-integer linear programming (MILP) has widespread applications in critical domains including supply chain logistics, service scheduling, energy management, and transportation planning. In practical industrial applications, MILP instances typically involve hundreds of thousands of decision variables and constraints. Existing commercial solvers based on exact methods incur high computational costs and cannot meet real-time requirements.

Research Motivation

  1. Scalability Issues: Existing ML methods directly learn high-dimensional solution space mappings, suffering from scalability problems
  2. Lack of Interpretability: End-to-end approaches cannot provide interpretable solutions or intuitive understanding
  3. Insufficient Utilization of Preference Information: Current model reduction methods focus only on optimal reduced models, neglecting valuable preference information from all reduced models

Limitations of Existing Methods

  • End-to-end Solution Prediction: Low prediction accuracy due to high dimensionality of solution space
  • Learning to Optimize: Limited efficiency due to decision horizon constraints
  • Existing Model Reduction Methods: Treat feasible reduced models as equally ideal labels, failing to fully exploit comparative information

Core Contributions

  1. Proposes Preference-Based Model Reduction Learning: Converts reduced model performance into preference information, fully exploiting comparative information in the instance-strategy space
  2. Designs Attention Mechanism Architecture: Captures correlations between instances and preference-reduced models, improving learning accuracy
  3. Introduces SETCOVER Pruning Technique: Controls the number of reduced models (labels), generating minimal label sets feasible for all instances
  4. Achieves Significant Performance Improvements: Validates on real MILP problems with 20% accuracy improvement over existing methods and 2-4 orders of magnitude speedup compared to Gurobi

Methodology Details

Task Definition

Given a parameterized MILP problem:

min f(c, x)
s.t. g(A, x) ≤ b
     xI ∈ Z^d, x_{-I} ∈ R^{n-d}

where θ = ⟨A, c, b⟩ represents MILP parameters.

Optimal Strategy Definition: s*(θ) = (T(θ), x*_I(θ)), including the set of tight constraints T(θ) and optimal values of integer variables.

Objective: Learn the mapping from parameters θ to optimal strategy s*(θ).

Model Architecture

1. Strategy Generation and Pruning Phase

  • Strategy Generation: Uses Good-Turing estimator to determine instance quantity until N₁/N falls below threshold
  • Strategy Pruning: Constructs instance-strategy bipartite graph G(V_θ, V_s, E), transforms pruning problem into SETCOVER problem
  • Greedy Algorithm: Iteratively selects strategy nodes covering the most uncovered instance nodes

2. Preference-Based Strategy Learning

Preference Computation:

r(θᵢ, sⱼ) = -log(p(θᵢ, sⱼ) + d(θᵢ, sⱼ))

where p(θᵢ, sⱼ) represents infeasibility and d(θᵢ, sⱼ) represents suboptimality.

Preference Definition:

(θᵢ, sⱼ) ≻ (θᵢ, sₖ) ⇔ r(θᵢ, sⱼ) > r(θᵢ, sₖ)

Attention Mechanism Encoding:

A([θᵢ, S^P]) = softmax(QK^T/√d)V

Treats each instance-strategy pair ⟨θᵢ, sⱼ⟩ as a token, extracting features through multi-layer attention mechanisms.

Technical Innovations

1. Preference Sampling Optimization

Leverages transitivity of preferences, ordering strategies by preference, requiring only M_P ordered preference samples instead of (M_P choose 2) pairwise comparisons, reducing sample complexity from quadratic to linear.

2. Dual Loss Function

  • Preference Loss:
L_p(φ) = -1/N ∑∑[μᵢ,ⱼ log(pᵢ,ⱼ) + (1-μᵢ,ⱼ) log(1-pᵢ,ⱼ)]
  • Reward Difference Loss:
L_d(φ) = 1/N ∑∑(r̂ᵢ,σ(j) - r̂ᵢ,σ(j+1) - δᵢ,ⱼ)²
  • Total Loss: L_total(φ) = λ₁L_p(φ) + λ₂L_d(φ)

3. Top-k Strategy Inference

Selects top-k output strategies as candidates during inference, evaluates reduced models and selects the strategy with lowest infeasibility.

Experimental Setup

Datasets

  1. MIPLIB: Real MILP problems from 6 scenarios with varying scales and solving difficulty
  2. Fuel Cell Energy Management Problem: Adjusts problem complexity by increasing time steps T
  3. Inventory Management Problem: 5 large-scale real industrial problems, averaging approximately 100,000 constraints

Evaluation Metrics

Accuracy Metrics:

accuracy = 1/N |{θᵢ | p(θᵢ, ŝⱼ) ≤ ε₁ ∧ d(θᵢ, ŝⱼ) ≤ ε₂}|

where both infeasibility and suboptimality tolerances are set to 1×10⁻⁴.

Comparison Methods

  1. Gurobi: Advanced commercial solver (with warm start enabled)
  2. Gurobi Heuristic: Gurobi heuristic mode (1-second time limit)
  3. MLOPT: Most applicable model reduction-based learning method
  4. Predict and Search: Solution prediction-based method

Implementation Details

  • Optimizer: AdamW, learning rate 0.0001-0.001
  • Learning rate decay: Linear decay by 0.9 every 10 epochs, 100 epochs total
  • Batch size: 128
  • Coordination parameter λ₁: 0.8-0.9

Experimental Results

Main Results

MIPLIB Dataset

  • In most scenarios, the method achieves performance comparable to Gurobi in suboptimality and feasibility
  • Shows slight advantages over MLOPT in feasibility, with significant improvements in suboptimality
  • Heuristic methods perform poorly on multiple metrics due to time constraints

Fuel Cell Energy Management

  • Achieves nearly 39% accuracy improvement on the most complex problem
  • Strategy pruning shows significant effects, particularly as problem scale increases
  • Top-k performance: Model performance improves with increasing k, outperforming MLOPT at equivalent k values

Inventory Management Problems

  • Maintains feasibility across all instances with stable accuracy around 90%
  • Achieves approximately 30% improvement over MLOPT on the largest problem P5 (exceeding 270,000 constraints)

Computational Time Analysis

  • Achieves 3 orders of magnitude time improvement compared to Gurobi
  • Maintains relatively stable time growth as problem scale increases
  • Minimal differences compared to MLOPT (at equivalent k values)
  • Top-k computation can be parallelized for further speedup

Ablation Studies

Preference Learning vs. Reward Fitting

On the fuel cell dataset, the preference method achieves approximately 9% average accuracy improvement over direct reward fitting, with superior performance on infeasibility and suboptimality metrics.

Ordered Sampling vs. Traditional Sampling

The ordered method achieves approximately 8% average accuracy improvement over traditional preference pair sampling across various problem scales, while significantly reducing training costs.

Classification of ML-based MILP Solving Methods

  1. End-to-end Solution Prediction: Directly learns mapping from MILP instances to solutions
  2. Learning to Optimize: Learns to accelerate traditional exact/heuristic methods
  3. Learning to Simplify MILP: Learns preprocessing or problem size reduction

Relationship to Existing Work

  • vs. End-to-end Methods: Avoids high-dimensional solution space learning issues
  • vs. Learning to Optimize: Not constrained by decision horizon limitations
  • vs. Existing Model Reduction: Fully exploits preference information rather than considering only optimal strategies

Conclusions and Discussion

Main Conclusions

  1. Preference-based model reduction learning significantly improves MILP solving performance
  2. SETCOVER pruning technique effectively controls strategy quantity
  3. Attention mechanisms successfully capture instance-strategy correlations
  4. Achieves significant speedup and accuracy improvements on real industrial problems

Limitations

  1. Method applies to homogeneous MILPs with repetitive structure
  2. Requires sufficient training data to learn effective preference models
  3. Strategy generation phase still requires commercial solvers to obtain optimal solutions
  4. Strategy space may remain large in large-scale problems

Future Directions

  1. Extend to broader optimization problem types
  2. Reduce dependence on training data
  3. Further improve strategy generation efficiency
  4. Explore unsupervised or semi-supervised learning approaches

In-Depth Evaluation

Strengths

  1. Strong Innovation: First systematic introduction of preference learning to MILP model reduction
  2. Solid Theoretical Foundation: Based on operations research theory of model reduction concepts
  3. Comprehensive Experiments: Validation on multiple real datasets including industrial-scale problems
  4. Significant Performance: Achieves substantial improvements over existing methods
  5. Good Interpretability: Reduced models correspond to interpretable operational modes

Weaknesses

  1. Limited Applicability: Primarily applies to parameterized MILP problems
  2. Training Cost: Still requires substantial annotated data (optimal solutions)
  3. Insufficient Theoretical Analysis: Lacks convergence and generalization ability guarantees
  4. Limited Comparison Baselines: Primarily compares with one learning method (MLOPT)

Impact

  1. Academic Contribution: Provides new research direction for ML4CO field
  2. Practical Value: Direct application potential in industrial MILP solving
  3. Reproducibility: Detailed method description and clear experimental setup
  4. Inspirational Value: Preference learning ideas extensible to other optimization problems

Applicable Scenarios

  1. Online Stochastic Planning: Requires fast solving of similar MILP instances
  2. Supply Chain Optimization: Problems with fixed structure but varying parameters
  3. Real-time Scheduling: Applications with strict solving time requirements
  4. Large-scale Industrial Optimization: Problems difficult for commercial solvers

References

The paper cites important works in the field, including:

  • Bengio, Lodi, and Prouvost (2021): ML for combinatorial optimization survey
  • Bertsimas and Stellato (2022): Online mixed-integer optimization
  • Misra, Roald, and Ng (2022): Learning for constrained optimization
  • Numerous related works on end-to-end learning and learning to optimize methods

Overall Assessment: This is a high-quality research paper that proposes innovative preference learning methods in the ML4CO field. Despite certain limitations, its theoretical contributions and practical value are prominent, providing a new effective pathway for large-scale MILP solving.