A Branch-and-Bound Approach for Maximum Low-Diameter Dense Subgraph Problems
Zhou, Luo, Wang et al.
A graph with $n$ vertices is an $f(\cdot)$-dense graph if it has at least $f(n)$ edges, $f(\cdot)$ being a well-defined function. The notion $f(\cdot)$-dense graph encompasses various clique models like $γ$-quasi cliques, $k$-defective cliques, and dense cliques, arising in cohesive subgraph extraction applications. However, the $f(\cdot)$-dense graph may be disconnected or weakly connected. To conquer this, we study the problem of finding the largest $f(\cdot)$-dense subgraph with a diameter of at most two in the paper. Specifically, we present a decomposition-based branch-and-bound algorithm to optimally solve this problem. The key feature of the algorithm is a decomposition framework that breaks the graph into $n$ smaller subgraphs, allowing independent searches in each subgraph. We also introduce decomposition strategies including degeneracy and two-hop degeneracy orderings, alongside a branch-and-bound algorithm with a novel sorting-based upper bound to solve each subproblem. Worst-case complexity for each component is provided. Empirical results on 139 real-world graphs under two $f(\cdot)$ functions show our algorithm outperforms the MIP solver and pure branch-and-bound, solving nearly twice as many instances optimally within one hour.
academic
A Branch-and-Bound Approach for Maximum Low-Diameter Dense Subgraph Problems
Title: A Branch-and-Bound Approach for Maximum Low-Diameter Dense Subgraph Problems
Authors: Yi Zhou (University of Electronic Science and Technology of China), Chunyu Luo (University of Electronic Science and Technology of China), Zhengren Wang (Peking University), Zhang-Hua Fu (Shenzhen Institute of Artificial Intelligence and Robotics for Society, CUHK Shenzhen)
Classification: cs.DS (Data Structures and Algorithms)
This paper investigates the maximum low-diameter f(·)-dense subgraph problem (MfDS). An f(·)-dense graph is defined as a graph with n vertices containing at least f(n) edges, where f(·) is a well-defined function. This concept encompasses various clique relaxation models including γ-quasi-cliques, k-defective cliques, and dense cliques. However, f(·)-dense graphs may be disconnected or weakly connected. To address this issue, the paper studies the problem of finding maximum f(·)-dense subgraphs with diameter at most 2. Specifically, a decomposition-based branch-and-bound algorithm is proposed to optimally solve this problem. The algorithm's key characteristic is decomposing the graph into n smaller subgraphs, allowing independent search within each subgraph. The paper introduces decomposition strategies such as degeneracy ordering and two-hop degeneracy ordering, along with a branch-and-bound algorithm featuring novel ordering-based bounds. Experimental results on 139 real-world graphs demonstrate that the algorithm outperforms MIP solvers and pure branch-and-bound methods, solving nearly twice as many instances within one hour.
Identifying tightly connected subgraphs (cohesive subgraphs) is a core task in network analysis, with applications in community detection, protein complex identification, financial network analysis, and other domains. The classical clique model requires edges between all pairs of vertices, but this requirement is too restrictive, especially in practical applications with noise and errors. Consequently, various clique relaxation models have emerged, such as:
γ-quasi-cliques: Induced subgraphs of n vertices with at least γ(n choose 2) edges
s-defective cliques: With at least (n choose 2) - s edges
Average s-plex: With at least n(n-s)/2 edges
These models can be unified as f(·)-dense graphs: graphs with n vertices containing at least f(n) edges.
Existing f(·)-dense graph models suffer from a critical flaw: they may be disconnected or have weak connectivity. The paper demonstrates this through two examples in Figure 1:
G1: A 200-vertex clique plus a 10-vertex path, requiring 10 hops from v1 to v210
G2: A 200-vertex clique plus 10 isolated vertices, with no path between v1 and v210
Both graphs satisfy the density requirement f(n)=0.9(n choose 2), but are clearly not tightly connected communities.
MIP approaches: While mixed integer programming models exist for specific f(·) functions (e.g., GF3), they are inefficient for large-scale graphs and lack MIP models for diameter constraints
Branch-and-bound methods: Existing algorithms target specific problems (e.g., maximum s-defective clique, maximum γ-quasi-clique) but lack a unified framework for general f(·)-dense graphs
Connectivity constraints: Santos et al. (2024) proposed spanning tree-based connectivity constraints, but diameter constraints better ensure tight connectivity
This paper introduces the concept of low-diameter f(·)-dense graphs: requiring the graph to be connected, have at least f(n) edges, and have diameter at most 2. The diameter-2 constraint (called 2-club) ensures that the shortest path between any two vertices is at most 2, guaranteeing strong connectivity. The paper aims to design efficient exact algorithms to solve this NP-hard problem.
First systematically defines f(·)-dense graphs and their hereditary properties
Formalizes the maximum low-diameter f(·)-dense subgraph problem (MfDS)
Provides the MIP-fD mixed integer linear programming model
Algorithm Framework:
Proposes a preprocessing framework based on graph decomposition, decomposing the input graph into n smaller subgraphs
Introduces two decomposition strategies: degeneracy ordering and two-hop degeneracy ordering
Designs a branch-and-bound algorithm with novel ordering-based bounds
Provides worst-case complexity analysis for each component and the overall algorithm
Experimental Verification:
Tests two f(·) functions (γ-quasi-cliques and s-defective cliques) on 139 real-world graphs
Demonstrates that the decomposition framework significantly improves performance, solving twice as many instances as pure branch-and-bound within one hour
Shows that ordering-based bounds dramatically reduce search tree size
Key Lemma (Lemma 3): Given any vertex ordering π=⟨v1,...,vn⟩ of graph G, for each vertex vi, define:
Vi = {vi} ∪ {N^(2)_G(vi) ∩ {vi+1,...,vn}}
Gi = GVi
If S_i is the maximum low-diameter f(·)-dense subgraph containing vi in Gi, then:
**ωf(G) = max(|S_1|,...,|S*_n|)**
Algorithm Flow:
1. Obtain initial solution S using heuristic (lower bound)
2. Determine vertex order π using ordering algorithm
3. For each vi in π:
- Construct subgraph Gi = G[Vi]
- Solve using ExactSearch(Gi, f(·), {vi}, N^(2)_G(vi), lb)
4. Return maximum solution from all subproblems
Time Complexity: O(Theur(G) + Torder(G) + poly(αG)2^αG)
BranchBound(G, S, C, lb):
Input: Graph G, current solution S, candidate set C, lower bound lb
1. If G[S] is feasible and |S|>lb: update lb and optimal solution
2. If f(·) is hereditary and |E(S)|<f(|S|): prune and return
3. UB ← SortBound(G, f(·), S, C) // Compute upper bound
4. If UB ≤ lb: prune and return
5. If f(·) is hereditary: apply reduction rules (Lemma 5)
6. Randomly select u∈C
7. If conditions in Lemma 4 are satisfied: recurse only on S∪{u} branch
Otherwise: recurse on both S∪{u} and S branches
Simple bound doesn't account for diameter-2 constraint
Algorithm Flow:
1. Partition C into χ independent sets Π1,...,Πχ using greedy coloring
(minimize χ ≈ graph coloring problem, O(|C|³) greedy)
2. For each Πi:
a. Sort Πi by wS(v)
b. Further partition Πi into multiple Rσ such that any two vertices
in Rσ have distance > 2
c. From each Rσ select vertex with minimum wS(·) and add to C'
d. Compute w'S(uj) = wS(uj) + j - 1
3. Sort C' by w'S(·), find maximum k such that |E(S)| + w'S(Sk) ≤ gf(|S|+k)
4. Return |S|+k
Time Complexity: O(|C|³ + |S||C|)
Correctness (Theorem 3):
Each independent set Πi can contribute at most s'i vertices
Each Rσ can contribute at most 1 vertex due to diameter-2 constraint
Matula & Beck (1983): "Smallest-last ordering and clustering and graph coloring algorithms" - Classical work on degeneracy ordering
Asahiro et al. (2002): "Complexity of finding dense subgraphs" - Complexity results for f(·)-dense graphs
Downey & Fellows (2012): "Parameterized complexity" - W1-hardness theory
Overall Assessment: This is an excellent paper with rigorous theory, innovative algorithms, and comprehensive experiments. The decomposition framework and ordering-based bound methods have high generality and practical value. The main contribution lies in unifying multiple clique relaxation models and providing efficient solution algorithms. Recommended for acceptance with significant contributions to graph algorithms and network analysis.