2025-11-12T13:34:14.831387

Efficient & Correct Predictive Equivalence for Decision Trees

Marques-Silva, Ignatiev
The Rashomon set of decision trees (DTs) finds importance uses. Recent work showed that DTs computing the same classification function, i.e. predictive equivalent DTs, can represent a significant fraction of the Rashomon set. Such redundancy is undesirable. For example, feature importance based on the Rashomon set becomes inaccurate due the existence of predictive equivalent DTs, i.e. DTs with the same prediction for every possible input. In recent work, McTavish et al. proposed solutions for several computational problems related with DTs, including that of deciding predictive equivalent DTs. The approach of McTavish et al. consists of applying the well-known method of Quine-McCluskey (QM) for obtaining minimum-size DNF (disjunctive normal form) representations of DTs, which are then used for comparing DTs for predictive equivalence. Furthermore, the minimum-size DNF representation was also applied to computing explanations for the predictions made by DTs, and to finding predictions in the presence of missing data. However, the problem of formula minimization is hard for the second level of the polynomial hierarchy, and the QM method may exhibit worst-case exponential running time and space. This paper first demonstrates that there exist decision trees that trigger the worst-case exponential running time and space of the QM method. Second, the paper shows that the QM method may incorrectly decide predictive equivalence, if two key constraints are not respected, and one may be difficult to formally guarantee. Third, the paper shows that any of the problems to which the smallest DNF representation has been applied to can be solved in polynomial time, in the size of the DT. The experiments confirm that, for DTs for which the worst-case of the QM method is triggered, the algorithms proposed in this paper are orders of magnitude faster than the ones proposed by McTavish et al.
academic

Efficient & Correct Predictive Equivalence for Decision Trees

Basic Information

  • Paper ID: 2509.17774
  • Title: Efficient & Correct Predictive Equivalence for Decision Trees
  • Authors: João Marques-Silva (ICREA & University of Lleida), Alexey Ignatiev (Monash University)
  • Classification: cs.AI cs.LG cs.LO
  • Publication Date/Venue: Journal of Machine Learning Research 23 (2025) 1-35
  • Paper Link: https://arxiv.org/abs/2509.17774

Abstract

The Rashomon set of decision trees has significant practical applications. Recent research demonstrates that decision trees computing the same classification function (i.e., predictively equivalent decision trees) may constitute a substantial portion of the Rashomon set. This redundancy is undesirable; for instance, feature importance measures based on Rashomon sets become inaccurate due to the presence of predictively equivalent decision trees. McTavish et al. recently proposed solutions to decision tree-related computational problems, including determining predictive equivalence. Their approach uses the well-known Quine-McCluskey (QM) method to obtain minimal DNF representations of decision trees for equivalence comparison. However, formula minimization is hard for the second level of the polynomial hierarchy, and the QM method may exhibit worst-case exponential runtime and space complexity. This paper first proves that decision trees exist that trigger the worst-case exponential complexity of the QM method, second demonstrates that the QM method may incorrectly determine predictive equivalence when two critical constraints are not satisfied, and finally proves that all problems employing minimal DNF representations can be solved in polynomial time relative to decision tree size.

Research Background and Motivation

Problem Definition

The core problem addressed in this paper is the efficiency and correctness of determining predictive equivalence of decision trees. Predictively equivalent decision trees are distinct decision trees that produce identical predictions for any given input.

Problem Significance

  1. Rashomon Set Optimization: In machine learning, Rashomon sets contain multiple models with comparable performance. Predictively equivalent decision trees create redundancy in this set, affecting the accuracy of feature importance assessment.
  2. Interpretability Requirements: Decision trees are widely recognized as interpretable models, yet even optimal decision trees require formal explanations, particularly in high-stakes applications.
  3. Computational Efficiency: Existing methods face severe computational bottlenecks when processing large-scale decision trees.

Limitations of Existing Methods

The method proposed by McTavish et al., based on the Quine-McCluskey (QM) algorithm, has the following issues:

  1. Computational Complexity: The QM method solves Σₚ²-hard problems, requiring exponential time and space in the worst case
  2. Correctness Issues: May produce incorrect results when specific constraints are not satisfied
  3. Practical Feasibility: Known to scale poorly for problems with dozens of variables

Core Contributions

  1. Theoretical Analysis: Proves the existence of decision trees that trigger worst-case exponential complexity of the QM method
  2. Correctness Analysis: Reveals potential incorrectness issues in the QM method for determining predictive equivalence
  3. Efficient Algorithm: Proposes a polynomial-time algorithm for solving completeness, conciseness, and predictive equivalence determination problems
  4. Experimental Validation: On decision trees triggering QM worst-case scenarios, the new algorithm is several orders of magnitude faster than existing methods
  5. Theoretical Connections: Establishes theoretical links between predictive equivalence and logical explanations, importance measures

Methodology Details

Task Definition

Given two decision trees T₁ and T₂, determine whether they are predictively equivalent, i.e.:

∀(x ∈ F). (κₜ₁(x) = κₜ₂(x))

where F is the feature space and κ is the classification function.

Core Technical Framework

1. Weak Abductive Explanation (WAXp) Method

The paper proposes a polynomial-time algorithm based on WAXp:

Algorithm 1: Path Consistency Check

def ConsistentPath(A, P, T):
    # Check consistency between partial assignment A and tree path P
    for each feature i:
        combine literals from A and P for feature i
        if inconsistent: return False
    return True

Algorithm 2: WAXp Determination

def IsWAXp(A, c, T):
    # Determine if partial assignment A is a WAXp for class c
    for each path P in T:
        if Class(P) != c and ConsistentPath(A, P, T):
            return False  # A is consistent with other class paths
    return True

2. Predictive Equivalence Determination Algorithm

Algorithm 5: Predictive Equivalence

def PredictivelyEquivalent(T1, T2):
    for P1 in Paths(T1):
        c1 = Class(P1)
        A1 = Literals(P1)  # Create partial assignment
        for P2 in Paths(T2):
            c2 = Class(P2)
            if c1 != c2 and ConsistentPath(A1, P2, T2):
                return False  # Found evidence of non-equivalence
    return True  # Cannot prove non-equivalence, therefore equivalent

Technical Innovations

  1. Avoiding Exponential Complexity: Operates directly on decision tree structure, avoiding generation of potentially exponentially-sized BCF representations
  2. Polynomial Time Guarantee: All algorithms have time complexity polynomial in decision tree size
  3. Formal Correctness: Provides rigorous mathematical proofs ensuring algorithm correctness
  4. Parallelizability: Predictive equivalence algorithm can be parallelized for further efficiency gains

Experimental Setup

Constructed Test Cases

The paper uses special decision tree constructions based on Theorem 1:

  • Parameter r: Controls tree complexity
  • Node Count: 6r + 3 nodes
  • Feature Count: 2r + 1 features
  • BCF Size: For class 1, lower bound of 2^r prime implicants

Evaluation Metrics

  1. Runtime: Algorithm execution time (seconds)
  2. BCF Size: Number of prime implicants in Blake Canonical Form
  3. Scalability: Capability to handle decision trees of varying sizes

Comparison Methods

  • SymPy's QM Implementation: Baseline method used by McTavish et al.
  • Independent BCF Generation: Authors' implementation of standard QM prime implicant generation steps

Implementation Details

  • Platform: Macbook M3 Pro processor
  • Programming Language: Python
  • Timeout Setting: QM method set to 150000 seconds timeout limit

Experimental Results

Main Results

Verification of QM Method's Exponential Complexity

rSymPy Time(s)|BCF₀(T)||BCF₁(T)|BCF Time(s)
30.134220.01
40.575460.07
539.606940.84
62789.45719011.28
7>150000.008382161.25

Scalability of New Algorithm

rDT NodesFeatures|BCF₁(T)|One AXpisWAXp?PE?
20012034012²⁰⁰1.71s0.005s3.7s
500300310012⁵⁰⁰26.98s0.032s57.1s
1000600320012¹⁰⁰⁰224.62s0.126s469.0s

Key Findings

  1. Exponential Growth Confirmed: Size of BCF₁(T) grows exponentially with r, validating theoretical analysis
  2. Massive Performance Gap: For r=200, the new algorithm processes a 1203-node decision tree in seconds, while the QM method times out on 57 nodes
  3. Practical Validation: New algorithm can handle large-scale decision trees likely to appear in real applications

Rashomon Set Research

  • Foundational Concepts: Breiman (2001) first introduced the Rashomon set concept
  • Recent Developments: Work by Fisher et al., Semenova et al. on feature importance
  • Predictive Equivalence: McTavish et al. first systematically studied predictive equivalence of decision trees

Logical Foundations of Explainable AI

  • Formal Explanations: Work by Marques-Silva et al. on AXp and CXp
  • Computational Complexity: Multiple studies establishing complexity of explanation computation
  • Interpretability Measures: Application of Shapley values and Banzhaf values in machine learning

Boolean Formula Minimization

  • Classical Methods: Historical development of Quine-McCluskey algorithm
  • Complexity Theory: Establishment of Σₚ²-hard complexity
  • Practical Limitations: Known that QM method efficiency drops sharply for more than 8 variables

Conclusions and Discussion

Main Conclusions

  1. Theoretical Contribution: Proves that the QM method indeed encounters exponential complexity on decision trees
  2. Algorithmic Contribution: Provides polynomial-time alternative algorithms
  3. Practical Value: New algorithm demonstrates significant advantages in practical applications
  4. Theoretical Connections: Establishes links between predictive equivalence and multiple XAI concepts

Limitations

  1. Python Implementation: Experiments using Python may affect absolute performance evaluation values
  2. Special Constructions: Experiments primarily focus on specially constructed decision trees
  3. Parallelization: Parallelization potential of predictive equivalence algorithm not validated on large-scale clusters
  4. Generality: Requires validation on more real-world datasets

Future Directions

  1. Asymptotically Optimal Algorithms: Search for theoretically optimal algorithms
  2. Other Model Types: Extension of methods to other interpretable models
  3. Practical Applications: Application in real Rashomon set optimization
  4. Parallel Implementation: Development of large-scale parallel implementations

In-Depth Evaluation

Strengths

  1. Theoretical Rigor: Provides complete mathematical proofs and complexity analysis
  2. High Practical Value: Solves fundamental performance issues of existing methods
  3. Strong Innovation: First systematic analysis of QM method issues on decision trees
  4. Comprehensive Experiments: Includes both theoretical construction validation and practical-scale testing
  5. Clear Writing: Well-structured paper with clear technical exposition

Weaknesses

  1. Limited Experimental Scope: Validation primarily on constructed test cases, lacking results on real datasets
  2. Implementation Language: Python may not be optimal choice, affecting persuasiveness of performance comparisons
  3. Application Validation: Lacks validation in actual Rashomon set optimization tasks
  4. QM Constraint Analysis: Insufficient analysis of practical reachability of QM correctness constraints

Impact

  1. Academic Value: Provides new theoretical tools for decision tree research
  2. Practical Significance: May change practical methods for Rashomon set analysis
  3. Reproducibility: Clear algorithm descriptions facilitate reproduction
  4. Extensibility: Methods may apply to other interpretable models

Applicable Scenarios

  1. High-Stakes Applications: Medical, financial, and other domains requiring explainable AI
  2. Model Selection: Scenarios requiring selection from multiple equivalent models
  3. Feature Importance Analysis: Applications requiring accurate feature importance assessment
  4. Large-Scale Decision Trees: Industrial applications handling complex decision trees

References

This paper cites extensive related work, primarily including:

  1. Rashomon Sets: Breiman (2001), Xin et al. (2022), Fisher et al. (2019)
  2. Logical Explainable AI: Marques-Silva (2022), Darwiche (2023), Ignatiev et al. (2019)
  3. Boolean Function Minimization: Quine (1952, 1955), McCluskey (1956), Umans (1998)
  4. Decision Tree Optimization: Bertsimas & Dunn (2017), Hu et al. (2019), Demirovic et al. (2022)

Overall Assessment: This is a high-quality paper combining theory and practice, not only revealing fundamental defects in existing methods but also providing practical solutions. The theoretical analysis is rigorous, experimental validation is comprehensive, and the work makes important contributions to decision tree research and explainable AI.