Efficient & Correct Predictive Equivalence for Decision Trees
Marques-Silva, Ignatiev
The Rashomon set of decision trees (DTs) finds importance uses. Recent work showed that DTs computing the same classification function, i.e. predictive equivalent DTs, can represent a significant fraction of the Rashomon set. Such redundancy is undesirable. For example, feature importance based on the Rashomon set becomes inaccurate due the existence of predictive equivalent DTs, i.e. DTs with the same prediction for every possible input. In recent work, McTavish et al. proposed solutions for several computational problems related with DTs, including that of deciding predictive equivalent DTs. The approach of McTavish et al. consists of applying the well-known method of Quine-McCluskey (QM) for obtaining minimum-size DNF (disjunctive normal form) representations of DTs, which are then used for comparing DTs for predictive equivalence. Furthermore, the minimum-size DNF representation was also applied to computing explanations for the predictions made by DTs, and to finding predictions in the presence of missing data. However, the problem of formula minimization is hard for the second level of the polynomial hierarchy, and the QM method may exhibit worst-case exponential running time and space. This paper first demonstrates that there exist decision trees that trigger the worst-case exponential running time and space of the QM method. Second, the paper shows that the QM method may incorrectly decide predictive equivalence, if two key constraints are not respected, and one may be difficult to formally guarantee. Third, the paper shows that any of the problems to which the smallest DNF representation has been applied to can be solved in polynomial time, in the size of the DT. The experiments confirm that, for DTs for which the worst-case of the QM method is triggered, the algorithms proposed in this paper are orders of magnitude faster than the ones proposed by McTavish et al.
academic
Efficient & Correct Predictive Equivalence for Decision Trees
The Rashomon set of decision trees has significant practical applications. Recent research demonstrates that decision trees computing the same classification function (i.e., predictively equivalent decision trees) may constitute a substantial portion of the Rashomon set. This redundancy is undesirable; for instance, feature importance measures based on Rashomon sets become inaccurate due to the presence of predictively equivalent decision trees. McTavish et al. recently proposed solutions to decision tree-related computational problems, including determining predictive equivalence. Their approach uses the well-known Quine-McCluskey (QM) method to obtain minimal DNF representations of decision trees for equivalence comparison. However, formula minimization is hard for the second level of the polynomial hierarchy, and the QM method may exhibit worst-case exponential runtime and space complexity. This paper first proves that decision trees exist that trigger the worst-case exponential complexity of the QM method, second demonstrates that the QM method may incorrectly determine predictive equivalence when two critical constraints are not satisfied, and finally proves that all problems employing minimal DNF representations can be solved in polynomial time relative to decision tree size.
The core problem addressed in this paper is the efficiency and correctness of determining predictive equivalence of decision trees. Predictively equivalent decision trees are distinct decision trees that produce identical predictions for any given input.
Rashomon Set Optimization: In machine learning, Rashomon sets contain multiple models with comparable performance. Predictively equivalent decision trees create redundancy in this set, affecting the accuracy of feature importance assessment.
Interpretability Requirements: Decision trees are widely recognized as interpretable models, yet even optimal decision trees require formal explanations, particularly in high-stakes applications.
Computational Efficiency: Existing methods face severe computational bottlenecks when processing large-scale decision trees.
Theoretical Analysis: Proves the existence of decision trees that trigger worst-case exponential complexity of the QM method
Correctness Analysis: Reveals potential incorrectness issues in the QM method for determining predictive equivalence
Efficient Algorithm: Proposes a polynomial-time algorithm for solving completeness, conciseness, and predictive equivalence determination problems
Experimental Validation: On decision trees triggering QM worst-case scenarios, the new algorithm is several orders of magnitude faster than existing methods
Theoretical Connections: Establishes theoretical links between predictive equivalence and logical explanations, importance measures
The paper proposes a polynomial-time algorithm based on WAXp:
Algorithm 1: Path Consistency Check
def ConsistentPath(A, P, T):
# Check consistency between partial assignment A and tree path P
for each feature i:
combine literals from A and P for feature i
if inconsistent: return False
return True
Algorithm 2: WAXp Determination
def IsWAXp(A, c, T):
# Determine if partial assignment A is a WAXp for class c
for each path P in T:
if Class(P) != c and ConsistentPath(A, P, T):
return False # A is consistent with other class paths
return True
Boolean Function Minimization: Quine (1952, 1955), McCluskey (1956), Umans (1998)
Decision Tree Optimization: Bertsimas & Dunn (2017), Hu et al. (2019), Demirovic et al. (2022)
Overall Assessment: This is a high-quality paper combining theory and practice, not only revealing fundamental defects in existing methods but also providing practical solutions. The theoretical analysis is rigorous, experimental validation is comprehensive, and the work makes important contributions to decision tree research and explainable AI.