Robustness and Regularization in Hierarchical Re-Basin
Franke, Heinrich, Lange et al.
This paper takes a closer look at Git Re-Basin, an interesting new approach to merge trained models. We propose a hierarchical model merging scheme that significantly outperforms the standard MergeMany algorithm. With our new algorithm, we find that Re-Basin induces adversarial and perturbation robustness into the merged models, with the effect becoming stronger the more models participate in the hierarchical merging scheme. However, in our experiments Re-Basin induces a much bigger performance drop than reported by the original authors.
academic
Robustness and Regularization in Hierarchical Re-Basin
This paper provides an in-depth investigation of Git Re-Basin, an emerging model merging method. The authors propose a hierarchical model merging scheme that significantly outperforms the standard MergeMany algorithm. Through the new algorithm, the research reveals that Re-Basin can introduce adversarial robustness and perturbation robustness to merged models, with these effects becoming more pronounced as the number of models participating in hierarchical merging increases. However, the performance degradation caused by Re-Basin in experiments is substantially larger than originally reported by the authors.
Core Problem: How to effectively merge multiple trained neural network models while maintaining or improving model performance
Limitations of Existing Methods:
Simple model interpolation leads to severe accuracy degradation, as the mean of two models in parameter space may fall outside the loss basin
The original Git Re-Basin's MergeMany algorithm has theoretical flaws: in each round, the mean of n-1 models cannot be guaranteed to lie within the loss basin
Proposes Hierarchical Re-Basin Merging Scheme: Designs a novel hierarchical model merging algorithm that significantly outperforms the original MergeMany algorithm
Discovers Robustness Enhancement Effect: Demonstrates that Re-Basin induces adversarial robustness and perturbation robustness, with effects strengthening as the number of merged models increases
Reveals Regularization Properties: Through weight norm and Lipschitz constant analysis, proves that Re-Basin exhibits regularization effects
Empirical Results Comparison: Finds that Re-Basin causes greater performance degradation compared to original authors' reports, providing important empirical supplements to the field
Given n trained neural network models Θ₁, Θ₂, ..., Θₙ with identical architectures, the objective is to merge them into a single model with better performance or at least without significant degradation.
Permutation Invariance: Exploits neural network permutation symmetry by reordering one model's neurons to "transport" it into another model's loss basin
Linear Interpolation: After ensuring both models lie in the same loss basin, performs linear interpolation for merging
Ainsworth et al. (2023): Original Git re-basin paper proposing foundational model merging method
Entezari et al. (2022): Role of permutation invariance in neural network linear mode connectivity
Frankle et al. (2020): Relationship between linear mode connectivity and lottery ticket hypothesis
Moosavi-Dezfooli et al. (2016): DeepFool adversarial attack method
Avant & Morgansen (2023): Analytical bounds for Lipschitz constants of ReLU networks
Summary: This paper proposes important improvements upon Git Re-Basin, not only addressing theoretical flaws of the original algorithm but also discovering robustness enhancement effects in model merging. Despite certain limitations, its rigorous experimental design and honest result reporting provide valuable contributions to the field's development.