2025-11-23T11:28:16.843938

Robustness and Regularization in Hierarchical Re-Basin

Franke, Heinrich, Lange et al.

This paper takes a closer look at Git Re-Basin, an interesting new approach to merge trained models. We propose a hierarchical model merging scheme that significantly outperforms the standard MergeMany algorithm. With our new algorithm, we find that Re-Basin induces adversarial and perturbation robustness into the merged models, with the effect becoming stronger the more models participate in the hierarchical merging scheme. However, in our experiments Re-Basin induces a much bigger performance drop than reported by the original authors.

academic

Robustness and Regularization in Hierarchical Re-Basin

Basic Information

Paper ID: 2510.09174
Title: Robustness and Regularization in Hierarchical Re-Basin
Authors: Benedikt Franke, Florian Heinrich, Markus Lange, Arne Raulf (German Aerospace Center - Institute for AI Safety and Security)
Classification: cs.LG (Machine Learning)
Publication Date: arXiv preprint, October 2025
Paper Link: https://arxiv.org/abs/2510.09174v2

Abstract

This paper provides an in-depth investigation of Git Re-Basin, an emerging model merging method. The authors propose a hierarchical model merging scheme that significantly outperforms the standard MergeMany algorithm. Through the new algorithm, the research reveals that Re-Basin can introduce adversarial robustness and perturbation robustness to merged models, with these effects becoming more pronounced as the number of models participating in hierarchical merging increases. However, the performance degradation caused by Re-Basin in experiments is substantially larger than originally reported by the authors.

Research Background and Motivation

Problem Definition

Core Problem: How to effectively merge multiple trained neural network models while maintaining or improving model performance
Limitations of Existing Methods:
- Simple model interpolation leads to severe accuracy degradation, as the mean of two models in parameter space may fall outside the loss basin
- The original Git Re-Basin's MergeMany algorithm has theoretical flaws: in each round, the mean of n-1 models cannot be guaranteed to lie within the loss basin

Research Significance

Permutation Symmetry: Exploiting the permutation invariance of artificial neural networks allows reordering neurons without affecting accuracy
Linear Mode Connectivity (LMC): Closely related to permutation invariance, providing theoretical foundations for model fusion
Practical Applications: Important value in federated learning, multi-task learning, and other scenarios

Core Contributions

Proposes Hierarchical Re-Basin Merging Scheme: Designs a novel hierarchical model merging algorithm that significantly outperforms the original MergeMany algorithm
Discovers Robustness Enhancement Effect: Demonstrates that Re-Basin induces adversarial robustness and perturbation robustness, with effects strengthening as the number of merged models increases
Reveals Regularization Properties: Through weight norm and Lipschitz constant analysis, proves that Re-Basin exhibits regularization effects
Empirical Results Comparison: Finds that Re-Basin causes greater performance degradation compared to original authors' reports, providing important empirical supplements to the field

Methodology Details

Task Definition

Given n trained neural network models Θ₁, Θ₂, ..., Θₙ with identical architectures, the objective is to merge them into a single model with better performance or at least without significant degradation.

Model Architecture

Git Re-Basin Fundamental Principles

Permutation Invariance: Exploits neural network permutation symmetry by reordering one model's neurons to "transport" it into another model's loss basin
Linear Interpolation: After ensuring both models lie in the same loss basin, performs linear interpolation for merging

Hierarchical Merging Scheme

Stage 0: Original trained models (2^n models)
Stage 1: Pairwise merging → 2^(n-1) merged models  
Stage 2: Continue pairwise merging → 2^(n-2) merged models
...
Stage n: Final merged model (1 model)

Algorithm Flow:

Perform n stages of pairwise merging on 2^n input models
In each stage, use merged models from the previous stage as input
Merging process: Apply Re-Basin algorithm to permute the second model into the first model's loss basin, then perform linear interpolation (λ=0.5)

Technical Innovations

Theoretical Advantages: Avoids the problem in MergeMany algorithm where the mean of n-1 models may not lie within the loss basin
Computational Complexity Trade-off: Although computationally more expensive, guarantees that each merge occurs within a valid loss basin
Progressive Merging: Through hierarchical structure, gradually reduces merging complexity, avoiding difficulties of handling multiple models simultaneously

Experimental Setup

Datasets

CIFAR-10: Standard image classification dataset
Model Count: Trained 1600 multilayer perceptrons (MLPs) as input models

Model Architecture

Network Structure: 4-layer MLP
Hidden Layer Dimension: 512
Latent Layer Dimension: 256
Activation Function: ReLU (except final layer)
Training Strategy: Each model trained with different random seeds

Evaluation Metrics

Accuracy: Test set classification accuracy
Robust Accuracy: Accuracy under adversarial attacks
Weight Norm: ∑ᵢ₌₀ᴺ ||Wᵢ||_F + ||bᵢ||₂
Lipschitz Upper Bound: Measures model sensitivity to input perturbations

Comparison Methods

MergeMany Algorithm: Original Git Re-Basin's multi-model merging method
L1/L2 Regularized Models: Robustness comparison baseline
Unmerged Models: Performance baseline

Implementation Details

PyTorch-based Re-Basin open-source implementation
Adversarial Attacks: DeepFool and FGSM
ε Parameter Range: 0.000-0.020

Experimental Results

Main Results

Merging Performance Comparison

4-Model Merging: Hierarchical scheme significantly outperforms MergeMany algorithm
8-Model Merging: Advantages become more pronounced, MergeMany algorithm accuracy severely degraded
Variance Analysis: Hierarchical scheme shows smaller result variance and more stable performance

Robustness Analysis

Adversarial Robustness:
- Around ε≈0.01, all Re-Basin stages match unmerged models
- Lower stages (fewer Re-Basin operations) perform better under weak attacks
- Higher stages (more Re-Basin operations) more robust to strong attacks
- L2 regularization performs best across most ε ranges
Weight Regularization Effect:
- Cumulative weight norm decreases linearly with Re-Basin stages
- Variance also decreases with stages
- Indicates Re-Basin exhibits weight regularization-like effects
Lipschitz Constant Analysis:
- Lipschitz upper bound decreases with Re-Basin stages
- Indicates enhanced perturbation resistance
- Variance similarly decreases, model behavior more consistent

Ablation Studies

Permutation Selection: Preliminary experiments show no statistically significant impact of which model is permuted
Interpolation Parameter: Uses λ=0.5 for linear interpolation

Experimental Findings

Regularization Mechanism: Re-Basin produces noise-like regularization effects through weight interpolation
Increasing Robustness: Merging more models brings stronger robustness but accompanies accuracy degradation
Theory-Practice Discrepancy: Unable to reproduce the zero-accuracy barrier phenomenon reported in original paper

Linear Mode Connectivity (LMC)

Origins: Initially studied in lottery ticket hypothesis context regarding SGD solution linear connectivity
Extended Applications: Multi-task learning, federated learning, and other domains
Theoretical Development: Extended from network-level connectivity to layer-wise linear feature connectivity

Model Permutation

Theoretical Foundation: Relationship between permutation invariance and LMC
Practical Applications: Weight matching averaging in federated learning
Security Research: Permutation invariance under adversarial attack contexts

Model Fusion

Mathematical Framework: Model fusion based on Wasserstein barycenter
Language Models: Pattern connectivity research in pretrained language models

Conclusions and Discussion

Main Conclusions

Hierarchical Scheme Superiority: Proposed hierarchical Re-Basin significantly outperforms MergeMany algorithm
Robustness Induction: Re-Basin induces adversarial and perturbation robustness, strengthening with increased merged models
Regularization Properties: Re-Basin exhibits weight regularization effects, reducing model complexity
Empirical Discrepancy: Found performance degradation larger than originally reported

Limitations

Computational Overhead: Hierarchical scheme has higher computational cost than MergeMany algorithm
Accuracy Degradation: Despite improvements over MergeMany, still exhibits accuracy loss
Reproducibility Issues: Unable to reproduce original paper's zero-accuracy barrier
Experimental Scope: Validation only on CIFAR-10 and MLPs, lacking broader experiments

Future Directions

Theoretical Analysis: Deeper understanding of mechanisms behind Re-Basin-induced robustness
Algorithm Optimization: Seeking more computationally efficient merging strategies
Application Extension: Verification on more datasets and architectures
Reproducibility: Further investigation of discrepancies with original results

In-Depth Evaluation

Strengths

Deep Theoretical Insights: Accurately identifies theoretical flaws in MergeMany algorithm
Rigorous Experimental Design: Uses 1600 models for statistical analysis, high result credibility
Multi-dimensional Analysis: Evaluates methods from accuracy, robustness, regularization perspectives
Honest Reporting: Objectively reports experimental results inconsistent with original authors
Method Innovation: Hierarchical merging scheme design is reasonable with clear theoretical motivation

Weaknesses

Limited Experimental Scope: Validation only on single dataset (CIFAR-10) and simple architecture (MLP)
Insufficient Theoretical Explanation: Lacks deep theoretical analysis of robustness induction mechanisms
Reproducibility Issues: Fails to explain fundamental causes of discrepancies with original work
Computational Efficiency: Insufficient analysis of hierarchical scheme computational overhead
Hyperparameter Sensitivity: Lacks sensitivity analysis of key hyperparameters (e.g., λ value)

Impact

Academic Value: Provides important empirical supplements and theoretical improvements to Git Re-Basin research
Practical Value: Hierarchical merging scheme directly applicable to real model fusion tasks
Safety Significance: Discovered robustness properties important for AI safety research
Methodological Contribution: Provides more comprehensive analysis framework for model merging evaluation

Applicable Scenarios

Federated Learning: Multi-client model aggregation
Model Ensemble: Improving single model performance and robustness
Knowledge Distillation: Preprocessing step for multi-teacher model fusion
Safety-Critical Applications: Systems requiring adversarial robustness

References

Key References

Ainsworth et al. (2023): Original Git re-basin paper proposing foundational model merging method
Entezari et al. (2022): Role of permutation invariance in neural network linear mode connectivity
Frankle et al. (2020): Relationship between linear mode connectivity and lottery ticket hypothesis
Moosavi-Dezfooli et al. (2016): DeepFool adversarial attack method
Avant & Morgansen (2023): Analytical bounds for Lipschitz constants of ReLU networks

Summary: This paper proposes important improvements upon Git Re-Basin, not only addressing theoretical flaws of the original algorithm but also discovering robustness enhancement effects in model merging. Despite certain limitations, its rigorous experimental design and honest result reporting provide valuable contributions to the field's development.