2025-11-24T16:40:16.782086

Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization

Gao, Liu, Liu et al.

Exploring effective and transferable adversarial examples is vital for understanding the characteristics and mechanisms of Vision Transformers (ViTs). However, adversarial examples generated from surrogate models often exhibit weak transferability in black-box settings due to overfitting. Existing methods improve transferability by diversifying perturbation inputs or applying uniform gradient regularization within surrogate models, yet they have not fully leveraged the shared and unique features of surrogate models trained on the same task, leading to suboptimal transfer performance. Therefore, enhancing perturbations of common information shared by surrogate models and suppressing those tied to individual characteristics offers an effective way to improve transferability. Accordingly, we propose a commonality-oriented gradient optimization strategy (COGO) consisting of two components: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs the mid-to-low frequency regions, leveraging the fact that ViTs trained on the same dataset tend to rely more on mid-to-low frequency information for classification. IS employs adaptive thresholds to evaluate the correlation between backpropagated gradients and model individuality, assigning weights to gradients accordingly. Extensive experiments demonstrate that COGO significantly improves the transfer success rates of adversarial attacks, outperforming current state-of-the-art methods.

academic

Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization

Basic Information

Paper ID: 2506.06992
Title: Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization
Authors: Yanting Gao, Yepeng Liu, Junming Liu, Qi Zhang, Hongyun Zhang, Duoqian Miao, Cairong Zhao
Affiliated Institutions: Tongji University, University of Florida
Classification: cs.CV (Computer Vision)
Publication Date: October 12, 2025 (arXiv preprint v2)
Paper Link: https://arxiv.org/abs/2506.06992

Abstract

Exploring effective and transferable adversarial examples is crucial for understanding the characteristics and mechanisms of Vision Transformers (ViTs). However, adversarial examples generated by surrogate models often exhibit weak transferability in black-box settings due to overfitting. Existing methods improve transferability by diversifying perturbed inputs or applying uniform gradient regularization within surrogate models, but fail to adequately exploit shared and unique features of surrogate models trained on the same task, resulting in suboptimal transfer performance. Therefore, enhancing perturbations that capture shared information among surrogate models while suppressing perturbations related to individual characteristics provides an effective pathway to improve transferability. Accordingly, we propose a Commonality-Oriented Gradient Optimization strategy (COGO), comprising two components: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs low-frequency regions, leveraging the fact that ViTs trained on the same dataset tend to rely more on mid-to-low frequency information for classification. IS employs adaptive thresholds to assess the correlation between backpropagated gradients and model individuality, assigning weights to gradients accordingly. Extensive experiments demonstrate that COGO significantly improves the transfer success rate of adversarial attacks, outperforming current state-of-the-art methods.

Research Background and Motivation

1. Research Problem

This paper primarily addresses the transferability problem in adversarial attacks against Vision Transformers (ViTs). Specifically, when using surrogate models to generate adversarial examples to attack unknown target models, the generated adversarial examples often fail to effectively transfer to the target model, resulting in attack failure.

2. Problem Significance

Safety-Critical Applications: The reliability of ViTs in safety-critical applications is severely threatened by adversarial attacks
Black-Box Attack Realism: In practical scenarios, attackers typically cannot access the internal structure of target models, making transferability critical
Model Robustness Evaluation: Understanding the transferability of adversarial examples helps evaluate and improve model robustness

3. Limitations of Existing Methods

Overfitting: Adversarial examples generated by existing methods contain excessive surrogate model-specific information, resulting in poor generalization
Uniform Treatment: Methods such as TGR and GNS-HFA only adjust gradients based on statistical properties, without considering the correlation between gradients and model-specific features
Improper Frequency Domain Utilization: Methods like HFA focus only on high-frequency components, overlooking the fact that ViTs rely more on mid-to-low frequency information

4. Research Motivation

The authors observe that different ViTs trained on the same dataset, despite architectural differences, exhibit commonality in decision patterns, particularly in their reliance on mid-to-low frequency information. Therefore, by enhancing common features and suppressing individual characteristics, more transferable adversarial examples can be generated.

Core Contributions

Proposes Commonality-Oriented Optimization Strategy: First to consider the relationship between gradients and model features, transcending traditional uniform gradient adjustment methods
Designs COGO Framework: Combines Commonality Enhancement (CE) and Individuality Suppression (IS) components, utilizing frequency domain energy enhancement and adaptive threshold mechanisms
Significant Performance Improvement: Substantially outperforms existing state-of-the-art methods across multiple benchmarks, including GNS-HFA and ATT
Comprehensive Experimental Validation: Achieves excellent performance in both ViT-to-ViT transfer and cross-architecture ViT-to-CNN transfer

Methodology Details

Task Definition

Given a clean input image $X_{clean} \in \mathbb{R}^N$ , the objective is to generate adversarial perturbation $\delta$ such that $X_{adv} = X_{clean} + \delta$ successfully attacks the surrogate model and exhibits good black-box transferability to unknown target models.

Model Architecture

The COGO strategy comprises two core components:

1. Commonality Enhancement (CE)

The CE module enhances mid-to-low frequency components during forward propagation:

Step 1: Add current perturbation and Gaussian noise

X = X_clean + δ
X_DCT = DCT(X + ε), where ε ~ N(0, I_N)

Step 2: Compute energy distribution and enhance

E(X_DCT) = Normalize(|X_DCT|)
X'_DCT = X_DCT · (1 + γ · E(X_DCT))

Step 3: Transform back to spatial domain and apply spatial mask

X_IDCT = IDCT(X'_DCT · M)

where γ controls enhancement strength and M is a spatial mask inherited from HFA.

2. Individuality Suppression (IS)

The IS module suppresses surrogate model-specific gradients during backpropagation:

Suppression of Redundant Features:

Quantify inter-channel redundancy using Mutual Information (MI) and Pearson Correlation Coefficient (PC)
Adaptive threshold: $\tau_{MI} = \beta_{MI} \cdot \text{mean}(MI(G_i^{(l)}, G_j^{(l)}))$
Weight computation: $w_i = \max(0.1, 1 - \alpha \sum_{(i,j) \in P} (t_{i,j}^{MI} + t_{i,j}^{corr}))$
Gradient adjustment: $\tilde{G}_i^{(l)} = G_i^{(l)} \cdot w_i$

Suppression of Auxiliary Knowledge:

For auxiliary tokens such as distillation tokens in data-efficient ViTs
Scaling factor: $c = \sigma(\frac{\|G_{additional}^{(l)}\|_2}{\|G_{primary}^{(l)}\|_2})$
Gradient adjustment: $\tilde{G}_{additional}^{(l)} = c \cdot G_{additional}^{(l)}$

Technical Innovations

Frequency Domain Commonality Exploitation: Unlike HFA which focuses only on high frequencies, CE specifically enhances mid-to-low frequency components that ViTs rely on
Adaptive Gradient Suppression: IS uses adaptive thresholds rather than fixed thresholds, better identifying and suppressing model-specific gradients
Dual Optimization Strategy: CE and IS cooperatively optimize from forward and backward directions, forming complementary effects

Experimental Setup

Datasets

ILSVRC 2012 Validation Set: Randomly sampled 1000 images, the standard setting for transfer attack research
Follows experimental protocols from prior work such as TGR

Evaluation Metrics

Attack Success Rate (ASR): $\text{ASR} = \frac{\text{Number of Successful Attacks}}{\text{Total Number of Attacks}} \times 100\%$
Measures the proportion of adversarial examples that cause target model misclassification

Comparison Methods

Primary Baselines: TGR (specifically designed for ViTs)
Recent Methods: GNS-HFA, ATT
Classical Methods: MIM, SINI-FGSM, PNA, SSA

Experimental Models

Surrogate Models: Visformer-S, DeiT-B, CaiT-S/24, ViT-B/16
ViT Target Models: TNT-S, ConViT-B, etc.
CNN Target Models: Inception-v3, Inception-v4, Inception-ResNet-v2, ResNet-101
Defense Models: Adversarially trained ensemble models

Implementation Details

Attack iterations: 10
Maximum $\ell_\infty$ perturbation: $\epsilon = 8$ (0-255 scale)
Key hyperparameters: $\gamma = 1$ , $\alpha = 0.1$ , $\beta_{MI} = 0.5$ , $\beta_{corr} = 0.7$

Experimental Results

Main Results

ViT-to-ViT Transfer Performance:

Average improvement of 7.2% over GNS-HFA
Average improvement of 10.1% over ATT
Achieves best performance across all tested ViT architectures

Cross-Architecture Transfer Performance (ViT → CNN):

Average improvement of 2.3% over GNS-HFA
Average improvement of 10.5% over ATT
Maintains good attack effectiveness against defense models

Specific Numerical Examples (with Visformer-S as surrogate model):

Method	ViT-B/16	DeiT-B	TNT-S	Inc-v3	Inc-v4
GNS-HFA	49.1%	54.1%	81.3%	71.6%	71.3%
COGO	55.2%	64.9%	85.5%	71.8%	72.4%

Ablation Study

Contribution of CE and IS Components:

CE	IS	ViTs	CNNs	CNNs-adv
-	-	46.64%	30.45%	9.80%
✓	-	72.56% (+25.92%)	56.18% (+25.73%)	32.15% (+22.35%)
-	✓	62.38% (+15.74%)	45.85% (+15.40%)	22.77% (+12.97%)
✓	✓	77.97% (+31.33%)	63.73% (+33.28%)	36.75% (+26.95%)

Key Findings:

CE component contributes more significantly, demonstrating the importance of frequency domain enhancement
IS component provides effective supplementation; combined effects are optimal
Significant improvements across all model types

Hyperparameter Sensitivity:

Enhancement coefficient γ = 1 yields best results
Iteration count N = 10 achieves performance balance
Channel pair quantity has minimal impact on results, demonstrating method robustness

Gradient Analysis

Through gradient dispersion metrics analysis:

COGO produces more uniform and diverse gradient distributions
Reduces dependence on surrogate model-specific features
Complementarity of CE and IS is evident across different layers

ViT Adversarial Attack Research

Early Methods: Primarily designed for CNNs, such as BIM, PGD, MIM
Input Transformation Methods: DIM, TIM improve transferability through input transformation
Frequency Domain Methods: SSA explores frequency domain vulnerabilities but lacks ViT-specific optimization

ViT-Specific Methods

TGR: Reduces variance by suppressing extreme gradients
GNS-HFA: Regularizes gradients to Gaussian distribution and enhances high frequencies
This Work: First to consider gradient-feature relationships, proposing commonality-oriented optimization

ViT Architecture Analysis

The authors categorize ViT variants into two types:

Computational Efficiency Type: Visformer, PiT, etc., simplifying attention operations
Data Efficiency Type: DeiT, CaiT, etc., enhancing representation through knowledge distillation

Conclusions and Discussion

Main Conclusions

Commonality-Oriented Optimization is Effective: Significantly improves adversarial example transferability by enhancing inter-model commonality and suppressing individuality
Frequency Domain Strategy is Important: Mid-to-low frequency enhancement tailored to ViT characteristics is more effective than traditional high-frequency methods
Adaptive Suppression is Superior: Adaptive suppression based on gradient-feature correlation outperforms uniform adjustment
Cross-Architecture Generalization: Method demonstrates excellent performance in both ViT-to-ViT and ViT-to-CNN transfer

Limitations

Computational Overhead: Frequency domain transformation and gradient analysis increase computational cost
Hyperparameter Sensitivity: Although relatively robust, appropriate parameter tuning is still required
Theoretical Analysis: Lacks in-depth theoretical analysis of why mid-to-low frequency enhancement is more effective
Defense Robustness: Insufficient exploration of robustness against targeted defense methods

Future Directions

Theoretical Refinement: Deepen theoretical analysis of frequency domain commonality foundations
Efficiency Optimization: Reduce computational overhead and improve practicality
Defense Research: Explore defense mechanisms against COGO
Extended Applications: Extend method to other Vision Transformer variants

In-Depth Evaluation

Strengths

Strong Novelty: First to analyze adversarial example transferability from commonality-individuality perspective, with novel approach
Systematic Methodology: CE and IS components are well-designed, forming a complete optimization framework
Comprehensive Experiments: Covers multiple model architectures and attack scenarios with convincing results
Significant Performance: Shows clear improvements over existing methods, achieving new SOTA level
In-Depth Analysis: Provides insightful gradient dispersion analysis

Weaknesses

Theoretical Foundation: Theoretical explanation of mid-to-low frequency commonality is insufficient
Computational Efficiency: Frequency domain transformation and gradient analysis increase computational complexity
Applicability Scope: Primarily targets ViTs; applicability to other architectures is limited
Defense Considerations: Insufficient consideration of adaptive defense impacts

Impact

Academic Value: Provides new optimization perspective for adversarial attack research
Practical Value: Applicable to ViT robustness assessment
Reproducibility: Provides detailed implementation details and hyperparameter settings
Inspirational Significance: Commonality-individuality analysis framework may inspire related research

Applicable Scenarios

Model Robustness Evaluation: Assessing ViT security under adversarial attacks
Adversarial Training: Generating more challenging training samples
Security Research: Understanding and improving deep learning model security
Cross-Model Attacks: Black-box scenarios where target model information is unavailable

References

The paper cites important works in related fields, including:

Vision Transformer foundational work Dosovitskiy et al., 2020
Classical adversarial attack methods Goodfellow, 2014; Madry et al., 2017
ViT-specific attack methods Zhang et al., 2023; Zhu et al., 2024
Frequency domain attack research Long et al., 2022

Overall Assessment: This is a high-quality adversarial attack research paper with excellent performance in methodological innovation, experimental design, and result analysis. The COGO method provides an effective solution for improving adversarial example transferability through dual strategies of commonality enhancement and individuality suppression, holding significant value for ViT security research.