2025-11-24T16:40:16.782086

Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization

Gao, Liu, Liu et al.
Exploring effective and transferable adversarial examples is vital for understanding the characteristics and mechanisms of Vision Transformers (ViTs). However, adversarial examples generated from surrogate models often exhibit weak transferability in black-box settings due to overfitting. Existing methods improve transferability by diversifying perturbation inputs or applying uniform gradient regularization within surrogate models, yet they have not fully leveraged the shared and unique features of surrogate models trained on the same task, leading to suboptimal transfer performance. Therefore, enhancing perturbations of common information shared by surrogate models and suppressing those tied to individual characteristics offers an effective way to improve transferability. Accordingly, we propose a commonality-oriented gradient optimization strategy (COGO) consisting of two components: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs the mid-to-low frequency regions, leveraging the fact that ViTs trained on the same dataset tend to rely more on mid-to-low frequency information for classification. IS employs adaptive thresholds to evaluate the correlation between backpropagated gradients and model individuality, assigning weights to gradients accordingly. Extensive experiments demonstrate that COGO significantly improves the transfer success rates of adversarial attacks, outperforming current state-of-the-art methods.
academic

Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization

Basic Information

  • Paper ID: 2506.06992
  • Title: Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization
  • Authors: Yanting Gao, Yepeng Liu, Junming Liu, Qi Zhang, Hongyun Zhang, Duoqian Miao, Cairong Zhao
  • Affiliated Institutions: Tongji University, University of Florida
  • Classification: cs.CV (Computer Vision)
  • Publication Date: October 12, 2025 (arXiv preprint v2)
  • Paper Link: https://arxiv.org/abs/2506.06992

Abstract

Exploring effective and transferable adversarial examples is crucial for understanding the characteristics and mechanisms of Vision Transformers (ViTs). However, adversarial examples generated by surrogate models often exhibit weak transferability in black-box settings due to overfitting. Existing methods improve transferability by diversifying perturbed inputs or applying uniform gradient regularization within surrogate models, but fail to adequately exploit shared and unique features of surrogate models trained on the same task, resulting in suboptimal transfer performance. Therefore, enhancing perturbations that capture shared information among surrogate models while suppressing perturbations related to individual characteristics provides an effective pathway to improve transferability. Accordingly, we propose a Commonality-Oriented Gradient Optimization strategy (COGO), comprising two components: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs low-frequency regions, leveraging the fact that ViTs trained on the same dataset tend to rely more on mid-to-low frequency information for classification. IS employs adaptive thresholds to assess the correlation between backpropagated gradients and model individuality, assigning weights to gradients accordingly. Extensive experiments demonstrate that COGO significantly improves the transfer success rate of adversarial attacks, outperforming current state-of-the-art methods.

Research Background and Motivation

1. Research Problem

This paper primarily addresses the transferability problem in adversarial attacks against Vision Transformers (ViTs). Specifically, when using surrogate models to generate adversarial examples to attack unknown target models, the generated adversarial examples often fail to effectively transfer to the target model, resulting in attack failure.

2. Problem Significance

  • Safety-Critical Applications: The reliability of ViTs in safety-critical applications is severely threatened by adversarial attacks
  • Black-Box Attack Realism: In practical scenarios, attackers typically cannot access the internal structure of target models, making transferability critical
  • Model Robustness Evaluation: Understanding the transferability of adversarial examples helps evaluate and improve model robustness

3. Limitations of Existing Methods

  • Overfitting: Adversarial examples generated by existing methods contain excessive surrogate model-specific information, resulting in poor generalization
  • Uniform Treatment: Methods such as TGR and GNS-HFA only adjust gradients based on statistical properties, without considering the correlation between gradients and model-specific features
  • Improper Frequency Domain Utilization: Methods like HFA focus only on high-frequency components, overlooking the fact that ViTs rely more on mid-to-low frequency information

4. Research Motivation

The authors observe that different ViTs trained on the same dataset, despite architectural differences, exhibit commonality in decision patterns, particularly in their reliance on mid-to-low frequency information. Therefore, by enhancing common features and suppressing individual characteristics, more transferable adversarial examples can be generated.

Core Contributions

  1. Proposes Commonality-Oriented Optimization Strategy: First to consider the relationship between gradients and model features, transcending traditional uniform gradient adjustment methods
  2. Designs COGO Framework: Combines Commonality Enhancement (CE) and Individuality Suppression (IS) components, utilizing frequency domain energy enhancement and adaptive threshold mechanisms
  3. Significant Performance Improvement: Substantially outperforms existing state-of-the-art methods across multiple benchmarks, including GNS-HFA and ATT
  4. Comprehensive Experimental Validation: Achieves excellent performance in both ViT-to-ViT transfer and cross-architecture ViT-to-CNN transfer

Methodology Details

Task Definition

Given a clean input image XcleanRNX_{clean} \in \mathbb{R}^N, the objective is to generate adversarial perturbation δ\delta such that Xadv=Xclean+δX_{adv} = X_{clean} + \delta successfully attacks the surrogate model and exhibits good black-box transferability to unknown target models.

Model Architecture

The COGO strategy comprises two core components:

1. Commonality Enhancement (CE)

The CE module enhances mid-to-low frequency components during forward propagation:

Step 1: Add current perturbation and Gaussian noise

X = X_clean + δ
X_DCT = DCT(X + ε), where ε ~ N(0, I_N)

Step 2: Compute energy distribution and enhance

E(X_DCT) = Normalize(|X_DCT|)
X'_DCT = X_DCT · (1 + γ · E(X_DCT))

Step 3: Transform back to spatial domain and apply spatial mask

X_IDCT = IDCT(X'_DCT · M)

where γ controls enhancement strength and M is a spatial mask inherited from HFA.

2. Individuality Suppression (IS)

The IS module suppresses surrogate model-specific gradients during backpropagation:

Suppression of Redundant Features:

  • Quantify inter-channel redundancy using Mutual Information (MI) and Pearson Correlation Coefficient (PC)
  • Adaptive threshold: τMI=βMImean(MI(Gi(l),Gj(l)))\tau_{MI} = \beta_{MI} \cdot \text{mean}(MI(G_i^{(l)}, G_j^{(l)}))
  • Weight computation: wi=max(0.1,1α(i,j)P(ti,jMI+ti,jcorr))w_i = \max(0.1, 1 - \alpha \sum_{(i,j) \in P} (t_{i,j}^{MI} + t_{i,j}^{corr}))
  • Gradient adjustment: G~i(l)=Gi(l)wi\tilde{G}_i^{(l)} = G_i^{(l)} \cdot w_i

Suppression of Auxiliary Knowledge:

  • For auxiliary tokens such as distillation tokens in data-efficient ViTs
  • Scaling factor: c=σ(Gadditional(l)2Gprimary(l)2)c = \sigma(\frac{\|G_{additional}^{(l)}\|_2}{\|G_{primary}^{(l)}\|_2})
  • Gradient adjustment: G~additional(l)=cGadditional(l)\tilde{G}_{additional}^{(l)} = c \cdot G_{additional}^{(l)}

Technical Innovations

  1. Frequency Domain Commonality Exploitation: Unlike HFA which focuses only on high frequencies, CE specifically enhances mid-to-low frequency components that ViTs rely on
  2. Adaptive Gradient Suppression: IS uses adaptive thresholds rather than fixed thresholds, better identifying and suppressing model-specific gradients
  3. Dual Optimization Strategy: CE and IS cooperatively optimize from forward and backward directions, forming complementary effects

Experimental Setup

Datasets

  • ILSVRC 2012 Validation Set: Randomly sampled 1000 images, the standard setting for transfer attack research
  • Follows experimental protocols from prior work such as TGR

Evaluation Metrics

  • Attack Success Rate (ASR): ASR=Number of Successful AttacksTotal Number of Attacks×100%\text{ASR} = \frac{\text{Number of Successful Attacks}}{\text{Total Number of Attacks}} \times 100\%
  • Measures the proportion of adversarial examples that cause target model misclassification

Comparison Methods

  • Primary Baselines: TGR (specifically designed for ViTs)
  • Recent Methods: GNS-HFA, ATT
  • Classical Methods: MIM, SINI-FGSM, PNA, SSA

Experimental Models

  • Surrogate Models: Visformer-S, DeiT-B, CaiT-S/24, ViT-B/16
  • ViT Target Models: TNT-S, ConViT-B, etc.
  • CNN Target Models: Inception-v3, Inception-v4, Inception-ResNet-v2, ResNet-101
  • Defense Models: Adversarially trained ensemble models

Implementation Details

  • Attack iterations: 10
  • Maximum \ell_\infty perturbation: ϵ=8\epsilon = 8 (0-255 scale)
  • Key hyperparameters: γ=1\gamma = 1, α=0.1\alpha = 0.1, βMI=0.5\beta_{MI} = 0.5, βcorr=0.7\beta_{corr} = 0.7

Experimental Results

Main Results

ViT-to-ViT Transfer Performance:

  • Average improvement of 7.2% over GNS-HFA
  • Average improvement of 10.1% over ATT
  • Achieves best performance across all tested ViT architectures

Cross-Architecture Transfer Performance (ViT → CNN):

  • Average improvement of 2.3% over GNS-HFA
  • Average improvement of 10.5% over ATT
  • Maintains good attack effectiveness against defense models

Specific Numerical Examples (with Visformer-S as surrogate model):

MethodViT-B/16DeiT-BTNT-SInc-v3Inc-v4
GNS-HFA49.1%54.1%81.3%71.6%71.3%
COGO55.2%64.9%85.5%71.8%72.4%

Ablation Study

Contribution of CE and IS Components:

CEISViTsCNNsCNNs-adv
--46.64%30.45%9.80%
-72.56% (+25.92%)56.18% (+25.73%)32.15% (+22.35%)
-62.38% (+15.74%)45.85% (+15.40%)22.77% (+12.97%)
77.97% (+31.33%)63.73% (+33.28%)36.75% (+26.95%)

Key Findings:

  • CE component contributes more significantly, demonstrating the importance of frequency domain enhancement
  • IS component provides effective supplementation; combined effects are optimal
  • Significant improvements across all model types

Hyperparameter Sensitivity:

  • Enhancement coefficient γ = 1 yields best results
  • Iteration count N = 10 achieves performance balance
  • Channel pair quantity has minimal impact on results, demonstrating method robustness

Gradient Analysis

Through gradient dispersion metrics analysis:

  • COGO produces more uniform and diverse gradient distributions
  • Reduces dependence on surrogate model-specific features
  • Complementarity of CE and IS is evident across different layers

ViT Adversarial Attack Research

  • Early Methods: Primarily designed for CNNs, such as BIM, PGD, MIM
  • Input Transformation Methods: DIM, TIM improve transferability through input transformation
  • Frequency Domain Methods: SSA explores frequency domain vulnerabilities but lacks ViT-specific optimization

ViT-Specific Methods

  • TGR: Reduces variance by suppressing extreme gradients
  • GNS-HFA: Regularizes gradients to Gaussian distribution and enhances high frequencies
  • This Work: First to consider gradient-feature relationships, proposing commonality-oriented optimization

ViT Architecture Analysis

The authors categorize ViT variants into two types:

  1. Computational Efficiency Type: Visformer, PiT, etc., simplifying attention operations
  2. Data Efficiency Type: DeiT, CaiT, etc., enhancing representation through knowledge distillation

Conclusions and Discussion

Main Conclusions

  1. Commonality-Oriented Optimization is Effective: Significantly improves adversarial example transferability by enhancing inter-model commonality and suppressing individuality
  2. Frequency Domain Strategy is Important: Mid-to-low frequency enhancement tailored to ViT characteristics is more effective than traditional high-frequency methods
  3. Adaptive Suppression is Superior: Adaptive suppression based on gradient-feature correlation outperforms uniform adjustment
  4. Cross-Architecture Generalization: Method demonstrates excellent performance in both ViT-to-ViT and ViT-to-CNN transfer

Limitations

  1. Computational Overhead: Frequency domain transformation and gradient analysis increase computational cost
  2. Hyperparameter Sensitivity: Although relatively robust, appropriate parameter tuning is still required
  3. Theoretical Analysis: Lacks in-depth theoretical analysis of why mid-to-low frequency enhancement is more effective
  4. Defense Robustness: Insufficient exploration of robustness against targeted defense methods

Future Directions

  1. Theoretical Refinement: Deepen theoretical analysis of frequency domain commonality foundations
  2. Efficiency Optimization: Reduce computational overhead and improve practicality
  3. Defense Research: Explore defense mechanisms against COGO
  4. Extended Applications: Extend method to other Vision Transformer variants

In-Depth Evaluation

Strengths

  1. Strong Novelty: First to analyze adversarial example transferability from commonality-individuality perspective, with novel approach
  2. Systematic Methodology: CE and IS components are well-designed, forming a complete optimization framework
  3. Comprehensive Experiments: Covers multiple model architectures and attack scenarios with convincing results
  4. Significant Performance: Shows clear improvements over existing methods, achieving new SOTA level
  5. In-Depth Analysis: Provides insightful gradient dispersion analysis

Weaknesses

  1. Theoretical Foundation: Theoretical explanation of mid-to-low frequency commonality is insufficient
  2. Computational Efficiency: Frequency domain transformation and gradient analysis increase computational complexity
  3. Applicability Scope: Primarily targets ViTs; applicability to other architectures is limited
  4. Defense Considerations: Insufficient consideration of adaptive defense impacts

Impact

  1. Academic Value: Provides new optimization perspective for adversarial attack research
  2. Practical Value: Applicable to ViT robustness assessment
  3. Reproducibility: Provides detailed implementation details and hyperparameter settings
  4. Inspirational Significance: Commonality-individuality analysis framework may inspire related research

Applicable Scenarios

  1. Model Robustness Evaluation: Assessing ViT security under adversarial attacks
  2. Adversarial Training: Generating more challenging training samples
  3. Security Research: Understanding and improving deep learning model security
  4. Cross-Model Attacks: Black-box scenarios where target model information is unavailable

References

The paper cites important works in related fields, including:

  • Vision Transformer foundational work Dosovitskiy et al., 2020
  • Classical adversarial attack methods Goodfellow, 2014; Madry et al., 2017
  • ViT-specific attack methods Zhang et al., 2023; Zhu et al., 2024
  • Frequency domain attack research Long et al., 2022

Overall Assessment: This is a high-quality adversarial attack research paper with excellent performance in methodological innovation, experimental design, and result analysis. The COGO method provides an effective solution for improving adversarial example transferability through dual strategies of commonality enhancement and individuality suppression, holding significant value for ViT security research.