Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization
Gao, Liu, Liu et al.
Exploring effective and transferable adversarial examples is vital for understanding the characteristics and mechanisms of Vision Transformers (ViTs). However, adversarial examples generated from surrogate models often exhibit weak transferability in black-box settings due to overfitting. Existing methods improve transferability by diversifying perturbation inputs or applying uniform gradient regularization within surrogate models, yet they have not fully leveraged the shared and unique features of surrogate models trained on the same task, leading to suboptimal transfer performance. Therefore, enhancing perturbations of common information shared by surrogate models and suppressing those tied to individual characteristics offers an effective way to improve transferability. Accordingly, we propose a commonality-oriented gradient optimization strategy (COGO) consisting of two components: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs the mid-to-low frequency regions, leveraging the fact that ViTs trained on the same dataset tend to rely more on mid-to-low frequency information for classification. IS employs adaptive thresholds to evaluate the correlation between backpropagated gradients and model individuality, assigning weights to gradients accordingly. Extensive experiments demonstrate that COGO significantly improves the transfer success rates of adversarial attacks, outperforming current state-of-the-art methods.
academic
Boosting Adversarial Transferability via Commonality-Oriented Gradient Optimization
Exploring effective and transferable adversarial examples is crucial for understanding the characteristics and mechanisms of Vision Transformers (ViTs). However, adversarial examples generated by surrogate models often exhibit weak transferability in black-box settings due to overfitting. Existing methods improve transferability by diversifying perturbed inputs or applying uniform gradient regularization within surrogate models, but fail to adequately exploit shared and unique features of surrogate models trained on the same task, resulting in suboptimal transfer performance. Therefore, enhancing perturbations that capture shared information among surrogate models while suppressing perturbations related to individual characteristics provides an effective pathway to improve transferability. Accordingly, we propose a Commonality-Oriented Gradient Optimization strategy (COGO), comprising two components: Commonality Enhancement (CE) and Individuality Suppression (IS). CE perturbs low-frequency regions, leveraging the fact that ViTs trained on the same dataset tend to rely more on mid-to-low frequency information for classification. IS employs adaptive thresholds to assess the correlation between backpropagated gradients and model individuality, assigning weights to gradients accordingly. Extensive experiments demonstrate that COGO significantly improves the transfer success rate of adversarial attacks, outperforming current state-of-the-art methods.
This paper primarily addresses the transferability problem in adversarial attacks against Vision Transformers (ViTs). Specifically, when using surrogate models to generate adversarial examples to attack unknown target models, the generated adversarial examples often fail to effectively transfer to the target model, resulting in attack failure.
Safety-Critical Applications: The reliability of ViTs in safety-critical applications is severely threatened by adversarial attacks
Black-Box Attack Realism: In practical scenarios, attackers typically cannot access the internal structure of target models, making transferability critical
Model Robustness Evaluation: Understanding the transferability of adversarial examples helps evaluate and improve model robustness
Overfitting: Adversarial examples generated by existing methods contain excessive surrogate model-specific information, resulting in poor generalization
Uniform Treatment: Methods such as TGR and GNS-HFA only adjust gradients based on statistical properties, without considering the correlation between gradients and model-specific features
Improper Frequency Domain Utilization: Methods like HFA focus only on high-frequency components, overlooking the fact that ViTs rely more on mid-to-low frequency information
The authors observe that different ViTs trained on the same dataset, despite architectural differences, exhibit commonality in decision patterns, particularly in their reliance on mid-to-low frequency information. Therefore, by enhancing common features and suppressing individual characteristics, more transferable adversarial examples can be generated.
Proposes Commonality-Oriented Optimization Strategy: First to consider the relationship between gradients and model features, transcending traditional uniform gradient adjustment methods
Designs COGO Framework: Combines Commonality Enhancement (CE) and Individuality Suppression (IS) components, utilizing frequency domain energy enhancement and adaptive threshold mechanisms
Significant Performance Improvement: Substantially outperforms existing state-of-the-art methods across multiple benchmarks, including GNS-HFA and ATT
Comprehensive Experimental Validation: Achieves excellent performance in both ViT-to-ViT transfer and cross-architecture ViT-to-CNN transfer
Given a clean input image Xclean∈RN, the objective is to generate adversarial perturbation δ such that Xadv=Xclean+δ successfully attacks the surrogate model and exhibits good black-box transferability to unknown target models.
Frequency Domain Commonality Exploitation: Unlike HFA which focuses only on high frequencies, CE specifically enhances mid-to-low frequency components that ViTs rely on
Adaptive Gradient Suppression: IS uses adaptive thresholds rather than fixed thresholds, better identifying and suppressing model-specific gradients
Dual Optimization Strategy: CE and IS cooperatively optimize from forward and backward directions, forming complementary effects
Commonality-Oriented Optimization is Effective: Significantly improves adversarial example transferability by enhancing inter-model commonality and suppressing individuality
Frequency Domain Strategy is Important: Mid-to-low frequency enhancement tailored to ViT characteristics is more effective than traditional high-frequency methods
Adaptive Suppression is Superior: Adaptive suppression based on gradient-feature correlation outperforms uniform adjustment
Cross-Architecture Generalization: Method demonstrates excellent performance in both ViT-to-ViT and ViT-to-CNN transfer
The paper cites important works in related fields, including:
Vision Transformer foundational work Dosovitskiy et al., 2020
Classical adversarial attack methods Goodfellow, 2014; Madry et al., 2017
ViT-specific attack methods Zhang et al., 2023; Zhu et al., 2024
Frequency domain attack research Long et al., 2022
Overall Assessment: This is a high-quality adversarial attack research paper with excellent performance in methodological innovation, experimental design, and result analysis. The COGO method provides an effective solution for improving adversarial example transferability through dual strategies of commonality enhancement and individuality suppression, holding significant value for ViT security research.