Personal Attribute Leakage in Federated Speech Models
Al-Ali, Ghavamipour, Caselli et al.
Federated learning is a common method for privacy-preserving training of machine learning models. In this paper, we analyze the vulnerability of ASR models to attribute inference attacks in the federated setting. We test a non-parametric white-box attack method under a passive threat model on three ASR models: Wav2Vec2, HuBERT, and Whisper. The attack operates solely on weight differentials without access to raw speech from target speakers. We demonstrate attack feasibility on sensitive demographic and clinical attributes: gender, age, accent, emotion, and dysarthria. Our findings indicate that attributes that are underrepresented or absent in the pre-training data are more vulnerable to such inference attacks. In particular, information about accents can be reliably inferred from all models. Our findings expose previously undocumented vulnerabilities in federated ASR models and offer insights towards improved security.
academic
Personal Attribute Leakage in Federated Speech Models
Federated learning is a widely-used approach for privacy-preserving training of machine learning models. This paper analyzes the vulnerability of ASR models in federated environments to attribute inference attacks. Under a passive threat model, researchers tested non-parametric white-box attack methods against three ASR models (Wav2Vec2, HuBERT, and Whisper). The attack operates solely on weight differences without requiring access to the target speaker's original speech. The study demonstrates the feasibility of attacks on sensitive demographic and clinical attributes (gender, age, accent, emotion, and dysarthria). The research reveals that attributes underrepresented or absent in pretraining data are more susceptible to such inference attacks. Notably, accent information can be reliably inferred from all models.
Core Issue: Whether ASR models in federated learning environments leak sensitive personal attribute information through model weight updates
Privacy Threats: Speech data contains rich personal information, including demographic characteristics (gender, age, accent), clinical conditions (dysarthria), and emotional states
Legal Compliance: Attribute leakage may violate GDPR, HIPAA, and anti-discrimination laws in the US and EU
Privacy Protection: The ADA protects individuals with disabilities from discrimination; leakage of speech disorder information carries severe consequences
Practical Threats: Even without identity disclosure, leakage of attributes such as accent or emotional state constitutes serious privacy violations
Federated Learning Assumptions: While federated learning improves privacy by keeping raw audio on-device, model updates may still leak sensitive information
Research Gaps: Previous work primarily focused on speaker re-identification and membership inference attacks, but the scope of attribute leakage remains insufficiently explored
Threat Models: Lack of systematic research on attribute inference through weight updates alone
Significant Research Value: First systematic revelation of attribute leakage vulnerabilities in federated ASR models with important privacy protection implications
Reasonable Methodology Design: Attack method is simple and effective; threat model is realistic and credible
Baevski et al. "wav2vec 2.0: A framework for self-supervised learning of speech representations." NeurIPS 2020.
Hsu et al. "HuBERT: Self-supervised speech representation learning by masked prediction of hidden units." IEEE/ACM TASLP 2021.
Radford et al. "Robust speech recognition via large-scale weak supervision." ICML 2023.
Shokri et al. "Membership inference attacks against machine learning models." IEEE S&P 2017.
Melis et al. "Exploiting unintended feature leakage in collaborative learning." IEEE S&P 2019.
This paper reveals important privacy risks in federated learning within the speech domain, providing valuable insights and guidance for building safer speech AI systems. The research not only possesses significant academic value but also carries profound implications for practical applications.