FedGTEA: Federated Class-Incremental Learning with Gaussian Task Embedding and Alignment
Li, Bidkhori
We introduce a novel framework for Federated Class Incremental Learning, called Federated Gaussian Task Embedding and Alignment (FedGTEA). FedGTEA is designed to capture task-specific knowledge and model uncertainty in a scalable and communication-efficient manner. At the client side, the Cardinality-Agnostic Task Encoder (CATE) produces Gaussian-distributed task embeddings that encode task knowledge, address statistical heterogeneity, and quantify data uncertainty. Importantly, CATE maintains a fixed parameter size regardless of the number of tasks, which ensures scalability across long task sequences. On the server side, FedGTEA utilizes the 2-Wasserstein distance to measure inter-task gaps between Gaussian embeddings. We formulate the Wasserstein loss to enforce inter-task separation. This probabilistic formulation not only enhances representation learning but also preserves task-level privacy by avoiding the direct transmission of latent embeddings, aligning with the privacy constraints in federated learning. Extensive empirical evaluations on popular datasets demonstrate that FedGTEA achieves superior classification performance and significantly mitigates forgetting, consistently outperforming strong existing baselines.
academic
FedGTEA: Federated Class-Incremental Learning with Gaussian Task Embedding and Alignment
This paper proposes a novel federated class-incremental learning framework, FedGTEA (Federated Gaussian Task Embedding and Alignment). The framework captures task-specific knowledge and model uncertainty in a scalable and communication-efficient manner. On the client side, a Cardinality-Agnostic Task Encoder (CATE) generates Gaussian-distributed task embeddings that encode task knowledge, address statistical heterogeneity, and quantify data uncertainty. A key characteristic of CATE is that it maintains fixed parameter scale regardless of the number of tasks, ensuring scalability for long task sequences. On the server side, FedGTEA leverages the 2-Wasserstein distance to measure task gaps between Gaussian embeddings, enforcing task separation through Wasserstein loss. This probabilistic formulation not only enhances representation learning but also protects task-level privacy by avoiding direct transmission of latent embeddings.
Federated Class-Incremental Learning (FCIL) is a hybrid of federated learning (FL) and class-incremental learning (CIL), requiring simultaneous solutions to three core challenges:
Catastrophic Forgetting: Occurs both during local client updates and global aggregation
Statistical Heterogeneity: Data distributions across clients are typically non-independent and identically distributed
Task Context Ambiguity: Lack of task identity at test time leads to semantic drift and performance degradation
Existing FCIL methods primarily focus on data-level feature utilization while neglecting the importance of task-level context. As shown in Figure 1, the same input may produce contradictory answers under different tasks (e.g., "What is this object?" vs. "What is the background color?"), requiring different task-level contextual information. Therefore, how to effectively utilize task context in FCIL remains a relatively underdeveloped research area.
Proposes FedGTEA Algorithm: Effectively captures task-level knowledge in FCIL in a scalable and robust manner, introducing a Cardinality-Agnostic Task Encoder (CATE) on the client side to generate task embeddings modeled as Gaussian random variables, and leveraging 2-Wasserstein distance on the server side to promote task separation.
Designs CATE Module: Capable of inferring task embeddings from data batches of arbitrary size with cardinality-agnostic properties. By modeling embeddings as Gaussian random variables, the server can quantify inter-task distances using the 2-Wasserstein metric.
Server-side Optimization Framework: First performs initial model aggregation using FedAvg principles, then formulates an optimization problem containing three loss components: knowledge distillation loss, Wasserstein loss, and anchor loss.
Experimental Validation: Achieves superior accuracy and forgetting performance compared to strong baselines (AC-GAN + FedAvg/FedProx, GLFC, FedCIL, FLwF-2T) on multiple benchmark datasets.
The FCIL system consists of N clients and a central server, processing a global task sequence T = {T¹, T², ..., Tᵀ}. Each client Cₖ collects a local dataset Dᵗₖ ⊂ Tᵗ during task Tᵗ. The objective is to find global parameters θᵗₘ that minimize the loss across all observed tasks and all clients.
Distillation Loss: Removing it significantly increases forgetting rate (from 8.6 to 12.2 on CIFAR-100 superclass), demonstrating its importance for retaining prior knowledge
Anchor Loss: Removing it substantially decreases accuracy (nearly 7% drop on CIFAR-10), indicating its necessity for stabilizing discriminative feature representation
CATE and Wasserstein Loss: Removing them significantly degrades performance, validating the effectiveness of the task encoder and task separation mechanism
Main aggregation strategies include FedAvg and FedProx, addressing statistical heterogeneity through weighted averaging and regularization, respectively.
FedGTEA achieves effective modeling of task-level knowledge in FCIL by introducing a cardinality-agnostic task encoder and Wasserstein distance regularization, outperforming existing methods in both accuracy and forgetting performance.
This work introduces a new perspective of task-level modeling to the FCIL field, potentially inspiring more research focusing on task context. The cardinality-agnostic design and privacy protection features make it promising for practical applications.
The paper cites important works in FCIL, CIL, and FL domains, including classical methods such as FedAvg, iCaRL, and AC-GAN, as well as recent FCIL research including FedCIL and GLFC, providing a solid theoretical foundation for this research.