Multi-View Graph Feature Propagation for Privacy Preservation and Feature Sparsity
Harari, Unger
Graph Neural Networks (GNNs) have demonstrated remarkable success in node classification tasks over relational data, yet their effectiveness often depends on the availability of complete node features. In many real-world scenarios, however, feature matrices are highly sparse or contain sensitive information, leading to degraded performance and increased privacy risks. Furthermore, direct exposure of information can result in unintended data leakage, enabling adversaries to infer sensitive information. To address these challenges, we propose a novel Multi-view Feature Propagation (MFP) framework that enhances node classification under feature sparsity while promoting privacy preservation. MFP extends traditional Feature Propagation (FP) by dividing the available features into multiple Gaussian-noised views, each propagating information independently through the graph topology. The aggregated representations yield expressive and robust node embeddings. This framework is novel in two respects: it introduces a mechanism that improves robustness under extreme sparsity, and it provides a principled way to balance utility with privacy. Extensive experiments conducted on graph datasets demonstrate that MFP outperforms state-of-the-art baselines in node classification while substantially reducing privacy leakage. Moreover, our analysis demonstrates that propagated outputs serve as alternative imputations rather than reconstructions of the original features, preserving utility without compromising privacy. A comprehensive sensitivity analysis further confirms the stability and practical applicability of MFP across diverse scenarios. Overall, MFP provides an effective and privacy-aware framework for graph learning in domains characterized by missing or sensitive features.
academic
Multi-View Graph Feature Propagation for Privacy Preservation and Feature Sparsity
Graph Neural Networks (GNNs) have achieved remarkable success in node classification tasks on relational data, yet their effectiveness often depends on the availability of complete node features. However, in many real-world scenarios, feature matrices are highly sparse or contain sensitive information, leading to performance degradation and increased privacy risks. To address these challenges, this paper proposes a novel Multi-view Feature Propagation (MFP) framework that enhances node classification performance under feature sparsity conditions while promoting privacy preservation. MFP extends traditional feature propagation (FP) by partitioning available features into multiple Gaussian-noise views, with each view propagating information independently through graph topology. The aggregated representations yield expressive and robust node embeddings.
This research addresses two core challenges in graph neural networks:
Feature Sparsity Problem: In practical applications, node feature matrices in graph data are often highly sparse or incomplete, causing severe performance degradation in GNNs
Privacy Protection Problem: Node features commonly contain sensitive personal information (e.g., demographic data, behavioral patterns), and direct usage may lead to privacy breaches
Traditional Feature Propagation (FP): While alleviating feature sparsity, performance remains significantly lower than models trained on complete features, and may reconstruct sensitive information
Differential Privacy Methods: Protect privacy by adding noise but often sacrifice model performance
Graph Anonymization: May excessively damage graph structure, affecting learning effectiveness
Proposes MFP Framework: The first graph learning framework simultaneously addressing feature sparsity and privacy protection
Multi-view Propagation Mechanism: Enhances representation learning capability through independent propagation and aggregation of multiple partially-noised views
Privacy Protection Verification: Demonstrates that propagation outputs are substitute interpolations rather than reconstructions of original features, preventing privacy leakage
Comprehensive Experimental Evaluation: Validates MFP's effectiveness and robustness on multiple benchmark datasets
Sensitivity Analysis: Systematically analyzes the impact of key factors including graph homophily, propagation depth, and number of views
Input: Attributed graph G = {X, E}, where E is the edge set, X ∈ R^{|V|×d} is the node feature matrix potentially containing sensitive attributes
Output: Node classification predictions Ŷ ∈ R^{|V|}
Objective: Achieve high-performance node classification while protecting sensitive feature privacy
Feature Distance Analysis: RMSE distributions of MFP and FP are highly similar to random noise, indicating no reconstruction of original features
Correlation Analysis: MFP's PCC values are primarily concentrated in -0.1, 0.1 interval, significantly lower than FP, indicating better privacy protection
Cross-representation Generalization: Model performance drops significantly across different representations (e.g., Cora dataset from 0.87 to 0.56), proving propagation outputs are substitute representations rather than reconstructions
Rossi et al. (2022): Feature Propagation effectiveness
Yang et al. (2016): Planetoid benchmark datasets
Zhu et al. (2020): Homophily in graph neural networks
Overall Assessment: This paper addresses dual challenges of feature sparsity and privacy protection in graph neural networks by proposing an innovative multi-view feature propagation framework. The method design is sound, experimental validation is comprehensive, and it advances the research frontier of privacy-preserving graph learning while maintaining practical utility. Although there is room for improvement in theoretical analysis and privacy guarantees, this is overall a high-quality research contribution.