Over-Threshold Multiparty Private Set Intersection for Collaborative Network Intrusion Detection
Arpaci, Boutaba, Kerschbaum
An important function of collaborative network intrusion detection is to analyze the network logs of the collaborators for joint IP addresses. However, sharing IP addresses in plain is sensitive and may be even subject to privacy legislation as it is personally identifiable information. In this paper, we present the privacy-preserving collection of IP addresses. We propose a single collector, over-threshold private set intersection protocol. In this protocol $N$ participants identify the IP addresses that appear in at least $t$ participant's sets without revealing any information about other IP addresses. Using a novel hashing scheme, we reduce the computational complexity of the previous state-of-the-art solution from $O(M(N \log{M}/t)^{2t})$ to $O(t^2M\binom{N}{t})$, where $M$ denotes the dataset size. This reduction makes it practically feasible to apply our protocol to real network logs. We test our protocol using joint networks logs of multiple institutions. Additionally, we present two deployment options: a collusion-safe deployment, which provides stronger security guarantees at the cost of increased communication overhead, and a non-interactive deployment, which assumes a non-colluding collector but offers significantly lower communication costs and applicable to many use cases of collaborative network intrusion detection similar to ours.
academic
Over-Threshold Multiparty Private Set Intersection for Collaborative Network Intrusion Detection
A critical function in collaborative network intrusion detection is analyzing network logs from collaborators to identify common IP addresses. However, sharing IP addresses in plaintext is sensitive and may be constrained by privacy legislation, as it constitutes personally identifiable information. This paper proposes a privacy-preserving collection method for IP addresses through a single-aggregator, over-threshold private set intersection protocol. In this protocol, N participants identify IP addresses appearing in at least t participants' sets without revealing any information about other IP addresses. Through a novel hashing scheme, the computational complexity of the previous state-of-the-art solution is reduced from O(M(NlogM/t)2t) to O(t2M(tN)), where M denotes the dataset size. This reduction makes applying the protocol to real network logs practically feasible.
The core challenge in collaborative network intrusion detection is identifying multi-institutional attacks while preserving privacy. Research demonstrates that 75% of institutional attacks spread to a second institution within one day, with over 40% spreading within one hour. Attackers typically exploit a small number of external IP addresses to simultaneously attack multiple institutions. If an external IP connects to at least t institutions within a specific time window, it can be classified as malicious with 95% recall.
Novel Hashing Scheme: Proposes an innovative hashing algorithm reducing computational complexity from O(M(N logM/t)²ᵗ) to O(t²M(N choose t)), achieving linear complexity in M
Practical Enhancement: Enables the protocol to handle real-scale network logs, completing detection within 170 seconds for 33 participating institutions with up to 144,045 IPs
Dual Deployment Options:
Collusion-resistant deployment: Provides stronger security guarantees but higher communication overhead
Non-interactive deployment: Assumes non-colluding aggregator, significantly reducing communication costs
Security Proof: Proves protocol security under the semi-honest multiparty computation model
Practical Validation: Evaluation using real network logs from the CANARIE IDS project
This paper cites 53 relevant references spanning cryptography, network security, and multiparty computation, providing solid theoretical foundation and comprehensive technical background.
Overall Assessment: This is a high-quality applied cryptography paper achieving excellent balance between theoretical innovation and practical application. The proposed hashing scheme represents significant theoretical breakthrough while demonstrating substantial practical value. The paper provides comprehensive experimental validation and rigorous security analysis, making important technical contributions to collaborative network security.