Optimized Layerwise Approximation for Efficient Private Inference on Fully Homomorphic Encryption
Lee, Lee, Kim et al.
Recent studies have explored the deployment of privacy-preserving deep neural networks utilizing homomorphic encryption (HE), especially for private inference (PI). Many works have attempted the approximation-aware training (AAT) approach in PI, changing the activation functions of a model to low-degree polynomials that are easier to compute on HE by allowing model retraining. However, due to constraints in the training environment, it is often necessary to consider post-training approximation (PTA), using the pre-trained parameters of the existing plaintext model without retraining. Existing PTA studies have uniformly approximated the activation function in all layers to a high degree to mitigate accuracy loss from approximation, leading to significant time consumption. This study proposes an optimized layerwise approximation (OLA), a systematic framework that optimizes both accuracy loss and time consumption by using different approximation polynomials for each layer in the PTA scenario. For efficient approximation, we reflect the layerwise impact on the classification accuracy by considering the actual input distribution of each activation function while constructing the optimization problem. Additionally, we provide a dynamic programming technique to solve the optimization problem and achieve the optimized layerwise degrees in polynomial time. As a result, the OLA method reduces inference times for the ResNet-20 model and the ResNet-32 model by 3.02 times and 2.82 times, respectively, compared to prior state-of-the-art implementations employing uniform degree polynomials. Furthermore, we successfully classified CIFAR-10 by replacing the GELU function in the ConvNeXt model with only 3-degree polynomials using the proposed method, without modifying the backbone model.
본 논문은 완전 동형 암호화(FHE)에서 효율적인 프라이빗 추론을 구현하기 위한 최적화된 계층별 근사(OLA) 방법을 제안한다. 이 방법은 각 계층에 대해 서로 다른 근사 다항식을 사용하여 정확도 손실과 시간 소비를 최적화하며, 사후 훈련 근사(PTA) 시나리오에서 추론 효율을 크게 향상시킨다. OLA 방법은 ResNet-20과 ResNet-32 모델의 추론 시간을 각각 3.02배와 2.82배 감소시켰으며, ConvNeXt 모델의 GELU 함수를 단 3차 다항식으로 성공적으로 대체했다.
프라이버시 보호 머신러닝(PPML)에서 완전 동형 암호화(FHE)는 암호화된 데이터에 대해 직접 계산을 수행할 수 있게 한다. 그러나 FHE 방식은 기본 산술 연산(덧셈과 곱셈)만 지원하며, 비산술 활성화 함수(ReLU, GELU, sigmoid 등)를 직접 처리할 수 없다.
Lee et al. "Low-complexity deep convolutional neural networks on fully homomorphic encryption using multiplexed convolutions." ICML 2022.
Kim et al. "Optimized privacy-preserving cnn inference with fully homomorphic encryption." IEEE TIFS 2023.
Gilad-Bachrach et al. "Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy." ICML 2016.
Cheon et al. "A full rns variant of approximate homomorphic encryption." SAC 2018.
요약: 본 논문에서 제안한 OLA 방법은 FHE 기반 프라이빗 추론 분야에서 중요한 의미를 가지며, 계층별 최적화를 통해 추론 효율을 크게 향상시켜 프라이버시 보호 AI의 실제 응용을 위한 중요한 기초를 마련했다. 일부 한계가 있지만, 그 혁신성과 실용 가치는 이를 해당 분야의 중요한 기여로 만든다.