Understanding how deep neural networks make decisions is crucial for analyzing their behavior and diagnosing failure cases. In computer vision, a common approach to improve interpretability is to assign importance to individual pixels using post-hoc methods. Although they are widely used to explain black-box models, their fidelity to the model's actual reasoning is uncertain due to the lack of reliable evaluation metrics. This limitation motivates an alternative approach, which is to design models whose decision processes are inherently interpretable. To this end, we propose a face similarity metric that breaks down global similarity into contributions from restricted receptive fields. Our method defines the similarity between two face images as the sum of patch-level similarity scores, providing a locally additive explanation without relying on post-hoc analysis. We show that the proposed approach achieves competitive verification performance even with patches as small as 28x28 within 112x112 face images, and surpasses state-of-the-art methods when using 56x56 patches.
This paper proposes a face verification method based on restricted receptive fields, aimed at addressing the interpretability problem of deep neural network decision-making processes. Traditional methods represent entire face images using a single global feature vector, while this work decomposes global similarity into local contributions from restricted receptive fields. The method defines similarity between two face images as the sum of block-level similarity scores, providing locally additive interpretability without relying on post-hoc analysis. Experiments demonstrate that the method achieves competitive verification performance even with small 28×28 patches in 112×112 face images, and surpasses state-of-the-art methods when using 56×56 patches.
Deep neural networks achieve excellent performance in face recognition tasks, but their decision-making processes lack interpretability, which is a serious concern in high-risk application scenarios.
This paper proposes an intrinsically interpretable alternative by designing models whose decision processes are inherently interpretable, rather than relying on post-hoc analysis methods.
Input: Two 112×112 face images A and B
Output: Binary verification decision (same/different identity)
Constraint: Decision process must be interpretable as a combination of local region contributions
The paper cites 68 related references, primarily covering:
Explainable AI methods (Rudin 2019, Chen et al. 2019)
Face recognition techniques (Deng et al. 2019, Kim et al. 2022)
Deep learning architectures (He et al. 2016)
Evaluation benchmark datasets (Huang et al. 2007, Wu et al. 2024)
Summary: This paper proposes an innovative face verification method based on restricted receptive fields, successfully achieving intrinsic interpretability while maintaining high performance. This work provides valuable new insights for the explainable AI field, particularly suitable for high-risk application scenarios requiring decision transparency. Despite limitations such as computational overhead and insufficient theoretical analysis, its innovation and practical value make it an important contribution to the field.