2025-11-11T08:49:09.253291

"Show Me You Comply... Without Showing Me Anything": Zero-Knowledge Software Auditing for AI-Enabled Systems

Scaramuzza, Ferreira, Suller et al.

The increasing exploitation of Artificial Intelligence (AI) enabled systems in critical domains has made trustworthiness concerns a paramount showstopper, requiring verifiable accountability, often by regulation (e.g., the EU AI Act). Classical software verification and validation techniques, such as procedural audits, formal methods, or model documentation, are the mechanisms used to achieve this. However, these methods are either expensive or heavily manual and ill-suited for the opaque, "black box" nature of most AI models. An intractable conflict emerges: high auditability and verifiability are required by law, but such transparency conflicts with the need to protect assets being audited-e.g., confidential data and proprietary models-leading to weakened accountability. To address this challenge, this paper introduces ZKMLOps, a novel MLOps verification framework that operationalizes Zero-Knowledge Proofs (ZKPs)-cryptographic protocols allowing a prover to convince a verifier that a statement is true without revealing additional information-within Machine-Learning Operations lifecycles. By integrating ZKPs with established software engineering patterns, ZKMLOps provides a modular and repeatable process for generating verifiable cryptographic proof of compliance. We evaluate the framework's practicality through a study of regulatory compliance in financial risk auditing and assess feasibility through an empirical evaluation of top ZKP protocols, analyzing performance trade-offs for ML models of increasing complexity.

academic

"Show Me You Comply... Without Showing Me Anything": Zero-Knowledge Software Auditing for AI-Enabled Systems

Basic Information

Paper ID: 2510.26576
Title: "Show Me You Comply... Without Showing Me Anything": Zero-Knowledge Software Auditing for AI-Enabled Systems
Authors: Filippo Scaramuzza, Renato Cordeiro Ferreira, Tomaz Maia Suller, Giovanni Quattrocchi, Damian Andrew Tamburri, Willem-Jan van den Heuvel
Classification: cs.SE (Software Engineering)
Submission Date: October 30, 2025 to arXiv
Paper Link: https://arxiv.org/abs/2510.26576

Abstract

With the widespread application of artificial intelligence systems in critical domains, trustworthiness has become a significant barrier, and regulatory requirements (such as the EU AI Act) demand verifiable accountability. Traditional software verification and validation techniques (such as program auditing, formal methods, or model documentation) suffer from high costs, extensive manual operations, and unsuitability for the "black-box" nature of AI models. This paper proposes the ZKMLOps framework, which addresses the contradiction between audit transparency and asset protection by integrating zero-knowledge proofs (ZKPs) into the machine learning operations lifecycle, providing modular and repeatable compliance verification processes.

Research Background and Motivation

Core Problem

The research addresses a fundamental conflict in AI system auditing: legal requirements demand high auditability and verifiability, yet this transparency conflicts with the need to protect audited assets (such as confidential data and proprietary models).

Problem Significance

Increasing Regulatory Pressure: Regulations such as the EU AI Act classify many industrial AI deployments as high-risk, requiring compliance evidence
Growing Critical Applications: AI systems are increasingly deployed in safety-critical domains such as finance, healthcare, and transportation
Inadequacy of Traditional Auditing: Existing software verification techniques have limited effectiveness for opaque AI models with millions of parameters

Limitations of Existing Approaches

Program Auditing: Costly and heavily dependent on manual operations
Formal Methods: Only effective when implementation logic is clear and deterministically modelable
Model Documentation: Cannot handle the "black-box" nature of AI models
Transparency Conflicts: Disclosing artifacts required for auditing may leak intellectual property or personal data

Research Motivation

Inspired by events such as the Volkswagen emissions scandal, the authors recognized the need for a method that provides verifiable compliance proof without disclosing sensitive information. Zero-knowledge proof technology offers a potential solution to this problem.

Core Contributions

Proposes ZKMLOps Framework: The first novel architecture that systematically integrates zero-knowledge proofs into the MLOps lifecycle
Practical Validation: Demonstrates the framework's practical value through a regulatory compliance use case in financial risk auditing
Feasibility Assessment: Conducts empirical evaluation of multiple ZKP protocols, analyzing performance trade-offs for ML models of varying complexity
Engineering Implementation: Transforms complex cryptographic procedures into modular, repeatable, and maintainable engineering processes

Methodology Details

Task Definition

Objective: Implement systematic AI system auditing within the MLOps lifecycle, enabling organizations to provide verifiable cryptographic proofs demonstrating system compliance with specific requirements and regulations, while protecting proprietary information and sensitive data.

Inputs: AI models, datasets, audit requirements Outputs: Zero-knowledge proofs and verification results Constraints: Protection of intellectual property and data privacy

Model Architecture

Overall Architecture Design

The ZKMLOps framework adopts a Hexagonal Architecture, divided into three main layers:

Methodological Layer: ML system verification lifecycle guiding principles (Components 1-4)
Implementation Layer: Trusted service architecture (Components 5-8)
Stakeholder Layer: Trust stakeholder interfaces (Component 9)

Core Component Functions

1. ML System Verification Lifecycle (Components 1-4)

MLOps Verification Lifecycle Selection: Choose one of four stages based on audit objectives
- Data and preprocessing verification
- Training and offline metrics verification
- Inference verification
- Online metrics verification
Model Selection: Select verification techniques based on technical requirements of deployed models
Protocol Selection: Choose the ZKP protocol most suitable for the application architecture
ZKP Traceability Specification: Generate documentation containing audit objectives, decision trajectories, and selected protocols

2. Trusted Service Architecture (Components 5-8)

Hexagonal Architecture Core: Implements business logic of audit workflows
Artifact Storage: Manages input and output artifacts during the audit process
ZKP Scripts: Executes specific implementations of different ZKP protocols
Internal State Machine: Coordinates execution of four ZKP steps (setup, key exchange, proof, verification)

Technical Implementation Details

State Machine Design: Employs Orchestration Saga Pattern and State Pattern, decomposing each audit workflow into four fundamental steps:

Setup → Key Exchange → Proof → Verification

Dependency Injection Pattern: Injects required adapters at runtime through configuration files, supporting flexible switching between multiple ZKP protocols.

Anti-Corruption Layer: Implements abstraction of external dependencies using ports and adapters pattern, including:

Routers (inbound ports): REST API interfaces
Interpreters, configuration, storage (outbound ports): Script execution and data management

Technical Innovations

Fusion of Cryptography and Software Engineering: First systematic integration of ZKP technology into software engineering lifecycle
Modular Design: Decouples core audit logic from specific ZKP implementations through architectural patterns
Protocol Selection Decision Tree: Provides systematic protocol selection method based on audit objectives, MLOps stages, and model types
Asynchronous Workflow Support: Accommodates computationally intensive proof generation in audit scenarios

Experimental Setup

Evaluation Data

ZKP Protocol Comparison:

ezkl: Supports ONNX format, GPU acceleration
SNARK: Implemented through Circom
STARK: Implemented through Cairo
GKR: Specifically optimized for neural networks

Test Models:

Feedforward Neural Networks (FNN)
Small Convolutional Neural Networks (Small CNN)
MNIST CNN
LeNet5
VGG11 (GKR only)

Evaluation Metrics

Proof Generation Time: Time required to generate zero-knowledge proofs
Verification Time: Time required to verify proofs
Proof Size: Storage space of generated proofs

Experimental Environment

Hardware: 8-core Intel Xeon E5-2698 v4 processor, 32GB RAM
Operating System: Ubuntu 22.04.4 LTS
Statistical Method: Each experimental condition run 10 times with random initialization, computing averages

Use Case Validation

Financial Risk Model Compliance Auditing:

Scenario: Financial institution proves to auditing firm that credit risk scores are generated by declared approved models
Requirements: Verify inference correctness without exposing proprietary model parameters
Protocol Selection: ezkl (non-interactive, transparent setup, standard representation, succinctness, quantum-safe)

Experimental Results

Main Results

Feedforward Neural Network (FNN) Performance Comparison:

Protocol	Proof Time (ms)	Verification Time (ms)	Proof Size (bytes)
SNARK	752	555	805.4
STARK	314,998.1	12.11	280,000
ezkl	492.79	9.80	23,958.9

LeNet5 Performance Comparison:

Protocol	Proof Time (ms)	Verification Time (ms)	Proof Size (bytes)
SNARK	18,788.5	611	804.4
GKR	331.99	91.31	45,718.75
ezkl	65,678.21	100.80	767,120.3

Key Findings

Model-Dependent Protocol Selection: Optimal ZKP protocols are highly dependent on specific ML models and performance metrics
Significant Performance Trade-offs:
- ezkl performs best on simple models
- SNARK achieves fastest proof generation and smallest proof size on complex models
- GKR excels on specially optimized models (LeNet5)
Asynchronous Audit Applicability: ezkl's verification time advantage makes it particularly suitable for asynchronous audit workflows

Practical Validation

The financial use case successfully demonstrates the framework's application in real regulatory environments:

Auditing firms only need to verify keys and proofs
Financial institutions need not disclose any confidential information
The entire process is verifiable and protects intellectual property

Zero-Knowledge Machine Learning (ZKML) Research

Inference Verification: ZEN, vCNN, zkCNN and others focus on zero-knowledge proofs for neural network inference Training Verification: Recent work extends to training processes and online metrics verification Trusted AI Applications: ZKAudit, FaaS and others target specific trusted AI scenarios

Advantages of This Work

Systematic Engineering Approach: First to provide a complete MLOps integration framework rather than isolated technical demonstrations
Practical Orientation: Demonstrates feasibility through real use cases and performance evaluation
Modular Design: Supports flexible integration and extension of multiple ZKP protocols

Conclusions and Discussion

Main Conclusions

Technical Feasibility: ZKP technology can be effectively integrated into the MLOps lifecycle, resolving the conflict between audit transparency and privacy protection
Engineering Value: Through application of software engineering patterns, complex cryptographic processes can be transformed into maintainable engineering practices
Practical Validation: The financial audit use case demonstrates the framework's applicability in real regulatory environments

Limitations

External Validity: The framework's applicability in other regulatory domains (such as healthcare, autonomous driving) requires further verification
Evaluation Scope: Primarily focuses on inference verification phase; evaluation of other MLOps stages is relatively limited
Model Scale: Experiments use relatively small models; performance characteristics of large models may differ
Protocol Maturity: Observed performance may reflect the maturity of underlying cryptographic libraries rather than theoretical efficiency

Future Directions

Real-World Validation: Verify framework performance and scalability through industrial case studies
Functional Extension: Implement audit workflows for other trusted AI properties, such as dataset fairness and model robustness
Large-Scale Model Support: Optimize framework to support complex AI systems such as large language models

In-Depth Evaluation

Strengths

Clear Problem Definition: Accurately identifies the fundamental conflict between transparency and privacy protection in AI auditing
Strong Methodological Innovation: First systematic engineering application of ZKP technology to MLOps
Excellent Architecture Design: Appropriate application of software engineering patterns such as hexagonal architecture and state pattern
Comprehensive Experimental Design: Combines theoretical analysis with practical use case validation, performance evaluation with feasibility arguments
High Practical Value: Addresses real regulatory needs with direct application value

Weaknesses

Evaluation Limitations: Primarily focuses on inference verification; support for training, data preprocessing and other stages is insufficient
Scalability Questions: Applicability to large-scale industrial AI systems requires further verification
Missing Cost Analysis: Lacks detailed analysis of computational costs and economic benefits
Insufficient Security Considerations: Discussion of security assumptions of ZKP protocols and potential attack vectors is inadequate

Impact

Academic Contribution: Introduces new research directions to MLOps field, promoting cross-disciplinary fusion of cryptography and software engineering
Practical Value: Provides actionable compliance verification solutions for regulatory bodies and enterprises
Technology Advancement: May promote adoption of ZKP technology in more practical application scenarios

Applicable Scenarios

Regulatory Compliance: AI system auditing in heavily regulated industries such as finance and healthcare
Intellectual Property Protection: Scenarios requiring verification of model performance without disclosing model details
Multi-Party Collaboration: Collaborative scenarios such as federated learning requiring verification of contributions while protecting data privacy
Supply Chain Auditing: AI service providers proving service quality to customers without exposing implementation details

References

The paper cites 72 related references, primarily including:

Foundational zero-knowledge proof theory (Goldreich, Blum, etc.)
ZKML application research (ZEN, zkCNN, ZKAudit, etc.)
Software engineering patterns (Clean Architecture, Design Patterns, etc.)
Trusted AI and MLOps related work (Liu et al., Kreuzberger et al., etc.)

Overall Assessment: This is a high-quality software engineering research paper that successfully combines cutting-edge cryptographic technology with practical engineering needs, providing an innovative solution for AI system auditing. The paper makes significant contributions in technical innovation, practicality, and engineering implementation, with important implications for advancing trustworthy AI development.