2025-11-11T08:49:09.253291

"Show Me You Comply... Without Showing Me Anything": Zero-Knowledge Software Auditing for AI-Enabled Systems

Scaramuzza, Ferreira, Suller et al.
The increasing exploitation of Artificial Intelligence (AI) enabled systems in critical domains has made trustworthiness concerns a paramount showstopper, requiring verifiable accountability, often by regulation (e.g., the EU AI Act). Classical software verification and validation techniques, such as procedural audits, formal methods, or model documentation, are the mechanisms used to achieve this. However, these methods are either expensive or heavily manual and ill-suited for the opaque, "black box" nature of most AI models. An intractable conflict emerges: high auditability and verifiability are required by law, but such transparency conflicts with the need to protect assets being audited-e.g., confidential data and proprietary models-leading to weakened accountability. To address this challenge, this paper introduces ZKMLOps, a novel MLOps verification framework that operationalizes Zero-Knowledge Proofs (ZKPs)-cryptographic protocols allowing a prover to convince a verifier that a statement is true without revealing additional information-within Machine-Learning Operations lifecycles. By integrating ZKPs with established software engineering patterns, ZKMLOps provides a modular and repeatable process for generating verifiable cryptographic proof of compliance. We evaluate the framework's practicality through a study of regulatory compliance in financial risk auditing and assess feasibility through an empirical evaluation of top ZKP protocols, analyzing performance trade-offs for ML models of increasing complexity.
academic

"Show Me You Comply... Without Showing Me Anything": Zero-Knowledge Software Auditing for AI-Enabled Systems

Basic Information

  • Paper ID: 2510.26576
  • Title: "Show Me You Comply... Without Showing Me Anything": Zero-Knowledge Software Auditing for AI-Enabled Systems
  • Authors: Filippo Scaramuzza, Renato Cordeiro Ferreira, Tomaz Maia Suller, Giovanni Quattrocchi, Damian Andrew Tamburri, Willem-Jan van den Heuvel
  • Classification: cs.SE (Software Engineering)
  • Submission Date: October 30, 2025 to arXiv
  • Paper Link: https://arxiv.org/abs/2510.26576

Abstract

With the widespread application of artificial intelligence systems in critical domains, trustworthiness has become a significant barrier, and regulatory requirements (such as the EU AI Act) demand verifiable accountability. Traditional software verification and validation techniques (such as program auditing, formal methods, or model documentation) suffer from high costs, extensive manual operations, and unsuitability for the "black-box" nature of AI models. This paper proposes the ZKMLOps framework, which addresses the contradiction between audit transparency and asset protection by integrating zero-knowledge proofs (ZKPs) into the machine learning operations lifecycle, providing modular and repeatable compliance verification processes.

Research Background and Motivation

Core Problem

The research addresses a fundamental conflict in AI system auditing: legal requirements demand high auditability and verifiability, yet this transparency conflicts with the need to protect audited assets (such as confidential data and proprietary models).

Problem Significance

  1. Increasing Regulatory Pressure: Regulations such as the EU AI Act classify many industrial AI deployments as high-risk, requiring compliance evidence
  2. Growing Critical Applications: AI systems are increasingly deployed in safety-critical domains such as finance, healthcare, and transportation
  3. Inadequacy of Traditional Auditing: Existing software verification techniques have limited effectiveness for opaque AI models with millions of parameters

Limitations of Existing Approaches

  1. Program Auditing: Costly and heavily dependent on manual operations
  2. Formal Methods: Only effective when implementation logic is clear and deterministically modelable
  3. Model Documentation: Cannot handle the "black-box" nature of AI models
  4. Transparency Conflicts: Disclosing artifacts required for auditing may leak intellectual property or personal data

Research Motivation

Inspired by events such as the Volkswagen emissions scandal, the authors recognized the need for a method that provides verifiable compliance proof without disclosing sensitive information. Zero-knowledge proof technology offers a potential solution to this problem.

Core Contributions

  1. Proposes ZKMLOps Framework: The first novel architecture that systematically integrates zero-knowledge proofs into the MLOps lifecycle
  2. Practical Validation: Demonstrates the framework's practical value through a regulatory compliance use case in financial risk auditing
  3. Feasibility Assessment: Conducts empirical evaluation of multiple ZKP protocols, analyzing performance trade-offs for ML models of varying complexity
  4. Engineering Implementation: Transforms complex cryptographic procedures into modular, repeatable, and maintainable engineering processes

Methodology Details

Task Definition

Objective: Implement systematic AI system auditing within the MLOps lifecycle, enabling organizations to provide verifiable cryptographic proofs demonstrating system compliance with specific requirements and regulations, while protecting proprietary information and sensitive data.

Inputs: AI models, datasets, audit requirements Outputs: Zero-knowledge proofs and verification results Constraints: Protection of intellectual property and data privacy

Model Architecture

Overall Architecture Design

The ZKMLOps framework adopts a Hexagonal Architecture, divided into three main layers:

  1. Methodological Layer: ML system verification lifecycle guiding principles (Components 1-4)
  2. Implementation Layer: Trusted service architecture (Components 5-8)
  3. Stakeholder Layer: Trust stakeholder interfaces (Component 9)

Core Component Functions

1. ML System Verification Lifecycle (Components 1-4)

  • MLOps Verification Lifecycle Selection: Choose one of four stages based on audit objectives
    • Data and preprocessing verification
    • Training and offline metrics verification
    • Inference verification
    • Online metrics verification
  • Model Selection: Select verification techniques based on technical requirements of deployed models
  • Protocol Selection: Choose the ZKP protocol most suitable for the application architecture
  • ZKP Traceability Specification: Generate documentation containing audit objectives, decision trajectories, and selected protocols

2. Trusted Service Architecture (Components 5-8)

  • Hexagonal Architecture Core: Implements business logic of audit workflows
  • Artifact Storage: Manages input and output artifacts during the audit process
  • ZKP Scripts: Executes specific implementations of different ZKP protocols
  • Internal State Machine: Coordinates execution of four ZKP steps (setup, key exchange, proof, verification)

Technical Implementation Details

State Machine Design: Employs Orchestration Saga Pattern and State Pattern, decomposing each audit workflow into four fundamental steps:

Setup → Key Exchange → Proof → Verification

Dependency Injection Pattern: Injects required adapters at runtime through configuration files, supporting flexible switching between multiple ZKP protocols.

Anti-Corruption Layer: Implements abstraction of external dependencies using ports and adapters pattern, including:

  • Routers (inbound ports): REST API interfaces
  • Interpreters, configuration, storage (outbound ports): Script execution and data management

Technical Innovations

  1. Fusion of Cryptography and Software Engineering: First systematic integration of ZKP technology into software engineering lifecycle
  2. Modular Design: Decouples core audit logic from specific ZKP implementations through architectural patterns
  3. Protocol Selection Decision Tree: Provides systematic protocol selection method based on audit objectives, MLOps stages, and model types
  4. Asynchronous Workflow Support: Accommodates computationally intensive proof generation in audit scenarios

Experimental Setup

Evaluation Data

ZKP Protocol Comparison:

  • ezkl: Supports ONNX format, GPU acceleration
  • SNARK: Implemented through Circom
  • STARK: Implemented through Cairo
  • GKR: Specifically optimized for neural networks

Test Models:

  • Feedforward Neural Networks (FNN)
  • Small Convolutional Neural Networks (Small CNN)
  • MNIST CNN
  • LeNet5
  • VGG11 (GKR only)

Evaluation Metrics

  1. Proof Generation Time: Time required to generate zero-knowledge proofs
  2. Verification Time: Time required to verify proofs
  3. Proof Size: Storage space of generated proofs

Experimental Environment

  • Hardware: 8-core Intel Xeon E5-2698 v4 processor, 32GB RAM
  • Operating System: Ubuntu 22.04.4 LTS
  • Statistical Method: Each experimental condition run 10 times with random initialization, computing averages

Use Case Validation

Financial Risk Model Compliance Auditing:

  • Scenario: Financial institution proves to auditing firm that credit risk scores are generated by declared approved models
  • Requirements: Verify inference correctness without exposing proprietary model parameters
  • Protocol Selection: ezkl (non-interactive, transparent setup, standard representation, succinctness, quantum-safe)

Experimental Results

Main Results

Feedforward Neural Network (FNN) Performance Comparison:

ProtocolProof Time (ms)Verification Time (ms)Proof Size (bytes)
SNARK752555805.4
STARK314,998.112.11280,000
ezkl492.799.8023,958.9

LeNet5 Performance Comparison:

ProtocolProof Time (ms)Verification Time (ms)Proof Size (bytes)
SNARK18,788.5611804.4
GKR331.9991.3145,718.75
ezkl65,678.21100.80767,120.3

Key Findings

  1. Model-Dependent Protocol Selection: Optimal ZKP protocols are highly dependent on specific ML models and performance metrics
  2. Significant Performance Trade-offs:
    • ezkl performs best on simple models
    • SNARK achieves fastest proof generation and smallest proof size on complex models
    • GKR excels on specially optimized models (LeNet5)
  3. Asynchronous Audit Applicability: ezkl's verification time advantage makes it particularly suitable for asynchronous audit workflows

Practical Validation

The financial use case successfully demonstrates the framework's application in real regulatory environments:

  • Auditing firms only need to verify keys and proofs
  • Financial institutions need not disclose any confidential information
  • The entire process is verifiable and protects intellectual property

Zero-Knowledge Machine Learning (ZKML) Research

Inference Verification: ZEN, vCNN, zkCNN and others focus on zero-knowledge proofs for neural network inference Training Verification: Recent work extends to training processes and online metrics verification Trusted AI Applications: ZKAudit, FaaS and others target specific trusted AI scenarios

Advantages of This Work

  1. Systematic Engineering Approach: First to provide a complete MLOps integration framework rather than isolated technical demonstrations
  2. Practical Orientation: Demonstrates feasibility through real use cases and performance evaluation
  3. Modular Design: Supports flexible integration and extension of multiple ZKP protocols

Conclusions and Discussion

Main Conclusions

  1. Technical Feasibility: ZKP technology can be effectively integrated into the MLOps lifecycle, resolving the conflict between audit transparency and privacy protection
  2. Engineering Value: Through application of software engineering patterns, complex cryptographic processes can be transformed into maintainable engineering practices
  3. Practical Validation: The financial audit use case demonstrates the framework's applicability in real regulatory environments

Limitations

  1. External Validity: The framework's applicability in other regulatory domains (such as healthcare, autonomous driving) requires further verification
  2. Evaluation Scope: Primarily focuses on inference verification phase; evaluation of other MLOps stages is relatively limited
  3. Model Scale: Experiments use relatively small models; performance characteristics of large models may differ
  4. Protocol Maturity: Observed performance may reflect the maturity of underlying cryptographic libraries rather than theoretical efficiency

Future Directions

  1. Real-World Validation: Verify framework performance and scalability through industrial case studies
  2. Functional Extension: Implement audit workflows for other trusted AI properties, such as dataset fairness and model robustness
  3. Large-Scale Model Support: Optimize framework to support complex AI systems such as large language models

In-Depth Evaluation

Strengths

  1. Clear Problem Definition: Accurately identifies the fundamental conflict between transparency and privacy protection in AI auditing
  2. Strong Methodological Innovation: First systematic engineering application of ZKP technology to MLOps
  3. Excellent Architecture Design: Appropriate application of software engineering patterns such as hexagonal architecture and state pattern
  4. Comprehensive Experimental Design: Combines theoretical analysis with practical use case validation, performance evaluation with feasibility arguments
  5. High Practical Value: Addresses real regulatory needs with direct application value

Weaknesses

  1. Evaluation Limitations: Primarily focuses on inference verification; support for training, data preprocessing and other stages is insufficient
  2. Scalability Questions: Applicability to large-scale industrial AI systems requires further verification
  3. Missing Cost Analysis: Lacks detailed analysis of computational costs and economic benefits
  4. Insufficient Security Considerations: Discussion of security assumptions of ZKP protocols and potential attack vectors is inadequate

Impact

  1. Academic Contribution: Introduces new research directions to MLOps field, promoting cross-disciplinary fusion of cryptography and software engineering
  2. Practical Value: Provides actionable compliance verification solutions for regulatory bodies and enterprises
  3. Technology Advancement: May promote adoption of ZKP technology in more practical application scenarios

Applicable Scenarios

  1. Regulatory Compliance: AI system auditing in heavily regulated industries such as finance and healthcare
  2. Intellectual Property Protection: Scenarios requiring verification of model performance without disclosing model details
  3. Multi-Party Collaboration: Collaborative scenarios such as federated learning requiring verification of contributions while protecting data privacy
  4. Supply Chain Auditing: AI service providers proving service quality to customers without exposing implementation details

References

The paper cites 72 related references, primarily including:

  • Foundational zero-knowledge proof theory (Goldreich, Blum, etc.)
  • ZKML application research (ZEN, zkCNN, ZKAudit, etc.)
  • Software engineering patterns (Clean Architecture, Design Patterns, etc.)
  • Trusted AI and MLOps related work (Liu et al., Kreuzberger et al., etc.)

Overall Assessment: This is a high-quality software engineering research paper that successfully combines cutting-edge cryptographic technology with practical engineering needs, providing an innovative solution for AI system auditing. The paper makes significant contributions in technical innovation, practicality, and engineering implementation, with important implications for advancing trustworthy AI development.