2025-11-15T18:46:12.059559

A Toolchain for Assisting Migration of Software Executables Towards Post-Quantum Cryptography

Rattanavipanon, Suaboot, Werapun
Quantum computing poses a significant global threat to today's security mechanisms. As a result, security experts and public sectors have issued guidelines to help organizations migrate their software to post-quantum cryptography (PQC). Despite these efforts, there is a lack of (semi-)automatic tools to support this transition especially when software is used and deployed as binary executables. To address this gap, in this work, we first propose a set of requirements necessary for a tool to detect quantum-vulnerable software executables. Following these requirements, we introduce QED: a toolchain for Quantum-vulnerable Executable Detection. QED uses a three-phase approach to identify quantum-vulnerable dependencies in a given set of executables, from file-level to API-level, and finally, precise identification of a static trace that triggers a quantum-vulnerable API. We evaluate QED on both a synthetic dataset with four cryptography libraries and a real-world dataset with over 200 software executables. The results demonstrate that: (1) QED discerns quantum-vulnerable from quantum-safe executables with 100% accuracy in the synthetic dataset; (2) QED is practical and scalable, completing analyses on average in less than 4 seconds per real-world executable; and (3) QED reduces the manual workload required by analysts to identify quantum-vulnerable executables in the real-world dataset by more than 90%. We hope that QED can become a crucial tool to facilitate the transition to PQC, particularly for small and medium-sized businesses with limited resources.
academic

A Toolchain for Assisting Migration of Software Executables Towards Post-Quantum Cryptography

Basic Information

  • Paper ID: 2409.07852
  • Title: A Toolchain for Assisting Migration of Software Executables Towards Post-Quantum Cryptography
  • Authors: Norrathep Rattanavipanon, Jakapan Suaboot, Warodom Werapun (Prince of Songkla University)
  • Classification: cs.CR (Cryptography and Security)
  • Publication Status: Submitted to IEEE ACCESS Journal
  • Paper Link: https://arxiv.org/abs/2409.07852

Abstract

Quantum computing poses a significant global threat to current security mechanisms. Although security experts and public sectors have released guidelines to assist organizations in migrating software to post-quantum cryptography (PQC), there is a lack of (semi)automated tools supporting such migration, particularly when software is deployed as binary executables. To address this issue, this paper first proposes necessary requirements for tools detecting quantum-vulnerable software executables. Based on these requirements, QED (Quantum-vulnerable Executable Detection) toolchain is introduced. QED employs a three-stage approach to identify quantum-vulnerable dependencies within a given set of executables, progressing from file-level to API-level, ultimately identifying static traces that trigger quantum-vulnerable APIs with precision. Evaluation results demonstrate that: (1) QED achieves 100% accuracy in distinguishing quantum-vulnerable and quantum-safe executables on synthetic datasets; (2) QED is practical and scalable, completing analysis of real-world executables in less than 4 seconds on average; (3) QED reduces manual effort required by analysts to identify quantum-vulnerable executables by over 90%.

Research Background and Motivation

Problem Definition

With rapid development of quantum computing technology, advancing from 2 qubits in 1998 to over 1000 qubits today, experts predict that large-scale functional quantum computers will be commercialized within the next two decades. Quantum computers can break currently widespread public-key cryptographic systems, such as RSA (requiring 4098 logical qubits) and elliptic curve cryptography (requiring 2330 logical qubits).

Significance

Global awareness of quantum attack threats continues to increase. Institutions such as NIST recommend that organizations establish quantum-ready teams to prepare for migrating software systems to post-quantum cryptography. This includes:

  1. Creating cryptographic inventories to assess cryptographic usage within organizations
  2. Conducting risk assessments based on these inventories

Limitations of Existing Approaches

  1. Lack of Specialized Tools: Currently, no (semi)automated tools are specifically designed to assist PQC migration tasks
  2. Manual Analysis Burden: Analysts must rely on various scattered tools and manual analysis to identify quantum-vulnerable software systems
  3. Binary Analysis Challenges: Analysts typically lack access to source code and must conduct PQC migration based on program binaries
  4. Cost Issues: Requires advanced binary analysis knowledge, increasing budget, time, and human resource costs

Research Motivation

Addressing these challenges, particularly the resource constraints faced by small and medium enterprises (SMEs) in conducting PQC migration, this paper aims to develop an automated tool to alleviate analyst workload.

Core Contributions

  1. Requirements Specification: First systematically establishes requirements specifications for tools assisting PQC migration of software executables
  2. QED Toolchain: Designs and implements the QED toolchain meeting the proposed requirements, with open-source code publicly released
  3. Empirical Validation: Validates QED's accuracy and efficiency on synthetic and real-world datasets, achieving 100% true positive rate and reducing manual effort by over 90%
  4. Practical Value: Provides critical PQC migration assistance tool for resource-limited SMEs

Methodology Details

Task Definition

Given a set of software executables, QED's objective is to identify quantum-vulnerable (QV) executables. A software executable is defined as QV if and only if there exists at least one possible execution path from its entry point (main function) to a cryptographic library API implementing QV algorithms (such as RSA, Diffie-Hellman, elliptic curve digital signature).

Tool Requirements (R1-R5)

  • R1 Dynamic Linking: Must identify executables using QV APIs through dynamic linking
  • R2 Binary-Level Analysis: Does not depend on source code availability
  • R3 Static Features: Uses only static features without requiring runtime execution traces
  • R4 Scalability: Supports analysis of large numbers of software executables within reasonable timeframes
  • R5 Validity: Produces no false negatives, tolerating minimal false positives

Model Architecture

QED employs a three-stage progressive analysis architecture:

Stage One: File-Level Dependency Analysis (P1)

Objective: Identify executables with dependencies on QV cryptographic libraries

Method:

  1. Construct software dependency graph G₁ = (V₁, E₁), where V₁ is the file set and E₁ represents direct dependencies
  2. Discover all dependencies through depth-first search
  3. Locate QV cryptographic libraries in V₁
  4. Prune nodes without dependencies to cryptographic libraries

Output: File-level dependency paths EV₁

Stage Two: API-Level Dependency Analysis (P2)

Objective: Reduce false positives from P1 by analyzing API-level dependencies

Method:

  1. Construct API dependency graph G₂ = (V₂, E₂), where E₂ contains triples (n₁, n₂, apis)
  2. Check whether predecessor nodes contain function calls to QV APIs
  3. Remove edges not containing QV API calls
  4. Embed API-level dependency information for each edge

Output: Dependency paths EV₂ with QV API information

Stage Three: Static Trace Analysis (P3)

Objective: Precisely identify executables conforming to QV definition

Method:

  1. Construct static call graphs for reachability analysis
  2. Verify execution paths from executable entry points to QV APIs
  3. Support normal and conservative modes
    • Normal mode: Missing execution traces directly indicate non-QV
    • Conservative mode: Treat missing traces as potential false negatives

Output: Static execution traces EV₃

Technical Innovations

  1. Progressive Analysis Strategy: Three-stage analysis from coarse-grained to fine-grained, balancing speed and accuracy
  2. API Name Information Utilization: Detects cryptographic usage based on API name information, avoiding false negatives from compiler optimizations
  3. Dynamic Linking Support: Specifically handles scenarios where cryptographic libraries are used through dynamic linking
  4. Flexible Analysis Modes: Provides both normal and conservative modes, allowing analysts to choose based on requirements

Experimental Setup

Datasets

Synthetic Dataset

  • Cryptographic Libraries: OpenSSL v1.1.1, OpenSSL v3.3.1, MbedTLS v2.28.8, wolfSSL v5.7.2
  • Cryptographic Primitives: SHA-512, AES-256, Diffie-Hellman, RSA, ECDSA (latter three are QV)
  • Direct Dependencies Set: 20 executables (12 QV, 8 non-QV)
  • Indirect Dependencies Set: 20 executables (12 QV, 8 non-QV)
  • Total: 40 executables (24 QV, 16 non-QV)

Real-World Dataset

  • Coreutils: 109 non-cryptographic software (non-QV)
  • UnixBench: 18 performance benchmark tools (non-QV)
  • Network: 13 network utility programs (7 QV, 6 non-QV)
  • tpm2-tools: 86 TPM functionality implementation tools
  • Total: 226 executables, average size 248KB

Evaluation Metrics

  • True Positive Rate (TPR): Proportion of correctly identified QV executables
  • True Negative Rate (TNR): Proportion of correctly identified non-QV executables
  • Runtime: Time required for each stage of analysis
  • Memory Usage: Peak RAM consumption
  • Manual Effort Reduction: Number of files requiring further manual review

Implementation Details

  • Programming Language: Python3 (approximately 800 lines of code)
  • Dependencies: pyelftools (ELF file processing), NetworkX (graph operations), angr (static call graph construction)
  • Experimental Environment: Ubuntu 20.04, Intel i5-8520U @ 1.6GHz, 24GB RAM

Experimental Results

Main Results

Synthetic Dataset Accuracy

StageDirect DependenciesIndirect DependenciesOverall
P1TPR: 100%, TNR: 0%TPR: 100%, TNR: 0%TPR: 100%, TNR: 0%
P1+P2TPR: 100%, TNR: 100%TPR: 100%, TNR: 0%TPR: 100%, TNR: 50%
P1+P2+P3TPR: 100%, TNR: 100%TPR: 100%, TNR: 100%TPR: 100%, TNR: 100%

Real-World Dataset Performance

  • Average Processing Time: Approximately 4 seconds per executable
  • Total Processing Time: Approximately 15 minutes for 226 executables
  • Memory Usage: P1 and P2 approximately 180MB, P3 approximately 3-5GB
  • Manual Effort Reduction: Reduced from 226 to 20 (91.15% reduction)

Ablation Study

  • P1 Stage: Rapid preliminary screening but high false positive rate
  • P2 Stage: Significantly reduces false positives, particularly for direct dependency scenarios
  • P3 Stage: Further improves precision but with higher computational overhead

Case Analysis

  • False Negative Case: curl program fails static call graph analysis due to indirect calls (function pointers)
  • False Positive Elimination: sftp and scp programs, though linked to OpenSSL, only use non-QV APIs

Experimental Findings

  1. Progressive Analysis Effectiveness: Three-stage design successfully balances speed and accuracy
  2. Static Analysis Limitations: Indirect calls remain a challenge for static analysis
  3. Practical Validation: Tool performs well in real environments, significantly reducing manual effort

Cryptographic Discovery Tools

Existing tools fall into two categories: static and dynamic:

  • Static Methods: Based on static features such as initialization vectors, lookup tables, assembly instruction sequences
  • Dynamic Methods: Based on runtime information such as loop structures, input-output relationships

PQC Migration Research

  • Migration Process: Four-step process of diagnosis → planning → execution → maintenance
  • Cryptographic Agility: System's ability to adapt to different cryptographic algorithms
  • Application Scenarios: Compiled binaries, external system assets, network communication layers

Advantages of This Work

  • Specifically targets quantum vulnerability detection
  • Supports dynamic linking scenarios
  • Does not require runtime execution or heavyweight symbolic analysis
  • Provides end-to-end practical tool

Conclusions and Discussion

Main Conclusions

  1. QED successfully satisfies all five design requirements (R1-R5)
  2. Achieves 100% accuracy on synthetic datasets
  3. Significantly reduces manual effort on real-world datasets
  4. Tool demonstrates good scalability and practicality

Limitations

  1. Indirect Call Detection: Static analysis cannot detect QV API usage through function pointers
  2. Linking Method Constraints: Assumes executables use cryptographic libraries through dynamic linking
  3. Dead Code Issues: May flag QV API calls that are never executed as positive

Future Directions

  1. Lightweight Dynamic Analysis: Combine dynamic analysis to identify indirect calls
  2. Static Linking Support: Extend detection to directly implemented or statically linked cryptographic functionality
  3. Automated Patching: Extend from identification to (semi)automatic patching of quantum-vulnerable usage

In-Depth Evaluation

Strengths

  1. Problem Importance: Addresses practical pain points in PQC migration
  2. Systematic Approach: Complete workflow from requirements analysis to tool implementation
  3. Technical Innovation: Well-designed three-stage progressive analysis strategy
  4. Practical Value: Open-source tool has significant value for SMEs
  5. Comprehensive Experiments: Thorough validation on both synthetic and real-world datasets

Weaknesses

  1. Platform Limitations: Currently supports only Linux ELF format, limited extensibility
  2. Language Constraints: Primarily targets C/C++ programs
  3. Static Analysis Limitations: Insufficient analysis of indirect calls and dead code
  4. Evaluation Scope: Some programs in real datasets lack ground truth

Impact

  1. Academic Contribution: Fills research gap in PQC migration tools
  2. Practical Value: Provides practical tool for organizational PQC migration
  3. Reproducibility: Open-source code and datasets support result reproduction
  4. Promotion Potential: Methodology extensible to other platforms and languages

Applicable Scenarios

  • Enterprise PQC migration risk assessment
  • Software supply chain security audits
  • Cryptographic dependency analysis
  • Security compliance checking

References

The paper cites 42 relevant references covering quantum computing development, PQC migration guidelines, cryptographic detection tools, binary analysis, and other important works across multiple domains, providing a solid theoretical foundation for the research.


Overall Assessment: This paper addresses the important problem of post-quantum cryptography migration by proposing a systematic solution. The QED toolchain is well-designed with comprehensive experimental validation, possessing significant academic value and practical significance. Despite some technical limitations, it makes important contributions to the PQC migration field, particularly providing feasible solutions for resource-limited SMEs.