2025-11-15T18:46:12.059559

A Toolchain for Assisting Migration of Software Executables Towards Post-Quantum Cryptography

Rattanavipanon, Suaboot, Werapun

Quantum computing poses a significant global threat to today's security mechanisms. As a result, security experts and public sectors have issued guidelines to help organizations migrate their software to post-quantum cryptography (PQC). Despite these efforts, there is a lack of (semi-)automatic tools to support this transition especially when software is used and deployed as binary executables. To address this gap, in this work, we first propose a set of requirements necessary for a tool to detect quantum-vulnerable software executables. Following these requirements, we introduce QED: a toolchain for Quantum-vulnerable Executable Detection. QED uses a three-phase approach to identify quantum-vulnerable dependencies in a given set of executables, from file-level to API-level, and finally, precise identification of a static trace that triggers a quantum-vulnerable API. We evaluate QED on both a synthetic dataset with four cryptography libraries and a real-world dataset with over 200 software executables. The results demonstrate that: (1) QED discerns quantum-vulnerable from quantum-safe executables with 100% accuracy in the synthetic dataset; (2) QED is practical and scalable, completing analyses on average in less than 4 seconds per real-world executable; and (3) QED reduces the manual workload required by analysts to identify quantum-vulnerable executables in the real-world dataset by more than 90%. We hope that QED can become a crucial tool to facilitate the transition to PQC, particularly for small and medium-sized businesses with limited resources.

academic

A Toolchain for Assisting Migration of Software Executables Towards Post-Quantum Cryptography

Basic Information

Paper ID: 2409.07852
Title: A Toolchain for Assisting Migration of Software Executables Towards Post-Quantum Cryptography
Authors: Norrathep Rattanavipanon, Jakapan Suaboot, Warodom Werapun (Prince of Songkla University)
Classification: cs.CR (Cryptography and Security)
Publication Status: Submitted to IEEE ACCESS Journal
Paper Link: https://arxiv.org/abs/2409.07852

Abstract

Quantum computing poses a significant global threat to current security mechanisms. Although security experts and public sectors have released guidelines to assist organizations in migrating software to post-quantum cryptography (PQC), there is a lack of (semi)automated tools supporting such migration, particularly when software is deployed as binary executables. To address this issue, this paper first proposes necessary requirements for tools detecting quantum-vulnerable software executables. Based on these requirements, QED (Quantum-vulnerable Executable Detection) toolchain is introduced. QED employs a three-stage approach to identify quantum-vulnerable dependencies within a given set of executables, progressing from file-level to API-level, ultimately identifying static traces that trigger quantum-vulnerable APIs with precision. Evaluation results demonstrate that: (1) QED achieves 100% accuracy in distinguishing quantum-vulnerable and quantum-safe executables on synthetic datasets; (2) QED is practical and scalable, completing analysis of real-world executables in less than 4 seconds on average; (3) QED reduces manual effort required by analysts to identify quantum-vulnerable executables by over 90%.

Research Background and Motivation

Problem Definition

With rapid development of quantum computing technology, advancing from 2 qubits in 1998 to over 1000 qubits today, experts predict that large-scale functional quantum computers will be commercialized within the next two decades. Quantum computers can break currently widespread public-key cryptographic systems, such as RSA (requiring 4098 logical qubits) and elliptic curve cryptography (requiring 2330 logical qubits).

Significance

Global awareness of quantum attack threats continues to increase. Institutions such as NIST recommend that organizations establish quantum-ready teams to prepare for migrating software systems to post-quantum cryptography. This includes:

Creating cryptographic inventories to assess cryptographic usage within organizations
Conducting risk assessments based on these inventories

Limitations of Existing Approaches

Lack of Specialized Tools: Currently, no (semi)automated tools are specifically designed to assist PQC migration tasks
Manual Analysis Burden: Analysts must rely on various scattered tools and manual analysis to identify quantum-vulnerable software systems
Binary Analysis Challenges: Analysts typically lack access to source code and must conduct PQC migration based on program binaries
Cost Issues: Requires advanced binary analysis knowledge, increasing budget, time, and human resource costs

Research Motivation

Addressing these challenges, particularly the resource constraints faced by small and medium enterprises (SMEs) in conducting PQC migration, this paper aims to develop an automated tool to alleviate analyst workload.

Core Contributions

Requirements Specification: First systematically establishes requirements specifications for tools assisting PQC migration of software executables
QED Toolchain: Designs and implements the QED toolchain meeting the proposed requirements, with open-source code publicly released
Empirical Validation: Validates QED's accuracy and efficiency on synthetic and real-world datasets, achieving 100% true positive rate and reducing manual effort by over 90%
Practical Value: Provides critical PQC migration assistance tool for resource-limited SMEs

Methodology Details

Task Definition

Given a set of software executables, QED's objective is to identify quantum-vulnerable (QV) executables. A software executable is defined as QV if and only if there exists at least one possible execution path from its entry point (main function) to a cryptographic library API implementing QV algorithms (such as RSA, Diffie-Hellman, elliptic curve digital signature).

Tool Requirements (R1-R5)

R1 Dynamic Linking: Must identify executables using QV APIs through dynamic linking
R2 Binary-Level Analysis: Does not depend on source code availability
R3 Static Features: Uses only static features without requiring runtime execution traces
R4 Scalability: Supports analysis of large numbers of software executables within reasonable timeframes
R5 Validity: Produces no false negatives, tolerating minimal false positives

Model Architecture

QED employs a three-stage progressive analysis architecture:

Stage One: File-Level Dependency Analysis (P1)

Objective: Identify executables with dependencies on QV cryptographic libraries

Method:

Construct software dependency graph G₁ = (V₁, E₁), where V₁ is the file set and E₁ represents direct dependencies
Discover all dependencies through depth-first search
Locate QV cryptographic libraries in V₁
Prune nodes without dependencies to cryptographic libraries

Output: File-level dependency paths EV₁

Stage Two: API-Level Dependency Analysis (P2)

Objective: Reduce false positives from P1 by analyzing API-level dependencies

Method:

Construct API dependency graph G₂ = (V₂, E₂), where E₂ contains triples (n₁, n₂, apis)
Check whether predecessor nodes contain function calls to QV APIs
Remove edges not containing QV API calls
Embed API-level dependency information for each edge

Output: Dependency paths EV₂ with QV API information

Stage Three: Static Trace Analysis (P3)

Objective: Precisely identify executables conforming to QV definition

Method:

Construct static call graphs for reachability analysis
Verify execution paths from executable entry points to QV APIs
Support normal and conservative modes
- Normal mode: Missing execution traces directly indicate non-QV
- Conservative mode: Treat missing traces as potential false negatives

Output: Static execution traces EV₃

Technical Innovations

Progressive Analysis Strategy: Three-stage analysis from coarse-grained to fine-grained, balancing speed and accuracy
API Name Information Utilization: Detects cryptographic usage based on API name information, avoiding false negatives from compiler optimizations
Dynamic Linking Support: Specifically handles scenarios where cryptographic libraries are used through dynamic linking
Flexible Analysis Modes: Provides both normal and conservative modes, allowing analysts to choose based on requirements

Experimental Setup

Datasets

Synthetic Dataset

Cryptographic Libraries: OpenSSL v1.1.1, OpenSSL v3.3.1, MbedTLS v2.28.8, wolfSSL v5.7.2
Cryptographic Primitives: SHA-512, AES-256, Diffie-Hellman, RSA, ECDSA (latter three are QV)
Direct Dependencies Set: 20 executables (12 QV, 8 non-QV)
Indirect Dependencies Set: 20 executables (12 QV, 8 non-QV)
Total: 40 executables (24 QV, 16 non-QV)

Real-World Dataset

Coreutils: 109 non-cryptographic software (non-QV)
UnixBench: 18 performance benchmark tools (non-QV)
Network: 13 network utility programs (7 QV, 6 non-QV)
tpm2-tools: 86 TPM functionality implementation tools
Total: 226 executables, average size 248KB

Evaluation Metrics

True Positive Rate (TPR): Proportion of correctly identified QV executables
True Negative Rate (TNR): Proportion of correctly identified non-QV executables
Runtime: Time required for each stage of analysis
Memory Usage: Peak RAM consumption
Manual Effort Reduction: Number of files requiring further manual review

Implementation Details

Programming Language: Python3 (approximately 800 lines of code)
Dependencies: pyelftools (ELF file processing), NetworkX (graph operations), angr (static call graph construction)
Experimental Environment: Ubuntu 20.04, Intel i5-8520U @ 1.6GHz, 24GB RAM

Experimental Results

Main Results

Synthetic Dataset Accuracy

Stage	Direct Dependencies	Indirect Dependencies	Overall
P1	TPR: 100%, TNR: 0%	TPR: 100%, TNR: 0%	TPR: 100%, TNR: 0%
P1+P2	TPR: 100%, TNR: 100%	TPR: 100%, TNR: 0%	TPR: 100%, TNR: 50%
P1+P2+P3	TPR: 100%, TNR: 100%	TPR: 100%, TNR: 100%	TPR: 100%, TNR: 100%

Real-World Dataset Performance

Average Processing Time: Approximately 4 seconds per executable
Total Processing Time: Approximately 15 minutes for 226 executables
Memory Usage: P1 and P2 approximately 180MB, P3 approximately 3-5GB
Manual Effort Reduction: Reduced from 226 to 20 (91.15% reduction)

Ablation Study

P1 Stage: Rapid preliminary screening but high false positive rate
P2 Stage: Significantly reduces false positives, particularly for direct dependency scenarios
P3 Stage: Further improves precision but with higher computational overhead

Case Analysis

False Negative Case: curl program fails static call graph analysis due to indirect calls (function pointers)
False Positive Elimination: sftp and scp programs, though linked to OpenSSL, only use non-QV APIs

Experimental Findings

Progressive Analysis Effectiveness: Three-stage design successfully balances speed and accuracy
Static Analysis Limitations: Indirect calls remain a challenge for static analysis
Practical Validation: Tool performs well in real environments, significantly reducing manual effort

Cryptographic Discovery Tools

Existing tools fall into two categories: static and dynamic:

Static Methods: Based on static features such as initialization vectors, lookup tables, assembly instruction sequences
Dynamic Methods: Based on runtime information such as loop structures, input-output relationships

PQC Migration Research

Migration Process: Four-step process of diagnosis → planning → execution → maintenance
Cryptographic Agility: System's ability to adapt to different cryptographic algorithms
Application Scenarios: Compiled binaries, external system assets, network communication layers

Advantages of This Work

Specifically targets quantum vulnerability detection
Supports dynamic linking scenarios
Does not require runtime execution or heavyweight symbolic analysis
Provides end-to-end practical tool

Conclusions and Discussion

Main Conclusions

QED successfully satisfies all five design requirements (R1-R5)
Achieves 100% accuracy on synthetic datasets
Significantly reduces manual effort on real-world datasets
Tool demonstrates good scalability and practicality

Limitations

Indirect Call Detection: Static analysis cannot detect QV API usage through function pointers
Linking Method Constraints: Assumes executables use cryptographic libraries through dynamic linking
Dead Code Issues: May flag QV API calls that are never executed as positive

Future Directions

Lightweight Dynamic Analysis: Combine dynamic analysis to identify indirect calls
Static Linking Support: Extend detection to directly implemented or statically linked cryptographic functionality
Automated Patching: Extend from identification to (semi)automatic patching of quantum-vulnerable usage

In-Depth Evaluation

Strengths

Problem Importance: Addresses practical pain points in PQC migration
Systematic Approach: Complete workflow from requirements analysis to tool implementation
Technical Innovation: Well-designed three-stage progressive analysis strategy
Practical Value: Open-source tool has significant value for SMEs
Comprehensive Experiments: Thorough validation on both synthetic and real-world datasets

Weaknesses

Platform Limitations: Currently supports only Linux ELF format, limited extensibility
Language Constraints: Primarily targets C/C++ programs
Static Analysis Limitations: Insufficient analysis of indirect calls and dead code
Evaluation Scope: Some programs in real datasets lack ground truth

Impact

Academic Contribution: Fills research gap in PQC migration tools
Practical Value: Provides practical tool for organizational PQC migration
Reproducibility: Open-source code and datasets support result reproduction
Promotion Potential: Methodology extensible to other platforms and languages

Applicable Scenarios

Enterprise PQC migration risk assessment
Software supply chain security audits
Cryptographic dependency analysis
Security compliance checking

References

The paper cites 42 relevant references covering quantum computing development, PQC migration guidelines, cryptographic detection tools, binary analysis, and other important works across multiple domains, providing a solid theoretical foundation for the research.

Overall Assessment: This paper addresses the important problem of post-quantum cryptography migration by proposing a systematic solution. The QED toolchain is well-designed with comprehensive experimental validation, possessing significant academic value and practical significance. Despite some technical limitations, it makes important contributions to the PQC migration field, particularly providing feasible solutions for resource-limited SMEs.