Quantum computing poses a significant global threat to today's security mechanisms. As a result, security experts and public sectors have issued guidelines to help organizations migrate their software to post-quantum cryptography (PQC). Despite these efforts, there is a lack of (semi-)automatic tools to support this transition especially when software is used and deployed as binary executables. To address this gap, in this work, we first propose a set of requirements necessary for a tool to detect quantum-vulnerable software executables. Following these requirements, we introduce QED: a toolchain for Quantum-vulnerable Executable Detection. QED uses a three-phase approach to identify quantum-vulnerable dependencies in a given set of executables, from file-level to API-level, and finally, precise identification of a static trace that triggers a quantum-vulnerable API. We evaluate QED on both a synthetic dataset with four cryptography libraries and a real-world dataset with over 200 software executables. The results demonstrate that: (1) QED discerns quantum-vulnerable from quantum-safe executables with 100% accuracy in the synthetic dataset; (2) QED is practical and scalable, completing analyses on average in less than 4 seconds per real-world executable; and (3) QED reduces the manual workload required by analysts to identify quantum-vulnerable executables in the real-world dataset by more than 90%. We hope that QED can become a crucial tool to facilitate the transition to PQC, particularly for small and medium-sized businesses with limited resources.
- Paper ID: 2409.07852
- Title: A Toolchain for Assisting Migration of Software Executables Towards Post-Quantum Cryptography
- Authors: Norrathep Rattanavipanon, Jakapan Suaboot, Warodom Werapun (Prince of Songkla University)
- Classification: cs.CR (Cryptography and Security)
- Publication Status: Submitted to IEEE ACCESS Journal
- Paper Link: https://arxiv.org/abs/2409.07852
Quantum computing poses a significant global threat to current security mechanisms. Although security experts and public sectors have released guidelines to assist organizations in migrating software to post-quantum cryptography (PQC), there is a lack of (semi)automated tools supporting such migration, particularly when software is deployed as binary executables. To address this issue, this paper first proposes necessary requirements for tools detecting quantum-vulnerable software executables. Based on these requirements, QED (Quantum-vulnerable Executable Detection) toolchain is introduced. QED employs a three-stage approach to identify quantum-vulnerable dependencies within a given set of executables, progressing from file-level to API-level, ultimately identifying static traces that trigger quantum-vulnerable APIs with precision. Evaluation results demonstrate that: (1) QED achieves 100% accuracy in distinguishing quantum-vulnerable and quantum-safe executables on synthetic datasets; (2) QED is practical and scalable, completing analysis of real-world executables in less than 4 seconds on average; (3) QED reduces manual effort required by analysts to identify quantum-vulnerable executables by over 90%.
With rapid development of quantum computing technology, advancing from 2 qubits in 1998 to over 1000 qubits today, experts predict that large-scale functional quantum computers will be commercialized within the next two decades. Quantum computers can break currently widespread public-key cryptographic systems, such as RSA (requiring 4098 logical qubits) and elliptic curve cryptography (requiring 2330 logical qubits).
Global awareness of quantum attack threats continues to increase. Institutions such as NIST recommend that organizations establish quantum-ready teams to prepare for migrating software systems to post-quantum cryptography. This includes:
- Creating cryptographic inventories to assess cryptographic usage within organizations
- Conducting risk assessments based on these inventories
- Lack of Specialized Tools: Currently, no (semi)automated tools are specifically designed to assist PQC migration tasks
- Manual Analysis Burden: Analysts must rely on various scattered tools and manual analysis to identify quantum-vulnerable software systems
- Binary Analysis Challenges: Analysts typically lack access to source code and must conduct PQC migration based on program binaries
- Cost Issues: Requires advanced binary analysis knowledge, increasing budget, time, and human resource costs
Addressing these challenges, particularly the resource constraints faced by small and medium enterprises (SMEs) in conducting PQC migration, this paper aims to develop an automated tool to alleviate analyst workload.
- Requirements Specification: First systematically establishes requirements specifications for tools assisting PQC migration of software executables
- QED Toolchain: Designs and implements the QED toolchain meeting the proposed requirements, with open-source code publicly released
- Empirical Validation: Validates QED's accuracy and efficiency on synthetic and real-world datasets, achieving 100% true positive rate and reducing manual effort by over 90%
- Practical Value: Provides critical PQC migration assistance tool for resource-limited SMEs
Given a set of software executables, QED's objective is to identify quantum-vulnerable (QV) executables. A software executable is defined as QV if and only if there exists at least one possible execution path from its entry point (main function) to a cryptographic library API implementing QV algorithms (such as RSA, Diffie-Hellman, elliptic curve digital signature).
- R1 Dynamic Linking: Must identify executables using QV APIs through dynamic linking
- R2 Binary-Level Analysis: Does not depend on source code availability
- R3 Static Features: Uses only static features without requiring runtime execution traces
- R4 Scalability: Supports analysis of large numbers of software executables within reasonable timeframes
- R5 Validity: Produces no false negatives, tolerating minimal false positives
QED employs a three-stage progressive analysis architecture:
Objective: Identify executables with dependencies on QV cryptographic libraries
Method:
- Construct software dependency graph G₁ = (V₁, E₁), where V₁ is the file set and E₁ represents direct dependencies
- Discover all dependencies through depth-first search
- Locate QV cryptographic libraries in V₁
- Prune nodes without dependencies to cryptographic libraries
Output: File-level dependency paths EV₁
Objective: Reduce false positives from P1 by analyzing API-level dependencies
Method:
- Construct API dependency graph G₂ = (V₂, E₂), where E₂ contains triples (n₁, n₂, apis)
- Check whether predecessor nodes contain function calls to QV APIs
- Remove edges not containing QV API calls
- Embed API-level dependency information for each edge
Output: Dependency paths EV₂ with QV API information
Objective: Precisely identify executables conforming to QV definition
Method:
- Construct static call graphs for reachability analysis
- Verify execution paths from executable entry points to QV APIs
- Support normal and conservative modes
- Normal mode: Missing execution traces directly indicate non-QV
- Conservative mode: Treat missing traces as potential false negatives
Output: Static execution traces EV₃
- Progressive Analysis Strategy: Three-stage analysis from coarse-grained to fine-grained, balancing speed and accuracy
- API Name Information Utilization: Detects cryptographic usage based on API name information, avoiding false negatives from compiler optimizations
- Dynamic Linking Support: Specifically handles scenarios where cryptographic libraries are used through dynamic linking
- Flexible Analysis Modes: Provides both normal and conservative modes, allowing analysts to choose based on requirements
- Cryptographic Libraries: OpenSSL v1.1.1, OpenSSL v3.3.1, MbedTLS v2.28.8, wolfSSL v5.7.2
- Cryptographic Primitives: SHA-512, AES-256, Diffie-Hellman, RSA, ECDSA (latter three are QV)
- Direct Dependencies Set: 20 executables (12 QV, 8 non-QV)
- Indirect Dependencies Set: 20 executables (12 QV, 8 non-QV)
- Total: 40 executables (24 QV, 16 non-QV)
- Coreutils: 109 non-cryptographic software (non-QV)
- UnixBench: 18 performance benchmark tools (non-QV)
- Network: 13 network utility programs (7 QV, 6 non-QV)
- tpm2-tools: 86 TPM functionality implementation tools
- Total: 226 executables, average size 248KB
- True Positive Rate (TPR): Proportion of correctly identified QV executables
- True Negative Rate (TNR): Proportion of correctly identified non-QV executables
- Runtime: Time required for each stage of analysis
- Memory Usage: Peak RAM consumption
- Manual Effort Reduction: Number of files requiring further manual review
- Programming Language: Python3 (approximately 800 lines of code)
- Dependencies: pyelftools (ELF file processing), NetworkX (graph operations), angr (static call graph construction)
- Experimental Environment: Ubuntu 20.04, Intel i5-8520U @ 1.6GHz, 24GB RAM
| Stage | Direct Dependencies | Indirect Dependencies | Overall |
|---|
| P1 | TPR: 100%, TNR: 0% | TPR: 100%, TNR: 0% | TPR: 100%, TNR: 0% |
| P1+P2 | TPR: 100%, TNR: 100% | TPR: 100%, TNR: 0% | TPR: 100%, TNR: 50% |
| P1+P2+P3 | TPR: 100%, TNR: 100% | TPR: 100%, TNR: 100% | TPR: 100%, TNR: 100% |
- Average Processing Time: Approximately 4 seconds per executable
- Total Processing Time: Approximately 15 minutes for 226 executables
- Memory Usage: P1 and P2 approximately 180MB, P3 approximately 3-5GB
- Manual Effort Reduction: Reduced from 226 to 20 (91.15% reduction)
- P1 Stage: Rapid preliminary screening but high false positive rate
- P2 Stage: Significantly reduces false positives, particularly for direct dependency scenarios
- P3 Stage: Further improves precision but with higher computational overhead
- False Negative Case: curl program fails static call graph analysis due to indirect calls (function pointers)
- False Positive Elimination: sftp and scp programs, though linked to OpenSSL, only use non-QV APIs
- Progressive Analysis Effectiveness: Three-stage design successfully balances speed and accuracy
- Static Analysis Limitations: Indirect calls remain a challenge for static analysis
- Practical Validation: Tool performs well in real environments, significantly reducing manual effort
Existing tools fall into two categories: static and dynamic:
- Static Methods: Based on static features such as initialization vectors, lookup tables, assembly instruction sequences
- Dynamic Methods: Based on runtime information such as loop structures, input-output relationships
- Migration Process: Four-step process of diagnosis → planning → execution → maintenance
- Cryptographic Agility: System's ability to adapt to different cryptographic algorithms
- Application Scenarios: Compiled binaries, external system assets, network communication layers
- Specifically targets quantum vulnerability detection
- Supports dynamic linking scenarios
- Does not require runtime execution or heavyweight symbolic analysis
- Provides end-to-end practical tool
- QED successfully satisfies all five design requirements (R1-R5)
- Achieves 100% accuracy on synthetic datasets
- Significantly reduces manual effort on real-world datasets
- Tool demonstrates good scalability and practicality
- Indirect Call Detection: Static analysis cannot detect QV API usage through function pointers
- Linking Method Constraints: Assumes executables use cryptographic libraries through dynamic linking
- Dead Code Issues: May flag QV API calls that are never executed as positive
- Lightweight Dynamic Analysis: Combine dynamic analysis to identify indirect calls
- Static Linking Support: Extend detection to directly implemented or statically linked cryptographic functionality
- Automated Patching: Extend from identification to (semi)automatic patching of quantum-vulnerable usage
- Problem Importance: Addresses practical pain points in PQC migration
- Systematic Approach: Complete workflow from requirements analysis to tool implementation
- Technical Innovation: Well-designed three-stage progressive analysis strategy
- Practical Value: Open-source tool has significant value for SMEs
- Comprehensive Experiments: Thorough validation on both synthetic and real-world datasets
- Platform Limitations: Currently supports only Linux ELF format, limited extensibility
- Language Constraints: Primarily targets C/C++ programs
- Static Analysis Limitations: Insufficient analysis of indirect calls and dead code
- Evaluation Scope: Some programs in real datasets lack ground truth
- Academic Contribution: Fills research gap in PQC migration tools
- Practical Value: Provides practical tool for organizational PQC migration
- Reproducibility: Open-source code and datasets support result reproduction
- Promotion Potential: Methodology extensible to other platforms and languages
- Enterprise PQC migration risk assessment
- Software supply chain security audits
- Cryptographic dependency analysis
- Security compliance checking
The paper cites 42 relevant references covering quantum computing development, PQC migration guidelines, cryptographic detection tools, binary analysis, and other important works across multiple domains, providing a solid theoretical foundation for the research.
Overall Assessment: This paper addresses the important problem of post-quantum cryptography migration by proposing a systematic solution. The QED toolchain is well-designed with comprehensive experimental validation, possessing significant academic value and practical significance. Despite some technical limitations, it makes important contributions to the PQC migration field, particularly providing feasible solutions for resource-limited SMEs.