LLMs are All You Need? Improving Fuzz Testing for MOJO with Large Language Models
Huang, Zhao, Chen
The rapid development of large language models (LLMs) has revolutionized software testing, particularly fuzz testing, by automating the generation of diverse and effective test inputs. This advancement holds great promise for improving software reliability. Meanwhile, the introduction of MOJO, a high-performance AI programming language blending Python's usability with the efficiency of C and C++, presents new opportunities to enhance AI model scalability and programmability. However, as a new language, MOJO lacks comprehensive testing frameworks and a sufficient corpus for LLM-based testing, which exacerbates model hallucination. In this case, LLMs will generate syntactically valid but semantically incorrect code, significantly reducing the effectiveness of fuzz testing. To address this challenge, we propose MOJOFuzzer, the first adaptive LLM-based fuzzing framework designed for zero-shot learning environments of emerging programming languages. MOJOFuzzer integrates a mutil-phase framework that systematically eliminates low-quality generated inputs before execution, significantly improving test case validity. Furthermore, MOJOFuzzer dynamically adapts LLM prompts based on runtime feedback for test case mutation, enabling an iterative learning process that continuously enhances fuzzing efficiency and bug detection performance. Our experimental results demonstrate that MOJOFuzzer significantly enhances test validity, API coverage, and bug detection performance, outperforming traditional fuzz testing and state-of-the-art LLM-based fuzzing approaches. Using MOJOFuzzer, we have conducted a first large-scale fuzz testing evaluation of MOJO, uncorvering 13 previous unknown bugs. This study not only advances the field of LLM-driven software testing but also establishes a foundational methodology for leveraging LLMs in the testing of emerging programming languages.
academic
LLMs are All You Need? Improving Fuzz Testing for MOJO with Large Language Models
The rapid advancement of Large Language Models (LLMs) has revolutionized software testing, particularly fuzzing, through automatic generation of diverse and effective test inputs. Concurrently, the introduction of MOJO—a high-performance AI programming language that combines Python's ease of use with C/C++ efficiency—presents new opportunities for enhancing AI model scalability and programmability. However, as an emerging language, MOJO lacks comprehensive testing frameworks and sufficient LLM training corpora, exacerbating model hallucination issues. To address this challenge, this paper proposes MOJOFuzzer, the first adaptive LLM fuzzing framework designed for zero-shot learning environments in emerging programming languages. Experimental results demonstrate that MOJOFuzzer significantly outperforms traditional fuzzing and state-of-the-art LLM-based fuzzing methods in testing effectiveness, API coverage, and error detection performance, successfully discovering 13 previously unknown errors in MOJO.
The core problem addressed by this research is the fuzzing testing challenge for emerging programming languages, particularly how to conduct effective testing in zero-shot learning environments lacking sufficient training data.
AI Development Requirements: With AI's widespread application in critical domains such as autonomous driving, medical diagnosis, and financial services, efficient programming languages are essential
MOJO Language Potential: MOJO achieves performance improvements up to 68,000 times faster than Python, making it a crucial tool for AI development
Missing Testing Frameworks: As an emerging language, MOJO lacks mature testing frameworks, leaving undiscovered software errors and security vulnerabilities
Develop the first LLM fuzzing framework specifically designed for MOJO, leveraging innovative prompt engineering and fine-tuning techniques to achieve effective error detection in zero-shot learning environments.
First Zero-Shot LLM Fuzzing Framework: MOJOFuzzer is the first LLM-driven fuzzing framework designed for zero-shot learning environments, effectively mitigating LLM hallucination issues
Multi-Stage Quality Control Mechanism: Integrates systematic low-quality input filtering mechanisms, significantly improving test case validity
Adaptive Mutation Strategy: Dynamically adjusts LLM prompts based on runtime feedback, enabling iterative learning processes
Practical Error Discovery: Successfully discovered 13 previously unknown errors in MOJO, with 9 confirmed and fixed by the official team
Significant Performance Improvements: Substantially outperforms existing methods in test validity (98%), API coverage (77.3%), and error detection capability
Input: MOJO programming language environment with limited syntax rules and historical error reports
Output: Valid test cases capable of triggering MOJO errors
Constraints: Zero-shot learning environment without extensive MOJO-specific training data
The paper cites 58 relevant references covering important works in LLMs, fuzzing, software engineering, and related fields, providing a solid theoretical foundation for the research.
Overall Assessment: This is a high-quality software engineering research paper that proposes innovative solutions to practical problems with rigorous experimental design and convincing results. Beyond technical breakthroughs, this work provides viable methodologies for testing emerging technologies, possessing significant academic and practical value.