2025-11-23T14:31:17.888154

Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models

Shim, Ju, Park et al.
Recent advancements in large language models (LLMs) have shown strong performance in natural language understanding and generation tasks. However, LLMs continue to encounter challenges with hallucinations, where models generate plausible but incorrect information. While several factors contribute to hallucinations, the impact of ill-formed prompts, prompts with ambiguous wording, incorrect grammar, or incomplete information, was relatively under explored. To address this, we introduce Multi-stage Prompt Refinement (MPR), a framework designed to systematically improve these ill-formed prompts across multiple stages. Each stage addresses specific errors such as punctuation, typographical mistakes, and misuse of key terms, using small language models (SLMs) fine-tuned for these tasks. MPR iteratively enhances the clarity of prompts with additional context and employs a self-reflection mechanism with ranking to prioritize the most relevant input. Experimental results on hallucination benchmarks show that prompts refined by MPR achieve over an 85~\% win rate compared to their original forms, demonstrating its effectiveness in reducing hallucinations and improving LLM output accuracy. Interestingly, we reveal that MPR can be combined with existing post-hoc hallucination mitigation frameworks, further enhancing its versatility. MPR provides a lightweight and adaptable solution for enhancing LLM reliability across various domains.
academic

Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models

Basic Information

  • Paper ID: 2510.12032
  • Title: Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models
  • Authors: Jung-Woo Shim, Yeong-Joon Ju, Ji-Hoon Park, Seong-Whan Lee
  • Institution: Korea University, Department of Artificial Intelligence
  • Classification: cs.CL cs.AI cs.LG
  • Publication Date: October 14, 2025 (arXiv)
  • Paper Link: https://arxiv.org/abs/2510.12032

Abstract

Large language models demonstrate exceptional performance in natural language understanding and generation tasks, yet still face the hallucination problem—generating information that appears plausible but is factually incorrect. While multiple factors contribute to hallucinations, the impact of poorly formatted prompts (containing ambiguous phrasing, grammatical errors, or incomplete information) remains relatively underexplored. This paper proposes a Multi-stage Prompt Refinement framework (MPR) that systematically improves such poorly formatted prompts through multiple stages. Each stage employs a small language model (SLM) fine-tuned for specific tasks to address concrete issues such as punctuation, spelling errors, and keyword misuse. MPR iteratively enhances prompt clarity through a self-reflection mechanism and ranking to prioritize the most relevant inputs. Experimental results demonstrate that prompts optimized by MPR achieve over 85% win rate compared to their original form, effectively reducing hallucinations and improving LLM output accuracy.

Research Background and Motivation

Problem Definition

Although large language models excel in multiple NLP tasks, they face a critical challenge: the hallucination problem, wherein models generate information that appears reasonable but is factually incorrect. This is particularly dangerous in critical domains such as healthcare and education, where accurate information transmission is paramount.

Limitations of Existing Methods

Current approaches to mitigating hallucinations primarily focus on:

  1. Model Architecture Modification: Altering LLM internal mechanisms, but at high computational cost
  2. Post-processing Techniques: Verifying content after generation, adding system complexity and latency
  3. Reinforcement Learning Fine-tuning: Requiring substantial computational resources, difficult for real-time applications

These methods typically overlook an important factor: the quality of user prompts. Poorly formatted prompts directly lead to inaccurate outputs, yet existing solutions often rely on large models or computationally intensive techniques.

Research Motivation

This paper posits that systematically optimizing input prompt quality can reduce hallucination problems at their source. Compared to modifying model architectures or post-processing outputs, prompt optimization represents a more lightweight and scalable solution.

Core Contributions

  1. Proposes MPR Framework: The first systematic multi-stage optimization framework addressing hallucinations caused by poorly formatted prompts
  2. Lightweight Design: Employs small language models (SLMs) rather than large models, significantly reducing computational costs
  3. Model Agnosticism: Seamlessly integrates with any LLM architecture, demonstrating high adaptability
  4. Comprehensive Evaluation: Validates effectiveness across multiple datasets with win rates exceeding 85%
  5. Compatibility Verification: Demonstrates compatibility with existing post-processing hallucination mitigation methods for further performance enhancement

Methodology Details

Task Definition

Input: Poorly formatted user prompts (containing punctuation errors, spelling mistakes, grammatical issues, terminology misuse, etc.) Output: High-quality prompts optimized through multi-stage refinement Objective: Reduce hallucinations in LLM-generated content and improve output accuracy and relevance

Model Architecture

The MPR framework comprises three main stages:

Stage 1: Error Detection and Classification

Employs a specialized fine-tuned SLM to identify error types in prompts, classifying them as:

  • Stage 1 Errors: Basic punctuation and capitalization errors
  • Stage 2 Errors: Spelling and grammatical errors
  • Stage 3 Errors: Semantic ambiguity and terminology misuse

Stage 2: Multi-stage Prompt Cleaning

Applies corresponding specialized SLMs for correction based on error types:

Stage 1: Punctuation Correction

Input: "what is the caPital of fRAnce?"
Output: "What is the capital of France?"

Stage 2: Spelling and Grammar Correction

Input: "See from spaiin moroco?"
Output: "Can you see Spain from Morocco?"

Stage 3: Semantic Alignment and Rewriting

Input: "Tell me about transformers"
Output: "Can you explain how Transformer-based neural networks work?"

Stage 3: Iterative Description Generation

  • Description Generation: Adds contextual information for ambiguous terms
  • Self-reflection Verification: Evaluates description adequacy and conciseness
  • Perplexity-based Ranking: Selects the most coherent and relevant descriptions
  • Intelligent Integration: Adds descriptions only when necessary, improving efficiency

Technical Innovations

  1. Staged Processing Strategy: Different error types require different handling methods; staged processing is more precise and effective
  2. SLM Specialization: Each SLM is fine-tuned for specific tasks, ensuring quality while maintaining efficiency
  3. QLoRA Fine-tuning Technique: Employs 4-bit quantized low-rank adaptation, reducing memory requirements while preserving performance
  4. Adaptive Description Generation: Dynamically generates descriptions as needed, avoiding unnecessary computational overhead

Experimental Setup

Datasets

Training Data Construction:

  • OLM Wikipedia Dataset: 10,000 grammatically perfect entries for punctuation and grammar optimization
  • CoEdIT Dataset: Focuses on fluency, coherence, and style-preserving non-semantic edits
  • MQR Dataset: 2,114 question rewriting pairs for semantic equivalence transformation training
  • Magpie Dataset: 300,000 keyword-description pairs for terminology explanation generation

Evaluation Datasets:

  • Well-formed Query Dataset: 8,000 user queries with format quality scores below 0.5
  • GSM8K: Mathematical problem dataset
  • SQuAD: Reading comprehension dataset
  • Natural Questions: Natural question dataset

Corruption Strategy: To thoroughly test the framework, three levels of errors are artificially introduced:

  • Stage 1: Basic punctuation errors
  • Stage 2: Spelling and grammatical errors
  • Stage 3: Technical terminology and abbreviation errors

Evaluation Metrics

  • Hallucination Index (HI): Quantifies factual accuracy of generated content (0-1, lower is better)
  • Content Quality Score (CQS): Measures relevance, coherence, and overall quality (0-1, higher is better)
  • Win Rate (WR): Percentage advantage of MPR-optimized prompts over original prompts
  • Processing Time (T): Framework efficiency assessment

Baseline Methods

  • SelfCheckGPT: Zero-resource black-box hallucination detection method
  • CoVE: Chain of verification method
  • DRESS: Natural language feedback-based alignment method
  • MixAlign: Knowledge alignment method

Implementation Details

  • Hardware: NVIDIA RTX A6000 GPU for training, NVIDIA TITAN V GPU for inference
  • Fine-tuning Method: QLoRA (4-bit quantized low-rank adaptation)
  • Evaluator: GPT-3.5-turbo API as primary evaluation standard

Experimental Results

Main Results

Performance on the Well-formed Query dataset:

ModelCorruption LevelHI ↓CQS ↑WR ↑
Baseline-0.810.52-
LLaMA-2 (7B)Stage 10.26 (-0.55)0.80 (+0.28)91%
LLaMA-2 (7B)Stage 30.48 (-0.33)0.60 (+0.08)86%
Average Performance-0.37 (-0.44)0.68 (+0.16)86%

Key Findings

  1. Consistent Improvement: MPR demonstrates significant improvements across all tested models and datasets
  2. Corruption Level Correlation: Higher corruption levels show more pronounced MPR improvements
  3. Model Scale Effect: Larger models (e.g., LLaMA-3.2) benefit more from MPR's description generation step
  4. Cross-domain Effectiveness: Effective across diverse tasks including mathematics (GSM8K), reading comprehension (SQuAD), and question answering (NQ)

Ablation Study

ConfigurationHI ↓CQS ↑WR ↑
Complete MPR0.140.8393%
Without Description Generation0.200.7889%
Without Multi-stage Cleaning0.240.7486%
Without Iterative Ranking0.210.7587%

Results demonstrate that each component contributes significantly to overall performance, with multi-stage cleaning being the most critical component.

Comparison with Existing Methods

FrameworkHI ↓CQS ↑WR ↑Processing Time (ms)
MPR0.180.8191%1215
SelfCheckGPT0.220.7685%1541
SelfCheckGPT + MPR0.140.8594%1478

MPR not only performs excellently independently but achieves even better results when combined with existing methods.

Hallucination Mitigation Methods

Existing approaches fall into three main categories:

  1. Architecture Modification: Adjusting model internal mechanisms, high computational cost
  2. Post-processing Verification: Verifying content after generation, adding latency
  3. Reinforcement Learning: Rewarding factual responses, requiring substantial computational resources

Small Language Model Applications

SLMs can achieve excellent performance on specific tasks through fine-tuning, particularly suitable for:

  • Resource-constrained environments
  • Real-time applications
  • Domain-specific tasks

Prompt Optimization Techniques

Traditional methods include:

  • LLM-based prompt rewriting (high computational cost)
  • Reinforcement learning-based iterative improvement
  • Manual intervention optimization

MPR achieves lightweight prompt optimization through the use of small models.

Conclusions and Discussion

Main Conclusions

  1. Effectiveness Validation: MPR demonstrates excellent performance in reducing hallucinations and improving output quality
  2. Lightweight Design: Significantly reduces computational costs compared to existing methods
  3. Broad Applicability: Compatible with multiple LLM architectures and existing mitigation methods
  4. Practical Value: Provides a scalable solution for real-world applications

Limitations

  1. Domain Specificity: May underperform in specialized domains such as law and medicine
  2. Evaluation Metric Limitations: Existing metrics do not fully capture user satisfaction and fluency
  3. Automation Level: While fully automated, may benefit from human-in-the-loop systems

Future Directions

  1. Domain Specialization: Develop fine-tuning strategies tailored to specific domains
  2. Multimodal Extension: Extend the framework to multimodal environments such as image-text
  3. Human-Machine Collaboration: Integrate human feedback mechanisms
  4. Evaluation Framework: Develop more comprehensive user-centric evaluation methods

In-depth Evaluation

Strengths

  1. Strong Innovation: First systematic approach to addressing hallucinations from the perspective of prompt quality
  2. Reasonable Design: Multi-stage processing strategy precisely targets different error types
  3. High Practicality: Lightweight design makes it feasible in resource-constrained environments
  4. Comprehensive Experiments: Thorough evaluation across multiple datasets and models
  5. Good Compatibility: Combines with existing methods for further performance enhancement

Weaknesses

  1. Domain Limitations: Performance in specialized domains requires further validation
  2. Language Constraints: Primarily targets English; multilingual support is unclear
  3. Complexity Assessment: While claimed to be lightweight, multi-stage processing still involves certain complexity
  4. Long-term Effects: Performance in extended dialogues or complex tasks remains unevaluated

Impact

  1. Academic Value: Provides new research direction for hallucination mitigation
  2. Practical Value: Offers viable optimization solution for real-world LLM deployment
  3. Reproducibility: Detailed method description facilitates reproduction and improvement
  4. Extensibility: Framework design demonstrates good extension potential

Applicable Scenarios

  • Resource-constrained Environments: Edge devices, mobile applications
  • Real-time Systems: Interactive systems requiring rapid response
  • Quality-sensitive Applications: Education, customer service, and other scenarios with high accuracy requirements
  • Existing System Upgrades: Integration as plugin into existing LLM systems

References

This paper cites 27 important references covering recent research in large language models, hallucination detection, prompt engineering, small model applications, and related fields, providing a solid theoretical foundation for the research.


Overall Assessment: This is a high-quality research paper proposing an innovative solution to address LLM hallucination problems. The MPR framework is elegantly designed with comprehensive experiments and convincing results. Despite certain limitations, its lightweight and modular design provides high practical value and extension potential.