Introduction: Healthcare AI models often inherit biases from their training data. While efforts have primarily targeted bias in structured data, mental health heavily depends on unstructured data. This study aims to detect and mitigate linguistic differences related to non-biological differences in the training data of AI models designed to assist in pediatric mental health screening. Our objectives are: (1) to assess the presence of bias by evaluating outcome parity across sex subgroups, (2) to identify bias sources through textual distribution analysis, and (3) to develop a de-biasing method for mental health text data. Methods: We examined classification parity across demographic groups and assessed how gendered language influences model predictions. A data-centric de-biasing method was applied, focusing on neutralizing biased terms while retaining salient clinical information. This methodology was tested on a model for automatic anxiety detection in pediatric patients. Results: Our findings revealed a systematic under-diagnosis of female adolescent patients, with a 4% lower accuracy and a 9% higher False Negative Rate (FNR) compared to male patients, likely due to disparities in information density and linguistic differences in patient notes. Notes for male patients were on average 500 words longer, and linguistic similarity metrics indicated distinct word distributions between genders. Implementing our de-biasing approach reduced diagnostic bias by up to 27%, demonstrating its effectiveness in enhancing equity across demographic groups. Discussion: We developed a data-centric de-biasing framework to address gender-based content disparities within clinical text. By neutralizing biased language and enhancing focus on clinically essential information, our approach demonstrates an effective strategy for mitigating bias in AI healthcare models trained on text.
A Data-Centric Approach to Detecting and Mitigating Demographic Bias in Pediatric Mental Health Text: A Case Study in Anxiety Detection
- Paper ID: 2501.00129
- Title: A Data-Centric Approach to Detecting and Mitigating Demographic Bias in Pediatric Mental Health Text: A Case Study in Anxiety Detection
- Authors: Julia Ive, Paulina Bondaronek, Vishal Yadav, Daniel Santel, Tracy Glauser, Tina Cheng, Jeffrey R. Strawn, Greeshma Agasthya, Jordan Tschida, Sanghyun Choo, Mayanka Chandrashekar, Anuj J. Kapadia, John Pestian
- Classification: cs.CL cs.AI
- Institutions: University College London, Queen Mary University of London, Cincinnati Children's Hospital Medical Center, Oak Ridge National Laboratory, et al.
- Paper Type: Research Paper
This study addresses demographic bias in AI models for pediatric mental health by proposing a data-centric debiasing approach. The research identifies systematic underdiagnosis in female adolescent patients, with 4% lower accuracy and 9% higher false negative rates compared to male patients. Through information density filtering and gender-neutral word substitution debiasing methods, the study successfully reduces diagnostic bias by 27%, providing an effective solution for fairness in medical AI.
- Prevalence of AI Bias: Medical AI models frequently inherit biases from training data, potentially exacerbating healthcare disparities, particularly affecting minority populations
- Specificity of Mental Health: Mental health diagnosis heavily relies on unstructured text data (clinical notes), while existing debiasing research primarily focuses on structured data
- Pediatric Mental Health Crisis: Post-COVID-19, the prevalence of anxiety symptoms in children has doubled, particularly among female adolescents
- Complexity and challenges in pediatric mental health screening
- Enormous potential of AI in expanding mental health diagnosis
- Urgent need to ensure AI tools are fair and effective across diverse populations
- Traditional debiasing techniques (e.g., word embedding debiasing, adversarial training) are not applicable to medical domains
- Heterogeneity of medical data (from different healthcare institutions) has not been adequately addressed
- Lack of specialized debiasing frameworks for medical text
- Systematic Bias Identification: First to identify and quantify gender bias in pediatric anxiety detection, with significantly higher false negative rates for female patients
- Data-Centric Debiasing Framework: Proposes debiasing methods specifically tailored for medical text, including information density filtering and gender word neutralization
- Effectiveness Validation: Validates the method's effectiveness on real clinical data, reducing diagnostic bias by up to 27%
- Interpretability Analysis: Uses LIME technology to analyze vocabulary on which model decisions depend, revealing sources of bias
Input: Sequence of clinical note text from pediatric patients
Output: Binary classification prediction (anxiety/non-anxiety)
Objective: Reduce performance disparities across gender groups while maintaining prediction accuracy
Multiple metrics are used to evaluate model bias:
- Balanced Error Rate (BER):
BER=2(FP+TNFP)+(FN+TPFN)
- False Negative Rate (FNR): Measures underdiagnosis rate
- False Positive Rate (FPR): Measures misdiagnosis rate
- BER Ratio: Ratio of underprivileged to privileged group BER; >1.25 indicates significant bias
Analyzes differences in text characteristics across demographic groups:
- Average note length
- Medical terminology percentage
- Gender-biased vocabulary percentage
- Jaccard distance and familiarity scores
- Calculates sentence importance using TF-IDF scores
- Removes 20% of sentences with lowest information content
- Balances information density across different groups
- Automatically detects gender-biased vocabulary such as names and pronouns
- Extracts proper nouns using Stanza tool
- Replaces gender-specific vocabulary with neutral alternatives
- Names → "person1", "person2", etc.
- Pronouns → "he/she" → "they"
Combines information density filtering and gender word substitution to leverage synergistic effects
- Transformer model based on Clinical-BigBird
- Pre-trained specifically on clinical text
- Supports long sequence input (up to 4,096 tokens)
- Fine-tuning parameters: 2 epochs, learning rate 1e-5, batch size 8
- Source: Cincinnati Children's Hospital Medical Center
- Scale: 1.3 million patients, 63 million clinical notes
- Time Span: January 2009 - March 2022
- Anxiety Cases: 84,426 cases meeting screening criteria
- Final Data: 73,288 patients, 7.81 million notes
- Divided into 5 age groups: 5, 8, 10, 12, 15 years
- 3,700-5,064 training samples per group
- 852-1,278 test samples per group
- 1:1 case-control matching (by age and gender)
- Deduplication: notes with cosine similarity ≥0.8
- Selection of most recent 25 notes
- Input length limited to 1,000 tokens
- Accuracy
- False Negative Rate (FNR) - primary metric of focus
- False Positive Rate (FPR)
- Balanced Error Rate (BER)
- Percentage of uncertain predictions (probability in 0.4, 0.6 interval)
| Metric | Male | Female | Difference |
|---|
| Accuracy | - | -4% | Lower for females |
| FNR | - | +9% | Higher for females |
| Uncertain Predictions | - | +5% | Higher for females |
| Note Length | Baseline | -500 words | Shorter for females |
- Vocabulary Similarity: Jaccard index 0.54 (male-female)
- Terminology Distribution: Jaccard index 0.34 (significant difference)
- Lowest Similarity: Age 5 and 15 groups (Jaccard 0.43)
Best Method (tf-idf_filt):
- FNR gap reduction of 0.024 (27% improvement)
- Bin 5: FNR gap reduced from 0.13 to 0.02
- Bin 15: FNR gap reduced from 0.13 to 0.07
- BER ratio decreased from 1.33 to 0.98 (Bin 10)
| Method | FNR Improvement | Performance Maintained | Uncertainty Reduction |
|---|
| rnd_filt | No consistent effect | ✓ | - |
| tf-idf_filt | -0.024 | ✓ | -4% |
| gen_sub | +0.008 | ✓ | -3% |
| Combined Method | -0.022 | ✓ | -12% |
Using LIME to analyze vocabulary on which model decisions depend:
- Original Model: 10% of cases rely on biased vocabulary for prediction
- tf-idf_filt: Reduced to 3%
- Combined Method: 50% reduction in biased vocabulary frequency
- Other racial groups show average FNR 0.05 higher
- Combined method reduces FNR gap by 0.034
- Demonstrates method's generalizability
- Preprocessing techniques: resampling, data augmentation
- Algorithm modification: adversarial debiasing, objective function modification
- Post-processing techniques: calibration, embedding transformation
- Attribute swapping: exchanging sensitive attribute vocabulary
- Embedding debiasing: removing gender components from word embeddings
- Adversarial training: penalizing predictions influenced by protected attributes
- Racial bias in commercial prediction algorithms
- Group disparities in suicide risk prediction
- Demographic bias in pathology models
- Bias is Pervasive: Pediatric anxiety detection models exhibit systematic underdiagnosis for female patients
- Text Differences are Root Cause: Significant differences exist in information density and language distribution between notes of different genders
- Data-Centric Approach is Effective: Significant bias reduction can be achieved through information density balancing and language neutralization
- Clinical Significance: 27% bias reduction is clinically meaningful for improving female patient diagnosis
- Data Quality Dependency: Method effectiveness is limited by EHR text quality and consistency
- Single Bias Type: Focuses only on gender bias, does not address other demographic characteristics
- Generalization Capacity: Generalizability across different clinical settings requires further validation
- Biological Differences: Difficult to completely distinguish biological differences from sociocultural differences
- Extend to other mental health conditions and populations
- Develop more refined bias detection and mitigation techniques
- Incorporate multimodal data (text + structured data)
- Establish standardized fairness assessment frameworks for medical AI
- Problem Importance: Focuses on pediatric mental health, a critical domain with significant social value
- Methodological Innovation: Proposes a data-centric debiasing framework specifically tailored for medical text
- Experimental Rigor: Validated on large-scale real clinical data with multidimensional bias analysis
- Practical Value: Simple and effective methods easily deployable in clinical environments
- Interpretability: Provides interpretable model decision analysis using techniques like LIME
- Theoretical Depth: Lacks deep theoretical analysis of bias generation mechanisms
- Method Limitations: Debiasing methods are relatively simple, potentially oversimplifying the problem
- Single Evaluation Dimension: Primarily focuses on classification fairness, lacks other fairness dimensions such as calibration
- Long-term Impact: Does not evaluate the impact of debiasing on long-term model performance and generalization
- Academic Contribution: Provides important case studies and methodological references for medical NLP bias research
- Practical Value: Offers concrete solutions for improving fairness in clinical AI systems
- Policy Significance: Provides technical support for medical AI regulation and standard-setting
- Reproducibility: Detailed method descriptions ensure good reproducibility
- Clinical Decision Support: Mental health screening and diagnostic assistance systems
- Healthcare Quality Improvement: Identifying and mitigating bias in existing medical AI systems
- Regulatory Compliance: Meeting fairness and ethical requirements for medical AI
- Research Tool: Providing methodological foundation for other medical AI bias research
This paper cites important literature from fairness in machine learning, NLP debiasing, and medical AI, including:
- Feldman et al. (2015) - Fairness metrics
- Bolukbasi et al. (2016) - Word embedding debiasing
- Obermeyer et al. (2019) - Racial bias in medical algorithms
- Ribeiro et al. (2016) - LIME interpretability method
Overall Assessment: This is an important research paper in the field of medical AI fairness that not only identifies gender bias issues in pediatric mental health AI but also proposes practical solutions. While there is room for improvement in theoretical depth and methodological complexity, its practical value and social significance make it an important contribution to the field.