2025-11-13T13:28:10.924524

Validation of an Artificial Intelligence Tool for the Detection of Sperm DNA Fragmentation Using the TUNEL In Situ Hybridization Assay

Jacobs, Morris, Shaik et al.
Sperm DNA fragmentation (SDF) is a critical parameter in male fertility assessment that conventional semen analysis fails to evaluate. This study presents the validation of a novel artificial intelligence (AI) tool designed to detect SDF through digital analysis of phase contrast microscopy images, using the terminal deoxynucleotidyl transferase dUTP nick end labeling (TUNEL) assay as the gold standard reference. Utilising the established link between sperm morphology and DNA integrity, the present work proposes a morphology assisted ensemble AI model that combines image processing techniques with state-of-the-art transformer based machine learning models (GC-ViT) for the prediction of DNA fragmentation in sperm from phase contrast images. The ensemble model is benchmarked against a pure transformer `vision' model as well as a `morphology-only` model. Promising results show the proposed framework is able to achieve sensitivity of 60\% and specificity of 75\%. This non-destructive methodology represents a significant advancement in reproductive medicine by enabling real-time sperm selection based on DNA integrity for clinical diagnostic and therapeutic applications.
academic

Validation of an Artificial Intelligence Tool for the Detection of Sperm DNA Fragmentation Using the TUNEL In Situ Hybridization Assay

Basic Information

  • Paper ID: 2510.11142
  • Title: Validation of an Artificial Intelligence Tool for the Detection of Sperm DNA Fragmentation Using the TUNEL In Situ Hybridization Assay
  • Authors: B. A. Jacobs, A. Morris, I. Shaik, F. Lin
  • Classification: cs.CV (Computer Vision)
  • Publication Date: October 13, 2025 (arXiv preprint)
  • Paper Link: https://arxiv.org/abs/2510.11142v1

Abstract

Sperm DNA fragmentation (SDF) is a critical parameter in male fertility assessment; however, conventional semen analysis cannot evaluate this indicator. This study proposes and validates a novel artificial intelligence tool for detecting SDF through digital analysis of phase-contrast microscopy images, using terminal deoxynucleotidyl transferase dUTP nick end labeling (TUNEL) assay as the gold standard reference. Leveraging the established relationship between sperm morphology and DNA integrity, this study presents a morphology-assisted integrated AI model that combines image processing techniques with state-of-the-art Transformer-based machine learning models (GC-ViT) to predict DNA fragmentation in sperm from phase-contrast images. The integrated model was benchmarked against pure Transformer vision models and morphology-only models. Results demonstrate that the proposed framework achieves 60% sensitivity and 75% specificity. This non-invasive approach represents a significant advancement in clinical diagnostics and therapeutic applications in reproductive medicine by enabling real-time sperm selection based on DNA integrity.

Research Background and Motivation

Problem Definition

  1. Core Issue: Conventional semen analysis cannot assess sperm DNA fragmentation (SDF), a critical fertility parameter directly associated with reduced fertilization rates, impaired embryonic development, and increased miscarriage rates.
  2. Clinical Significance:
    • Infertility affects approximately 15% of reproductive-age couples globally, with male factors accounting for 30-50% of cases
    • SDF directly impacts the success rates of assisted reproductive technology (ART)
    • Existing detection methods compromise sperm viability, precluding their use in subsequent therapeutic procedures
  3. Limitations of Current Methods:
    • TUNEL Assay: Requires specialized equipment and trained personnel; time-consuming and costly; fixation and staining processes inactivate sperm
    • Inconsistent Detection Methods: Multiple detection approaches (AOT, CMA3, SCSA, COMET, SCD) yield inconsistent results
    • High Subjectivity: Manual interpretation exhibits both intra-observer and inter-observer variability
  4. Research Motivation: Develop an AI-based, non-invasive, rapid, and objective SDF detection tool capable of preserving sperm viability for subsequent ART procedures.

Core Contributions

  1. Proposed a Morphology-Assisted Integrated AI Model: Combines image processing techniques with GC-ViT Transformer models, leveraging the association between sperm morphology and DNA integrity for prediction
  2. Developed a Non-Invasive Detection Method: Performs SDF detection using only phase-contrast microscopy images while maintaining sperm viability for subsequent treatment
  3. Constructed an Annotated Dataset: Comprises 1,825 sperm image triplets (bright-field, phase-contrast, fluorescence) from 35 patients
  4. Quantified Intra-Observer Variability: Through digital analysis, revealed the subjectivity inherent in traditional manual assessment (intra-observer concordance of only 81%)
  5. Established Performance Benchmarks: Validated the feasibility of the AI-assisted tool at sensitivity of 60% and specificity of 75%

Methodology Details

Task Definition

  • Input: Phase-contrast microscopy images of sperm
  • Output: Binary classification result (DNA fragmented/non-fragmented)
  • Constraints: Non-invasive, real-time processing, applicable to both live and dead sperm

Model Architecture

1. Ensemble Model

Input: Phase-contrast image + Morphological features
     ↓
GC-ViT Transformer → Visual features
     ↓
Morphological feature extraction → Morphological features
     ↓
Feature fusion module → Classification head (1024→256 nodes)
     ↓
Output: DNA fragmentation probability

2. Comparative Models

  • Pure Vision Model: Processes phase-contrast images using GC-ViT only
  • Morphology-Only Model: Uses only morphological parameters extracted from phase-contrast images

3. Key Technical Components

  • GC-ViT Transformer: Global context vision Transformer as backbone network
  • Morphological Features: Head length, width, vacuole presence, acrosomal region parameters
  • Feature Fusion: Adaptive module selecting visual features, morphological features, or both
  • Classification Head: Two-layer fully connected network (1024→256 nodes) with LeakyReLU activation and Dropout regularization

Technical Innovations

  1. Multimodal Fusion: First application combining Transformer vision models with sperm morphological features for SDF detection
  2. Non-Invasive Detection: Overcomes limitations of traditional chemical detection, enabling viability-preserving analysis
  3. Transfer Learning Strategy: Hierarchical learning rate decay and early stopping strategies tailored for small datasets
  4. Objective Quantification: Provides reproducible quantitative analysis, reducing subjective bias

Experimental Setup

Dataset

  • Sample Source: Semen samples from 35 consenting patients
  • Image Count: 1,825 image triplets (bright-field, phase-contrast, fluorescence)
  • Annotation Distribution:
    • Fragmented: 512 images
    • Non-fragmented: 715 images
    • Indeterminate: 591 images (excluded)
  • Data Partition:
    • Training set: 1,017 images (28 patients)
    • Validation set: 210 images (7 patients)
    • Patient-level grouping to prevent data leakage

Evaluation Metrics

  • Sensitivity (Recall): Proportion of fragmented sperm correctly identified
  • Specificity: Proportion of non-fragmented sperm correctly identified
  • Precision: Proportion of true fragmented sperm among predicted fragmented
  • Accuracy: Overall classification correctness rate
  • F1 Score: Harmonic mean of precision and recall
  • ROC Curve: Receiver operating characteristic curve

Comparative Methods

  • GC-ViT pure vision model
  • Morphology-only model
  • Ensemble model

Implementation Details

  • Optimizer: Adam with initial learning rate 5×10⁻⁵
  • Learning Rate Schedule: Hierarchical decay (decay factor 0.12), warmup ratio 0.1
  • Loss Function: Binary cross-entropy
  • Regularization: Dropout (0.6, 0.3), early stopping (10 epochs)
  • Data Augmentation: Random rotation and flipping
  • Training Epochs: Maximum 50

Experimental Results

Primary Results

Model TypeSensitivitySpecificityPrecisionAccuracyF1 Score
Ensemble Model0.600.750.600.690.60
Morphology Model0.780.440.470.570.59
Pure Vision Model0.780.460.480.590.60

Key Findings

  1. Superior Ensemble Performance: The ensemble model outperformed single-modality models in balanced performance, achieving favorable equilibrium between sensitivity and specificity
  2. Intra-Observer Variability: The same expert's re-annotation after 10 months showed concordance of only 81%, with absolute mean difference in patient-level SDF percentage of 13.7%±19.5%
  3. Model Stability: Learning curves demonstrate absence of significant overfitting; ROC curves substantially outperform random classification

Case Analysis

  • Correct Classification Cases: The ensemble model balances visual and morphological information, correctly classifying cases where single modalities fail
  • Misclassification Cases: Primarily attributable to multiple sperm tails in images or image blur causing morphological measurement errors

Traditional SDF Detection Methods

  • TUNEL Assay: Gold standard but compromises sperm viability
  • Alternative Methods: AOT, CMA3, SCSA, COMET, SCD, etc., with inconsistent results

AI Applications in Sperm Analysis

  • Serrano Berenguer et al. (2022): Random forest and CNN for COMET result prediction
  • Wang et al. (2019): Linear and nonlinear regression models based on AOT data with 82.7% test accuracy
  • Advantages of This Study: Non-invasive, multimodal fusion, real-time processing capability

Conclusions and Discussion

Main Conclusions

  1. Successfully developed an AI-based non-invasive SDF detection tool
  2. The ensemble model achieved balanced performance with 60% sensitivity and 75% specificity
  3. Provides a novel solution for sperm selection in assisted reproductive technology

Limitations

  1. Dataset Scale: Relatively small dataset limits further performance improvement
  2. Single-Expert Annotation: Lacks multi-expert annotation for inter-observer variability assessment
  3. Sensitivity Improvement Needed: 60% sensitivity warrants further enhancement

Future Directions

  1. Expand training dataset scale
  2. Conduct multi-center clinical validation
  3. Integrate insights from multiple SDF detection methods
  4. Develop real-time clinical application systems

In-Depth Evaluation

Strengths

  1. Significant Clinical Value: Addresses genuine clinical needs in reproductive medicine
  2. Strong Technical Innovation: First application combining Transformer models with sperm morphological features for SDF detection
  3. Rigorous Methodology: Patient-level data grouping prevents leakage; quantifies intra-observer variability
  4. High Practical Value: Non-invasive detection preserves sperm viability, suitable for clinical application

Limitations

  1. Sample Size Constraints: 1,825 samples relatively small for deep learning models
  2. Single-Center Study: Lacks multi-center validation; generalizability requires verification
  3. Performance Enhancement Needed: 60% sensitivity may be suboptimal for clinical application
  4. Absent Cost-Benefit Analysis: No economic comparison with traditional methods provided

Impact

  1. Academic Contribution: Provides novel perspectives for AI applications in reproductive medicine
  2. Clinical Translation Potential: Promises to improve ART success rates, benefiting infertile patients
  3. Technology Dissemination Value: Extensible to other medical image analysis tasks

Applicable Scenarios

  1. IVF/ICSI Procedures: Pre-operative sperm quality assessment and selection
  2. Male Infertility Diagnosis: Objective SDF evaluation
  3. Reproductive Medicine Research: Standardized SDF detection tool
  4. Telemedicine: Automated analysis reduces dependence on specialized personnel

References

This study cites important literature from reproductive medicine, machine learning, and image processing, including WHO semen examination manuals, TUNEL assay standard protocols, and recent research on AI applications in medical image analysis.


Overall Assessment: This is an important interdisciplinary study applying advanced AI technology to address practical problems in reproductive medicine. While there remains room for improvement in dataset scale and performance, its innovative non-invasive detection concept and multimodal fusion technical approach provide clear direction for future development in this field.