Red blood cells (RBCs) are essential to human health, and their precise morphological analysis is important for diagnosing hematological disorders. Despite the promise of foundation models in medical diagnostics, comprehensive AI solutions for RBC analysis remain scarce. We present RedDino, a self-supervised foundation model designed for RBC image analysis. RedDino uses an RBC-specific adaptation of the DINOv2 self-supervised learning framework and is trained on a curated dataset of 1.25 million RBC images from diverse acquisition modalities and sources. Extensive evaluations show that RedDino outperforms existing state-of-the-art models on RBC shape classification. Through assessments including linear probing and nearest neighbor classification, we confirm its strong feature representations and generalization ability. Our main contributions are: (1) a foundation model tailored for RBC analysis, (2) ablation studies exploring DINOv2 configurations for RBC modeling, and (3) a detailed evaluation of generalization performance. RedDino addresses key challenges in computational hematology by capturing nuanced morphological features, advancing the development of reliable diagnostic tools. The source code and pretrained models for RedDino are available at https://github.com/Snarci/RedDino, and the pretrained models can be downloaded from our Hugging Face collection at https://huggingface.co/collections/Snarcy/reddino-689a13e29241d2e5690202fc
- Paper ID: 2508.08180
- Title: RedDino: A foundation model for red blood cell analysis
- Authors: Luca Zedda, Andrea Loddo, Cecilia Di Ruberto, Carsten Marr
- Categories: eess.IV cs.AI cs.CV
- Publication Date: August 22, 2025 (arXiv v2)
- Paper Link: https://arxiv.org/abs/2508.08180
Red blood cells (RBCs) are crucial to human health, and precise morphological analysis is essential for diagnosing hematological diseases. Although foundation models have demonstrated significant potential in medical diagnostics, comprehensive AI solutions specifically for RBC analysis remain scarce. This paper proposes RedDino, a self-supervised foundation model specifically designed for RBC image analysis. RedDino employs a DINOv2 self-supervised learning framework tailored for RBCs, trained on a carefully curated dataset containing 1.25 million RBC images from different acquisition modalities and sources. Extensive evaluation demonstrates that RedDino significantly outperforms existing state-of-the-art models on RBC shape classification tasks. The model's strong feature representation and generalization capabilities are validated through linear probing and nearest neighbor classification evaluation methods.
Red blood cell morphological analysis is fundamental to hematological diagnostics but faces several key challenges:
- Staining and imaging variability: Different staining protocols and imaging devices introduce bias, increasing analysis complexity
- Batch effects: Significant systematic differences exist in multi-source, multi-patient scenarios
- Professional training requirements: Traditional analysis requires extensive professional training
- Lack of specialized AI tools: Compared to white blood cell analysis, RBC analysis lacks mature foundation models
While foundation models have demonstrated significant advantages in white blood cell analysis, effectively predicting clinical outcomes and addressing batch effects, the RBC analysis field has not yet fully explored the potential of these advanced techniques. This research aims to fill this gap by developing a foundation model specifically tailored for RBC analysis.
- Specialized foundation model: Proposes RedDino, the first self-supervised foundation model family optimized specifically for RBC analysis
- In-depth configuration study: Conducts rigorous comparative analysis of DINOv2 configurations for RBC morphological modeling
- Comprehensive performance evaluation: Performs extensive benchmarking on multiple RBC datasets, demonstrating superiority over existing state-of-the-art models
- Strong generalization capability: Effectively mitigates batch effects challenges, demonstrating excellent cross-domain generalization performance
RedDino aims to learn universal RBC feature representations supporting downstream RBC shape classification, anomaly detection, and morphological analysis tasks. Input consists of RBC microscopy images, with output being high-dimensional feature vectors applicable to various RBC analysis tasks.
RedDino is built upon the DINOv2 self-supervised learning framework, employing Vision Transformer (ViT) as the backbone network. The model family includes three versions:
- RedDino Small: Feature dimension 384, batch size 512, 22 million parameters
- RedDino Base: Feature dimension 768, batch size 384, 86 million parameters
- RedDino Large: Feature dimension 1024, batch size 256, 304 million parameters
- Removal of Koleo regularizer: The original DINOv2 uses Koleo regularization to prevent feature collapse. However, in RBC scenarios, due to the natural consistency of RBC shape and color, this regularizer excessively suppresses feature expression of pathological and abnormal RBCs
- Sinkhorn-Knopp centering: Replaces moving average centering, improving representation quality
- Customized data augmentation: Replaces DINOv2's original augmentation strategy with 32 pixel-level augmentations from the Albumentations library
- Data scale: 56,712 raw images from 18 datasets, covering over 420 individuals
- Data extraction: Two methods employed:
- Cell segmentation using improved CellPose, producing 3,076,269 segmented cells
- Extraction of 224×224 pixel non-overlapping image patches, generating 1,250,781 image patches
- Data balancing: White blood cell image datasets are incorporated to mitigate natural imbalance between red and white blood cells
Systematic experiments reveal:
- Training with image patches outperforms single-cell training
- Removing local crops significantly improves performance
- Customized augmentation pipeline further enhances feature quality
Training data: 18 public RBC datasets, including different imaging modalities, resolutions, and staining techniques
Test data:
- Elsafty dataset: 240,000 images, 9 classes, from 4 different sources
- Chula dataset: 20,875 images, 12 RBC classes
- DSE dataset: 5,659 images, 8 classes
- Accuracy (Acc)
- Balanced Accuracy (bAcc)
- Weighted F1 Score (wF1)
- ResNet50
- DINOv2 (Small/Base/Large)
- DinoBloom (Small/Base/Large) - current state-of-the-art feature extractor for hematological data
- Linear probing: Evaluates feature adaptation capability for downstream tasks
- K-nearest neighbor classification (1-NN, 20-NN): Evaluates feature robustness under batch effects
- Cross-source evaluation: Uses leave-one-source-out validation strategy
- Five-fold cross-validation: For imbalanced datasets
In the most challenging cross-source evaluation, RedDino achieves significant advantages:
| Model | Linear Probe wF1 | 1-NN wF1 | 20-NN wF1 |
|---|
| ResNet50 | 77.6±8.1 | 64.3±4.8 | 66.2±4.9 |
| DinoBloom-L | 85.4±5.2 | 74.1±5.0 | 77.0±4.5 |
| DINOv2 large | 86.0±5.6 | 73.7±6.2 | 76.4±7.0 |
| RedDino base | 88.1±4.9 | 78.8±3.6 | 82.6±2.8 |
| RedDino large | 88.5±5.5 | 78.5±4.6 | 81.6±4.7 |
Key Findings:
- RedDino achieves improvements exceeding 2.1% (linear probing) and 3.0% (nearest neighbor classification) over the best baseline methods
- Average improvement margins of 4.0-6.5%, demonstrating consistent performance advantages
On the Chula and DSE datasets with five-fold cross-validation, RedDino similarly demonstrates excellent performance, surpassing baseline methods on nearly all metrics.
Impact of key configuration improvements:
- Removal of Koleo regularizer: Significantly improves performance, preventing pathological RBC features from being excessively suppressed
- Sinkhorn-Knopp centering: Further performance improvement when replacing moving average centering
- Image patches vs. single-cell training: Image patch training strategy outperforms single-cell training
- Customized augmentation pipeline: Shows clear improvements compared to original DINOv2 augmentation strategy
Three-component PCA visualization validates RedDino feature effectiveness:
- Capable of distinguishing background, cells, membrane structures, and parasites
- Demonstrates excellent discrimination ability for abnormal morphologies such as malaria-infected RBCs and acanthocytes
UMAP projection using the Elsafty dataset shows:
- Different classes form clear clusters with no apparent batch effects
- Clinically difficult-to-distinguish classes (such as spherical RBCs, elliptocytes, etc.) indeed overlap in feature space
- Cell aggregates form unique clusters, proving the model can distinguish single cells from aggregates
- White blood cell analysis: Mature foundation models such as DinoBloom exist, demonstrating excellent performance in clinical outcome prediction
- Red blood cell analysis: Comparatively underdeveloped, lacking specialized foundation models
- Computer-aided diagnosis: Gradually becoming an important tool for addressing critical diagnostic challenges in hematology
Self-supervised methods such as DINOv2 have achieved tremendous success on natural images, but their application in medical imaging, particularly RBC analysis, remains to be fully explored.
- Performance breakthrough: RedDino achieves new state-of-the-art performance on RBC classification tasks
- Strong generalization capability: Effectively mitigates batch effects, demonstrating excellent performance in cross-source scenarios
- High practical value: Provides reliable foundational tools for automated hematological diagnostics
- Training data constraints: Despite the large dataset scale, certain rare RBC morphologies may be underrepresented
- Computational resource requirements: Large model versions require substantial computational resources
- Annotated data dependency: Downstream tasks still require certain amounts of annotated data for fine-tuning
- Extended application scenarios: Explore applications in other hematological tasks
- Model compression: Develop lighter-weight versions for resource-constrained environments
- Multimodal fusion: Incorporate other types of medical data to improve diagnostic accuracy
- Strong problem specificity: Specifically addresses RBC analysis, an important yet overlooked field
- Reasonable methodology design: Makes targeted improvements to DINOv2 based on RBC characteristics
- Rigorous experimental design: Employs strict evaluation methods such as cross-source validation, ensuring result reliability
- Large dataset contribution: Constructs the largest RBC image training collection to date
- Open-source friendly: Provides complete code and pre-trained models
- Limited theoretical analysis: Theoretical explanation for why Koleo regularizer removal is effective lacks depth
- Insufficient computational cost analysis: Lacks detailed analysis of computational efficiency trade-offs between different model versions
- Lack of clinical validation: Absence of validation results in real clinical environments
- Academic value: Provides important foundational tools and benchmarks for the RBC analysis field
- Practical value: Has potential to significantly enhance automation of hematological diagnostics
- Reproducibility: Provides complete open-source implementation, facilitating use and improvement by the research community
- Blood pathology diagnostic assistance
- Large-scale blood screening
- RBC morphological research
- Hematological education and training tool development
RedDino's core innovation lies in successfully adapting a general self-supervised learning framework to specialized medical domains. Through removing inappropriate regularization constraints and optimizing training strategies, it achieves significant performance improvements. This provides valuable reference for foundation model development in other medical imaging analysis tasks.
Environmental Impact Statement: The paper reports experimental carbon emissions of 4.15 kg CO2eq, reflecting attention to environmental responsibility.