The increasing number of Health Care facilities in Nepal has added up the challenges on managing health care waste (HCW). Improper segregation and disposal of HCW leads to contamination, spreading of infectious diseases and risk for waste handlers. This study benchmarks the state of the art waste classification models: ResNeXt-50, EfficientNet-B0, MobileNetV3-S, YOLOv8-n and YOLOv5-s using stratified 5-fold cross-validation technique on combined HCW data. YOLOv5-s achieved the highest accuracy (95.06%) but fell short with the YOLOv8-n model in inference speed with few milliseconds. The EfficientNet-B0 showed promising results of 93.22% accuracy but took the highest inference time. Following a repetitive ANOVA test to confirm the statistical significance, the best performing model (YOLOv5-s) was deployed to the web with bin color mapped using Nepal's HCW management standards. Further work is suggested to address data limitation and ensure localized context.
- Paper ID: 2508.07450
- Title: Health Care Waste Classification Using Deep Learning Aligned with Nepal's Bin Color Guidelines
- Authors: Suman Kunwar (DWaste, USA), Prabesh Rai (Lambton College, Canada)
- Category: cs.CV (Computer Vision)
- Publication Date: October 15, 2025 (arXiv)
- Paper Link: https://arxiv.org/abs/2508.07450
With the increasing number of healthcare facilities in Nepal, healthcare waste (HCW) management faces significant challenges. Improper segregation and disposal lead to environmental contamination, disease transmission, and occupational hazards for waste handlers. This study benchmarks state-of-the-art waste classification models on a comprehensive HCW dataset using stratified 5-fold cross-validation: ResNeXt-50, EfficientNet-B0, MobileNetV3-S, YOLOv8-n, and YOLOv5-s. YOLOv5-s achieves the highest accuracy (95.06%), though it is marginally slower than YOLOv8-n in inference speed. EfficientNet-B0 demonstrates good performance at 93.22% but exhibits the longest inference time. After confirming statistical significance through repeated ANOVA testing, the best-performing model (YOLOv5-s) is deployed to the web and bin colors are mapped according to Nepal's HCW management standards.
- Problem Statement: Nepal has 16,611 healthcare facilities facing severe challenges in medical waste management. Traditional manual segregation methods are labor-intensive, error-prone, and pose health risks to waste handlers.
- Problem Significance: Improper medical waste segregation and disposal results in:
- Environmental pollution
- Disease transmission
- Health risks to waste handlers
- Potential hazards to residents near hospitals
- Limitations of Existing Approaches:
- Small dataset scales
- Poor image quality
- Controlled environment testing
- Scalability and infrastructure feasibility concerns
- Difficulty integrating with existing waste management systems
- Research Motivation: Nepal follows national medical waste management standards and operational procedures, classifying waste into general and hazardous medical waste using a color-coding system. This research aims to develop an AI-driven automated waste classification solution compliant with Nepal's standards.
- Multi-Model Benchmarking: First systematic comparison of five state-of-the-art deep learning models on medical waste classification tasks
- Localization Application: Aligns classification results with Nepal's medical waste management color-coding standards
- Comprehensive Dataset: Integrates two datasets covering 23 waste categories
- Practical Deployment: Deploys the best model to Hugging Face platform for public use
- Statistical Validation: Uses repeated ANOVA testing to confirm statistical significance of model performance
Input: RGB images of medical waste (1920×1080 resolution)
Output: Waste classification into 23 categories mapped to corresponding color-coded bins
Constraints: Must comply with Nepal's national medical waste management color-coding system
The study evaluates five different types of deep learning models:
- ResNeXt-50: Residual network variant using grouped convolutions
- EfficientNet-B0: Efficient CNN architecture balancing accuracy and computational efficiency
- MobileNetV3-S: Lightweight network suitable for mobile devices
- YOLOv8-n: Latest version of YOLO object detection model
- YOLOv5-s: Mature YOLO model variant
Training Strategy:
- Traditional CNN models (ResNeXt-50, EfficientNet-B0, MobileNetV3-S): Use ImageNet pre-trained weights, freeze base layers, add custom classification heads
- YOLO models: Train from scratch
- Stratified K-Fold Cross-Validation: Employs 5-fold stratified cross-validation ensuring consistent label proportions across folds
- Data Balancing:
- Reduces oversampled classes using median class count values
- Applies data augmentation techniques (flipping, brightness-contrast adjustment) for undersampled classes
- Localization Mapping: Directly maps classification results to Nepal's standard color-coded bins
Combined Dataset comprises:
- Medical Waste Dataset 4.0:
- Source: Tuscany region, Italy, collected using OAK 4.0 camera equipment
- Categories: Gauze, glove pairs, single gloves, medical caps, medical glasses, shoe covers, etc.
- Pharmaceutical and Biomedical Waste Dataset:
- Source: Collected by Engineering UBU
- Categories: Body tissues, organic waste, equipment packaging, syringe needles, etc.
Data Preprocessing:
- Removes duplicate glove categories to reduce bias
- Handles class imbalance using median class count
- Applies data augmentation techniques
- Accuracy
- Precision
- Recall
- F1-Score
- Inference Time
Five models compared against each other: ResNeXt-50, EfficientNet-B0, MobileNetV3-S, YOLOv8-n, YOLOv5-s
- Hardware: Two NVIDIA Tesla T4 GPUs
- Training Epochs: 30 epochs
- Cross-Validation: 5-fold stratified cross-validation with 80% training and 20% validation
| Model | Accuracy | Precision | Recall | F1-Score | Inference Time (ms) |
|---|
| YOLOv5-s | 95.06% | 96.65% | 95.06% | 94.87% | 10.97 |
| YOLOv8-n | 94.68% | 96.44% | 94.68% | 94.57% | 9.29 |
| EfficientNet-B0 | 93.22% | 94.81% | 93.22% | 93.04% | 444.67 |
| MobileNetV3-S | 91.05% | 92.90% | 91.05% | 90.95% | 369.24 |
| ResNeXt-50 | 74.51% | 76.53% | 74.51% | 74.48% | 395.74 |
- YOLOv5-s Achieves Best Performance: Obtains highest scores across accuracy, precision, recall, and F1-score
- Inference Speed Advantage: YOLO models (v5-s and v8-n) significantly outperform other models in inference time
- Practical Trade-offs: YOLOv8-n slightly edges YOLOv5-s in inference speed but with marginally lower accuracy
Repeated ANOVA test results demonstrate:
- Models have highly significant effects on performance metrics
- Significant differences exist between evaluation metrics
- Highly significant interaction effects between models and metrics
| Study | Categories | Best Model | Accuracy |
|---|
| Bruno et al. | 7 classes | EfficientNet-B0 | 99.45% |
| This Study | 23 classes | YOLOv5-s | 95.06% |
While Bruno et al. achieved 99.45% accuracy on a 7-class task, this study achieves 95.06% accuracy on the more challenging 23-class task.
- Deep Learning Applications in Medical Waste Classification: Application of models such as ResNeXt-50 and EfficientNet
- IoT and AI Integration for Automated Sorting: Integration of YOLO models with IoT devices
- Real-Time Deployment and Edge Computing: Practical applications in healthcare environments
- More Comprehensive Category Coverage: 23 categories versus 6-8 in previous studies
- Localization Standards Alignment: Complies with Nepal's national standards
- Practical Deployment: Provides accessible web application
- YOLOv5-s is the Optimal Choice: Demonstrates superior accuracy and comprehensive performance
- YOLO Models Suitable for Real-Time Applications: Fast inference speed enables practical deployment
- Deep Learning Effectively Addresses Medical Waste Classification: Provides a viable AI solution for Nepal's medical waste management
- Dataset Constraints:
- Missing certain categories: cytotoxic, radioactive, pathological, chemical, and liquid waste
- Data biased toward common items (gloves, gauze)
- Data collected from non-Nepali environments
- Practical Application Challenges:
- Real-world waste may be occluded, mixed, or haphazardly packaged
- Model performance may degrade in complex real-world scenarios
- Dataset Expansion: Collect more representative local data
- Missing Category Inclusion: Add all waste categories specified in Nepal's standards
- Real-World Environment Testing: Validate model performance in actual healthcare settings
- System Integration: Integrate with existing waste management systems
- High Practical Value: Addresses actual medical waste management challenges in Nepal
- Rigorous Methodology: Employs stratified cross-validation and statistical significance testing
- Comprehensive Model Comparison: Covers diverse state-of-the-art model types
- Practical Deployment: Provides usable web application, enhancing research utility
- Localization Considerations: Aligns with local standards, offering practical applicability
- Insufficient Dataset Representativeness: Lacks localized data, potentially affecting real-world performance
- Incomplete Category Coverage: Does not include all waste categories in Nepal's standards
- Lack of Real-World Environment Validation: Primarily tested in controlled environments
- Limited Technical Innovation: Mainly applies and compares existing models without methodological novelty
- Field Contribution: Provides an AI solution model for medical waste management in developing countries
- Practical Value: Directly applicable to Nepali healthcare institutions
- Reproducibility: Public dataset and code facilitate reproduction and extension
- Healthcare Institutions: Waste segregation in hospitals and clinics
- Waste Processing Centers: Large-scale medical waste processing
- Regulatory Agencies: Waste management compliance inspection
- Other Developing Countries: Similar medical waste management challenges
The paper cites 16 relevant references covering deep learning applications in medical waste classification, IoT applications, and Nepal's medical waste management status, providing solid theoretical foundation and practical reference for this research.
Overall Assessment: This is an applied research paper with strong practical value. While relatively limited in technical innovation, its focus on real-world problems, rigorous experimental design, and deployment efforts provide significant social value and application prospects.