2025-11-15T20:58:11.863584

MIP-Based Tumor Segmentation: A Radiologist-Inspired Approach

Zarik, Kiryati, Green et al.
PET/CT imaging is the gold standard for tumor detection, offering high accuracy in identifying local and metastatic lesions. Radiologists often begin assessment with rotational Multi-Angle Maximum Intensity Projections (MIPs) from PET, confirming findings with volumetric slices. This workflow is time-consuming, especially in metastatic cases. Despite their clinical utility, MIPs are underutilized in automated tumor segmentation, where 3D volumetric data remains the norm. We propose an alternative approach that trains segmentation models directly on MIPs, bypassing the need to segment 3D volumes and then project. This better aligns the model with its target domain and yields substantial gains in computational efficiency and training time. We also introduce a novel occlusion correction method that restores MIP annotations occluded by high-intensity structures, improving segmentation. Using the autoPET 2022 Grand Challenge dataset, we evaluate our method against standard 3D pipelines in terms of performance and training/computation efficiency for segmentation and classification, and analyze how MIP count affects segmentation. Our MIP-based approach achieves segmentation performance on par with 3D (<=1% Dice difference, 26.7% better Hausdorff Distance), while reducing training time (convergence time) by 55.8-75.8%, energy per epoch by 71.7-76%, and TFLOPs by two orders of magnitude, highlighting its scalability for clinical use. For classification, using 16 MIPs only as input, we surpass 3D performance while reducing training time by over 10x and energy consumption per epoch by 93.35%. Our analysis of the impact of MIP count on segmentation identified 48 views as optimal, offering the best trade-off between performance and efficiency.
academic

MIP-Based Tumor Segmentation: A Radiologist-Inspired Approach

Basic Information

  • Paper ID: 2510.09326
  • Title: MIP-Based Tumor Segmentation: A Radiologist-Inspired Approach
  • Authors: Romario Zarik, Nahum Kiryati, Michael Green, Liran Domachevsky, Arnaldo Mayer
  • Classification: eess.IV (Electrical Engineering and Systems Science - Image and Video Processing)
  • Publication Date: October 10, 2025
  • Paper Link: https://arxiv.org/abs/2510.09326v1

Abstract

This paper proposes a tumor segmentation method based on multi-angle Maximum Intensity Projections (MIPs), training segmentation models directly on MIPs rather than employing the conventional approach of 3D volume segmentation followed by projection. Using the autoPET 2022 dataset, the method maintains comparable performance to 3D approaches (Dice difference ≤1%, Hausdorff distance improvement of 26.7%) while achieving significant computational efficiency gains: training time reduction of 55.8-75.8%, per-epoch energy consumption reduction of 71.7-76%, and computational complexity reduction by two orders of magnitude. For classification tasks, using only 16 MIPs surpasses 3D performance with over 10-fold reduction in training time.

Research Background and Motivation

Problem Definition

PET/CT imaging is the gold standard for tumor detection. In clinical practice, radiologists typically first examine rotated multi-angle Maximum Intensity Projections (MIPs) to assess cases, then confirm findings through volumetric slices. This workflow is particularly time-consuming in metastatic cases.

Research Motivation

  1. Mismatch between clinical practice and algorithms: Despite widespread MIP usage in clinical settings, automated tumor segmentation still primarily relies on 3D volumetric data
  2. Computational efficiency requirements: Traditional 3D segmentation methods have high computational complexity and lengthy training times, hindering clinical deployment
  3. Resource constraints: Processing large-scale 3D data on standard hardware is challenging
  4. Domain alignment: Direct training on MIPs better aligns with radiologists' diagnostic thinking

Limitations of Existing Methods

  • High computational overhead in conventional 3D segmentation followed by MIP projection
  • Existing MIP applications primarily limited to detection and classification, with limited segmentation applications
  • Lack of effective solutions for MIP occlusion problems
  • Insufficient exploitation of MIP computational efficiency advantages

Core Contributions

  1. Direct MIP segmentation method: Proposes training segmentation models directly on MIPs, avoiding the complex 3D segmentation-then-projection workflow
  2. Occlusion correction technique: Introduces a novel MIP annotation occlusion correction method to address high-intensity structure occlusion
  3. Significant efficiency improvements: Achieves substantial reductions in training time, energy consumption, and computational load while maintaining comparable performance
  4. Optimal MIP quantity analysis: Systematically analyzes the impact of MIP quantity on segmentation performance, identifying 48 viewpoints as the optimal configuration

Methodology Details

Task Definition

Input: 3D PET scan data Output: Tumor segmentation results Objective: Perform semantic segmentation directly on multi-angle MIPs, avoiding 3D volume processing

MIP Generation Method

MIP images are generated using the following formula:

Fk(i,j)=maxdfk(i,j,d)F_k(i,j) = \max_d f_k(i,j,d)

Where:

  • Fk(i,j)F_k(i,j): Value of the k-th MIP image at pixel (i,j)
  • fk(i,j,d)f_k(i,j,d): 3D data rotated by angle kΔΘk\Delta\Theta around the vertical axis
  • Angular step size: ΔΘ(N)=180°N\Delta\Theta(N) = \frac{180°}{N}, where N is the number of MIPs

Occlusion Correction Algorithm

To address the problem of high-intensity organs (such as brain, heart, kidneys) occluding tumor annotations, a three-step processing pipeline is designed:

  1. Occlusion detection: Verify that at least 75% of pixels in each marked tumor actually originate from tumor regions in the volumetric PET data
  2. Annotation segmentation: For marked regions with tumor pixel ratios <75%, retain only pixels confirmed to originate from tumors
  3. Low-contrast filtering: Remove tumor remnants with extremely low contrast that are imperceptible to the human eye

Model Architecture

  • Segmentation model: Attention U-Net, showing the best performance among various CNN architectures
  • 3D baseline: Swin-UNETR architecture, based on the 5th-place solution from the autoPET 2022 challenge
  • Classification model: CNN encoder + attention pooling + fully connected head

Technical Innovations

  1. Domain-aligned design: Direct training on MIP views commonly used by radiologists, enhancing clinical relevance
  2. Computational efficiency optimization: 16 MIPs represent only approximately 4% of volumetric information, substantially reducing memory and computational requirements
  3. Occlusion problem resolution: First systematic solution to occlusion problems in MIP annotations
  4. End-to-end optimization: Eliminates the two-stage 3D segmentation-then-projection workflow

Experimental Setup

Dataset

  • Data source: autoPET 2022 open-source dataset
  • Scale: 1,014 PET/CT scans from 900 patients
  • Disease types: Lung cancer, lymphoma, melanoma, healthy controls
  • Data distribution: Healthy (513), lymphoma (145), melanoma (188), lung cancer (168)

Data Partitioning

  • Independent test set: 15%
  • 5-fold cross-validation: 85%
  • Consistent class distribution maintained

Evaluation Metrics

Segmentation tasks:

  • Dice Score: Overlap measure
  • IoU: Intersection over Union
  • Hausdorff Distance: Boundary accuracy

Classification tasks:

  • Accuracy, Precision, Recall, F1-score

Efficiency metrics:

  • Convergence time (CT): Time to reach peak validation performance
  • Per-epoch training time (TPE) and energy consumption (EPE)
  • Computational complexity (TFLOPs)

Comparison Methods

  • 3D Swin-UNETR segmentation followed by MIP projection
  • 3D classification models using the same CNN architecture

Experimental Results

Main Results

Segmentation Performance Comparison

MethodDice ScoreIoUHausdorff Distance
3D Projection0.597±0.050.471±0.04139.614±8.42
OR-MIPs0.578±0.010.452±0.01102.813±9.61
OC-MIPs0.591±0.010.466±0.01102.26±9.53

Efficiency Improvements

Metric3D MethodOC-MIPsImprovement Factor
Training time (hours)54.64±19.2213.18±4.14.1×
Per-epoch energy (Wh)142.2±79.134.194±4.74.2×
TFLOPs317.42±144.050.97±0.29327×

Classification Results

Metric3D Data16 MIPsImprovement
Accuracy (%)72.8±3.280.5±1.7+7.7%
F1 Score (%)82.3±1.286.4±0.8+4.1%
Training time44.7±1.5 hours4.2±0.2 hours10.6×

MIP Quantity Impact Analysis

Systematic analysis of the impact of 16, 32, 48, 64, and 80 MIPs:

  • Optimal configuration: 48 MIPs provide the highest and most stable Dice scores
  • Statistical significance: 16 and 32 MIPs show statistically significant differences in training sets
  • Efficiency balance: 48 MIPs achieve optimal balance between performance and computational efficiency

Key Findings

  1. Performance equivalence: Wilcoxon signed-rank test shows no statistically significant difference between MIP and 3D methods (p=0.22)
  2. Boundary accuracy: MIP method demonstrates superior performance in Hausdorff distance, with 26.7% improvement
  3. Occlusion correction effectiveness: Only 0.57% of tumors are completely excluded, maintaining annotation integrity
  4. Scalability: Computational complexity reduced by two orders of magnitude, significantly improving clinical applicability

MIP Applications in Medical Imaging

  • Detection tasks: Kawakami et al. used YOLOv2 to detect physiological uptake on multi-directional MIPs
  • Classification applications: Takahashi et al. employed Xception models to improve breast cancer classification
  • Feature extraction: Toosi et al. extracted features from 72 MIPs for survival prediction

2D Projection Method Development

  • Enhanced 3D segmentation: Constantino et al. demonstrated that MIPs enhance 3D PET/CT segmentation
  • Volume reconstruction: Toosi et al. reconstructed volumetric segmentation from 2D MIPs
  • 2.75D methods: Wang et al. combined multiple 2D views to enrich 3D learning

Advantages Relative to This Work

  • First systematic direct MIP segmentation method
  • Innovative techniques for addressing MIP occlusion problems
  • Comprehensive efficiency and performance evaluation
  • Clinical workflow-aligned design

Conclusions and Discussion

Main Conclusions

  1. Performance equivalence: Direct MIP segmentation maintains comparable performance to 3D methods while achieving significant computational efficiency gains
  2. Optimal configuration: 48 MIP viewpoints represent the optimal balance between performance and efficiency
  3. Clinical applicability: Substantially reduced computational requirements make the method more suitable for resource-constrained clinical environments
  4. Method generalizability: Demonstrates advantages in both segmentation and classification tasks

Limitations

  1. Single dataset: Validation only on autoPET 2022 dataset; broader validation needed
  2. PET-specific: Current method primarily targets PET data; CT integration remains unexplored
  3. 3D information loss: Projection process inevitably loses some 3D spatial information
  4. Occlusion handling: While improved, complex occlusion scenarios may still impact performance

Future Directions

  1. Multi-modal integration: Map CT information to MIPs for joint PET/CT analysis
  2. 3D reconstruction: Explore methods for reconstructing 3D annotations from MIP segmentation results
  3. Extended validation: Validate the method on more datasets and disease types
  4. Real-time applications: Develop real-time MIP segmentation systems to support clinical decision-making

In-Depth Evaluation

Strengths

  1. Strong innovation: First systematic direct MIP segmentation method, highly aligned with clinical practice
  2. High practical value: Significant efficiency improvements provide strong clinical application potential
  3. Comprehensive technical approach: Complete technical solutions from occlusion correction to optimal parameter analysis
  4. Thorough validation: Comprehensive evaluation on both segmentation and classification tasks
  5. Good reproducibility: Code and tools are publicly available

Weaknesses

  1. Insufficient theoretical analysis: Lacks in-depth theoretical analysis of why MIP methods achieve comparable performance
  2. Dataset limitations: Single dataset may limit the generalizability of conclusions
  3. Missing clinical validation: Lacks validation studies in actual clinical environments
  4. Limited comparison methods: Primarily compares with basic 3D methods; lacks comparison with latest SOTA methods

Impact

  1. Academic contribution: Provides a new efficient paradigm for medical image segmentation
  2. Clinical value: Likely to significantly improve the efficiency of automated PET scan analysis
  3. Technology promotion: Method is extensible to other medical image projection analysis tasks
  4. Resource optimization: Provides feasible solutions for resource-constrained environments

Applicable Scenarios

  1. Clinical screening: Rapid preliminary analysis in large-scale tumor screening
  2. Resource-constrained environments: Medical institutions with limited computational resources
  3. Real-time applications: Clinical decision support systems requiring rapid response
  4. Mobile healthcare: Medical image analysis on portable devices

References

This paper cites 34 relevant references, primarily including:

  • Medical image processing frameworks (MONAI, PyTorch)
  • PET/CT imaging technology fundamentals
  • Deep learning segmentation and classification methods
  • MIP applications in medical imaging
  • Related evaluation metrics and datasets

Overall Assessment: This is a high-quality medical image processing paper that proposes an innovative and practical direct MIP segmentation method. While maintaining academic rigor, the paper emphasizes clinical practicality and provides a new efficient solution for medical imaging AI applications. Despite some limitations, its significant efficiency improvements and strong performance make it of considerable academic and applied value.