2025-11-15T20:58:11.863584

MIP-Based Tumor Segmentation: A Radiologist-Inspired Approach

Zarik, Kiryati, Green et al.

PET/CT imaging is the gold standard for tumor detection, offering high accuracy in identifying local and metastatic lesions. Radiologists often begin assessment with rotational Multi-Angle Maximum Intensity Projections (MIPs) from PET, confirming findings with volumetric slices. This workflow is time-consuming, especially in metastatic cases. Despite their clinical utility, MIPs are underutilized in automated tumor segmentation, where 3D volumetric data remains the norm. We propose an alternative approach that trains segmentation models directly on MIPs, bypassing the need to segment 3D volumes and then project. This better aligns the model with its target domain and yields substantial gains in computational efficiency and training time. We also introduce a novel occlusion correction method that restores MIP annotations occluded by high-intensity structures, improving segmentation. Using the autoPET 2022 Grand Challenge dataset, we evaluate our method against standard 3D pipelines in terms of performance and training/computation efficiency for segmentation and classification, and analyze how MIP count affects segmentation. Our MIP-based approach achieves segmentation performance on par with 3D (<=1% Dice difference, 26.7% better Hausdorff Distance), while reducing training time (convergence time) by 55.8-75.8%, energy per epoch by 71.7-76%, and TFLOPs by two orders of magnitude, highlighting its scalability for clinical use. For classification, using 16 MIPs only as input, we surpass 3D performance while reducing training time by over 10x and energy consumption per epoch by 93.35%. Our analysis of the impact of MIP count on segmentation identified 48 views as optimal, offering the best trade-off between performance and efficiency.

academic

MIP-Based Tumor Segmentation: A Radiologist-Inspired Approach

Basic Information

Paper ID: 2510.09326
Title: MIP-Based Tumor Segmentation: A Radiologist-Inspired Approach
Authors: Romario Zarik, Nahum Kiryati, Michael Green, Liran Domachevsky, Arnaldo Mayer
Classification: eess.IV (Electrical Engineering and Systems Science - Image and Video Processing)
Publication Date: October 10, 2025
Paper Link: https://arxiv.org/abs/2510.09326v1

Abstract

This paper proposes a tumor segmentation method based on multi-angle Maximum Intensity Projections (MIPs), training segmentation models directly on MIPs rather than employing the conventional approach of 3D volume segmentation followed by projection. Using the autoPET 2022 dataset, the method maintains comparable performance to 3D approaches (Dice difference ≤1%, Hausdorff distance improvement of 26.7%) while achieving significant computational efficiency gains: training time reduction of 55.8-75.8%, per-epoch energy consumption reduction of 71.7-76%, and computational complexity reduction by two orders of magnitude. For classification tasks, using only 16 MIPs surpasses 3D performance with over 10-fold reduction in training time.

Research Background and Motivation

Problem Definition

PET/CT imaging is the gold standard for tumor detection. In clinical practice, radiologists typically first examine rotated multi-angle Maximum Intensity Projections (MIPs) to assess cases, then confirm findings through volumetric slices. This workflow is particularly time-consuming in metastatic cases.

Research Motivation

Mismatch between clinical practice and algorithms: Despite widespread MIP usage in clinical settings, automated tumor segmentation still primarily relies on 3D volumetric data
Computational efficiency requirements: Traditional 3D segmentation methods have high computational complexity and lengthy training times, hindering clinical deployment
Resource constraints: Processing large-scale 3D data on standard hardware is challenging
Domain alignment: Direct training on MIPs better aligns with radiologists' diagnostic thinking

Limitations of Existing Methods

High computational overhead in conventional 3D segmentation followed by MIP projection
Existing MIP applications primarily limited to detection and classification, with limited segmentation applications
Lack of effective solutions for MIP occlusion problems
Insufficient exploitation of MIP computational efficiency advantages

Core Contributions

Direct MIP segmentation method: Proposes training segmentation models directly on MIPs, avoiding the complex 3D segmentation-then-projection workflow
Occlusion correction technique: Introduces a novel MIP annotation occlusion correction method to address high-intensity structure occlusion
Significant efficiency improvements: Achieves substantial reductions in training time, energy consumption, and computational load while maintaining comparable performance
Optimal MIP quantity analysis: Systematically analyzes the impact of MIP quantity on segmentation performance, identifying 48 viewpoints as the optimal configuration

Methodology Details

Task Definition

Input: 3D PET scan data Output: Tumor segmentation results Objective: Perform semantic segmentation directly on multi-angle MIPs, avoiding 3D volume processing

MIP Generation Method

MIP images are generated using the following formula:

$F_k(i,j) = \max_d f_k(i,j,d)$

Where:

$F_k(i,j)$ : Value of the k-th MIP image at pixel (i,j)
$f_k(i,j,d)$ : 3D data rotated by angle $k\Delta\Theta$ around the vertical axis
Angular step size: $\Delta\Theta(N) = \frac{180°}{N}$ , where N is the number of MIPs

Occlusion Correction Algorithm

To address the problem of high-intensity organs (such as brain, heart, kidneys) occluding tumor annotations, a three-step processing pipeline is designed:

Occlusion detection: Verify that at least 75% of pixels in each marked tumor actually originate from tumor regions in the volumetric PET data
Annotation segmentation: For marked regions with tumor pixel ratios <75%, retain only pixels confirmed to originate from tumors
Low-contrast filtering: Remove tumor remnants with extremely low contrast that are imperceptible to the human eye

Model Architecture

Segmentation model: Attention U-Net, showing the best performance among various CNN architectures
3D baseline: Swin-UNETR architecture, based on the 5th-place solution from the autoPET 2022 challenge
Classification model: CNN encoder + attention pooling + fully connected head

Technical Innovations

Domain-aligned design: Direct training on MIP views commonly used by radiologists, enhancing clinical relevance
Computational efficiency optimization: 16 MIPs represent only approximately 4% of volumetric information, substantially reducing memory and computational requirements
Occlusion problem resolution: First systematic solution to occlusion problems in MIP annotations
End-to-end optimization: Eliminates the two-stage 3D segmentation-then-projection workflow

Experimental Setup

Dataset

Data source: autoPET 2022 open-source dataset
Scale: 1,014 PET/CT scans from 900 patients
Disease types: Lung cancer, lymphoma, melanoma, healthy controls
Data distribution: Healthy (513), lymphoma (145), melanoma (188), lung cancer (168)

Data Partitioning

Independent test set: 15%
5-fold cross-validation: 85%
Consistent class distribution maintained

Evaluation Metrics

Segmentation tasks:

Dice Score: Overlap measure
IoU: Intersection over Union
Hausdorff Distance: Boundary accuracy

Classification tasks:

Accuracy, Precision, Recall, F1-score

Efficiency metrics:

Convergence time (CT): Time to reach peak validation performance
Per-epoch training time (TPE) and energy consumption (EPE)
Computational complexity (TFLOPs)

Comparison Methods

3D Swin-UNETR segmentation followed by MIP projection
3D classification models using the same CNN architecture

Experimental Results

Main Results

Segmentation Performance Comparison

Method	Dice Score	IoU	Hausdorff Distance
3D Projection	0.597±0.05	0.471±0.04	139.614±8.42
OR-MIPs	0.578±0.01	0.452±0.01	102.813±9.61
OC-MIPs	0.591±0.01	0.466±0.01	102.26±9.53

Efficiency Improvements

Metric	3D Method	OC-MIPs	Improvement Factor
Training time (hours)	54.64±19.22	13.18±4.1	4.1×
Per-epoch energy (Wh)	142.2±79.1	34.194±4.7	4.2×
TFLOPs	317.42±144.05	0.97±0.29	327×

Classification Results

Metric	3D Data	16 MIPs	Improvement
Accuracy (%)	72.8±3.2	80.5±1.7	+7.7%
F1 Score (%)	82.3±1.2	86.4±0.8	+4.1%
Training time	44.7±1.5 hours	4.2±0.2 hours	10.6×

MIP Quantity Impact Analysis

Systematic analysis of the impact of 16, 32, 48, 64, and 80 MIPs:

Optimal configuration: 48 MIPs provide the highest and most stable Dice scores
Statistical significance: 16 and 32 MIPs show statistically significant differences in training sets
Efficiency balance: 48 MIPs achieve optimal balance between performance and computational efficiency

Key Findings

Performance equivalence: Wilcoxon signed-rank test shows no statistically significant difference between MIP and 3D methods (p=0.22)
Boundary accuracy: MIP method demonstrates superior performance in Hausdorff distance, with 26.7% improvement
Occlusion correction effectiveness: Only 0.57% of tumors are completely excluded, maintaining annotation integrity
Scalability: Computational complexity reduced by two orders of magnitude, significantly improving clinical applicability

MIP Applications in Medical Imaging

Detection tasks: Kawakami et al. used YOLOv2 to detect physiological uptake on multi-directional MIPs
Classification applications: Takahashi et al. employed Xception models to improve breast cancer classification
Feature extraction: Toosi et al. extracted features from 72 MIPs for survival prediction

2D Projection Method Development

Enhanced 3D segmentation: Constantino et al. demonstrated that MIPs enhance 3D PET/CT segmentation
Volume reconstruction: Toosi et al. reconstructed volumetric segmentation from 2D MIPs
2.75D methods: Wang et al. combined multiple 2D views to enrich 3D learning

Advantages Relative to This Work

First systematic direct MIP segmentation method
Innovative techniques for addressing MIP occlusion problems
Comprehensive efficiency and performance evaluation
Clinical workflow-aligned design

Conclusions and Discussion

Main Conclusions

Performance equivalence: Direct MIP segmentation maintains comparable performance to 3D methods while achieving significant computational efficiency gains
Optimal configuration: 48 MIP viewpoints represent the optimal balance between performance and efficiency
Clinical applicability: Substantially reduced computational requirements make the method more suitable for resource-constrained clinical environments
Method generalizability: Demonstrates advantages in both segmentation and classification tasks

Limitations

Single dataset: Validation only on autoPET 2022 dataset; broader validation needed
PET-specific: Current method primarily targets PET data; CT integration remains unexplored
3D information loss: Projection process inevitably loses some 3D spatial information
Occlusion handling: While improved, complex occlusion scenarios may still impact performance

Future Directions

Multi-modal integration: Map CT information to MIPs for joint PET/CT analysis
3D reconstruction: Explore methods for reconstructing 3D annotations from MIP segmentation results
Extended validation: Validate the method on more datasets and disease types
Real-time applications: Develop real-time MIP segmentation systems to support clinical decision-making

In-Depth Evaluation

Strengths

Strong innovation: First systematic direct MIP segmentation method, highly aligned with clinical practice
High practical value: Significant efficiency improvements provide strong clinical application potential
Comprehensive technical approach: Complete technical solutions from occlusion correction to optimal parameter analysis
Thorough validation: Comprehensive evaluation on both segmentation and classification tasks
Good reproducibility: Code and tools are publicly available

Weaknesses

Insufficient theoretical analysis: Lacks in-depth theoretical analysis of why MIP methods achieve comparable performance
Dataset limitations: Single dataset may limit the generalizability of conclusions
Missing clinical validation: Lacks validation studies in actual clinical environments
Limited comparison methods: Primarily compares with basic 3D methods; lacks comparison with latest SOTA methods

Impact

Academic contribution: Provides a new efficient paradigm for medical image segmentation
Clinical value: Likely to significantly improve the efficiency of automated PET scan analysis
Technology promotion: Method is extensible to other medical image projection analysis tasks
Resource optimization: Provides feasible solutions for resource-constrained environments

Applicable Scenarios

Clinical screening: Rapid preliminary analysis in large-scale tumor screening
Resource-constrained environments: Medical institutions with limited computational resources
Real-time applications: Clinical decision support systems requiring rapid response
Mobile healthcare: Medical image analysis on portable devices

References

This paper cites 34 relevant references, primarily including:

Medical image processing frameworks (MONAI, PyTorch)
PET/CT imaging technology fundamentals
Deep learning segmentation and classification methods
MIP applications in medical imaging
Related evaluation metrics and datasets

Overall Assessment: This is a high-quality medical image processing paper that proposes an innovative and practical direct MIP segmentation method. While maintaining academic rigor, the paper emphasizes clinical practicality and provides a new efficient solution for medical imaging AI applications. Despite some limitations, its significant efficiency improvements and strong performance make it of considerable academic and applied value.