Generalized Task-Driven Medical Image Quality Enhancement with Gradient Promotion
Zhang, Cheng
Thanks to the recent achievements in task-driven image quality enhancement (IQE) models like ESTR, the image enhancement model and the visual recognition model can mutually enhance each other's quantitation while producing high-quality processed images that are perceivable by our human vision systems. However, existing task-driven IQE models tend to overlook an underlying fact -- different levels of vision tasks have varying and sometimes conflicting requirements of image features. To address this problem, this paper proposes a generalized gradient promotion (GradProm) training strategy for task-driven IQE of medical images. Specifically, we partition a task-driven IQE system into two sub-models, i.e., a mainstream model for image enhancement and an auxiliary model for visual recognition. During training, GradProm updates only parameters of the image enhancement model using gradients of the visual recognition model and the image enhancement model, but only when gradients of these two sub-models are aligned in the same direction, which is measured by their cosine similarity. In case gradients of these two sub-models are not in the same direction, GradProm only uses the gradient of the image enhancement model to update its parameters. Theoretically, we have proved that the optimization direction of the image enhancement model will not be biased by the auxiliary visual recognition model under the implementation of GradProm. Empirically, extensive experimental results on four public yet challenging medical image datasets demonstrated the superior performance of GradProm over existing state-of-the-art methods.
academic
Generalized Task-Driven Medical Image Quality Enhancement with Gradient Promotion
This paper proposes a generalized gradient promotion (GradProm) training strategy for task-driven medical image quality enhancement (IQE). While existing task-driven IQE models (such as ESTR) achieve mutual promotion between image enhancement and visual recognition models, they overlook an important fact: different levels of visual tasks have different and sometimes conflicting feature requirements. To address this issue, the paper divides the task-driven IQE system into two sub-models: a primary image enhancement model and an auxiliary visual recognition model. GradProm updates the image enhancement model parameters using gradients from both sub-models only when their gradient directions are consistent; otherwise, it uses only the image enhancement model's own gradients. The method is theoretically proven to ensure that the optimization direction of the image enhancement model is not biased by the auxiliary visual recognition model. Experimental results on four public medical image datasets validate its superiority.
Medical image analysis plays an increasingly important role in modern medical systems, helping physicians visualize internal anatomical structures and assess disease progression. Image quality is critical for medical image analysis, as higher quality images typically yield more accurate recognition performance.
Issues with Perception-Oriented Approaches: Traditional perception-oriented medical image processing methods primarily pursue high-quality performance aligned with human visual perception. However, enhanced visual perception quality does not necessarily translate to beneficial information for downstream visual recognition models.
Deficiencies in Task-Driven Methods: While existing task-driven IQE methods jointly train image enhancement and visual recognition models, they overlook an important fact—different levels of computer vision tasks have different and sometimes conflicting feature requirements.
As illustrated in Figure 2, given the same input image, denoising tasks focus on all image regions, semantic segmentation tasks focus on foreground object regions, while diagnostic tasks focus on discriminative local regions of foreground objects. This inconsistency in feature requirements creates potential conflicts between upstream image enhancement models and downstream visual recognition models, affecting performance.
Proposes a new paradigm for task-driven medical IQE: Explicitly divides the system into primary image enhancement and auxiliary visual recognition sub-models
Designs the GradProm training strategy: A simple yet effective generalized training strategy that dynamically trains both sub-models and achieves continuous performance improvement without requiring additional data or network architecture modifications
Provides theoretical proof: Demonstrates that GradProm converges to local optima without bias from the auxiliary visual recognition model
Comprehensive experimental validation: Conducts extensive experiments on four public medical image datasets, demonstrating that GradProm achieves state-of-the-art performance on IQE tasks
Task-driven medical IQE is essentially an image enhancement task where the input is a low-quality image X, with the corresponding high-quality image Y as the label. The training process aims to make X processed through the image enhancement model IP and visual recognition model VR as close as possible to Y.
Proof Highlights: By proving that the update direction has a non-negative inner product with the primary model's gradient, the correctness of the optimization direction is ensured, preventing bias introduction from the auxiliary model.
Performance comparison at different noise levels (Tables 1 and 2):
Noise σ=0.1
PSNR↑
SSIM↑
Frozen-params
32.152
0.906
GradProm
33.383
0.915
GradProm outperforms baseline methods at various noise levels, achieving 1.231 PSNR and 0.009 SSIM improvements over the Frozen-params method at σ=0.1.
Simultaneously using diagnosis and segmentation as auxiliary tasks did not improve performance; instead, performance decreased, confirming the hypothesis of inconsistent feature requirements across different visual tasks.
In cross-domain experiments trained on ISIC 2018 and tested on Lizard, GradProm achieves PSNR/SSIM improvements of 13.273/0.325 and 13.825/0.458 over ESTR in unsupervised and supervised settings, respectively.
Multi-task Learning: Leveraging useful knowledge from related tasks to improve overall performance of all involved tasks
Auxiliary Learning: When multiple tasks have different importance levels, tasks are divided into primary and auxiliary tasks
This paper frames task-driven medical image quality enhancement as an auxiliary learning paradigm, where image processing is the primary task and image recognition is the auxiliary task.
Gradient Computation Overhead: Requires additional gradient similarity computation, increasing training time
Simplistic Threshold Setting: Using only 0 as the threshold may be too coarse; finer-grained strategies could yield better results
Limited Cross-Domain Validation: While generalization across different medical imaging modalities is validated, cross-domain validation is insufficient
Limited Baseline Selection: Some comparison methods may not be the most recent SOTA approaches
The paper cites abundant related work, primarily including:
ESTR 1 - Representative work in task-driven image quality enhancement
ResNet 6 - Classical deep learning architecture
UNet 39 - Classical method for medical image segmentation
Multiple papers on medical image datasets 40-43
Overall Assessment: This is a high-quality computer vision paper that proposes an innovative solution to a key problem in task-driven medical image quality enhancement. The method is simple and effective, with solid theoretical foundations and comprehensive experimental validation, demonstrating significant academic and practical value.