AquaCluster: Using Satellite Images And Self-supervised Machine Learning Networks To Detect Water Hidden Under Vegetation
Iakovidis, Kalantari, Payberah et al.
In recent years, the wide availability of high-resolution radar satellite images has enabled the remote monitoring of wetland surface areas. Machine learning models have achieved state-of-the-art results in segmenting wetlands from satellite images. However, these models require large amounts of manually annotated satellite images, which are slow and expensive to produce. The need for annotated training data makes it difficult to adapt these models to changes such as different climates or sensors. To address this issue, we employed self-supervised training methods to develop a model, AquaCluster, which segments radar satellite images into water and land areas without manual annotations. Our final model outperformed other radar-based water detection techniques that do not require annotated data in our test dataset, having achieved a 0.08 improvement in the Intersection over Union metric. Our results demonstrate that it is possible to train machine learning models to detect vegetated water from radar images without the use of annotated data, which can make the retraining of these models to account for changes much easier.
academic
AquaCluster: Using Satellite Images And Self-supervised Machine Learning Networks To Detect Water Hidden Under Vegetation
Recent widespread availability of high-resolution radar satellite imagery has enabled remote monitoring of wetland surface area. Machine learning models have achieved state-of-the-art results in wetland segmentation tasks on satellite images. However, these models require large quantities of manually annotated satellite images, which are costly and time-consuming to produce. The demand for annotated training data makes these models difficult to adapt to variations in climate, sensors, and other factors. To address this issue, this research develops the AquaCluster model using self-supervised training methods, which can segment radar satellite images into water and land regions without manual annotation. On the test dataset, the model demonstrates superior performance among annotation-free radar water detection techniques, achieving an 0.08 improvement in Intersection over Union (IoU) metric. The results demonstrate that machine learning models can be trained to detect vegetation-covered water bodies from radar images without using annotated data, making it easier to retrain models to adapt to changing conditions.
Importance of Wetland Monitoring: Although wetlands occupy only a small fraction of Earth's surface, they play a critical role in environmental protection and climate impact mitigation, including water purification, flood risk reduction, and carbon storage. However, wetlands are disappearing at an alarming rate due to climate change and human activities.
Challenges in Detecting Vegetation-Covered Water Bodies: Traditional optical satellite images perform well in detecting open water bodies but struggle with partially or completely vegetation-covered wetland water bodies, as optical sensors cannot penetrate vegetation. While radar sensors can penetrate vegetation to detect water beneath, radar images contain noise (such as speckle noise), making it difficult to distinguish water from land.
Limitations of Existing Methods:
Deep learning models such as CNNs perform well in wetland segmentation tasks but require large quantities of annotated data
Producing annotated data is costly and time-consuming, particularly in remote sensing where specialized knowledge is required
Models struggle to adapt to variations in climate conditions or sensors
Dependence on global or national-level datasets with low update frequencies cannot meet seasonal water body monitoring needs
The core motivation of this research is to develop a fully self-supervised machine learning framework that can achieve wetland water-land segmentation using only radar satellite images, addressing the dependency on annotated data and improving model scalability and adaptability.
Proposed the AquaCluster Framework: A fully self-supervised machine learning framework for wetland semantic segmentation using only radar satellite images, addressing the challenge of detecting water bodies beneath vegetation without annotated data.
Introduced Ensemble Model Version: To improve accuracy and stability, an ensemble version combining predictions from multiple independently trained networks is proposed.
Validated Effectiveness of Annotation-Free Training: Demonstrated that the ensemble AquaCluster model outperforms baseline statistical method Otsu and optical-based Dynamic World model on the same dataset.
Provided Open-Source Implementation: All source code, test datasets, and pre-trained models are released on GitHub, facilitating research reproducibility and application promotion.
Ensemble Model Optimal: The AquaCluster ensemble version demonstrates superior performance across all metrics
Significant Recall Improvement: Compared to Otsu method, AquaCluster shows substantial improvements in recall and IoU
Outperforms Optical Methods: Dynamic World performs worst across all metrics, demonstrating the advantage of radar data in detecting vegetation-covered water bodies
Model Stability: Individual AquaCluster models show high performance variability (IoU ranging from 0.7 to 0.9), with ensemble methods effectively improving stability
This work's advantages over existing research include: specialized design for radar images, no optical data requirement, and fully self-supervised training.
The paper cites 60 relevant references covering multiple domains including wetland ecology, remote sensing technology, deep learning, and self-supervised learning, providing a solid theoretical foundation for the research.
Overall Assessment: This is a high-quality application-oriented research paper that proposes innovative solutions to practical problems with certain technical contributions and high practical value. Although it has some limitations in theoretical analysis and dataset scale, its open-source contributions and practical application value make it an important work in the field.