π Land Type Identification via Satellite Imagery
- Project Type: Group Research Project
- With: Gabriel Mariadass and Vincent Gefflaut, supervised by Konstantin Kuznetsov & Pavel Litvinov (GRASP Earth)
- Ressources: Read the report: Report
- When: 2024
- Associated to: ENPC
π Overview
This project explored unsupervised land type classification based on satellite-derived climatological data. Leveraging a global dataset provided by GRASP Earth, we applied dimensionality reduction and clustering techniques to segment the Earthβs surface based on optical soil properties.
π°οΈ Dataset
We used the POLDER climatological dataset, containing 131 optical variables over 64,800 geolocated points at 1.1Β° resolution. The focus was on land-only regions, with variables such as:
- NDVI (vegetation index)
- DHR (Bidirectional Reflectance Factors at multiple wavelengths)
π Methodology
1. Data Preprocessing
- Variable selection based on physical relevance (NDVI, DHRs)
- Normalization and correlation analysis to reduce redundancy
2. Clustering on Raw Features
- Implemented Hierarchical Agglomerative Clustering (HAC) and K-Means
- Visualization over satellite maps
- Quantitative comparison using Silhouette Score
3. Dimensionality Reduction
- Applied PCA to capture >95% variance with only 3 components
- Clustering in PCA space preserved spatial structure with minimal information loss
4. Variational Autoencoder (VAE)
- Built a custom VAE for non-linear dimensionality reduction
- Trained on physical land parameters to learn a structured latent space
- Achieved superior clustering performance (Silhouette score: 0.40, vs. PCA: 0.27)
- Explored 1D VAE for interpretable, segment-based clustering
π§ Highlights
- Demonstrated effective land segmentation using unsupervised learning
- Built pipelines for data ingestion, preprocessing, clustering, and visualization
- Integrated both classical ML and deep learning (VAE) techniques
- Visualized clusters over satellite imagery for interpretability
- Latter, we developped a web application to let the users choose the variables and parameters, and explore the results, linking the segmentation with other classifications.
π Tools & Technologies
- Python (Pandas, Scikit-learn, PyTorch, Seaborn, Plotly)
- Satellite data processing (xarray, netCDF4)
- Custom interactive map generation (Mapbox)
- Web Application, via Plotly
