← Work

Landslide Hazard Engine

End-to-end pipeline predicting landslide susceptibility from terrain derivatives, rainfall signals, and land cover.

Python · XGBoost · Rasterio · FastAPI · Streamlit
View source →

Context

Road infrastructure in mountainous terrain faces persistent landslide risk. Traditional hazard mapping relies on expert-drawn polygons — expensive, inconsistent, and not reproducible across regions or time.

Problem

No systematic, data-driven method to assess landslide susceptibility along road corridors at scale. Manual approaches couldn't be updated as terrain conditions, land cover, or rainfall patterns changed.

Approach

  • Engineered 14 terrain features from 30m DEM: slope, curvature, TWI, roughness, TPI, aspect derivatives
  • Integrated rainfall proxies, land cover classification, and geological unit boundaries
  • Built spatial cross-validation to prevent geographic data leakage between train and test sets
  • Trained XGBoost classifier on labeled landslide inventory with probability calibration
  • Deployed risk tile API via FastAPI with interactive Streamlit dashboard

Technical Decisions

  • XGBoost over deep learning — interpretability matters when infrastructure teams need to justify spending decisions
  • Spatial CV over random CV — standard k-fold leaks geographic autocorrelation and inflates metrics
  • Tile-based output over pixel-based — aligns with how road maintenance teams actually plan interventions
  • FastAPI serving layer — risk tiles accessible via API, not locked in a notebook

Trade-offs

  • 30m DEM limits spatial resolution — acceptable for corridor-scale assessment, insufficient for site-specific geotechnical engineering
  • Binary classification with probability output vs. multi-class severity — chose calibrated probabilities with flexible thresholds over rigid categories
  • Excluded dynamic triggers (real-time rainfall) to keep the system static and reproducible for long-term planning use

Outcome

Ranked risk tiles for municipal infrastructure prioritization. Repeatable pipeline — new data in, updated risk map out. Designed for adoption, not a one-off analysis.