Machine Learning for Improved Post-Fire Debris Flow Likelihood Prediction
Timely prediction of debris flow probabilities in areas impacted by wildfires is crucial to mitigate public exposure to this hazard during post-fire rainstorms. This paper presents a machine learning approach to amend an existing dataset of post-fire debris flow events with additional features reflecting existing vegetation type and geology, and train traditional and deep learning methods on a randomly selected subset of the data. The developed methods achieve AUC (area under the receiver operational characteristic curve) values of 0.93 (random forest) and 0.92 (neural network) on the test set, representing a significant improvement over a logistic regression model currently used (AUC 0.79). The paper also overviews a distributed, Kubernetes-based big data processing pipeline to efficiently retrieve features in areas impacted by new fires, and deploy the methods for real-time prediction of debris flow hazards.