Abstract
It has been recognized that wildfire, followed by large precipitation events, triggers both flooding and debris flows in mountainous regions. The ability to predict and mitigate these hazards is crucial in protecting public safety and infrastructure. A need for advanced modeling techniques was highlighted by re-evaluating existing prediction models from the literature. Data from 15 individual burn basins in the intermountain western United States, which contained 388 instances and 26 variables, were obtained from the United States Geological Survey (USGS). After randomly selecting a subset of the data to serve as a validation set, advanced predictive modeling techniques, using machine learning, were implemented using the remaining training data. Tenfold cross-validation was applied to the training data to ensure nearly unbiased error estimation and also to avoid model over-fitting. Linear, nonlinear, and rule-based predictive models including naïve Bayes, mixture discriminant analysis, classification trees, and logistic regression models were developed and tested on the validation dataset. Results for the new non-linear approaches were nearly twice as successful as those for the linear models, previously published in debris flow prediction literature. The new prediction models advance the current state-of-the-art of debris flow prediction and improve the ability to accurately predict debris flow events in wildfire-prone intermountain western United States.
Similar content being viewed by others
References
Agresti A (2002) Introduction to generalized linear models. In: Balding DJ, Bloomfield P, Noel NAC, Fisher NI, Johnstone IM, Kadane JB, Ryan LM, Scott DW, Smith AFM, Teugels JL (eds) Categorical data analysis, 2nd edn. Wiley, Hoboken, pp 115–164
Bailey RW, Craddock GW, Croft AR (1947) Watershed management for summer flood control in Utah. US Department of Agriculture, Washington
Benediktsson JA, Swain PH, Ersoy OK (1990) Neural network approaches versus statistical-methods in classification of multisource remote-sensing data. IEEE Trans Geosci Remote Sens 28(4):540–52. doi:10.1109/TGRS.1990.572944
Cannon SH (2001) Debris flow generation from recently burned watersheds. Environ Eng Geosci 7:321–341. doi:10.2113/gseegeosci.7.4.321
Cannon SH, Degraff JV (2009) The increasing wildfire and post-fire debris-flow threat in Western USA, and implications for consequences of climate change. In: Sassa K, Canuti P (eds) Landslides—disaster risk reduction, 1st edn. Springer, Berlin, pp 177–190
Cannon SH, Gartner JE (2005) Wildfire-related debris flow from a hazards perspective. In: Debris flow hazards and related phenomena, 1st edn. Springer, Berlin, pp 363–385
Cannon SH, Gartner JE, Rupert MG, Michael JA, Rea AH, Parrett C (2010) Predicting the probability and volume of postwildfire debris flows in the intermountain Western United States. Geol Soc Am Bull 122:127–44
Cannon SH, Kirkham RM, Parise M (2000) Wildfire-related debris-flow initiation process, Storm King Mountain, Colorado. Geomorphology 39:171–188. doi:10.1016/S0169-555X(00)00108-2
Clark J (2013) Remote sensing and geospatial support to burned area emergency response (BAER) teams in assessing wildfire effects to hillslopes. In: Landslide science and practice, vol 4, global environmental change, 1st edn. Springer, Berlin, pp 211–215
Clemmensen L, Hastie T, Witten D, Ersboll B (2011) Sparse discriminant analysis. Technometrics 53(4):406–413. doi:10.1198/TECH.2011.08118
De Graff JV (2014) Improvement in quantifying debris flow risk for post-wildfire emergency response. Geoenviron Disasters. doi:10.1186/s40677-014-0005-2
De Graff JV, Lewis DS (1989) Using past landslide activity to guide post-wildfire mitigation. In: Engineering geology and geotechnical engineering, 1st edn. 25th symposium on engineering geology and geotechnical engineering, Nevada, p 65
Eaton EC (1936) Flood and erosion control problems and their solution. Trans Am Soc Civ Eng 61:1021–1049
Faraway J (1995) Data splitting strategies for reducing the effect of model selection on inference. Dissertation, University of Michigan
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874
Fisher R (1936) The use of multiple measurements in taxonomic problems. Ann Hum Genet. doi:10.1111/j.1469-1809.1936.tb02137.x
Freedman DA (1983) A note on screening regression equations. Am Stat. doi:10.2307/2685877
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for general linear models via coordinate descent. J Stat Softw 33(1):1–22
Gartner JE, Cannon SH, Bigio ER, Davis NK, Parrett C, Pierce KL, Rupert MG, Thurston BL, Trebish MJ, Garcia SP, Rea AH (2005) Compilation of data relating to the erosive response of 606 recently burned basins in the Western US. US Geological Survey. https://pubs.usgs.gov/of/2005/1218/Report.html
Gartner JE, Cannon SH, Helsel DR, Bandurraga M (2009) Multivariate statistical models for predicting sediment yields from southern California watersheds. US Geological Survey. https://pubs.er.usgs.gov/publication/ofr20091200
Gartner JK, Cannon SH, Santi PM, deWolfe VG (2007) Empirical models to predict the volumes of debris flows generated by recently burned basins in the western U.S. Geomorphology. doi:10.1016/j.geomorph.2007.02.033
Gartner JE, Cannon SH, Santi PM (2011) Implementation of post-fire debris flow hazard assessment along drainage networks, southern California, U.S.A. U.S Geological Survey, Reston. doi:10.4408/IJEGE.2011-03.B-093
Hardle W (2004) Nonparametric density estimation. In: Nonparametric and semiparametric models. Springer, Berlin, pp 39–83
Harrell FE (2001) Regression modeling strategies: with applications to linear models, logistic regression, and survival analysis. Springer, New York
Haupt SE, Pasini A, Marzban C (2009) Artificial intelligence methods in the environmental sciences. Springer, Netherlands
Hsieh WW (2009) Machine learning methods in the environmental sciences: neural networks and kernels. Cambridge University Press, Cambridge
Ichoku C (2011) Earth observatory. National Aeronautics and Space Administration. http://earthobservatory.nasa.gov/GlobalMaps/view.php
Key CH, Benson NC (2006) Landscape assessment: ground measure severity, the composite burn index; and remote sensing of severity, the normalized burn ratio. U.S. Geological Survey. https://pubs.er.usgs.gov/publication/2002085. Accessed 01 Jan 2015
Kotsiantis S, Kannellopoulos D, Pintelas P (2006) Data preprocessing for supervised learning. Int J Comput Sci 1(2):111–117
Krasnopolsky VM (2007) Neural network emulations for complex multidimensional geophysical mappings: applications of neural network techniques to atmospheric and oceanic satellite retrievals and numerical modeling. Rev Geophys 45(3):RG3009. doi:10.1029/2006RG000200
Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, New York
Olden JD, Jackson DA (2002) A comparison of statistical approaches for modelling fish species distributions. Freshw Biol 47(10):1976–95
Oommen T, Misra D, Twarakavi NKC, Prakash A, Sahoo B, Bandopadhyay S (2008) An objective analysis of support vector machine based classification for remote sensing. Math Geosci 40:409. doi:10.1007/s11004-008-9156-6
RStudio Team (2015) RStudio: integrated development for R. RStudio, Inc. http://www.rstudio.com/
Rupert MG, Cannon SH, Gartner JE (2003) Using logistic regression to predict the probability of debris flows occurring in areas of recently burned by Wildland Fires. U.S. Geological Survey. https://pubs.er.usgs.gov/publication/ofr03500
Rupert MG, Cannon SH, Gartner JE, Michael JA, Helsel DR (2008) Using logistic regression to predict the probability of debris flows in areas Burned by Wildfires, southern California, 2003–2006. U.S. Geological Survey. https://pubs.gs.gov/of/2008/1370/
Sahoo BC, Oommen T, Misra D, Newby G (2007) Using the one-dimensional S-transform as a discrimination tool in classification of hyperspectral images. Can J Remote Sens 33(6):551–560
Samui P, Gowda P, Oommen T, Howell T, Marek T (2012) Statistical learning algorithms for identifying contrasting tillage practices with Landsat Thematic Mapper data. Int J Remote Sens 33:5732–5745
Santi PM, Victor G, Dewolfe JV, Higgins D, Cannon SH, Gartner JE (2007) Sources of debris flow material in burned areas. Geomorphology. doi:10.1016/j.geomorph.2007.02.022
Schwartz GE, Alexander RB (1995) Soils data for the conterminous United States derived from the NRCS state soil geographic (STATSGO) data base. U.S. Geological Survey. https://water.usgs.gov/GIS/metadata/usgswrd/XML/ussoils.xml. Accessed 01 Jan 2015
Staley DM (2014) Emergency assessment of post-fire debris-flow hazards for the 2013 Springs Fire, Ventura County, California. U.S. Geological Survey, Reston. doi:10.3133/ofr20141001
Steyerberg EW, Harrell FE (2015) Prediction models need appropriate internal, internal-external, and external validation. J Clin Epidemiol. doi:10.1016/j/jclinepi.2015.04.005
Welch B (1939) Note on discriminant functions. Biometrika. doi:10.2307/2334985
Wells WG II (1987) The effects of fire on the generation of debris flows in southern California. Geol Soc Am. doi:10.1130/REG7-p105
Acknowledgements
This project was funded by the US Department of Transportation (USDOT) through the Office of the Assistant Secretary for Research and Technology. The authors would also like to thank the following individuals for their contributions to the work described: Caesar Singh, USDOT program manager, and Susan Cannon for providing data and necessary guidance.
Author information
Authors and Affiliations
Corresponding author
Additional information
Disclaimer: The views, opinions, findings, and conclusions reflected in this paper are the responsibility of the authors only and do not represent the official policy or position of the USDOT/OST-R or any State or other entity.
Rights and permissions
About this article
Cite this article
Kern, A.N., Addison, P., Oommen, T. et al. Machine Learning Based Predictive Modeling of Debris Flow Probability Following Wildfire in the Intermountain Western United States. Math Geosci 49, 717–735 (2017). https://doi.org/10.1007/s11004-017-9681-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11004-017-9681-2