Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods

Pham, Binh Thai; Tien Bui, Dieu; Pourghasemi, Hamid Reza; Indra, Prakash; Dholakia, M. B.

doi:10.1007/s00704-015-1702-9

Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods

Original Paper
Published: 23 December 2015

Volume 128, pages 255–273, (2017)
Cite this article

Theoretical and Applied Climatology Aims and scope Submit manuscript

Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods

Binh Thai Pham^1,2,
Dieu Tien Bui³,
Hamid Reza Pourghasemi⁴,
Prakash Indra⁵ &
…
M. B. Dholakia⁶

3045 Accesses
254 Citations
1 Altmetric
Explore all metrics

Abstract

The objective of this study is to make a comparison of the prediction performance of three techniques, Functional Trees (FT), Multilayer Perceptron Neural Networks (MLP Neural Nets), and Naïve Bayes (NB) for landslide susceptibility assessment at the Uttarakhand Area (India). Firstly, a landslide inventory map with 430 landslide locations in the study area was constructed from various sources. Landslide locations were then randomly split into two parts (i) 70 % landslide locations being used for training models (ii) 30 % landslide locations being employed for validation process. Secondly, a total of eleven landslide conditioning factors including slope angle, slope aspect, elevation, curvature, lithology, soil, land cover, distance to roads, distance to lineaments, distance to rivers, and rainfall were used in the analysis to elucidate the spatial relationship between these factors and landslide occurrences. Feature selection of Linear Support Vector Machine (LSVM) algorithm was employed to assess the prediction capability of these conditioning factors on landslide models. Subsequently, the NB, MLP Neural Nets, and FT models were constructed using training dataset. Finally, success rate and predictive rate curves were employed to validate and compare the predictive capability of three used models. Overall, all the three models performed very well for landslide susceptibility assessment. Out of these models, the MLP Neural Nets and the FT models had almost the same predictive capability whereas the MLP Neural Nets (AUC = 0.850) was slightly better than the FT model (AUC = 0.849). The NB model (AUC = 0.838) had the lowest predictive capability compared to other models. Landslide susceptibility maps were final developed using these three models. These maps would be helpful to planners and engineers for the development activities and land-use planning.

A comparative study of different machine learning models for landslide susceptibility prediction: a case study of Kullu-to-Rohtang pass transport corridor, India

Article 21 March 2023

1 Introduction

Landslide is considered to be one of the most typical natural hazards that have been affecting human lives and properties in India (Sarkar et al. 2011). Guha-Sapir et al. (2014) reported that India is one of the most affected countries in Asia by landslides after China. According to the Geological Survey of India, landslide affected area is about 0.49 million km². Approximately 15 % of land area of India is vulnerable to landslide hazard (Onagh et al. 2012). Landslide has been occurring mainly in the Himalaya mountain range including the Uttarakhand state, in the northern India. Many areas of Uttarakhand state have been subjected to severe erosions of rock and soil, and under cutting of slopes by rivers and anthropogenic activities. With the developmental activities, landslide is turning into a serious hazard issue requiring urgent attentions to mitigate its damages. Landslide susceptibility map is considered as a standard tool in land management decision-making for mitigating damages arising from landslides. In addition, these maps are helpful for planners to select suitable areas to implement development schemes (Akgun 2012; Ozdemir and Altural 2013)

Over the decades, the topic of landslide susceptibility mapping has been turning into one of the most attentions to many researchers throughout the world (Dou et al. 2015c; Hong et al. 2015b; Kavzoglu et al. 2015; Pareek et al. 2013; Shahabi et al. 2014; Tien Bui et al. 2014b). Geographical Information System (GIS) is an excellent and useful tool in order to produce landslide susceptibility map. Nowadays, the data used in GIS is easier and quicker to receive due to the development of Global Positioning Systems (GPS) and Remote Sensing (RS) techniques. Therefore, the utilization of GIS becomes more and more popular in the spatial analysis of landslides (Althuwaynee et al. 2012).

Over the last decades, many studies in landslide susceptibility mapping using GIS have been carried out in many regions over the world such as weights of evidence (Kayastha et al. 2012; Mohammady et al. 2012; Ozdemir and Altural 2013; Pourghasemi et al. 2013b; Regmi et al. 2014), frequency ratio (Choi et al. 2012; Shahabi et al. 2014; Yalcin et al. 2011), logistic regression (Bai et al. 2011; Conoscenti et al. 2014; Das et al. 2010; Garcia-Rodriguez et al. 2008; Ohlmacher and Davis 2003), evidential belief function (Althuwaynee et al. 2012; Jebur et al. 2015; Pradhan et al. 2014), fuzzy logic (Aksoy and Ercanoglu 2012; Pourghasemi et al. 2012; Pradhan 2011; Saboya Jr et al. 2006; Tien Bui et al. 2012d), neuro-fuzzy (Oh and Pradhan 2011; Pradhan et al. 2010b; Sezer et al. 2013; Tien Bui et al. 2012c; Vahidnia et al. 2010), support vector machine (Hong et al. 2015a; Marjanovic et al. 2011; Peng et al. 2014; Pourghasemi et al. 2013a; Tien Bui et al. 2012b; Xu et al. 2012).

Overall, these methods can be grouped into three approaches: (i) analytic approach, (ii) statistic approach, (iii) soft computing approach (Pradhan 2013). The analytic approach (safety factor – based approach) is only feasible in small areas where landslide types are simple and having fairly homogeneous geologic properties (Dou et al. 2014; Scolobig and Pelling 2015). It is almost impossible to be applied in large areas (Pradhan 2013). Therefore, statistical and soft computing approaches usually are chosen for landslide susceptibility assessment in large areas. Moreover, these approaches can be implemented comparatively easily using GIS (Akgun et al. 2012; Pradhan 2013). Recently, some relative new soft computing – based models have been developed and applied successfully in classification task such as Naïve Bayes (NB), Multilayer Perception Neural Network (MLP Neural Nets), Functional Tree (FT) which can be applied in spatial prediction of landslide occurrence. In general, no agreement is reached on which one is best for all areas, therefore, the comparison of the performance of these methods should be carried out (Aksoy and Ercanoglu 2012; Alimohammadlou et al. 2014; Kavzoglu et al. 2014; Regmi et al. 2014; Shahabi et al. 2014; Zare et al. 2013).

NB is a new technique in classification and was selected for the present study to construct landslide susceptibility map. This method has been already used widely and successfully in many studies (Aguiar-Pulido et al. 2012; Lepora et al. 2010; Lu et al. 2010; Rosen et al. 2011; Zhang and Gao 2011). However, its application is still rare in landslide problem (Tien Bui et al. 2012a). Tien Bui et al. (2012a) also stated that NB giving relatively good prediction capability in landslide susceptibility assessment in Hoa Binh province in Viet Nam.

MLP Neural Nets is an artificial neural network that has been employed widely in many fields including landslide susceptibility assessment (Conforti et al. 2014; Dou et al. 2015d; Lee et al. 2004; Li et al. 2012; Melchiorre et al. 2008; Yilmaz 2009). Zare et al. (2013) was carried out the study in comparison of two artificial neural networks such as MLP Neural Nets and Radial Basic Function Network (RBF Neural Nets) for landslide problems and giving a statement that MLP Neural Nets outperforms RBF Neural Nets. Since, MLP Neural Nets was selected in this study for comparing with other models (NB and FT).

FT is an efficient method for classification in many fields such as forestry (Heinimäki and Elomaa 2013), patent (Cascini and Zini 2008), oceanology (Nerini and Ghattas 2007). Gama (2004) stated that the FT model might be seen as a generalization of prior multivariate trees for decision problems. However, this technique has seldom been explored for landslide susceptibility assessment; therefore, it was selected in this present study to compare the prediction capability with other models (FT and MLP Neural Nets).

The above three proposed methods have advantages and disadvantages depending on the available various datasets and site conditions. The comparison of these methods in landslide susceptibility assessment is still limited and thus requiring more studies. The main objective of this present study is to compare the prediction capability of the NB, MLP Neural Nets, and FT techniques for spatial prediction of landslides. Weka 3.7.12 and ArcGIS 10.2 software were used in data analysis, development of models and landslide susceptibility maps.

2 Description of study area

The part of Uttarakhand state has been selected as a study area (Fig. 1). It lies between latitudes 29°56’38’’N to 30°09’37’’N and longitudes 78°29’01’E to 78°37’06”E covering an area of about 323,815 m².

The study area located in the middle of the Tehri Garhwal and Pauri Garhwal districts, as it is having number of landslides and identical meteorological conditions of these two districts. The daily mean temperature remains 19.6 ^oC in the Tehri Garhwal district, whereas the maximum temperature 36.5 °C in June, and the lowest temperature 4.6 °C in January (Bagchi 2011). The relative humidity in the Tehri Garhwal district is generally very high with the highest one of 85 % in monsoon season (June to September) and the lowest one of 25 % in summer season (April to June) (Bagchi 2011). The daily mean temperature remains 25 °C in the Pauri Garhwal district whereas the maximum temperature of 45 °C in June, and the minimum temperature of 1.3 °C in January. The relative humidity varies between 54 % and 63 % in the Pauri Garhwal district (http://pauri.nic.in/pages/display/55-the-land). The study area is situated in subtropical moon soon region which frequently experiences heavy rainfall. The heavy rainfall usually occurs in monsoon season. The annual mean rainfall varies from 770 mm to 1684 mm.

Broadly, in this study area, there are four types of land cover including dense forest, non-forest, open-forest, and scrub land. Non-forest covers the biggest area with 39.02 %, followed by dense-forest (31.96 %), open-forest (22.36 %), and scrub land (6.67 %), respectively.

Topographically, the study area is mountainous having very high mountain ranges. Elevation ranges from 380 m to 2180 m above the sea level, with average elevation of 1,081 m. Most of area falls above elevation 900 m. Slope angles in this area are smaller than 70 degrees. Most of the slope angles are in classes of 15 – 25 degrees (21.34 %), 25 – 35 degrees (38.08 %), 35 – 45 degrees (26.03 %). There is only 2.89 % in very gentle slope angles smaller than 8 degrees.

Geologically, there are six main lithological groups namely, Blaini and Krol group (boulder bed and limestone), Amri group (quartzite, phyllite), Bijni group (quartzite, phyllite), Jaunsar group (phyllite and quartzite), Tal group (sandstone, shale, quartzite, phyllite, and limestone), Manikot shell limestone (limestone). Baliana and Krol group, Bijni group present dominantly with 30.1 % and 28.1 % of the study area, respectively. Soil types are mainly fine-silt (26.27 %) and loamy (73.73 %). Loamy soils can be divided into four classes such as course-loamy (20.1 %), fine-loamy (8.02 %), skeletal- loamy (42.02 %), and mixed-loamy (3.6 %).

3 Methodology

Methodology in this study was carried out in six steps (Fig. 2): (i) data collection and interpretation, (ii) preparing of the landslide conditioning factors and landslide inventory map and creating of the training and testing datasets as inputs for landslide susceptibility modeling, (iii) considering of influence of the eleven conditioning factors on landslide occurrence using linear support vector machine, (iv) applying the NB, MLP Neural Nets, and FT models to assess the susceptibility of landslides in the study area, (v) evaluating and comparing the performance of these models using the success rate and predictive rate curves, (vi) generating landslide susceptibility maps, and selecting of the best model in the study area.

3.1 Data collection and interpretation

The dataset in the current study was collected mostly from published literatures such as Geological Survey of India including topography, lithology, soil, and land cover maps of Uttarakhand state at a scale of 1:1000000 (http://www.ahec.org.in/wfw/maps.htm). Meteorological data was collected and compiled for 30 years (1984 to 2014) from Climate Forecast System Reanalysis (CFSR) in Global Weather data for SWAT (NCEP 2014). The LANDSAT-8 satellite images with 30 m spatial resolution were downloaded from United State Geological Survey (USGS) database (http://earthexplorer.usgs.gov/). The Google Earth satellite images up to 10 m spatial resolution were used with the help of Google Earth pro 7.0 software. All of the raw data were processed to create the input data for landslide susceptibility assessment using ArcGIS 10.2.

3.1.1 Landslide inventory map

Landslide inventory map is the compilation of the landslide locations occurred in the past and its characteristics. It is indispensable in landslide susceptibility assessment that is carried out on the base of the statistical assumption that the conditions for landslide occurrence in the future is same with landslides happened in the past (Guzzetti et al. 2005). One of the most helpful methods for identification of landslides is using of remote sensing techniques (Dou et al. 2015b; Guzzetti et al. 2012; Singhroy et al. 1998).

Information for historical landslides for the study areas was first collected from reports and newspaper records and then this information were used to identify landslides by using “zooming out” and “zooming in” tools in Google Earth pro 7.0 software. Landslide locations were identified by Google Earth images in 2014. LANDSAT-8 satellite images were also used to help identification of very large landslides in the study area. The interpretation of landslides was based on evidences such as bare soil, breaks in the forest canopy, and other typical geomorphic characteristics (Pradhan 2013). Landslide locations were then validated using extensive field investigation.

A total of 430 landslide locations were finally identified whereas 236 landslides with area larger than 400 m² (corresponding to the area of pixels in DEM 20 m) were mapped in polygon format, and remaining 194 landslides with area smaller than 400 m² were mapped as points (Fig. 1). The biggest landslide polygon was identified with area of 199,574 m². Approximately 75.6 % landslides (325 locations) are translational type whereas about 24.4 % of landslides (105 locations) are rotational type.

Most of these landslides have occurred in cut-slopes, embankments alongside roads, and highways. Examples of landslide photos in Uttarakhand state is shown in Fig. 3.

3.1.2 Landslide conditioning factors

Eleven landslide conditioning factors were used for landslide susceptibility assessment in the study area namely slope angle, slope aspect, elevation, curvature, lithology, soil type, land cover, distance to roads, distance to lineaments, distance to rivers, and rainfall.

Slope angle

Slope angle is one of the most causative factors to landslide occurrence (Van Den Eeckhaut et al. 2006). Slope angle map was extracted from a digital elevation model (DEM) with a 20 m*20 m grid size, which was generated using elevation contour lines and points extracted from the state topographical map at a scale of 1:1000000. Slope angles were classified into six classes including 0-8, 8-15, 15-25, 25-35, 35-45; and > 45 degrees (Fig. 4a). It can be observed from the statistical analysis that landslide occurred mainly between slope angles of 35 – 45 degrees (41.54 %) and 25–35 degrees (29.13 %). There are no landslides reported in very gentle slope angles smaller than 8 degrees (Fig. 4b).

Slope aspect

Slope aspect that controls topographic moisture due to impaction of solar radiation and rainfall (Sadr et al. 2014) was considered as a conditioning factor to landslide occurrence in this study. Slope aspect map was extracted from a DEM and were classified into nine classes such as flat (-1), north (0-22.5 and 337.5-360), northeast (22.5-67.5), east (67.5-112.5), southeast (112.5-157.5), south (157.5-202.5), southwest (202.5-247.5), west (247.5-292.5), northwest (292.5-337.5) (Fig. 4c). Landslides in the study area have been observed concentrated mainly in southwest (23.65 %), and south (14.5 %), and there are no landslide shown in flat area (Fig. 4d).

Elevation

Elevation is the deviation of maximum and minimum heights within terrains that is considered as a conditioning factors to landslide occurrence (Ayalew and Yamagishi 2005b). Elevation map was generated from DEM and categorized into different classes such as 0 - 600, 600 - 750, 750 - 900, 900 – 1050, 1050 – 1200, 1200 – 1350, 1350 – 1500, 1500 – 1650, 1650 – 1800, and > 1800 m (Fig. 5a). Elevation with the highest number of landslide is in class smaller than 600 m (26.49 %). Landslides in 900 – 1050 m and 1050 – 1200 m classes are also high (18.28 % and 14.65 %), respectively. No landslides were observed on elevation greater than 1800 m (Fig. 5b).

Curvature

Curvature is considered as one of the conditioning factors in this study due to it controls the water flow on terrain surface affecting to landslide occurrence. It can be stated that convex slopes are more stable than concave slopes because runoff disperses uniformly and easily down (Stocking 1972). Therefore, curvature map was derived from DEM and divided into three classes: concave (< -0.05), flat (-0.05 – 0.05), and convex (> 0.05) (Fig. 5c). Landslides occurred mostly in concave (57.79 %) and convex (42.21 %). Obviously, there are no landslides appearing in flat class (Fig. 5d).

Lithology

Geology includes lithological and structural variations which often lead to a difference in strength and permeability of rocks and soils, has greatly influences on occurrence of landslides (Ayalew and Yamagishi 2005b). Lithological map in the study area was extracted from the geological map of the state and then constructed with six formations based on lithological similarities (Ayalew and Yamagishi 2005a) including group 1 (Amri group with quartzite, phyllite), group 2 (Blaini and Krol group with boulder bed and limestone), group 3 (Bijni group with quartzite, phyllite), group 4 (Jaunsar group with phyllite and quartzite), group 5 (Manikot shell limestone with limestone), group 6 (Tal group with sandstone, shale, quartzite, phyllite, and limestone) (Fig. 6a). Lithological formation that has the highest number of landslides is group 6 (37.28 %). The number of landslides is also high in group 3 (22.11 %), group 4 (18.83 %), and group 2 (18.64 %). Landslides in group 5 are very small (3.14 %) whereas landslide did not happen in group 1 (Fig. 6b).

Soil

Soil type reflects textures and compositions of soil materials affecting on landslide occurrence (Sassa and Canuti 2008). Soil map was constructed from the soil map of state and were classified into fine-silt, course-loamy, fine-loamy, mixed-loamy, skeletal-loamy (Fig. 6c). It can be seen that landslides happened mainly in course-loamy (CL) (40.76 %), skeletal-loamy (SL) (36.13 %). There are few landslides in fine-silt (FS) (16.15 %), fine-loamy (FL) (3.14 %), and mixed-loamy (ML) (3.82 %) (Fig. 6d).

Land cover

Land cover map for the study area was extracted from the land cover map of the state. Land cover classes in the study area include dense forests, non-forests, open forests, and scrub lands (Fig. 7a). It is obvious that landslides took place mostly in non-forest area (31.86 %), following by dense-forest (29.13 %), open-forest (25.87 %), and scrub land (13.14 %), respectively (Fig. 7b). In the area covering by vegetation, the roots of vegetation act as anchors to keep stability of soil and rock masses on the slope thus preventing landslides (Prandini et al. 1977).

Rainfall

Rainfall is the most affecting factor for landslide occurrence (Glade et al. 2000; Polemio and Petrucci 2000). Rainfall map was constructed from the meteorological data. The rainfall was classified into eight classes including 0 - 900, 900 – 1000, 1000 – 1100, 1100 – 1200, 1200 – 1300, 1300 – 1400, 1400 – 1500, and > 1500 mm (Fig. 7c). It has been observed that landslides occurred mainly in areas where road cutting is there and rainfalls is more than 900 mm that is between 900 mm - 1000 mm (21.38 %), and 1000 – 1100 mm (20.65 %) (Fig. 7d).

Distance to roads

Distance to roads map was built with six classes including 0-40, 40 - 80, 80 - 120, 120 - 160, 160 – 200, and > 200 m (Fig. 8a). Roads were extracted from Google Earth satellite images, and then the road sections with slope angles larger than 15 degrees were buffered within the study area to construct distance to roads map. It has been observed that number of landslides occurring near to the roads. Landslides occurred mostly at distances of 40 – 80 m (40.27 %), 80 – 120 m (20.74 %), 0 – 40 m (15.97 %), respectively. Few landslides were observed at distances of 120 – 160 m (12.22 %), 160 – 200 m (7.23 %), and greater than 200 m (3.56 %) (Fig. 8b).

Distance to rivers

Distance to rivers map was also generated with six classes including 0-40, 40-80, 80 - 120, 120-160, 160 – 200, and > 200 m (Fig. 8c). Rivers were also extracted from Google Earth satellite images, and then the river sections with slope angles larger than 15 degrees were buffered within the study area to construct distance to rivers map. Landslides placed mostly at distance of 0 – 40 m (65.12 %), and very small amount of landslides occurred at greater distances (Fig. 8d).

Distance to lineaments

Distance to lineaments map was constructed with different classes as 0 - 50, 50 - 100, 100-150, 150-200, 200 – 250, 250 – 300, 300 – 350, 350 – 400, 400-450, 450-500, and > 500 m (Fig. 9a). Lineaments are linear features such as faults, fractures, geomorphologic ridges and other topographic and tectonic structures which affect continuity of rocks and soils masses thus causing instability of slope (Ayalew and Yamagishi 2005b). Lineaments were extracted from LANDSAT-8 satellite images using Geomatica software 2015 version. Most of landslides took place in distance of 0 – 50 m (68.9 %), and very few landslides occurred at greater distances (Fig. 9b).

3.2 Preparation of training and testing dataset

Chung and Fabbri (2003) stated that it is impossible to validate the model performance without splitting of the dataset. Therefore, it is important to split dataset into two parts for landslide susceptibility assessment. One part that is used for model building called training dataset, and another that is employed for validating/testing the performance of model called testing dataset. However, there are no thumb rule for selection of the proportion of training and testing datasets (Pradhan 2013). In this study, a ratio of 70 % and 30 % was selected according to randomly partition for training and testing dataset, respectively (Pourghasemi et al. 2012; Pourghasemi et al. 2013b; Xu et al. 2012).

In order to construct training dataset, 70 % of landslides were first extracted randomly from landslide inventory map including 301 landslide locations. These landslides were then converted into 6133 pixels with 20 m* 20 m size. The 6133 non-landslide pixels were also extracted randomly from the study area. Subsequently, the landslide pixels and non-landslide pixels were combined into sampling data. The training dataset was finally constructed by overlaying sampling data onto the eleven landslide conditioning factors. Also, 30 % of remaining landslides were used for making the testing dataset. Similarly, the testing dataset was built with 1614 landslide pixels from 129 landslide locations and 1614 non-landslide pixels.

Landslide pixels were assigned with value of 1 and non-landslide pixels were assigned with value of 0 in dataset for classification (Tien Bui et al. 2015).

3.3 Landslide conditioning factors selection

The quality of landslide susceptibility assessment depends not only on the selected models but also on the quality of the input data (Pradhan 2013). Therefore, the predictive ability of landslide conditioning factors has to be tested before conducting the learning process of the landslide models. The factors that are tested as low or null predictive capability should be removed for improving of results accuracy (Doshi and Chaturvedi 2014).

Feature selection is an effective method in selection of the input data for the model (Dash and Liu 1997). It was used for selection of landslide conditioning factors in the current study. Several feature selection methods used in the literature review such us information gain ratio (Dai and Xu 2013), consistency (Dash and Liu 2003), gain ratio (Karegowda et al. 2010), chi-square statistic (Chen et al. 2009). In this study, linear support vector machine (LSVM) was selected for evaluation of prediction capability of landslide conditioning factors. The LSVM method was proposed by Guyon and Elisseeff (2003) that improves the classification accuracy for modeling by reducing irrelevant and unnecessary input variables (Lin et al. 2008).

According to training dataset and eleven landslide conditioning factors, the test of the predictive capability of these factors was carried out using the LSVM method (Eq. 1):

$$ g(x)=sgn\left({w}^{\mathrm{T}}a+b\right) $$

(1)

Where, w ^T is the inverse matrix of weight matrix assigned in each landslide conditioning factor, a = (a ₁, a ₂,…, a ₁₁) is the vector of inputs that contains eleven landslide conditioning factors, b is the offset from the origin of the hyper-plane. The landslide conditioning factor ith with the weight w _i close to 0 has a smaller effect on the prediction than the one with larger values of w _i (Mladenić et al. 2004).

3.4 Landslide susceptibility models

3.4.1 Naive Bayes (NB)

NB is a statistical classifier based on hypothesis that there is not any dependency between attributes to maximize the posterior probability in determination of the class for classification (Soni et al. 2011). The NB model use Bayes’ theorem that can be constructed in following steps: (i) collecting the examples, (ii) estimating a prior probabilities of each class, (iii) estimating means of classes, (iv) constructing covariance matrices and finding the inverse and determinant for each class, (v) forming the discriminant function for each lass (Bhargavi and Jyothi 2009). The advantage of the NB model is that it requires a small amount of training data to estimate the necessary parameters for classification (Bhargavi and Jyothi 2009). In this study, the performance of the NB model was tested in spatial prediction of landslide occurrence in the study area.

Let x = (x ₁, x ₂, …, z ₁₁) is the vector of the eleven landslide conditioning factors, y = (y ₁, y ₂) is the vector of the classifier variables (landslide, non-landslide). The NB classifier is based on the following equation:

$$ {y}_{NBC}=\underset{y_{\mathrm{i}}=\left[\mathrm{landslide},non\ \mathrm{landslide}\right]}{\mathrm{argmax}\ P\left({y}_{\mathrm{i}}\right)}\ {\displaystyle \prod_{i=1}^{10}P\left({x}_{\mathrm{i}}/{y}_{\mathrm{i}}\right)} $$

(2)

where, P(y _i) is the prior probability of y _i that can be estimated based on the proportion of the observed cases with output class y _i in the training dataset. P(x _i/y _i) is the conditional probability that can be calculated according to Eq. 3:

$$ P\left({x}_{\mathrm{i}}/{y}_{\mathrm{i}}\right)=\frac{1}{\sqrt{2\uppi}\alpha }{e}^{\frac{\hbox{-} {\left({x}_{\mathrm{i}}\hbox{-} \eta \right)}^2}{2{\alpha}^2}},\eta\ \mathrm{is}\ \mathrm{the}\ \mathrm{mean}\ \mathrm{and}\ \alpha\ \mathrm{is}\ \mathrm{the}\ \mathrm{standard}\ \mathrm{deviation}\ \mathrm{of}\ {x}_{\mathrm{i}} $$

(3)

3.4.2 Multilayer Perceptron Neural Networks (MLP Neural Nets)

MLP Neural Nets is one of the artificial neural networks which is a branch of the artificial intelligence (AI) (Gardner and Dorling 1998). MLP Neural Nets has been the most widely used in ANN techniques (Zare et al. 2013). The structure of MLP Neural Nets includes three classes which are input layers, hidden layers, and output layers. The meaning of these classes is various in terminologies for different fields (Yee and De Silva 2002). The training process in MLP Neural Nets is carried out in two steps: (i) the inputs are propagated forward through the hidden layers to result the output values, and then the output values are compared to pre-values in order to estimate the difference, (ii) the connection weights were adjusted to optimize the best results with the least difference (Tien Bui et al. 2015). In this study, MLP Neural Nets was employed for landslide susceptibility assessment in the study area. For this purpose, the input layers are viewed as the landslide conditioning factors, the output layers are understood as the classified results which are landslide or non-landslide classes, and the hidden layers are classifying layers to transform inputs to outputs (Fig. 10).

Suppose that t = t _i, i = 1, 2, ..., 11 is the vector of the eleven landslide conditioning factors, Φ = Φ _j, j = 1, 2 that represents landslide, non landslide classes. MLP Neural Nets function for classification in this study indicating as following:

$$ \varPhi =f(t) $$

(4)

where f(t) is an unknown function that is optimized by the adjustable network weights during training process for given network architecture.

MLP Neural Nets has been shown as an efficient approach for classification which is better than traditional ones (Benediktsson et al. 1990). There are some advantages when using this approach: (i) no pre-assumptions regarding to the distribution of training dataset, (ii) no decision needs to be made related to the relative importance of the different input measurements, and (iii) the weights are adjusted to select the most input measurements during training process (Gardner and Dorling 1998).

However, there is an important and difficult problem when using MLP Neural Nets for classification that is to determine the optimal number of the hidden layers. It is because an increase in the number of the hidden layers that lead to lessen the output errors for the training examples, but increases the errors for novel examples that can create the “over-fitting” problem (Murata et al. 1994).

3.4.3 Functional Trees (FT)

FT is one of the hierarchical models, which is a framework to construct multivariate trees for regression and classification problems (Gama 2004). FT could use functional inner nodes or functional leaves or combining attributes at both functional inner nodes and leaves in prediction problems (Gama 2004). FT using functional leaves is a variance reduction process whereas FT using functional inner nodes is a bias reduction method, and FT using both functional inner nodes and leaves presents very well in large datasets (Gama 2004). The main difference between FT and other traditional hierarchical models is that FT use a logistic regression function for the splitting in the functional inner nodes and for prediction at the functional leaves instead of dividing the inputs at a tree node by comparing the value of some input attributes with a constant value (Lan et al. 2011). The performance of the FT model depends on the minimum number of instance per leaf, the number of bootstrap iterations and the selection of function trees (Gama 2004).

In this study, the FT model was applied on the first time in landslide susceptibility assessment and its performance was compared with other models such as NB and MLP Neural Nets.

Assigning z = z _i, i = 1, 2, ..., 11 are the attributes of the eleven landslide causal factors, f = f _j, j = landslide, non landslide classes. The FT model for classification in this study was carried out in following steps: (1) selecting the constructor of Linear Bayes discriminate function (Gama 2000) to construct the model f = f(z _i) which is the probability of distribution of landslide and non-landslide classes, (2) extending z _i with new landslide conditioning factors to create new constructed dataset whereas each new landslide conditioning factor is the probability that z _i belong to landslide or non-landslide, (3) selecting the landslide conditioning factors from original dataset and all new constructed dataset to build the tree for classification.

Finally, landside susceptibility maps were constructed after training the above mentioned models. In order to generate these maps, landslide susceptibility indices were first calculated for all pixels of the study area. Subsequently, these indices were reclassified into different susceptible categories. Ayalew and Yamagishi (2005b) stated that the classification of the landslide susceptibility indices can be carried out using some methods namely equal interval, natural break, and the standard deviation. Pradhan and Lee (2010) also reported that it can be finalized based on percentage of area of the study region. Schicker and Moon (2012) reported that the natural break method has the best representation of landslide distribution. Since, the natural break method was selected in this study that has been widely used in many researches (Constantin et al. 2011; Irigaray et al. 2007; Zare et al. 2013). According to this method, landslide susceptibility maps can be classified into five categories very low, low, moderate, high, and very high.

3.5 Models performance validation

The performance of three landslide susceptibility models was validated by applying the success rate and prediction rate curves. Success rate curve indicates how suitable the built model is for the landslide susceptibility assessment (Gaprindashvili et al. 2014). Success rate curve is constructed based on a comparison of the landslide susceptibility maps with the number of landslides used in training dataset (6133 landslide pixels) (Pradhan et al. 2010a).

Prediction rate curve that allows estimating the probabilities of landslide occurrence in the future, therefore, it indicates how good the model is and it can be used for validate of the prediction capability of the models (Brenning 2005). Predictive rate curve is generated using the same procedure of success rate curve excepting that it employs the testing dataset (1614 landslide pixels) instead of training dataset (Fabbri et al. 2002).

The AUC value is the area under the curve that is a base for assessing accuracy of the landslide susceptibility models. The AUC values are range from 0.5 to 1.0. The AUC value equals to 1 indicating the ideal model whereas AUC value is equal to 0.5 illustrating the inaccuracy model. If the AUC value is close to 1, the results of the model are very good. In contrast, if the AUC value is close to 0.5, the results of the model is not good (Pradhan 2013).

4 Results and discussion

4.1 Selection of landslide conditioning factors

The prediction capability of the landslide conditioning factors using LSVM method is shown in Fig. 11. It can be observed that distance to roads has the highest predictive capability for landslide model (AM = 10.6). It is because landslides that have been detected in this study mostly taking place beside the roads or highway or slope cutting (Fig. 1). This is also in line with other research was carried out by Tien Bui et al. (2014a).

Slope angle is also having very high contribution to landslide model in this study (AM = 10.4). It is also comparable with another studies including Ohlmacher and Davis (2003) and Van Den Eeckhaut et al. (2006) who stated that slope angle is one of the most important factors to landslide occurrence.

Other factors like elevation (AM = 9.0), rainfall (AM = 8), land cover (AM = 6.9), distance to lineaments (AM = 6.1), curvature (AM = 4.8), soil type (AM = 3.6), lithology (AM = 3.3), slope aspect (AM = 2.2), distance to rivers (AM = 1.1), respectively, also having significantly contribution to landslide models. They were widely used in many landslide susceptibility researches (Chen et al. 2015; Dou et al. 2015a; Feizizadeh et al. 2014; Fourniadis et al. 2007; Martha et al. 2013; Vijith and Madhu 2008).

Overall, all of the eleven landslide conditioning factors proposed in this study are having a contribution to landslide model, and it were all selected for building the landslide model. As a remark, it is necessary to test the prediction capability of the landslide conditioning factors before conducting the learning process of the landslide models.

4.2 Landslide susceptibility mapping using the MLP Neural Nets, FT, and NB models

Using the NB model, landslide susceptibility map was generated and shown in Fig. 12. Landslide susceptibility classes were reclassified into five classes with the respective susceptible index intervals such as very low (0 – 0.106), low (0.106 – 0.298), moderate (0.298 – 0.537), high (0.537 – 0.787), very high (0.787 – 0.999). It can be observed that very low class has largest area (51.15 %), following by low (17.62 %), very high (11.46 %), moderate (10.82 %), and high (8.95 %), respectively. The performance of landslide susceptibility map using the NB model also indicates in Fig. 12. Obviously, it indicates very good performance because landslides are shown the most in very high (59.95 %), and high classes (16.52 %). There are very few landslides presenting in moderate, low, and very low classes.

Landslide susceptibility map using the MLP Neural Nets model was constructed and shown in Fig. 13. Landslide susceptibility classes were reclassified into five classes with the respective susceptible index intervals such as very low (0.008 – 0.113), low (0.113 – 0.261), moderate (0.261 – 0.481), high (0.481 – 0.752), very high (0.752 – 0.929). Fig. 13 shows that low class has largest area (46.68 %), following by very low (20.12 %), very high (14.43 %), moderate (13.27 %), and high (5.5 %), respectively. Landslide susceptibility map using the MLP Neural Nets model also performs very well. The highest number of landslides is in very high class (71.59 %), followed by high class (9.2 %), moderate (8.98 %), low (8.49 %), and very low (1.73 %) classes, respectively.

Also, landslide susceptibility map built using the FT model is shown in Fig. 14. Landslide susceptibility classes were reclassified into five classes with the corresponding susceptible index intervals including very low (0– 0.114), low (0.114 – 0.345), moderate (0.345 – 0.592), high (0.592 – 0.847), very high (0.847 – 1). Very low class has largest area (71.74 %), followed by very high (9.49 %), low (6.86 %), moderate (6.07 %), and high (5.84 %), respectively. Using the FT model, landslide susceptibility map performs very well as well (Fig. 14). Very high class has the highest number of landslides (68.14 %). High class also has the high number of landslides. Landslides are shown very less in the remaining classes.

It can be concluded that, all of three landslide susceptibility maps were produced using three models indicating a very good accuracy, whereas landslide susceptibility map produced by the MLP Neural Nets model is slightly higher than others.

4.3 Models performance and their comparison

Regarding to the FT model, an optimal test was first carried out with various parameters, and the best performance was obtained by using functional leaves with 50 instances per leaf, and 30 bootstrap iterations. For the MLP Neural Nets model, the parameters for momentum, learning rate, and training time were set up is 0.2, 0.3, and 500, respectively, the logistic sigmoid was employed as the activation function (Sasikala et al. 2014; Şenkal and Kuleli 2009). To determine the number of the hidden layers in this study, a test was carried out with various numbers of hidden layers based on predictive accuracy (ACC) and the area under the receiver operating characteristics curve (AUC) using testing data. The optimal number of the hidden layers was determined is 2 in this study as shown in Table 1.

Table 1 The performance of the MLP Neural Nets model with the different number of hidden neurons

Full size table

The final results are shown in Fig. 15. It can be observed from Fig. 15a that all of the three landslide models are very suitable for landslide susceptibility assessment according to the performance of success rate curve. The FT model indicates the highest reasonable degree (AUC = 0.9), the MLP Neural Nets model is less suitable (AUC = 0.876), and the NB model (AUC = 0.838) presents the lowest suitable comparing to others. However, Fig. 15b shows that the prediction capability of the MLP Neural Nets model is the highest with AUC value of 0.851, following by the FT model (AUC = 0.849), and the NB model (AUC = 0.838), respectively.

The NB model is fast and efficient to deal with discrete and continuous attributes. It has very well performance in solving real-life problems. However, due to independent assumptions on the attributes which is not always true in fact. Especially, in landslide studies, the landslide conditioning factors have high correlation (Tien Bui et al. 2015), therefore, the assumption of non-independence of these factors in the NB model might result poor performance (Zhang and Su 2004). Indeed, the performance of the NB model in this study is slightly less than the MLP Neural Nets and FT models.

Meanwhile, the MLP Neural Nets is a method, that is based on back-propagation algorithm, which is very flexible and adaptable in modeling (Basheer and Hajmeer 2000). It is considered as a standard algorithm for any supervised-learning pattern recognition process, however, it uses ‘black box” for classification that is very difficult to explain the performance of results in some cases (Soria et al. 2008). In this study, the MLP Neural Nets model outperforms the FT and NB models. It can be explained that the MLP Neural Nets model uses loops to optimize weights assigning for input data which is not used in the FT and NB models, thus its prediction capability can be improved (Gardner and Dorling 1998). These results also are comparable with another research such as Kapur et al. (2005) which shown that the MLPN model is really better than the NB model in motion capture data.

Moreover, one of the most attractive features of the FT model is that it uses the bootstrap aggregating (bagging) (Breiman 1996) which is simple variance reduction method to reduce its error (Gama 2004). Furthermore, the FT model was considered as an effective multivariable decision tree method for classification (Gama 2004). Chen et al. (2014) also stated that the FT model has the highest prediction capability compared with decision tree and logistic regression models in biodegradability prediction of chemicals. In this study, for the first time, the FT model was applied successfully in landslide problem. Its prediction capability is even better than the NB model.

Overall, all of these three landslide models are suitable for landslide susceptibility assessment in the study area. Therefore, land planners and engineers can select these models to construct landslide susceptibility map that can also be helpful in mitigating of landslide destruction.

5 Conclusion

Landslide susceptibility assessment is one of the hottest topics in recent decades due to its usefulness in identification of landslide susceptible areas that can be helpful for land use planning and decision makers and in warning landslide hazards. Even though NB, MLP Neural Nets, and FT were applied individually in many researches for solving various problems, these models had not been compared in landslide susceptibility assessment. Especially, the FT is the model that has not so far employed in landslide research. The main goal of the current study is to make a comparison on predictive abilities of these three models for landslide susceptibility mapping in the study area. Eleven landslide conditioning factors was exploited that have affected to landslide occurrence in this study area including slope angle, slope aspect, elevation, curvature, lithology, soil type, land cover, distance to roads, distance to lineaments, distance to rivers, and rainfall. These factors were used to analyze the spatial relationship between them and landslide occurrence. The prediction capability of three models was validated and compared using the success rate and predictive rate curves. The results show that three landslide models perform very reasonable for landslide susceptibility assessment. Out of these, the MLP Neural Nets model performs slightly better than the FT and NB models. Also, the predictive capability of the NB model is less than the MLP Neural Nets and FT models. As a final conclusion, all of three landslide models employed in this study perform as promising methods for landslide susceptibility assessments. It can be a good selection for the same purposes in other areas where are prone to landslide occurrence. More importantly, the construction of the road is the main reason for landslide occurrence in this study area. Since, it is essential to use of protective measures to keep slope stability when constructing roads or highways.

References

Aguiar-Pulido V, Munteanu CR, Seoane JA, Fernández-Blanco E, Pérez-Montoto LG, González-Díaz H, Dorado J (2012) Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer. Mol BioSyst 8:1716–1722
Article Google Scholar
Akgun A (2012) A comparison of landslide susceptibility maps produced by logistic regression, multi-criteria decision, and likelihood ratio methods: a case study at İzmir, Turkey. Landslides 9:93–106
Article Google Scholar
Akgun A, Sezer EA, Nefeslioglu HA, Gokceoglu C, Pradhan B (2012) An easy-to-use MATLAB program (MamLand) for the assessment of landslide susceptibility using a Mamdani fuzzy algorithm. Comput Geosci 38:23–34
Article Google Scholar
Aksoy B, Ercanoglu M (2012) Landslide identification and classification by object-based image analysis and fuzzy logic: An example from the Azdavay region (Kastamonu, Turkey). Comput Geosci 38:87–-98
Article Google Scholar
Alimohammadlou Y, Najafi A, Gokceoglu C (2014) Estimation of rainfall-induced landslides using ANN and fuzzy clustering methods: A case study in Saeen Slope, Azerbaijan province, Iran. Catena 120:149–162
Article Google Scholar
Althuwaynee OF, Pradhan B, Lee S (2012) Application of an evidential belief function model in landslide susceptibility mapping. Comput Geosci 44:120–135
Article Google Scholar
Ayalew L, Yamagishi H (2005a) The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains. Central Japan Geomorphology 65:15–31
Article Google Scholar
Bagchi D (2011) Ground Water Brochure, District Tehri Garhwal, Uttarakhand. Central Ground Water Board. Retrieved April 2014.
Bai S, Lü G, Wang J, Zhou P, Ding L (2011) GIS-based rare events logistic regression for landslide-susceptibility mapping of Lianyungang. China Environ Earth Sci 62:139–149
Article Google Scholar
Basheer I, Hajmeer M (2000) Artificial neural networks: fundamentals, computing, design, and application. Journal of microbiological methods 43:3–31
Article Google Scholar
Benediktsson J, Swain PH, Ersoy OK (1990) Neural network approaches versus statistical methods in classification of multisource remote sensing data Geoscience and Remote Sensing. IEEE Trans 28:540–552
Google Scholar
Bhargavi P, Jyothi S (2009) Applying naive Bayes data mining technique for classification of agricultural land soils. Int J Comput Sci Netw Secur 9:117–122
Google Scholar
Breiman L (1996) Bagging Predictors Machine Learning 24:123–140
Brenning A (2005) Spatial prediction models for landslide hazards: review, comparison and evaluation. Nat Hazards Earth Syst Sci 5:853–862
Article Google Scholar
Cascini G, Zini M (2008) Measuring patent similarity by comparing inventions functional trees. In: Computer-aided innovation (CAI). Springer, pp 31-42
Chen G, Li X, Chen J, Yn Z, Peijnenburg WJ (2014) Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression. Environmental Toxicology and Chemistry 33:2688–2693
Article Google Scholar
Chen H, Liu H, Han J, Yin X, He J (2009) Exploring optimization of semantic relationship graph for multi-relational Bayesian classification Decision Support Systems, Elsevier 48:112–121
Chen J, Zeng Z, Jiang P, Tang H (2015) Deformation prediction of landslide based on functional network. Neurocomputing 149(Part A):151–157
Article Google Scholar
Choi J, Oh H-J, Lee H-J, Lee C, Lee S (2012) Combining landslide susceptibility maps obtained from frequency ratio, logistic regression, and artificial neural network models using ASTER images and GIS. Engineering Geology 124:12–23
Article Google Scholar
Chung C-JF, Fabbri AG (2003) Validation of spatial prediction models for landslide hazard mapping. Natural Hazards 30:451-472
Conforti M, Pascale S, Robustelli G, Sdao F (2014) Evaluation of prediction capability of the artificial neural networks for mapping landslide susceptibility in the Turbolo River catchment (northern Calabria, Italy). CATENA 113:236–250
Article Google Scholar
Conoscenti C, Angileri S, Cappadonia C, Rotigliano E, Agnesi V, Märker M (2014) Gully erosion susceptibility assessment by means of GIS-based logistic regression: a case of Sicily (Italy). Geomorphology 204:399–411
Article Google Scholar
Constantin M, Bednarik M, Jurchescu MC, Vlaicu M (2011) Landslide susceptibility assessment using the bivariate statistical analysis and the index of entropy in the Sibiciu Basin (Romania). Environ Earth Sci 63:397–406
Article Google Scholar
Dai J, Xu Q (2013) Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification. Applied Soft Computing 13:211–221
Article Google Scholar
Das I, Sahoo S, van Westen C, Stein A, Hack R (2010) Landslide susceptibility assessment using logistic regression and its comparison with a rock mass classification system, along a road section in the northern Himalayas (India). Geomorphology 114:627–637
Article Google Scholar
Dash M, Liu H (1997) Feature selection for classification. Intell data Anal 1:131–156
Article Google Scholar
Dash M, Liu H (2003) Consistency-based search in feature selection. Artificial Intelligence 151:155–176
Article Google Scholar
Doshi M, Chaturvedi SK (2014) Correlation based feature selection (CFS) technique to predict student performance. International Journal of Computer Networks & Communication (UCNC) 6
Dou J et al. (2015a) Optimization of causative factors for landslide susceptibility evaluation using remote sensing and GIS data in parts of Niigata, Japan. PLoS One 10:e0133262
Article Google Scholar
Dou J, Chang K-T, Chen S, Yunus AP, Liu J-K, Xia H, Zhu Z (2015b) Automatic case-based reasoning approach for landslide detection: integration of object-oriented image analysis and a genetic algorithm. Remote Sens 7:4318–4342
Article Google Scholar
Dou J, Oguchi T, Hayakawa YS, Uchiyama S, Saito H, Paudel U (2014) GIS-based landslide susceptibility mapping using a certainty factor model and its validation in the Chuetsu Area, Central Japan. In: Landslide science for a safer geoenvironment. Springer, pp 419-424
Dou J, Paudel U, Oguchi T, Uchiyama S, Hayakavva YS (2015c) Shallow and Deep-Seated Landslide Differentiation Using Support Vector Machines: A Case Study of the Chuetsu Area. Terrestrial, Atmospheric & Oceanic Sciences, Japan, p. 26
Google Scholar
Dou J, Yamagishi H, Pourghasemi HR, Yunus AP, Song X, Xu Y, Zhu Z (2015d) An integrated artificial neural network model for the landslide susceptibility assessment of Osado Island, Japan. Natural Hazards:1-28
Fabbri AG, Chung CF, Napolitano P, Remondo J, Zêzere JL (2002) Prediction rate functions of landslide susceptibility applied in the Iberian Peninsula
Feizizadeh B, Jankowski P, Blaschke T (2014) A GIS based spatially-explicit sensitivity and uncertainty analysis approach for multi-criteria decision analysis. Comput Geosci 64:81–95
Article Google Scholar
Fourniadis I, Liu J, Mason P (2007) Regional assessment of landslide impact in the Three Gorges area. China using ASTER data. Wushan-Zigui Landslides 4:267–278
Article Google Scholar
Gama J (2000) A linear-bayes classifier. In: Advances in artificial intelligence. Springer, pp 269-279
Gama J (2004) Functional trees. Machine Learning 55:219–250
Article Google Scholar
Gaprindashvili G, Guo J, Daorueang P, Xin T, Rahimy P (2014) A new statistic approach towards landslide hazard risk assessment. Int J Geosci 5:38–49
Article Google Scholar
Garcia-Rodriguez MJ, Malpica JA, Benito B, Diaz M (2008) Susceptibility assessment of earthquake-triggered landslides in El Salvador using logistic regression. Geomorphology 95:172–191
Article Google Scholar
Gardner M, Dorling S (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment 32:2627–2636
Article Google Scholar
Glade T, Crozier M, Smith P (2000) Applying probability determination to refine landslide-triggering rainfall thresholds using an empirical “Antecedent Daily Rainfall Model”. Pure Appl Geophys 157:1059–1079
Guha-Sapir D, Hoyois P, Below R (2014) Annual Disaster Statistical Review 2013: The Numbers and Trends. CRED, Brussels
Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learning Res 3:1157–1182
Google Scholar
Guzzetti F, Mondini AC, Cardinali M, Fiorucci F, Santangelo M, Chang K-T (2012) Landslide inventory maps: new tools for an old problem. Earth Sci Rev 112:42–66
Guzzetti F, Reichenbach P, Cardinali M, Galli M, Ardizzone F (2005) Probabilistic landslide hazard assessment at the basin scale. Geomorphology 72:272-299
Heinimäki TJ, Elomaa T 2013 Facilitating technology forestry: software tool support for creating functional technology trees. In: Proceedings of the 3rd International Conference on Innovative Computing Technology (INTECH’13), pp 510-519
Hong H, Pradhan B, Xu C, Tien Bui D (2015a) Spatial prediction of landslide hazard at the Yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. CATENA 133:266–281
Article Google Scholar
Hong H, Xu C, Revhaug I, Tien Bui D (2015b) Spatial prediction of landslide hazard at the Yihuang area (China): a comparative study on the predictive ability of backpropagation multi-layer perceptron neural networks and radial basic function neural networks. In: Robbi Sluter C, Madureira Cruz CB, Leal de Menezes PM (eds) Cartography—maps connecting the world. Lecture notes in geoinformation and cartography. Springer International Publishing, pp 175-188.
Irigaray C, Fernández T, El Hamdouni R, Chacón J (2007) Evaluation and validation of landslide-susceptibility maps obtained by a GIS matrix method: examples from the Betic Cordillera (southern Spain). Natural Hazards 41:61-79
Jebur MN, Pradhan B, Tehrany MS (2015) Manifestation of LiDAR-derived parameters in the spatial prediction of landslides using novel ensemble evidential belief functions and support vector machine models in GIS. Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of 8:674–690
Article Google Scholar
Kapur A, Kapur A, Virji-Babul N, Tzanetakis G, Driessen PF (2005) Gesture-based affective computing on motion capture data. In: Affective computing and intelligent interaction. Springer, pp 1-7
Karegowda AG, Manjunath A, Jayaram M (2010) Comparative study of attribute selection using gain ratio and correlation based feature selection. Int J Inf Technol Knowledge Manag 2:271–277
Google Scholar
Kavzoglu T, Kutlug Sahin E, Colkesen I (2015) Selecting optimal conditioning factors in shallow translational landslide susceptibility mapping using genetic algorithm. Eng Geol 192:101–112
Article Google Scholar
Kavzoglu T, Sahin EK, Colkesen I (2014) Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression. Landslides 11:425–439
Article Google Scholar
Kayastha P, Dhital MR, De Smedt F (2012) Landslide susceptibility mapping using the weight of evidence method in the Tinau watershed, Nepal. Nat Hazards 63:479–498
Article Google Scholar
Lan H, Frank E, Hall M (2011) Data mining: Practical machine learning tools and techniques. Morgan Kaufman, Boston
Google Scholar
Lee S, Ryu J-H, Won J-S, Park H-J (2004) Determination and application of the weights for landslide susceptibility mapping using an artificial neural network. Eng Geol 71:289–302
Article Google Scholar
Lepora NF, Evans M, Fox CW, Diamond ME, Gurney K, Prescott TJ 2010 Naive Bayes texture classification applied to whisker data from a moving robot. In: Neural networks (IJCNN), The 2010 International Joint Conference on, IEEE, pp 1-8
Li Y, Chen G, Tang C, Zhou G, Zheng L (2012) Rainfall and earthquake-induced landslide susceptibility assessment using GIS and artificial neural network. Natural Hazards and Earth System Sciences
Lin S-W, Lee Z-J, Chen S-C, Tseng T-Y (2008) Parameter determination of support vector machine and feature selection using simulated annealing approach. Appl Soft Comput 8:1505–1512
Article Google Scholar
Lu S-H, Chiang D-A, Keh H-C, Huang H-H (2010) Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values. Knowl-Based Syst 23:598–604
Article Google Scholar
Marjanovic M, Kovacevic M, Bajat B, Vozenílek V (2011) Landslide susceptibility assessment using SVM machine learning algorithm. Eng Geol 123:225–234
Article Google Scholar
Martha TR, van Westen CJ, Kerle N, Jetten V, Vinod Kumar K (2013) Landslide hazard and risk assessment using semi-automatically created landslide inventories. Geomorphology 184:139–150
Article Google Scholar
Melchiorre C, Matteucci M, Azzoni A, Zanchi A (2008) Artificial neural networks and cluster analysis in landslide susceptibility zonation. Geomorphology 94:379–400
Mladenić D, Brank J, Grobelnik M, Milic-Frayling N 2004 Feature selection using linear classifier weights: interaction with classification models. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, pp 234-241
Mohammady M, Pourghasemi HR, Pradhan B (2012) Landslide susceptibility mapping at Golestan Province, Iran: A comparison between frequency ratio, Dempster–Shafer, and weights-of-evidence models. Journal of Asian Earth Sciences 61:221–236
Article Google Scholar
Murata N, Yoshizawa S, Amari S-i (1994) Network information criterion-determining the number of hidden units for an artificial neural network model. Neural Networks. IEEE Transac 5:865–872
Google Scholar
NCEP (2014) Global Weather data for SWAT http://globalweather.tamu.edu/home/view13292
Nerini D, Ghattas B (2007) Classifying densities using functional regression trees: Applications in oceanology. Computational Statistics & Data Analysis 51:4984–4993
Article Google Scholar
Oh H-J, Pradhan B (2011) Application of a neuro-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area. Computers & Geosciences 37:1264–1276
Article Google Scholar
Ohlmacher GC, Davis JC (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA. Eng Geol 69:331–343
Article Google Scholar
Onagh M, Kumra V, Rai PK (2012) Landslide susceptibility mapping in a part of Uttarkashi district (India) by multiple linear regression method. International Journal of Geology, Earth and Environmental Sciences ISSN:2277-2081
Ozdemir A, Altural T (2013) A comparative study of frequency ratio, weights of evidence and logistic regression methods for landslide susceptibility mapping: Sultan Mountains, SW Turkey. Journal of Asian Earth Sciences 64:180–197
Article Google Scholar
Pareek N, Pal S, Sharma ML, Arora MK (2013) Study of effect of seismic displacements on landslide susceptibility zonation (LSZ) in Garhwal Himalayan region of India using GIS and remote sensing techniques. Comput Geosci 61:50–63
Article Google Scholar
Peng L, Niu R, Huang B, Wu X, Zhao Y, Ye R (2014) Landslide susceptibility mapping based on rough set theory and support vector machines: A case of the Three Gorges area, China. Geomorphology 204:287–301
Article Google Scholar
Polemio M, Petrucci O (2000) Rainfall as a landslide triggering factor an overview of recent international research. Landslides in research, theory and practice
Pourghasemi HR, Jirandeh AG, Pradhan B, Xu C, Gokceoglu C (2013a) Landslide susceptibility mapping using support vector machine and GIS at the Golestan Province, Iran. J Earth Syst Sci 2:349–369
Article Google Scholar
Pourghasemi HR, Pradhan B, Gokceoglu C (2012) Application of fuzzy logic and analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz watershed, Iran. Nat Hazards 63:965–996
Article Google Scholar
Pourghasemi HR, Pradhan B, Gokceoglu C, Mohammadi M, Moradi HR (2013b) Application of weights-of-evidence and certainty factor models and their comparison in landslide susceptibility mapping at Haraz watershed, Iran. Arabian J Geosci 6:2351–2365
Article Google Scholar
Pradhan B (2011) Manifestation of an advanced fuzzy logic model coupled with Geo-information techniques to landslide susceptibility mapping and their comparison with logistic regression modelling. Environmental and Ecological Statistics 18:471–493
Article Google Scholar
Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci 51:350–-365
Article Google Scholar
Pradhan B, Abokharima MH, Jebur MN, Tehrany MS (2014) Land subsidence susceptibility mapping at Kinta Valley (Malaysia) using the evidential belief function model in GIS. Nat Hazards 73:1019–1042
Article Google Scholar
Pradhan B, Lee S (2010) Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ Earth Sci 60:1037–1054
Article Google Scholar
Pradhan B, Lee S, Buchroithner MF (2010a) A GIS-based back-propagation neural network model and its cross-application and validation for landslide susceptibility analyses. Comput Environ Urban Syst 34:216–235
Article Google Scholar
Pradhan B, Sezer EA, Gokceoglu C, Buchroithner FM (2010b) Landslide susceptibility mapping by neuro-fuzzy approach in a landslide-prone area (Cameron Highlands, Malaysia). Transactions on Geoscience and Remote Sensing
Prandini L, Guidiini G, Bottura J, Pançano W, Santos A (1977) Behavior of the vegetation in slope stability: a critical review. Bulletin of Engineering Geology and the Environment 16:51-55
Regmi AD, Devkota KC, Yoshida K, Pradhan B, Pourghasemi HR, Kumamoto T, Akgun A (2014) Application of frequency ratio, statistical index, and weights-of-evidence models and their comparison in landslide susceptibility mapping in CCentral Nepal Himalaya. Arabian Journal of Geosciences 7:725–742
Article Google Scholar
Rosen GL, Reichenberger ER, Rosenfeld AM (2011) NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics 27:127–129
Article Google Scholar
Saboya Jr F, da Glória AM, Dias Pinto W (2006) Assessment of failure susceptibility of soil slopes using fuzzy logic. Eng Geol 86:211–224
Article Google Scholar
Sadr MP, Maghsoudi A, Saljoughi BS (2014) Landslide susceptibility mapping of Komroud sub-basin using fuzzy logic approach. Geodynamics Research International Bulletin 2
Sarkar S, Kanungo D, Chauhan P (2011) Varunavat landslide disaster in Uttarkashi, Garhwal Himalaya, India. Q J Eng Geol Hydrogeol 44:17–22
Article Google Scholar
Sasikala S, alias Balamurugan SA, Geetha S (2014) Multi filtration feature selection (MFFS) to improve discriminatory ability in clinical data set. Applied Computing and Informatics
Sassa K, Canuti P (2008) Landslides—disaster risk reduction. Springer Science & Business Media,
Schicker R, Moon V (2012) Comparison of bivariate and multivariate statistical approaches in landslide susceptibility mapping at a regional scale. Geomorphology 161–162:40–57
Article Google Scholar
Scolobig A, Pelling M (2015) The co-production of risk from a natural hazards perspective science and policy interaction for landslide risk management in Italy
Şenkal O, Kuleli T (2009) Estimation of solar radiation over Turkey using artificial neural network and satellite data. Appl Energy 86:1222–1228
Article Google Scholar
Sezer EA, Pradhan B, Gokceoglu C (2013) Erratum to: “Manifestation of an adaptive neuro-fuzzy model on landslide susceptibility mapping: Klang valley, Malaysia” [Expert Systems with Applications 38 (2011) 8208–8219]. Expert Syst Appl 40:2360
Article Google Scholar
Shahabi H, Khezri S, Ahmad BB, Hashim M (2014) Landslide susceptibility mapping at central Zab basin, Iran: a comparison between analytical hierarchy process, frequency ratio and logistic regression models. CATENA 115:55–70
Article Google Scholar
Singhroy V, Mattar K, Gray A (1998) Landslide characterisation in Canada using interferometric SAR and combined SAR and TM images. Adv Space Res 21:465–476
Article Google Scholar
Soni J, Ansari U, Sharma D, Soni S (2011) Predictive data mining for medical diagnosis: an overview of heart disease prediction. Int J Comput Appl 17:43–48
Google Scholar
Soria D, Garibaldi JM, Biganzoli E, Ellis IO (2008) A comparison of three different methods for classification of breast cancer data. In: Machine learning and applications. In: ICMLA ’08. Seventh International Conference on, 2008. IEEE, pp. 619–624
Google Scholar
Stocking M (1972) Relief analysis and soil erosion in Rhodesia using multi-variate techniques. Zeitschrift fur Geomorphologie NF 16:432–-443
Google Scholar
Tien Bui D, Chung Ho T, Revhaug I, Pradhan B, Ba Nguyen D (2014a) Landslide susceptibility mapping along the National Road 32 of Vietnam using GIS-based J48 decision tree classifier and its ensembles. Geoinformation and Cartography
Tien Bui D, Pradhan B, Lofman O, Revhaug I (2012a) Landslide susceptibility assessment in vietnam using support vector machines, decision tree, and Naive Bayes Models. Mathematical Problems in Engineering 2012
Tien Bui D, Pradhan B, Lofman O, Revhaug I, B Dick O (2012b) Application of support vector machines in landslide susceptibility assessment for the Hoa Binh province (Vietnam) with kernel functions analysis. International Environmental Modelling and Software Society (iEMSs)
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012c) Landslide susceptibility mapping at Hoa Binh province (Vietnam) using an adaptive neuro-fuzzy inference system and GIS. Comput Geosci 45:199–211
Article Google Scholar
Tien Bui D, Pradhan B, Lofman O, Revhaug I, Dick OB (2012d) Spatial prediction of landslide hazards in Hoa Binh province (Vietnam): a comparative assessment of the efficacy of evidential belief functions and fuzzy logic models. Catena 96:28–40
Article Google Scholar
Tien Bui D, Pradhan B, Revhaug I, Trung Tran C (2014b) A comparative assessment between the application of fuzzy unordered rules induction algorithm and J48 decision tree models in spatial prediction of shallow landslides at Lang Son City, Vietnam. Remote Sensing Applications in Environmental Research
Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2015) Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides: 1–18
Vahidnia MH, Alesheikh AA, Alimohammadi A, Hosseinali F (2010) A GIS-based neuro-fuzzy procedure for integrating knowledge and data in landslide susceptibility mapping. Computers & Geosciences 36:1101–1114
Van Den Eeckhaut M, Vanwalleghem T, Poesen J, Govers G, Verstraeten G, Vandekerckhove L (2006) Prediction of landslide susceptibility using rare events logistic regression: A case-study in the Flemish Ardennes (Belgium). Geomorphology 76:392–410
Vijith H, Madhu G (2008) Estimating potential landslide sites of an upland sub-watershed in Western Ghat’s of Kerala (India) through frequency ratio and GIS. Environmental Geology 55:1397–1405
Article Google Scholar
Xu C, Dai F, Xu X, Lee YH (2012) GIS-based support vector machine modeling of earthquake-triggered landslide susceptibility in the Jianjiang River watershed, China. Geomorphology 145–146:70–80
Article Google Scholar
Yalcin A, Reis S, Aydinoglu A, Yomralioglu T (2011) A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics and logistics regression methods for landslide susceptibility mapping in Trabzon, NE Turkey. Catena 85:274–287
Yee LP, De Silva LC 2002 Application of multilayer perceptron network as a one-way hash function. In: Neural networks, 2002. IJCNN ‘02. Proceedings of the 2002 International Joint Conference on. IEEE, pp 1459–1462
Yilmaz I (2009) Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: a case study from Kat landslides (Tokat—Turkey). Comput Geosci 35:1125–1138
Article Google Scholar
Zare M, Pourghasemi HR, Vafakhah M, Pradhan B (2013) Landslide susceptibility mapping at Vaz Watershed (Iran) using an artificial neural network model: a comparison between multilayer perceptron (MLP) and radial basic function (RBF) algorithms. Arab J Geosci 6:2873–2888
Article Google Scholar
Zhang H, Su J (2004) Naive Bayesian classifiers for ranking machine learning 3201:501–512
Zhang W, Gao F (2011) An improvement to naive Bayes for text classification. Procedia Eng 15:2160–2164
Article Google Scholar

Download references

Acknowledgments

This research is funded by the ICCR scholarship program of Indian government to the first author for his Ph.D studies at the Department of Civil Engineering, Gujarat Technological University, Gujarat, India. Authors are also thankful to Director, Bhaskarcharya Institute for Space Applications and Geo-Informatics (BISAG), Department of Science & Technology, Government of Gujarat, Gandhinagar, Gujarat, India for providing facilities to carry out this research work.

Author information

Authors and Affiliations

Department of Civil Engineering, Gujarat Technological University, Nr. Visat Three Roads, Visat - Gandhinagar Highway, Chandkheda, Ahmedabad, Gujarat, 382424, India
Binh Thai Pham
Department of Geotechnical Engineering, University of Transport Technology, 54 Trieu Khuc, Thanh Xuan, Hanoi, Vietnam
Binh Thai Pham
Geographic Information System Group, Department of Business Administration and Computer Science, University College of Southeast Norway, Hallvard Eikas Plass 1, N-3800, Bø i Telemark, Norway
Dieu Tien Bui
Department of Natural Resources and Environmental Engineering, College of Agriculture, Shiraz University, Shiraz, Iran
Hamid Reza Pourghasemi
Department of Science & Technology, Bhaskarcharya Institute for Space Applications and Geo-Informatics (BISAG), Government of Gujarat, Gandhinagar, India
Prakash Indra
Department of Civil Engineering, LDCE, Gujarat Technological University, Ahmedabad, Gujarat, 380015, India
M. B. Dholakia

Authors

Binh Thai Pham
View author publications
You can also search for this author in PubMed Google Scholar
Dieu Tien Bui
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Reza Pourghasemi
View author publications
You can also search for this author in PubMed Google Scholar
Prakash Indra
View author publications
You can also search for this author in PubMed Google Scholar
M. B. Dholakia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Binh Thai Pham.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pham, B.T., Tien Bui, D., Pourghasemi, H.R. et al. Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor Appl Climatol 128, 255–273 (2017). https://doi.org/10.1007/s00704-015-1702-9

Download citation

Received: 17 August 2015
Accepted: 02 December 2015
Published: 23 December 2015
Issue Date: April 2017
DOI: https://doi.org/10.1007/s00704-015-1702-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS: a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods

Abstract

Similar content being viewed by others

A comparative study of different machine learning models for landslide susceptibility prediction: a case study of Kullu-to-Rohtang pass transport corridor, India

1 Introduction

2 Description of study area

3 Methodology

3.1 Data collection and interpretation

3.1.1 Landslide inventory map

3.1.2 Landslide conditioning factors

Slope angle

Slope aspect

Elevation

Curvature

Lithology

Soil

Land cover

Rainfall

Distance to roads

Distance to rivers

Distance to lineaments

3.2 Preparation of training and testing dataset

3.3 Landslide conditioning factors selection

3.4 Landslide susceptibility models

3.4.1 Naive Bayes (NB)

3.4.2 Multilayer Perceptron Neural Networks (MLP Neural Nets)

3.4.3 Functional Trees (FT)

3.5 Models performance validation

4 Results and discussion

4.1 Selection of landslide conditioning factors

4.2 Landslide susceptibility mapping using the MLP Neural Nets, FT, and NB models

4.3 Models performance and their comparison

5 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation