Comparative analysis of main catastrophic factors of landslides and debris flows based on the combination of multi-collinearity diagnosis and logistic regression
Collapse, landslides, and debris flows are the main mountain disasters in China. In recent years, these disasters have caused serious damage to residential buildings, highways, bridges, and hydropower stations. Different types of disasters have different disaster-causing mechanisms and catastrophic factors. Identify the fundamental factors of disasters can help us to carry out targeted prevention management and control work. The research aims to obtain the catastrophic factors’ sort of importance and compare their differences by data mining analysis. The main research work and results obtained are as follows:
(1) Selecting Wenchuan and Maoxian as research areas and integrating the data of historical landslides and debris flows, we establish a spatial database of research areas through the GIS platform. Based on the disaster-causing mechanism and influencing factors of geological disasters, 20 kinds of hazard factors are selected as the evaluation indicators: elevation, slope, aspect, plane curvature, section curvature, comprehensive curvature, roughness, slope position, micro-geomorphology, lithology, distance to fault , normalized vegetation index, surface coverage, topographic wetness index, sediment transport index, stream power index, distance to river, annual average rainfall, distance to house, distance to road.
(2) For the methods of data mining analysis, the multi-collinearity diagnosis and logistic regression analysis method based on R language are selected preferentially; because up to 20 primary evaluation factors are selected, there may be multiple collinearity between these factors, which will affect the regression coefficient and leads to the model distortion. Therefore, the multi-colinearity diagnosis of the 20 selected hazard factors is firstly performed by R language. The results show that the factor has multicollinearity problem.
(3) Then, the logistic regression analysis is carried on the factors which were through multi-collinearity diagnosis, before performing the the stepwise regression to further screen the factors that can not satisfy the significance test. Finally, the logistic regression models of landslide and debris flows are obtained by the historical data in Wenchuan County, and the reliability of models are evaluated by confusion matrix. Then, the models are applied to the prediction of landslide and debris flow disasters in Maoxian County, and the prediction results are compared with historical disasters. The ROC curve and the AUC value are used to evaluate the accuracy of the prediction of models which are established by main catastrophic factors.
(4) Finally, the order of importance of two disasters’ catastrophic factors is analyzed separately by the correlation coefficient of logistic regression, and the differences of two disasters’ main catastrophic factors are compared. The different developmental mechanisms of landslides and debris flows are explored.
(5) The results show that the main factors of landslides are: elevation, slope, lithology, NDVI, annual average rainfall, surface coverage and distance to road. the order of importance is: elevation > surface coverage > annual average rainfall>slope>lithology>NDVI>distance to road; the logistic regression model obtained by the dominant factor was applied to predict in Maoxian Area. The AUC value of the ROC curve was 0.85. The main factors of debris flow are: elevation, surface coverage, comprehensive curvature, distance to road, SPI, slope, TWI, micro-geomorphology, aspect. The order of importance is: elevation > surface coverage > comprehensive curvature >distance to road > SPI > slope > TWI > micro-topography >aspect. The AUC value of the ROC curve was 0.89. In general, the effective disaster prediction model can be established by the main catastrophic factors which are selected through the multi-collinearity diagnosis and logistic regression analysis. At the same time, the main catastrophic factors of landslides and mudslide disasters have obvious differences, reflecting the different disaster-causing mechanisms of landslides and debris flows.