130 / 2019-11-05 01:35:47
Combining three statistical techniques to analyze the influencing factors for stroke: a population-based case-control study
logistic regression, classification tree, support vector machine (SVM), area under the receiver operating characteristic curve (AUC)
摘要录用
Yuting JI / Southwest Medical University
Yuansheng lI / Southwest Medical University
Min QI / Southwest Medical University
Shuting LI / Southwest Medical University
Rong CHEN / Southwest Medical University
Shili luo / Southwest Medical University
Xi ZHANG / Southwest Medical University
Wei JIANG / Southwest Medical University
Wangdong XU / Southwest Medical University
Junhui ZHANG / Southwest Medical University
Purpose: Stroke is the leading cause of death among Chinese population. The aim of this study was to explore the influencing factors for stroke, so as to provide theoretical basis for the study of the etiology of and the prevention and control of stroke.
Methods: We conducted a population-based case-control study using frequency matching, with 1141 stroke patients and 1141 controls selected from the Luzhou population health information platform. Unconditional logistic regression model, CHAID classification tree model and Support Vector Machine (SVM) were used to explore the influencing factors for stroke. We evaluated the accuracy of the three statistical techniques by using the area under the receiver operating characteristic curve (AUC).
Result: The results of multivariate unconditional logistic regression analysis showed that the influencing factors of stroke were age, exercise, hypercholesterolemia, low levels of high density lipoprotein cholesterol (HDL-C), hypertension, diabetes, coronary heart disease (CHD). Classification tree models screened six influencing factors for stroke: hypertension, high salt diet, insufficient exercise, low HDL-C, diabetes and age, and hypertension was the major risk factor for stroke. High risk groups are mainly distributed in people with hypertension, diabetes, age over 50 and lack of exercise. The results of the support vector machine model presented that there were nine factors affecting stroke, the influence levels from high to low were hypertension (42%), insufficient exercise (18%), age group (12%), alcohol consumption (4%), Hypoglycemia (4%), gender (4%), hypercholesterolemia (3%), high salt diet (3%) and diabetes mellitus (2%), respectively. The AUC of logistic regression, classification tree model and SVM model were 0.769, 0.753 and 0.880, respectively.
Conclusion: All three models were reliable and got similar but different results. Therefore, the results of the three models can complement each other. Hypertension, diabetes, CHD, smoking, insufficient exercise, hypercholesterolemia, low HDL-C, and high salt diet, fifty years of age or more are the important risk factors that affect the prevalence of stroke.
重要日期
  • 会议日期

    12月20日

    2019

    12月22日

    2019

  • 11月15日 2019

    初稿录用通知日期

  • 12月22日 2019

    初稿截稿日期

  • 12月22日 2019

    注册截止日期

承办单位
湘雅公共卫生学院
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询