Objective Based on the risk factors of Helicobacter pylori (Hp) infection, to construct the risk prediction model of Hp infection, and provide a new method for clinical prevention of Hp infection.
Methods Subjects who underwent 13C and 14C urea breath test in the department of gastroenterology at The First Affiliated Hospital of Shihezi University were investigated to explore the local Hp infection status using a questionnaire. The dataset was randomly divided into a training set and a test set according to the ratio of 7 ∶ 3, with whether Hp infection occurred as the outcome variable. Univariate analysis and multivariate Logistic regression analysis were used to screen out the characteristic variables with statistical differences. In the training set, six machine learning methods, support vector (SVM), k-proximity (KNN), logistic regression (LR) model, random forest (RF), limit gradient lifting (XGB) and light gradient Enhancer (LightGBM), were used to construct a Hp infection risk prediction model, which was verified and evaluated in the test set. The optimal model was selected by comparing the prediction performance among the models. The interpretability of the optimal model was analyzed by Shapley Additive interpretation (SHAP) method.
Results There were 678 people in the survey, including 475 in the training set and 203 in the test set. The accuracy rate of the XGB model was 0.784, the accuracy rate was 0.777, the recall rate was 0.783, the F1 value was 0.780, the area under the precision recall curve (AUPRC) value was 0.875, the area under characteristic (AUC) was 0.885, and the Brier value was 0.140, which was the best prediction model. Based on the XGB model, the importance of the characteristic variables was ranked as Hp cognitive score, eating high-salt and high-fat food, sharing daily necessities such as dental cups and water cups, eating pickled food, and eating raw garlic.
Conclusion The Hp infection risk prediction model based on XGB machine learning method is the best, which is helpful for early clinical assessment and prevention of Hp infection risk.
1.Marshall BJ, Warren JR. Unidentified curved bacilli in the stomach of patients with gastritis and peptic ulceration[J]. Lancet, 1984, 11(8390): 1311-1315. DOI: 10.1016/s0140-6736(84)91816-6.
2.Pellicano R, laniro G, Fagoonce S, et al. Review: extragastric diseases and Helicobacter pylori[J]. Helicobacter, 2020, 25(Suppl 1): e12741. DOI: 10.1111/hel.12741.
3.Zebasil M, Endalkachew N, Tamirat H, et al. Prevalence and associated risk factors of Helicobacter pylori infection in East Africa: a systematic review and Meta-analysis[J]. Braz J Microbiol, 2023, 55(1): 51-64. DOI: 10.1007/s42770-023-01190-0.
4.Schistosomes, liver flukes and Helicobacter pylori. IARC working group on the evaiuation of carcinogenic risks to humans[J]. IARC Monogr Evall Carcinog Risks Hum, 1994, 61: 1-241. https://pubmed.ncbi.nlm.nih.gov/7715068/.
5.Mărginean CD, Mărginean CO, Melit LE. Helicobacter pylori-related extraintestinal manifestations myth or reality[J]. Children (Basel), 2022, 9(9): 1352. DOI: 10.3390/ children9091352.
6.Handelman GS, Kok HK, Chandra RV, et al. eDoctor: machine learning and the future of medicine[J]. J Intern Med, 2018, 284(6): 603-619. DOI: 10.1111/joim.12822.
7.姚敏, 李艳梅. 人群中幽门螺杆菌感染现状及危险因素分析 [J]. 世界最新医学信息文摘, 2024, 18(55): 85-86. [Yao M, Li YM. Analysis of Helicobacter pylori infection status and risk factors in population[J]. World Latest Medical Information Digest, 2024, 18(55): 85-86.] DOI: 10.19613/j.cnki.1671-3141.2018.55.036.
8.王闫. 幽门螺杆菌感染患者管理现状及一般人群认知情况调查研究[D]. 郑州: 郑州大学, 2022. DOI: 10.27466/d.cnki. gzzdu.2022. 004340.
9.Wang YX, Zou JY, Hu LF, et al. What is the general Chinese public's awareness of and attitudes towards Helicobacter pylori screening and associated health behaviours? a cross-sectional study[J]. BMJ Open, 2022, 12(1): e057929. DOI: 10.1136/bmjopen-2021- 057929.
10.Devellis RF,著. 席仲恩, 杜珏.译. 量表编制: 理论与应用[M]. 重庆: 重庆大学出版社, 2016: 75-125.
11.Liew BXW, Kovacs FM, Rügamer D, et al. Machine learning versus logistic regression for prognostic modelling in individuals with non-specific neck pain[J]. Eur Spine J, 2022, 31(8): 2082-2091. DOI: 10.1007/s00586-022-07188-w.
12.杨翀, 李旭东, 吕良福, 等. 基于机器学习预测动脉瘤性蛛网膜下腔出血预后模型的临床研究与应用[J]. 中国医院药学杂志, 2024, 44(3): 257-262. [Yang C, Li XD, Lyu LF, et al. Clinical research and application of machine learning-based prognostic model for aneurysmal subarachnoid hemorrhage[J]. Chinese Journal of Hospital Pharmacy, 2024, 44(3): 257-262.] DOI: 10.13286/j.1001-5213.2024.03.03.
13.Lu TL, Zhang JM, Li SR, et al. Spatial-temporal distribution and influencing factors of Helicobacter pylori infection in Chinese mainland, 2001-2020: a systematic review and meta -analysis[J]. J Clin Gastroenterol, 2022, 56(5): e273-e282. DOI: 10.1097/MCG.0000000000001691.
14.张晓冬, 张大涯, 陈世锔, 等. 海口市幽门螺杆菌感染现状与相关危险因素调查[J]. 现代消化及介入诊疗, 2024, 29(4): 393-397. [Zhang XD, Zhang DY, Chen SR, et al. Investigation of Helicobacter pylori infection status and related risk factors in Haikou city[J]. Modern Digestive and Interventional Diagnosis and Treatment, 2024, 29(4): 393-397.] DOI: 10.3969/j.issn.1672-2159.2024.04.001.
15.俞菊英, 尹强龙, 陈勇, 等. 嘉兴市秀洲区体检和门诊人群幽门螺杆菌感染状况调查[J]. 胃肠病学, 2018, 23(6): 363-365. [Yu JY, Yin QL, Chen Y, et al. Investigation on Helicobacter pylori infection in physical examination and outpatient population in Xiuzhou district of Jiaxing City[J]. Gastroenterology, 2018, 23(6): 363-365.] DOI: 10.3969/j.issn.1008-7125.2018.06.009.
16.田宏扬, 严华芳, 乔春萍, 等. 上海市浦东新区南部体检人群幽门螺杆菌感染情况及相关因素分析[J]. 复旦学报(医学版), 2022, 49(5): 720-725, 732. [Tian HY, Yan HF, Qiao CP, et al. Analysis of Helicobacter pylori infection and its related factors among physical examination population in southern Pudong New Area of Shanghai[J]. Journal of Fudan University, 2022, 49(5): 720-725, 732.] DOI: 10.3969/j.issn.1672-8467.2022.05.012.
17.吴美璇, 胡泽林, 贾圆露, 等. 幽门螺杆菌感染危险因素的Meta分析[J]. 吉林医学, 2024, 45(12): 2997-3000. [Wu MX, Hu ZL, Jia YL, et al. Meta-analysis of risk factors for Helicobacter pylori infection[J]. Jilin Medicine, 2024, 45(12): 2997-3000.] DOI: 10.3969/j.issn.1004-0412.2024.12.035.
18.马贞, 李曼玲, 唐璐, 等. 幽门螺杆菌感染危险因素Meta分析[J]. 现代消化及介入诊疗, 2020, 25(9): 1212-1216. [Ma Z, Li ML, Tang L, et al. Meta-analysis of risk factors for Helicobacter pylori infection[J]. Modern Digestive and Interventional Therapy, 2020, 25(9): 1212-1216.] DOI: 10.3969/j.issn.1672-2159.2020.09.018.
19.Khan MY, Aslam A, Mihali AB, et al. Effectiveness of Helicobacter pylori eradication in preventing metachronous gastric cancer and preneoplastic lesions: a systematic review and meta-analysis[J]. Eur J Gastroenterol Hepatol, 2020, 32(6): 686-694. DOI: 10.1097/MEG.0000000000001740.
20.赵文芳, 刘鑫洋, 徐灿霞. 普通人群幽门螺杆菌感染认知度和治疗意愿[J]. 临床与病理杂志, 2024, 44(1): 63-70. [Zhao WF, Liu XY, Xu CX. Awareness and willingness to treat Helicobacter pylori infection in general population[J]. Chinese Journal of Clinical & Pathology, 2024, 44(1): 63-70.] DOI: 10.11817/j.issn.2095-6959.2024.230407.
21.肖婵, 曾小平, 王时波, 等. 慢性胃炎患者幽门螺杆菌检出现状及影响因素研究[J]. 华南预防医学, 2023, 49(2): 156-159. [Xiao C, Zeng XP, Wang SB, et al. Study on Helicobacter pylori detection status and influencing factors in patients with chronic gastritis[J]. South China Preventive Medicine, 2023, 49(2): 156-159.] DOI: 10.12183/j.scjpm.2023.0156.
22.陈伟, 周景梅, 顾赟, 等. 东台地区健康体检人群幽门螺杆菌感染状况分析[J]. 智慧健康, 2024, 10(4): 27-29, 33. [Chen W, Zhou JM, Gu Y, et al. Analysis of Helicobacter pylori infection among healthy people in Dongtai area[J]. Smart Health, 2024, 10(4): 27-29, 33.] DOI: 10.19335/j.cnki.2096-1219.2024.04.007.
23.达娃卓玛, 李生隆, 土旦格列, 等. 西藏地区体检人群幽门螺杆菌的流行病学调查[J]. 高原科学研究, 2021, 5(3): 46-54. [Dawa ZM, Li SL, Tu Tangri, et al. Epidemiological investigation of Helicobacter pylori in physical examination population in Tibet[J]. Plateau Scientific Research, 2021, 5(3): 46-54.] DOI: 10.16249/j.cnki.2096-4617.2021.03.006.
24.Loh JT, Beckett AC, Scholz MB, et al. High-salt conditions alter transcription of Helicobacter pylori genes encoding outer membrane proteins[J]. Infect Immun, 2018, 86(3): e00626-17. DOI: 10.1128/IAI.00626-17.
25.Diniz LIB, Celino FDMC, Sousa DDJC, et al. Risk factors of Helicobacter pylori infection in an urban community in Northeast Braril and the relationship between the infection and gastric discases[J]. Rev Soc Bras Med Trop, 2018, 51(2): 183-189. DOI: 10.1590/0037-8682-0412-2016.
26.Kim H, Keum N, Giovannucci EL, et al. Garlic intake and gastric cancer risk: results from two large prospective US cohort studies[J]. Int J Cancer, 2018, 143(5): 1047-1053. DOI: 10.1002/ijc.31396.
27.罗鹏, 蒲柯, 杨国栋. 南充地区幽门螺杆菌感染现状及相关影响因素分析[J]. 现代消化及介入诊疗, 2022, 27(8): 951-955. [Luo P, Pu K, Yang GD. Analysis of Helicobacter pylori infection status and related influencing factors in Nanchong area[J]. Modern Digestive and Interventional Diagnosis and Treatment, 2022, 27(8): 951-955.] DOI: 10.3969/j.issn.1672-2159.2022.08.004.
28.Haghi A, Azimi H, Rahimi R. A comprehensive review on pharmacotherapeutics of three phytochemicals, curcumin, quercetin, and allicin in the treatment of gastric cancer[J]. J Gastrointest Cancer, 2017, 48(4): 314-320. DOI: 10.1007/s12029-017-9997-7.
29.张莉莉. 上海宝山地区体检人群幽门螺杆菌感染现状及影响因素调查分析[D]. 上海: 上海交通大学, 2019. DOI: 10.27307/d.cnki.gsjtu.2019.002767.
30.Vale FF, Oleastro M. Overview of the phytomedicine approaches against Helicobacter pylori[J]. World J Gastroenterol, 2014, 20(19): 5594-5609. DOI: 10.3748/wjg.v20.i19.5594.
31.陈虎, 陈光侠, 余思锦, 等. 幽门螺杆菌易感因素分析及感染风险预测模型的构建[J]. 中国医药导报, 2024, 21(30): 135-140. [Chen H, Chen GX, Yu SJ, et al. Analysis of susceptibility factors and construction of infection risk prediction model for Helicobacter pylori[J]. China Medical Review, 2024, 21(30): 135-140.] DOI: 10.20047/j.issn.1673-7210.2024.30.25.
32.袁一鸣, 杜结玲, 洪慧斯, 等. 机器学习对H.pylori感染患者的特征变量及预测模型研究[J]. 胃肠病学和肝病学杂志, 2024, 33(8): 958-965. [Yuan YM, Du JL, Hong HS, et al. Study on characteristic variables and prediction model of H.pylori infection by machine learning[J]. Journal of Gastroenterology and Hepatology, 2024, 33(8): 958-965.] DOI: 10.3969/j.issn.1006-5709.2024.08.002.
33.杜结玲, 袁一鸣, 洪慧斯, 等. 数据挖掘构建幽门螺杆菌感染患者的预测模型和防治策略[J]. 胃肠病学和肝病学杂志, 2022, 31(9): 992-998. [Du JL, Yuan YM, Hong HS, et al. Prediction model and prevention strategy of Helicobacter pylori infection by data mining[J]. Journal of Gastroenterology and Hepatology, 2022, 31(9): 992-998.] DOI: 10.3969/j.issn.1006-5709.2022.09.008.
34.Liew BXW, Kovacs FM, Rügamer D, Royuela A. Machine learning versus lo-gistic regression for prognostic modelling in individuals with nonspecific neck pain[J]. Eur Spine J, 2022, 31(8): 2082-2091. DOI: 10.1007/s00586-022-07188-w.
35.刘丁玮. 基于机器学习识别HP感染高危人群及构建根除后复发预测模型[D]. 南昌: 南昌大学, 2024. DOI: 10.27232/d.cnki.gnchu.2024.003878.
36.Tran V, Saad T, Tesfaye M, et al. Helicobacter pylori (H.pylori) risk factor analysis and prevalence prediction: a machine learning-based approach[J]. BMC Infect Dis, 2022, 22(1): 655. DOI: 10.1186/s12879-022-07625-7.
37.Liu M, Liu S, Lu Z, et al. Machine learning based prediction of Helicobacter pylori infection study in adults[J]. Med Sci Monit, 2024, 30: e943666. DOI: 10.12659/MSM.943666.