Classification of patients with chronic disease by activation level using machine learning methods


Demiray O., Gunes E. D., Kulak E., Dogan E., KARAKETİR E. Ş., Cifcili S., ...Daha Fazla

Health Care Management Science, cilt.26, sa.4, ss.626-650, 2023 (SSCI) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 26 Sayı: 4
  • Basım Tarihi: 2023
  • Doi Numarası: 10.1007/s10729-023-09653-4
  • Dergi Adı: Health Care Management Science
  • Derginin Tarandığı İndeksler: Social Sciences Citation Index (SSCI), Scopus, ABI/INFORM, Business Source Elite, Business Source Premier, CINAHL, EconLit, EMBASE, MEDLINE, Public Affairs Index
  • Sayfa Sayıları: ss.626-650
  • Anahtar Kelimeler: Binary classification, Chronic care, Logistic regression, Machine learning, Patient activation, Patient activation measure, Prediction, Primary care
  • İstanbul Üniversitesi Adresli: Hayır

Özet

Patient Activation Measure (PAM) measures the activation level of patients with chronic conditions and correlates well with patient adherence behavior, health outcomes, and healthcare costs. PAM is increasingly used in practice to identify patients needing more support from the care team. We define PAM levels 1 and 2 as low PAM and investigate the performance of eight machine learning methods (Logistic Regression, Lasso Regression, Ridge Regression, Random Forest, Gradient Boosted Trees, Support Vector Machines, Decision Trees, Neural Networks) to classify patients. Primary data collected from adult patients (n=431) with Diabetes Mellitus (DM) or Hypertension (HT) attending Family Health Centers in Istanbul, Turkey, is used to test the methods. 44.5 % of patients in the dataset have a low PAM level. Classification performance with several feature sets was analyzed to understand the relative importance of different types of information and provide insights. The most important features are found as whether the patient performs self-monitoring, smoking and exercise habits, education, and socio-economic status. The best performance was achieved with the Logistic Regression algorithm, with Area Under the Curve (AUC)=0.72 with the best performing feature set. Alternative feature sets with similar prediction performance are also presented. The prediction performance was inferior with an automated feature selection method, supporting the importance of using domain knowledge in machine learning.