Applying Predictive Analytics in Identifying Key Risk Factors for Hypertension in Malawi: A Randomized Controlled Population Health Study


  •  Bongs Lainjo    
  •  Dorothy Eunice Lazaro    
  •  Maureen Leah Chirwa    
  •  Gomezga Chitsulo    

Abstract

This study investigates the primary risk factors for hypertension in Malawi using predictive analytics and the CARROT-BUS (Capacity Building, Accountability, Resources, Results, Ownership, Transparency – Bottom-Up Strategy) model as a guiding framework. Drawing on baseline data from a population-level control cohort study, multiple machine learning models—Logistic Regression, Random Forests, Support Vector Machines (SVMs), Neural Networks, and XGBoost—were applied to assess predictive performance. Among them, XGBoost achieved the highest accuracy (88%) and AUC-ROC (0.92), followed by Random Forest and Logistic Regression. Key predictors included age, body mass index (BMI), systolic blood pressure, physical inactivity, and high sodium intake.

In parallel, qualitative data from focus group discussions (FGDs) provided contextual insights into community knowledge, attitudes, and barriers regarding hypertension prevention and care. Participants revealed widespread misconceptions about hypertension symptoms and causes, reliance on traditional medicine, inadequate infrastructure, and medication shortages. The CARROT-BUS model served as a lens to assess systemic enablers and constraints, emphasizing the importance of community ownership, transparent resource allocation, and sustainable intervention planning.

This mixed-methods approach demonstrates the value of integrating machine learning with participatory community engagement to guide data-informed, culturally relevant public health strategies. While the cross-sectional nature of the baseline data limits causal inference, and some self-reported variables may reflect social desirability bias, the study offers actionable insights for improving hypertension control in low-resource settings.

Future phases, including midline and endline assessments, will further evaluate the effectiveness and sustainability of the interventions. These assessments are critical for enabling causal inference and determining the longitudinal impact of the intervention on hypertension control.



This work is licensed under a Creative Commons Attribution 4.0 License.