Show simple item record

dc.contributor.authorBirri Makota, Rutendo Beauty
dc.contributor.authorMusenge, Eustasius
dc.date.accessioned2025-07-25T13:56:29Z
dc.date.available2025-07-25T13:56:29Z
dc.date.issued2023-06-07
dc.identifier.citationBirri Makota RB, Musenge E (2023) Predicting HIV infection in the decade (2005–2015) pre-COVID-19 in Zimbabwe: A supervised classification-based machine learning approach. PLOS Digital Health 2(6): e0000260. https://doi.org/10.1371/journal.pdig.0000260en_ZW
dc.identifier.urihttps://hdl.handle.net/10646/4790
dc.description.abstractThe burden of HIV and related diseases have been areas of great concern pre and post the emergence of COVID-19 in Zimbabwe. Machine learning models have been used to predict the risk of diseases, including HIV accurately. Therefore, this paper aimed to determine common risk factors of HIV positivity in Zimbabwe between the decade 2005 to 2015. The data were from three two staged population five-yearly surveys conducted between 2005 and 2015. The outcome variable was HIV status. The prediction model was fit by adopting 80% of the data for learning/training and 20% for testing/prediction. Resampling was done using the stratified 5-fold cross-validation procedure repeatedly. Feature selection was done using Lasso regression, and the best combination of selected features was determined using Sequential Forward Floating Selection. We compared six algorithms in both sexes based on the F1 score, which is the harmonic mean of precision and recall. The overall HIV prevalence for the combined dataset was 22.5% and 15.3% for females and males, respectively. The best-performing algorithm to identify individuals with a higher likelihood of HIV infection was XGBoost, with a high F1 score of 91.4% for males and 90.1% for females based on the combined surveys. The results from the prediction model identified six common features associated with HIV, with total number of lifetime sexual partners and cohabitation duration being the most influential variables for females and males, respectively. In addition to other risk reduction techniques, machine learning may aid in identifying those who might require Pre-exposure prophylaxis, particularly women who experience intimate partner violence. Furthermore, compared to traditional statistical approaches, machine learning uncovered patterns in predicting HIV infection with comparatively reduced uncertainty and, therefore, crucial for effective decision-making.en_ZW
dc.language.isoenen_ZW
dc.publisherPLOS Digital Healthen_ZW
dc.subjectCOVID-19en_ZW
dc.subjectMachine learning modelsen_ZW
dc.subjectHIV infectionen_ZW
dc.subjectHIV prevalenceen_ZW
dc.subjectPre-exposure prophylaxisen_ZW
dc.titlePredicting HIV infection in the decade (2005–2015) pre-COVID-19 in Zimbabwe: a supervised classification-based machine learning approachen_ZW
dc.typeArticleen_ZW


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record