Comparing MARS and Binary Logistic Regression to Modelling Hepatitis C Cases using the SMOTE Balancing Method

Nur Chamidah, Aulia Ramadhanti, Azzah Nazhifa Wina Ramadhani, Bimo Okta Syahputra, Jovansha Ariyawan, Ardi Kurniawan

Abstract


Hepatitis is an inflammatory liver disease caused by viral infection and remains a major global public health concern, responsible for approximately 1.4 million deaths annually. Egypt is among the countries with the highest prevalence of Hepatitis C. To address this issue and support Goal 3 of the Sustainable Development Goals (SDGs), this study applies a quantitative approach using secondary data to analyze factors influencing Hepatitis C infection in Egypt. Two statistical models Binary Logistic Regression and Multivariate Adaptive Regression Splines (MARS) were compared, with the SMOTE method implemented to correct class imbalance. The dataset consisted of 608 patient observations, initially imbalanced at a ratio of 86.5:13.5, and were balanced to 52.6:47.4 after SMOTE application. The results revealed that the MARS model demonstrated superior predictive performance compared to binary logistic regression. All independent variables were found statistically significant (p < 0.05), except sex. Additionally, all odds ratios were less than 1, indicating a lower probability of Hepatitis C infection relative to non-infection. These findings highlight the relevance of statistical modeling and data-driven strategies in supporting preventive health measures.

 


Keywords


Egypt; Hepatitis C; Binary Logistic Regression; MARS; SMOTE.

Full Text:

DOWNLOAD [PDF]

References


Amrin, A., Rudianto, R., & Sismadi, S. (2025). Data Mining with Logistic Regression and Support Vector Machine for Hepatitis Disease Diagnosis. JITE (Journal of Informatics and Telecommunication Engineering), 8(2), 248–256. https://doi.org/ 10.31289/jite.v8i2.13218

Anugrawati, S. D., Nurhikma, Iyut Wahyu Saputri, & Khalilah Nurfadilah. (2023). Analisis Regresi Logistik Biner dalam Penentuan Faktor-Faktor yang Mempengaruhi Ketepatan Waktu Lulus Mahasiswa UIN Alauddin Makassar. Journal of Mathematics: Theory and Applications, 5(1), 11–16. https://doi.org/10.31605/jomta.v5i1.2401

Ayoub, H. H., Chemaitelly, H., Kouyoumjian, S. P., & Abu-Raddad, L. J. (2020). Characterizing the historical role of parenteral antischistosomal therapy in hepatitis C virus transmission in Egypt. International Journal of Epidemiology, 49(3), 798–809. https://doi.org/10.1093/ije/dyaa052

Bataller, R., Arab, J. P., & Shah, V. (2022). Alcohol-Associated Hepatitis. The New England Journal of Medicine, 387(26), 2436–2448. https://doi.org/10.1056/NEJMra2207599

Devarbhavi, H., Asrani, S. K., Arab, J. P., Nartey, Y. A., Pose, E., & Kamath, P. S. (2023). Global burden of liver disease: 2023 update. Journal of Hepatology, 79(2), 516–537. https://doi.org/10.1016/j.jhep.2023.03.017

Dufour, D. R., Lott, J. A., Nolte, F. S., Gretch, D. R., Koff, R. S., & Seeff, L. B. (2000). Diagnosis and Monitoring of Hepatic Injury. I. Performance Characteristics of Laboratory Tests. Clinical Chemistry, 46(12), 2027–2049. https://doi.org/10.1093/clinchem/46.12.2027

Feliansyah, A. W., & Purwanto, E. (2024). Analisis faktor yang berhubungan dengan penyakit hepatitis di Indonesia. Holistik Jurnal Kesehatan, 18(9), 1131–1138. https://doi.org/10.33024/hjk.v18i9.587

Han, J., Kamber, M., & Pei, J. (2022). Data Mining: Concepts and Techniques (4th Ed.). Elsevier.

Haryawan, C., & Ardhana, Y. M. K. (2023). Analisa Perbandingan Teknik Oversampling SMOTE pada Imbalanced Data. Jurnal Informatika dan Rekayasa Elektronik, 6(1), 73–78. https://doi.org/10.36595/jire.v6i1.834

Heneghan, M. A., & Lohse, A. W. (2025). Update in clinical science: Autoimmune hepatitis. Journal of Hepatology, 82(5), 926–937. https://doi.org/10.1016/j.jhep.2024.12.041

Hwang, S. Y., Danpanichkul, P., Agopian, V., Mehta, N., Parikh, N. D., Abou-Alfa, G. K., Singal, A. G., & Yang, J. D. (2025). Hepatocellular carcinoma: updates on epidemiology, surveillance, diagnosis and treatment. Clinical and Molecular Hepatology, 31(Suppl), S228–S254. https://doi.org/10.3350/cmh.2024.0824

Liu, Y., Li, D., & Xia, Y. Dimension Reduction and MARS. JMLR, 24(309),1–30. http://jmlr.org/papers/v24/22-1422.html

Miranda, S., & Adiwinoto, R. P. (2022). Tinjauan Sistematik: Epidemiologi Hepatitis A pada Anak di Indonesia. Prominentia Medical Journal, 3(2), 40–55. https://doi.org/10.37715/pmj.v3i2.3216

Nurhayati, L. D., & Rahardi, M. (2025). Impact of SMOTE and ADASYN on Class Imbalance in Metabolic Syndrome Classification Using Random Forest Algorithm. Journal of Applied Informatics and Computing, 9(5), 2807–2813. https://doi.org/10.30871/jaic.v9i5.10657

Singh, V., Pencina, M., Einstein, A. J., Liang, J. X., Berman, D. S., & Slomka, P. (2021). Impact of train/test sample regimen on performance estimate stability of machine learning in cardiovascular imaging. Scientific reports, 11(1), 14490. https://doi.org/10.1038/s41598-021-93651-5

Stroffolini, T., & Stroffolini, G. (2024). Prevalence and Modes of Transmission of Hepatitis C Virus Infection: A Historical Worldwide Review. Viruses, 16(7), 1115. https://doi.org/10.3390/v16071115

The World Bank. (2024). How Egypt Won its Battle Against Hepatitis C. https://www.worldbank.org/en/news/feature/2024/04/05/how-egypt-won-its-battle-against-hepatitis-c

UCI Machine Learning Repository. (2017). Hepatitis C Virus (HCV) for Egyptian patients - UCI Machine Learning Repository. https://archive.ics.uci.edu/dataset/503/hepatitis+c+virus+hcv+for+egyptian+patients

Wongvorachan, T., He, S., & Bulut, O. (2023). A comparison of undersampling, oversampling, and SMOTE methods for dealing with imbalanced classification in educational data mining. Information, 14(1), 54. https://doi.org/10.3390/info14010054

Yi, S.-W., Yi, J.-J., & Ohrr, H. (2019). Total cholesterol and all-cause mortality by sex and age: a prospective cohort study among 12.8 million adults. Scientific Reports, 9(1), 1596. https://doi.org/10.1038/s41598-018-38461-y




DOI: https://doi.org/10.31764/jtam.v10i1.33196

Refbacks

  • There are currently no refbacks.


Copyright (c) 2026 Nur Chamidah, Aulia Ramadhanti, Azzah Nazhifa Wina Ramadhani, Bimo Okta Syahputra, Jovansha Ariyawan, Ardi Kurniawan

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

_______________________________________________

JTAM already indexing:

                     


_______________________________________________

 

Creative Commons License

JTAM (Jurnal Teori dan Aplikasi Matematika) 
is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

______________________________________________

_______________________________________________

_______________________________________________ 

JTAM (Jurnal Teori dan Aplikasi Matematika) Editorial Office: