A Comparative Study of PCA-Based Dimensionality Reduction and Best Subset Selection in Disease Classification
Abstract
Keywords
Full Text:
DOWNLOAD [PDF]References
Abdollahi, J., & Nouri-Moghaddam, B. (2021). Feature selection for medical diagnosis: Evaluation for using a hybrid Stacked-Genetic approach in the diagnosis of heart disease. ArXiv. https://arxiv.org/abs/2103.08175
Andika, R. A., & Dewi, C. (2025). Importance of Feature Selection for Multiple Disease Classification. Jurnal Buana Informatika, 16(1), 34–45.
Austin, P. C., & van Buuren, S. (2023). Logistic regression vs. predictive mean matching for imputing binary covariates. Statistical Methods in Medical Research, 32(11), 2172–2183. https://doi.org/10.1177/09622802231198795
Chen, H., Hu, S., Hua, R., & Zhao, X. (2021). Improved naive Bayes classification algorithm for traffic risk management. Eurasip Journal on Advances in Signal Processing, 30(2021), 1–12. https://doi.org/10.1186/s13634-021-00742-6
Devaraj, S., & Paulraj, S. (2015). An Efficient Feature Subset Selection Algorithm for Classification of Multidimensional Dataset. Scientific World Journal, 2015. https://doi.org/10.1155/2015/821798
Dey, D., Haque, M. S., Islam, M. M., Aishi, U. I., Shammy, S. S., Mayen, M. S. A., Noor, S. T. A., & Uddin, M. J. (2025). The proper application of logistic regression model in complex survey data: a systematic review. BMC Medical Research Methodology, 25(15). https://doi.org/10.1186/s12874-024-02454-5
Esen, G., Altaibek, A., Amankulov, J., Matkerim, B., & Nurtas, M. (2024). Enhancing Breast Cancer Detection with Dimensionality Reduction Techniques: A Study Using PCA and LDA on Wisconsin Breast Cancer Data. Procedia Computer Science, 251, 414–421. https://doi.org/10.1016/j.procs.2024.11.128
Graf, R., Zeldovich, M., & Friedrich, S. (2024). Comparing linear discriminant analysis and supervised learning algorithms for binary classification—A method comparison study. Biometrical Journal, 66(1). https://doi.org/10.1002/bimj.202200098
Guyon, I., & Elisseeff, A. (2003). An Introduction to Variable and Feature Selection André Elisseeff. Journal of Machine Learning Research, 3, 1157–1182.
Han, J., Pei, J., & Tong, H. (2023). Data Mining: Concepts and Techniques.
Hanke, M., Dijkstra, L., Foraita, R., & Didelez, V. (2024). Variable selection in linear regression models: Choosing the best subset is not always the best choice. Biometrical Journal, 66(1). https://doi.org/10.1002/bimj.202200209
Heaton, J. (2018). Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep learning. Genetic Programming and Evolvable Machines, 19(1–2), 305–307. https://doi.org/10.1007/s10710-017-9314-z
Huang, D., Quan, Y., He, M., & Zhou, B. (2009). Comparison of linear discriminant analysis methods for the classification of cancer based on gene expression data. Journal of Experimental and Clinical Cancer Research, 28(1). https://doi.org/10.1186/1756-9966-28-149
Johnson, R. A. ., & Wichern, D. W. . (2014). Applied multivariate statistical analysis. Pearson Educated Limited.
Joosse, H. J., Chumsaeng-Reijers, C., Huisman, A., Hoefer, I. E., van Solinge, W. W., Haitjema, S., & van Es, B. (2025). Haematology dimension reduction, a large scale application to regular care haematology data. BMC Medical Informatics and Decision Making, 25(1). https://doi.org/10.1186/s12911-025-02899-8
Kehinde Josephine Olowe, Ngozi Linda Edoh, Stephane Jean Christophe Zouo, & Jeremiah Olamijuwon. (2024). Comprehensive review of logistic regression techniques in predicting health outcomes and trends. World Journal of Advanced Pharmaceutical and Life Sciences, 7(2), 016–026. https://doi.org/10.53346/wjapls.2024.7.2.0039
Kuzudisli, C., Bakir-Gungor, B., Bulut, N., Qaqish, B., & Yousef, M. (2023). Review of feature selection approaches based on grouping of features. In PeerJ (Vol. 11). PeerJ Inc. https://doi.org/10.7717/peerj.15666
Labory, J., Njomgue-Fotso, E., & Bottini, S. (2024). Benchmarking feature selection and feature extraction methods to improve the performances of machine-learning algorithms for patient classification using metabolomics biomedical data. Computational and Structural Biotechnology Journal, 23, 1274–1287. https://doi.org/10.1016/j.csbj.2024.03.016
Li, B., Gui, X., & Zhou, Q. (2022). Construction of Development Momentum Index of Financial Technology by Principal Component Analysis in the Era of Digital Economy. Computational Intelligence and Neuroscience, 2022. https://doi.org/10.1155/2022/2244960
Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2018). Feature selection: A data perspective. In ACM Computing Surveys (Vol. 50, Number 6). Association for Computing Machinery. https://doi.org/10.1145/3136625
Mohtasham, F., Pourhoseingholi, M. A., Hashemi Nazari, S. S., Kavousi, K., & Zali, M. R. (2024). Comparative analysis of feature selection techniques for COVID-19 dataset. Scientific Reports, 14(1). https://doi.org/10.1038/s41598-024-69209-6
Opitz, J. (2024). A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice. Transactions of the Association for Computational Linguistics, 12, 820–836. https://doi.org/https://doi.org/10.1162/tacl_a_00675
Parman, N. H., Hassan, R., & Zakaria, N. H. (2024). Breast Cancer Prediction Using Support Vector Machine Ensemble with PCA Feature Selection Method. International Journal of Innovative Computing, 14(1), 15–19. https://doi.org/10.11113/ijic.v14n1.461
Sankarganesh, P. V, & Priya, D. R. (2024). Improved Feature Selection and Classification for Diabetes Mellitus Using Random Forest-Based U-Net Classifier. International Journal of Intelligent Systems and Applications in Engineering IJISAE, 12(4), 1772–1780. www.ijisae.org
Shen, Z. (2023). Comparison and Evaluation of Classical Dimensionality Reduction Methods. Highlights in Science, Engineering and Technology ICMEA, 70(2023), 411–418. https://doi.org/https://doi.org/10.54097/hset.v70i.13890
Sujon, K. M., Hassan, R., Choi, K., & Samad, M. A. (2025). Accuracy, precision, recall, f1-score, or MCC? empirical evidence from advanced statistics, ML, and XAI for evaluating business predictive models. Journal of Big Data, 12(1). https://doi.org/10.1186/s40537-025-01313-4
Wu, R. M. X., Zhang, Z., Yan, W., Fan, J., Gou, J., Liu, B., Gide, E., Soar, J., Shen, B., Fazal-E-Hasan, S., Liu, Z., Zhang, P., Wang, P., Cui, X., Peng, Z., & Wang, Y. (2022). A comparative analysis of the principal component analysis and entropy weight methods to establish the indexing measurement. PLoS ONE, 17(1 January), 1–26. https://doi.org/10.1371/journal.pone.0262261
Zheng, J., & Rakovski, C. (2021). On the application of principal component analysis to classification problems. Data Science Journal, 20(1). https://doi.org/10.5334/dsj-2021-026
DOI: https://doi.org/10.31764/jtam.v10i3.38265
Refbacks
- There are currently no refbacks.
Copyright (c) 2026 Andreas Rony Wijaya, Atika Ratna Dewi, Muhammah Bayu Nirwana, Respatiwulan, Sri Sulistijowati Handajani

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
_______________________________________________
JTAM already indexing:
_______________________________________________
![]() | JTAM (Jurnal Teori dan Aplikasi Matematika) |
_______________________________________________
_______________________________________________
JTAM (Jurnal Teori dan Aplikasi Matematika) Editorial Office:



















2.jpg)
