Hybrid Approach for Class Imbalance Handling using Adaptive Weighted Oversampling and Instance Hardness-Based Undersampling

Hartono Hartono, Erianto Ongko, Muhammad Khahfi Zuhanda

Abstract


Class imbalance remains a major challenge in multi-class classification, where existing hybrid resampling methods often combine oversampling and undersampling in a loosely coupled manner, without explicitly coordinating minority enrichment and majority reduction. In this experimental study, we propose a novel hybrid resampling method, Adaptive Weighted Oversampling and Instance Hardness-Based Undersampling (AWO-IHU), which differs from existing hybrid approaches by explicitly aligning boundary-aware minority oversampling with instance hardness-based majority undersampling. Rather than independently applying oversampling and undersampling, the proposed method integrates both processes through a coordinated design guided by classification difficulty to improve decision boundary quality. Methodologically, AWO-IHU first applies adaptive weighted oversampling to emphasize informative minority instances near class boundaries, followed by instance hardness-based undersampling that selectively removes redundant majority samples using an ensemble-based difficulty estimation. The experimental evaluation is conducted using multiple benchmark datasets with varying numbers of instances, attributes, and classes. Classification performance is evaluated using Accuracy, Precision, Recall, and Cohen’s Kappa, enabling a comprehensive assessment of overall correctness, minority sensitivity, and agreement beyond chance under class imbalance. Experimental results show that AWO-IHU consistently outperforms SMOTE, Random Undersampling, and conventional hybrid sampling methods. In particular, the proposed method achieves perfect or near-perfect Recall values up to 1.0, while maintaining high Precision values above 0.89 and producing the highest Cohen’s Kappa values up to 0.86. These findings demonstrate that explicitly coordinating minority enrichment with difficulty-aware majority reduction yields more reliable decision boundary learning and improved generalization in imbalanced multi-class classification.

 


Keywords


Class Imbalance; Mult-class classification; Hybrid Resampling; Adaptive Weighted Oversampling; Instance Hardness-Based Undersampling.

Full Text:

DOWNLOAD [PDF]

References


Alabduallah, B., Maray, M., Alruwais, N., Alabdan, R., Darem, A. A., Alallah, F. S., Alsini, R., & Yafoz, A. (2024). Class imbalanced data handling with cyberattack classification using Hybrid Salp Swarm Algorithm with deep learning approach. Alexandria Engineering Journal, 106, 654–663. https://doi.org/10.1016/j.aej.2024.08.061

Altalhan, M., Algarni, A., & Turki-Hadj Alouane, M. (2025). Imbalanced Data Problem in Machine Learning: A Review. IEEE Access, 13, 13686–13699. https://doi.org/10.1109/ACCESS.2025.3531662

Aymaz, S. (2025). Unlocking the power of optimized data balancing ratios: A new frontier in tackling imbalanced datasets. The Journal of Supercomputing, 81(2), 443. https://doi.org/10.1007/s11227-025-06919-2

Barua, S., Islam, Md. M., Yao, X., & Murase, K. (2014). MWMOTE–Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning. IEEE Transactions on Knowledge and Data Engineering, 26(2), 405–425. https://doi.org/10.1109/TKDE.2012.232

Carvalho, M., Pinho, A. J., & Brás, S. (2025). Resampling approaches to handle class imbalance: A review from a data perspective. Journal of Big Data, 12(1), 71. https://doi.org/10.1186/s40537-025-01119-4

Chen, W., Yang, K., Yu, Z., Shi, Y., & Chen, C. L. P. (2024). A survey on imbalanced learning: Latest research, applications and future directions. Artificial Intelligence Review, 57(6), 137. https://doi.org/10.1007/s10462-024-10759-6

Chiu, C. W., & Minku, L. L. (2024). Smoclust: Synthetic minority oversampling based on stream clustering for evolving data streams. Machine Learning, 113(7), 4671–4721. https://doi.org/10.1007/s10994-023-06420-y

Elreedy, D., & Atiya, A. F. (2019). A Comprehensive Analysis of Synthetic Minority Oversampling Technique (SMOTE) for handling class imbalance. Information Sciences, 505, 32–64. https://doi.org/10.1016/j.ins.2019.07.070

Fachrie, M., Musdholifah, A., & Pulungan, R. (2025). Effectiveness of data resampling and ensemble learning in multiclass imbalance learning. Artificial Intelligence Review, 58(12), 368. https://doi.org/10.1007/s10462-025-11357-w

Farou, Z., Wang, Y., & Horváth, T. (2024). Cluster-based oversampling with area extraction from representative points for class imbalance learning. Intelligent Systems with Applications, 22, 200357. https://doi.org/10.1016/j.iswa.2024.200357

Han, F., Wang, C., Ling, Q., & Han, H. (2025). A denoising majority weighted minority oversampling technique for imbalanced classification. Expert Systems with Applications, 288, 128199. https://doi.org/10.1016/j.eswa.2025.128199

Hancock, J., Khoshgoftaar, T. M., & Johnson, J. M. (2022). The Effects of Random Undersampling for Big Data Medicare Fraud Detection. 2022 IEEE International Conference on Service-Oriented System Engineering (SOSE), 141–146. https://doi.org/10.1109/SOSE55356.2022.00023

Hassani, H., Entezarian, M. R., Zaeimzadeh, S., Marvian, L., & Komendantova, N. (2025). An oversampling-undersampling strategy for large-scale data linkage. Frontiers in Big Data, 8. https://doi.org/10.3389/fdata.2025.1542483

Hoyos-Osorio, J., Alvarez-Meza, A., Daza-Santacoloma, G., Orozco-Gutierrez, A., & Castellanos-Dominguez, G. (2021). Relevant information undersampling to support imbalanced data classification. Neurocomputing, 436, 136–146. https://doi.org/10.1016/j.neucom.2021.01.033

Huang, H., Liu, B., Xue, X., Cao, J., & Chen, X. (2024). Imbalanced credit card fraud detection data: A solution based on hybrid neural network and clustering-based undersampling technique. Applied Soft Computing, 154, 111368. https://doi.org/10.1016/j.asoc.2024.111368

Li, D.-C., Shi, Q.-S., Lin, Y.-S., & Lin, L.-S. (2022). A Boundary-Information-Based Oversampling Approach to Improve Learning Performance for Imbalanced Datasets. Entropy, 24(3), 322. https://doi.org/10.3390/e24030322

Lin, Z., Xu, Y., Liu, K., & Chen, L. (2026). MDGP-forest: A novel deep forest for multi-class imbalanced learning based on multi-class disassembly and feature construction enhanced by genetic programming. Pattern Recognition, 170, 112070. https://doi.org/10.1016/j.patcog.2025.112070

M, P., Tyagi, B., R, N. P., & B, M. (2025). Hybrid Synthetic Minority Over-sampling Technique (HSMOTE) and Ensemble Deep Dynamic Classifier Model (EDDCM) for big data analytics. Scientific Reports, 15(1), 39495. https://doi.org/10.1038/s41598-025-23062-3

Matharaarachchi, S., Domaratzki, M., & Muthukumarana, S. (2024). Enhancing SMOTE for imbalanced data with abnormal minority instances. Machine Learning with Applications, 18, 100597. https://doi.org/10.1016/j.mlwa.2024.100597

Nouas, S., Oukid, L., & Boumahdi, F. (2025). Syngo: Synthetic genetic oversampling technique for textual data. Social Network Analysis and Mining, 15(1), 9. https://doi.org/10.1007/s13278-025-01423-0

Olabisi, O., Maurya, L., & Bader-El-Den, M. (2025). Localised ensemble learning (LEL) – a localised approach to class imbalance. Pattern Analysis and Applications, 28(4), 177. https://doi.org/10.1007/s10044-025-01545-3

Razali, M. N., Arbaiy, N., Lin, P.-C., & Ismail, S. (2025). Optimizing Multiclass Classification Using Convolutional Neural Networks with Class Weights and Early Stopping for Imbalanced Datasets. Electronics, 14(4), Article 4. https://doi.org/10.3390/electronics14040705

Salehi, A., & Khedmati, M. (2025). Hybrid clustering strategies for effective oversampling and undersampling in multiclass classification. Scientific Reports, 15(1), 3460. https://doi.org/10.1038/s41598-024-84786-2

Shi, S., Li, J., Zhu, D., Yang, F., & Xu, Y. (2023). A hybrid imbalanced classification model based on data density. Information Sciences, 624, 50–67. https://doi.org/10.1016/j.ins.2022.12.046

Shyalika, C., Wickramarachchi, R., El Kalach, F., Harik, R., & Sheth, A. (2024). Evaluating the Role of Data Enrichment Approaches towards Rare Event Analysis in Manufacturing. Sensors (Basel, Switzerland), 24(15), 5009. https://doi.org/10.3390/s24155009

Taskiran, S. F., Turkoglu, B., Kaya, E., & Asuroglu, T. (2025). A comprehensive evaluation of oversampling techniques for enhancing text classification performance. Scientific Reports, 15(1), 21631. https://doi.org/10.1038/s41598-025-05791-7

Wang, A. X., Chukova, S. S., & Nguyen, B. P. (2023). Synthetic minority oversampling using edited displacement-based k-nearest neighbors. Applied Soft Computing, 148, 110895. https://doi.org/10.1016/j.asoc.2023.110895

Xie, Y., Huang, X., Qin, F., Li, F., & Ding, X. (2024). A majority affiliation based under-sampling method for class imbalance problem. Information Sciences, 662, 120263. https://doi.org/10.1016/j.ins.2024.120263

Zhang, R., Lu, S., Yan, B., Yu, P., & Tang, X. (2023). A density-based oversampling approach for class imbalance and data overlap. Computers & Industrial Engineering, 186, 109747. https://doi.org/10.1016/j.cie.2023.109747

Zhong, X., Zhang, L., & Ban, H. (2023). Deep reinforcement learning for class imbalance fault diagnosis of equipment in nuclear power plants. Annals of Nuclear Energy, 184, 109685. https://doi.org/10.1016/j.anucene.2023.109685




DOI: https://doi.org/10.31764/jtam.v10i3.36972

Refbacks

  • There are currently no refbacks.


Copyright (c) 2026 Hartono, Erianto Ongko, Muhammad Khahfi Zuhanda

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

_______________________________________________

JTAM already indexing:

                     


_______________________________________________

 

Creative Commons License

JTAM (Jurnal Teori dan Aplikasi Matematika) 
is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

______________________________________________

_______________________________________________

_______________________________________________ 

JTAM (Jurnal Teori dan Aplikasi Matematika) Editorial Office: