Chi-Square Feature Selection with Pseudo-Labelling in Natural Language Processing
Abstract
Keywords
Full Text:
DOWNLOAD [PDF]References
Adnan, K., & Akbar, R. (2019). An analytical study of information extraction from unstructured and multidimensional big data. In Journal of Big Data (Vol. 6, Issue 1). 56-70 Springer International Publishing. https://doi.org/10.1186/s40537-019-0254-8
Al Walid, M. H., Anisuzzaman, D. M., & Saif, A. F. M. S. (2019). Data Analysis and Visualization of Continental Cancer Situation by Twitter Scraping. International Journal of Modern Education and Computer Science, 11(7), 23–31. https://doi.org/10.5815/ijmecs.2019.07.03
Alshaer, H. N., Otair, M. A., Abualigah, L., Alshinwan, M., & Khasawneh, A. M. (2021). Feature selection method using improved CHI Square on Arabic text classifiers: analysis and application. Multimedia Tools and Applications, 80(7), 10373–10390. https://doi.org/10.1007/s11042-020-10074-6
Arora, N., & Kaur, P. D. (2020). A Bolasso based consistent feature selection enabled random forest classification algorithm: An application to credit risk assessment. Applied Soft Computing Journal, 86(11), 105936. https://doi.org/10.1016/j.asoc.2019.105936
Asghar, S., Choi, J., Yoon, D., & Byun, J. (2020). Spatial pseudo-labeling for semi-supervised facies classification. Journal of Petroleum Science and Engineering, 195(August), 107834. https://doi.org/10.1016/j.petrol.2020.107834
Chen, R. C., Dewi, C., Huang, S. W., & Caraka, R. E. (2020). Selecting critical features for data classification based on machine learning methods. Journal of Big Data, 7(1), 52. https://doi.org/10.1186/s40537-020-00327-4
Deta Kirana, Y., & Al Faraby, S. (2021). Sentiment Analysis of Beauty Product Reviews Using the K-Nearest Neighbor (KNN) and TF-IDF Methods with Chi-Square Feature Selection. Open Access J Data Sci Appl, 4(1), 31–042. https://doi.org/10.34818/JDSA.2021.4.71
Ferrario, A., & Naegelin, M. (2020). The Art of Natural Language Processing: Classical, Modern and Contemporary Approaches to Text Document Classification. SSRN Electronic Journal, 3(1), 1–51. https://doi.org/10.2139/ssrn.3547887
Garg, S., Panwar, D. S., Gupta, A., & Katarya, R. (2020). A literature review on sentiment analysis techniques involving social media platforms. PDGC 2020 - 2020 6th International Conference on Parallel, Distributed and Grid Computing, 3(1), 254–259. https://doi.org/10.1109/PDGC50313.2020.9315735
Hamzah, M. B. (2021). Classification of Movie Review Sentiment Analysis Using Chi-Square and Multinomial Naïve Bayes with Adaptive Boosting. Journal of Advances in Information Systems and Technology, 3(1), 67–74. https://doi.org/10.15294/jaist.v3i1.49098
Herlawati, H., Trias Handayanto, R., Ekawati, I., Meutia, K. I., Asian, J., & Aditiawarman, U. (2020). Twitter scrapping for profiling education staff. 2020 5th International Conference on Informatics and Computing, ICIC 2020. 3(1), 23-67. https://doi.org/10.1109/ICIC50835.2020.9288607
Jabbar, A., Iqbal, S., Tamimy, M. I., Hussain, S., & Akhunzada, A. (2020). Empirical evaluation and study of text stemming algorithms. In Artificial Intelligence Review (Vol. 53, Issue 8). 5559-5588. Springer Netherlands. https://doi.org/10.1007/s10462-020-09828-3
Krstinić, D., Braović, M., Šerić, L., & Božić-Štulić, D. (2020). Multi-label Classifier Performance Evaluation with Confusion Matrix. 3(1), 01–14. https://doi.org/10.5121/csit.2020.100801
Kudo, T., & Richardson, J. (2018). SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. EMNLP 2018 - Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Proceedings, 3(8), 66–71. https://doi.org/10.18653/v1/d18-2012
Mohd Nafis, N. S., & Awang, S. (2021). An Enhanced Hybrid Feature Selection Technique Using Term Frequency-Inverse Document Frequency and Support Vector Machine-Recursive Feature Elimination for Sentiment Classification. IEEE Access, 9(Ml), 52177–52192. https://doi.org/10.1109/ACCESS.2021.3069001
Paudel, S., Prasad, P. W. C., Alsadoon, A., Islam, M. R., & Elchouemi, A. (2019). Feature selection approach for twitter sentiment analysis and text classification based on chi-square and naïve bayes. Advances in Intelligent Systems and Computing, 842(11), 281–298. https://doi.org/10.1007/978-3-319-98776-7_30
Sakthi Vel, S. (2021). Pre-Processing techniques of Text Mining using Computational Linguistics and Python Libraries. Proceedings - International Conference on Artificial Intelligence and Smart Systems, ICAIS 2021, 3(1), 879–884. https://doi.org/10.1109/ICAIS50930.2021.9395924
Sarica, S., & Luo, J. (2021). Stopwords in technical language processing. PLoS ONE, 16(8 August), 1–13. https://doi.org/10.1371/journal.pone.0254937
Shan Lee, V. L., Gan, K. H., Tan, T. P., & Abdullah, R. (2019). Semi-supervised learning for sentiment classification using small number of labeled data. Procedia Computer Science, 161(2019), 577–584. https://doi.org/10.1016/j.procs.2019.11.159
Singh, K. N., Devi, S. D., Devi, H. M., & Mahanta, A. K. (2022). A novel approach for dimension reduction using word embedding: An enhanced text classification approach. International Journal of Information Management Data Insights, 2(1), 100061. https://doi.org/10.1016/j.jjimei.2022.100061
Singh, N. K., Tomar, D. S., & Sangaiah, A. K. (2020). Sentiment analysis: a review and comparative analysis over social media. Journal of Ambient Intelligence and Humanized Computing, 11(1), 97–117.https://doi.org/10.1007/s12652-018-0862-8
Syrotkina, O., Aleksieiev, M., Moroz, B., Matsiuk, S., Shevtsova, O., & Kozlovskyi, A. (2020). Mathematical Methods for optimizing Big Data Processing. Proceedings - International Conference on Advanced Computer Information Technologies, ACIT, 1(9), 170–176. https://doi.org/10.1109/ACIT49673.2020.9208940
Tubishat, M., Abushariah, M. A. M., Idris, N., & Aljarah, I. (2019). Improved whale optimization algorithm for feature selection in Arabic sentiment analysis. Applied Intelligence, 49(5), 1688–1707. https://doi.org/10.1007/s10489-018-1334-8
Yang, A., Zhang, J., Pan, L., & Xiang, Y. (2016). Enhanced twitter sentiment analysis by using feature selection and combination. Proceedings - 2015 International Symposium on Security and Privacy in Social Networks and Big Data, SocialSec 2015, 9(November), 52–57. https://doi.org/10.1109/SocialSec2015.9
Yang, W., Zhang, R., Chen, J., Wang, L., & Kim, J. (2023). Prototype-Guided Pseudo Labeling for Semi-Supervised Text Classification. Proceedings of the Annual Meeting of the Association for Computational Linguistics, 1(july), 16369–16382. https://doi.org/10.18653/v1/2023.acl-long.904
DOI: https://doi.org/10.31764/jtam.v8i3.22751
Refbacks
- There are currently no refbacks.
Copyright (c) 2024 Sintia Afriyani, Sugiyarto Surono, Mahmud Iwan Solihin
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
_______________________________________________
JTAM already indexing:
_______________________________________________
JTAM (Jurnal Teori dan Aplikasi Matematika) |
_______________________________________________
_______________________________________________
JTAM (Jurnal Teori dan Aplikasi Matematika) Editorial Office: