Evaluating the Quality of Mid-Semester Mathematics Summative Assessment in Secondary School: A Psychometric Analysis of Test Items
Abstract
Keywords
Full Text:
DOWNLOAD [PDF]References
Ahmad, A., Judijanto, L., Jeranah, Halomoan, J. L. A., & Ichsan, M. (2024). Barriers and Difficulties of Students in the Mathematics Learning Process in Junior High Schools. Journal of Education Research and Evaluation, 8(2), 306–316. https://doi.org/10.23887/jere.v8i2.74056
Aiken, L. R. (1980). Content validity and reliability of single items or questionnaires. Educational and Psychological Measurement, 40(4), 955–959. https://doi.org/10.1177/001316448004000419
Aiken, L. R. (1985). Three Coefficiennts for Analyzing the Reliability and Validity of Ratings. Educational and Psychological Measurement, 45, 131–142. https://doi.org/10.1177/0013164485451012
Allen, M. J., & Yen, W. M. (2001). Introduction to Measurement Theory. Waveland Press.
Alonzo, D., Labad, V., Bejano, J., & Guerra, F. (2021). The Policy-driven Dimensions of Teacher Beliefs about Assessment Assessment. Australian Journal of Teacher Education, 46(3), 36–52. https://doi.org/10.14221/ajte.2021v46n3.3
Anderson-Levitt, K. (2025). The deficit model in PISA assessments of competencies: counter-evidence from anthropology. Globalisation, Societies and Education, 23(4), 942–958. https://doi.org/10.1080/14767724.2023.2223141
Andriatna, R., Sujadi, I., Budiyono, Kurniawati, I., Wulandari, A. N., & Puteri, H. A. (2024). Junior high school students’ numeracy in geometry and measurement content: Evidence from the minimum competency assessment result. Proceeding of the 7th National Conference on Mathematics and Mathematics Education (SENATIK). https://doi.org/10.1063/5.0194570
Anyawale, M. A., Chere-Masopha, J., & Morena, M. C. (2022). The Classical Test or Item Response Measurement Theory: The Status of the Framework at the Examination Council of Lesotho. International Journal of Learning, Teaching and Educational Research, 21(8), 384–406. https://doi.org/10.26803/ijlter.21.8.22
Awalurahman, H. W., & Budi, I. (2024). Automatic distractor generation in multiple-choice questions: a systematic literature review. PeerJ Computer Science, 10(2), 1–27. https://doi.org/10.7717/peerj-cs.2441
Bahena, R. D., Kilag, O. K. T., Andrin, G. R., Diano, F. M., & Unabia, R. P. (2024). From Method to Equity : Rethinking Mathematics Assessment Policies in Education. EXCELLENCIA: International Multi-Disciplinary Journal Of Edcation, 2(1), 121–132. https://multijournals.org/index.php/excellencia-imje/article/view/281
Bhat, S. K., & Prasad, K. H. (2021). Item analysis and optimizing multiple-choice questions for a viable question bank in ophthalmology. Indian Journal of Ophthalmology, 69(2), 343–346. https://doi.org/10.4103/ijo.IJO_1610_20
Butakor, P. K. (2022). Using Classical Test and Item Response Theories to Evaluate Psychometric Quality of Teacher-Made Test in Ghana. European Scientific Journal, 18(1), 139–168. https://doi.org/10.19044/esj.2022.v18n1p139
Charles, K. J. (2023). Hyflex Instruction: Using Results from Mid-Semester Evaluations for Improvement. International Journal of Science and Research (IJSR), 12(9), 325–336. https://doi.org/10.21275/SR23901210226
Creswell, J. W., & Clark, V. L. P. (2017). Designing and Conducting Mixed Methods Research (3rd ed.). SAGE Publications.
Dewi, W. O., & Prabowo, A. (2022). Item Analysis of the Mid-Semester Assessment for Grade VIII A Mathematics in the 2018/2019 Academic Year at SMP Negeri 3 Mlati. AdMathEduSt: Jurnal Ilmiah Mahasiswa Pendidikan Matematika, 9(2), 76–83. https://doi.org/10.12928/admathedust.v9i2.25347
Ebel, R. L., & Frisbie, D. A. (1991). Essentials of Educational Measurement (5th ed.). Englewood Cliffs, N.J.
Elgadal, A. H., & Mariod, A. A. (2021). Item Analysis of Multiple-choice Questions (MCQs): Assessment Tool For Quality Assurance Measures. Sudan Journal of Medical Sciences, 16(3), 334–346. https://doi.org/10.18502/sjms.v16i3.9695
Farida, F., & Musyarofah, A. (2021). Validity and Reliability in Item Analysis. Al-Mu’arrib: Jurnal Pendidikan Bahasa Arab, I(1), 34–44. https://doi.org/10.32923/al-muarrib.v1i1.2100
Feldman, L. I. (2025). The Role of Assessment in Improving Education and Promoting Educational Equity. Education Sciences, 15(2), 1–11. https://doi.org/10.3390/educsci15020224
Fitria, N. N., Mufidah, L. L. N., & Setiawati, P. (2024). Summative Assessment of Islamic Education Subject in Merdeka Curriculum. Journal of Educational Research and Practice, 2(3), 328–338. https://doi.org/10.70376/jerp.v2i3.157
Ghimire, L. (2021). Assessment of the policy. In Multilingualism in Education in Nepal (pp. 128–150). Routledge India. https://doi.org/10.4324/9781003159964-7
Ginting, P., Hasnah, Y., Hasibuan, S. H., & Batubara, I. H. (2021). Evaluating Cognitive Level of Final Semester Examination Questions Based on Bloom’s Revised Taxonomy. AL-ISHLAH: Jurnal Pendidikan, 13(1), 186–195. https://doi.org/10.35445/alishlah.v13i1.385
Griffin, P., Care, E., Francis, M., & Scoular, C. (2014). The Role of Assessment in Improving Learning in a Context of High Accountability. In Designing Assessment for Quality Learning (pp. 73–87). Springer. https://doi.org/10.1007/978-94-007-5902-2_5
Hadi, A. F. M. Q. Al, Listari, D. A., Meilawati, A., & Inayati, N. L. (2024). Implementation of Summative Evaluation in Islamic Education Learning at SMPN 1 Surakarta. TSAQOFAH, 4(1), 769–778. https://doi.org/10.58578/tsaqofah.v4i1.2570
Hadzhikoleva, S., Hadzhikolev, E., Gaftandzhieva, S., & Pashev, G. (2025). A conceptual framework for multi-component summative assessment in an e-learning management system. Frontiers in Education, 10(1), 1–12. https://doi.org/10.3389/feduc.2025.1656092
Halimi, K., & Seridi-Bouchelaghem, H. (2021). Students’ competencies discovery and assessment using learning analytics and semantic web. Australasian Journal of Educational Technology, 37(5), 77–97. https://doi.org/10.14742/ajet.7116
Hartati, N., & Yogi, H. P. S. (2019). Item Analysis for a Better Quality Test. English Language in Focus, 2(1), 57–70. https://doi.org/10.24853/elif.2.1.59-70
Heil, J., & Ifenthaler, D. (2023). Online Assessment in Higher Education: A Systematic Review. Online Learning, 27(1), 187–218. https://doi.org/10.24059/olj.v27i1.3398
Ishaq, K., Majid, A., Rana, K., Azan, N., & Zin, M. (2020). Exploring Summative Assessment and Effects : Primary to Higher Education. Bulletin Of Education and Research, 42(3), 23–50. https://eric.ed.gov/?id=EJ1291061
Kemendikbudristek. (2022). Learning and Assessment in Early Childhood, Primary, and Secondary Education (Pembelajaran dan Asesmen Pendidikan Anak Usia Dini, Pendidikan Dasar, dan Menegah).
Kenea, T. G., Mikire, F., & Negawo, Z. (2023). The Psychometric Properties and Performances of Teacher-Made Tests in Measuring Students ’ Academic Performance in Ethiopian Public Universities : Baseline Survey Study. Research Square, 1(4), 1–23. https://doi.org/10.21203/rs.3.rs-3095433/v1
Kissi, P., Baidoo-Anu, D., Anane, E., & Annan-Brew, R. K. (2023). Teachers’ test construction competencies in examination-oriented educational system: Exploring teachers’ multiple-choice test construction competence. Frontiers in Education, 8(1), 1–14. https://doi.org/10.3389/feduc.2023.1154592
Klee, H. L., & Miller, A. D. (2019). Moving Up! Or Down? Mathematics Anxiety in the Transition From Elementary School to Junior High. The Journal of Early Adolescence, 39(9), 1311–1336. https://doi.org/10.1177/0272431618825358
Koçdar, S., Karadag, N., & Sahin, M. D. (2016). Analysis of the Difficulty and Discrimination Indices of Multiple-Choice Questions According to Cognitive Levels in an Open and Distance Learning Context. The Turkish Online Journal of Education Technology, 15(4), 16–24. https://eric.ed.gov/?id=EJ1117619
Mahphoth, M. H., Sulaiman, Z., Koe, W., Kamarudin, N. A., Puspo, & Dirgantari, D. (2021). Psychometric Assessment of Young Visitors at the National Museum Of Malaysia. Asian Journal of University Education, 17(2), 1–13. https://doi.org/10.24191/ajue.v17i2.13396
Malapane, T. A., & Ndlovu, N. K. (2024). Assessing the Reliability of Likert Scale Statements in an E-Commerce Quantitative Study: A Cronbach Alpha Analysis Using SPSS Statistics. 2024 Systems and Information Engineering Design Symposium (SIEDS), 90–95. https://doi.org/10.1109/SIEDS61124.2024.10534753
Manfaat, B., Nurazizah, A., & Misri, M. A. (2021). Analysis of mathematics test items quality for high school Analysis of mathematics test items quality for high school. Jurnal Penelitian Dan Evaluasi Pendidikan, 25(1), 108–117. https://doi.org/10.21831/pep.v25i1.39174
Marsevani, M. (2022). Item Analysis Of Multiple-Choice Questions: An Assessment Of Young Learners. English Review: Journal of English Education, 10(2), 401–408. https://doi.org/10.25134/erjee.v10i2.6241
Masyitoh, M., Ahda, Y., Hartanto, I., & Darussyamsu, R. (2020). An Analysis of High Order Thinking Skills Aspects on the Assessment Instruments Environmental Change Topic for the 10th GradeSenior High School Students. Jurnal Atrium Pendidikan Biologi, 5(4), 1–7. https://doi.org/10.24036/apb.v5i4.6945
Miles, M. B., Huberman, A. M., & Saldaña, J. (2014). Qualitative Data Analysis: A Methods Sourcebook (3rd ed.). SAGE Publications. https://www.ucg.ac.me/skladiste/blog_609332/objava_105202/fajlovi/Creswell.pdf
Mumpuni, K. E., & Ramli, M. (2018). Students’ Understanding and Approvement toward Assessment for Learning. BIOEDUKASI: Jurnal Pendidikan Biologi, 11(1), 55–60. https://jurnal.uns.ac.id/bioedukasi/article/download/19746/pdf
Nitko, A. J., & Brookhart, S. M. (2011). Educational Assessment of Students (6th ed.). Pearson/Allyn & Bacon.
Nitko, A. J., & Brookhart, S. M. (2019). Educational Assessment of Students (8th ed.). Pearson.
Nurjanah, S., Iqbal, M., Zafrullah, Z., Mahmud, M. N., Seran, D. S. F., Suardi, I. K., & Arriza, L. (2024). Psychometric quality of multiple-choice tests under classical test theory (CTT): AnBuso, Iteman, and R. Jurnal Penelitian Dan Evaluasi Pendidikan, 28(2), 161–172. https://doi.org/10.21831/pep.v28i2.71542
Odukoya, J. A., & Omonijo, D. O. (2024). Discriminatory indices of ‘introduction to psychology’ multiple choice examination questions. Edelweiss Applied Science and Technology, 8(6), 8833–8847. https://doi.org/10.55214/25768484.v8i6.3880
Orhani, S. (2024). Preparation of Tests from the Subject of Mathematics According to Bloom ’ s Taxonomy. International Journal of Research Publication and Reviews, 5(2), 2335–2345. https://doi.org/10.55248/gengpi.5.0224.0542
Pokropek, A., Marks, G. N., & Borgonovi, F. (2022). How much do students’ scores in PISA reflect general intelligence and how much do they reflect specific abilities? Journal of Educational Psychology, 114(5), 1121–1135. https://doi.org/10.1037/edu0000687
Popham, W. J. (2017). Classroom Assessment: What Teachers Need to Know. Pearson Education.
Priyatni, E. T., & Martutik. (2020). The Development of a Critical–Creative Reading Assessment Based on Problem Solving. Sage Open, 10(2), 1–9. https://doi.org/10.1177/2158244020923350
Rahmadani, N., & Hidayati, K. (2023). Quality of Mathematics Even Semester Final Assessment Test in Class VIII Using R Program. Jurnal Pendidikan Matematika, 17(3), 397–416. https://doi.org/10.22342/jpm.17.3.20627.397-416
Raykov, T., & Zhang, B. (2025). The One-Parameter Logistic Model Can Be True With Zero Probability for a Unidimensional Measuring Instrument: How One Could Go Wrong Removing Items Not Satisfying the Model. Educational and Psychological Measurement, 85(4). https://doi.org/10.1177/00131644251345120
Regina, A. (2024). Assessment Rubric for Historical Thinking Skills in Accordance with the Kurikulum Merdeka. EDUTEC : Journal of Education And Technology, 7(4). https://doi.org/10.29062/edu.v7i4.784
Retnawati, H. (2022). Estimating Item Parameters and Student Abilities : An IRT 2PL Analysis of Mathematics Examination. Al-Islah: Jurnal Pendidikan, 14(1), 385–398. https://doi.org/10.35445/alishlah.v14i1.926
Retnawati, H., Kartowagiran, B., Arlinwibowo, J., & Sulistyaningsih, E. (2017). Why are the Mathematics National Examination Items Difficult and What Is Teachers’ Strategy to Overcome It? International Journal of Instruction, 10(3), 257–276. https://doi.org/10.12973/iji.2017.10317a
Rezigalla, A. A., Eleragi, A. M. E. S. A., Elhussein, A. B., Alfaifi, J., ALGhamdi, M. A., Al Ameer, A. Y., Yahia, A. I. O., Mohammed, O. A., & Adam, M. I. E. (2024). Item analysis: the impact of distractor efficiency on the difficulty index and discrimination power of multiple-choice items. BMC Medical Education, 24(1), 445–451. https://doi.org/10.1186/s12909-024-05433-y
Roach, V. A. (2025). Validity: Conceptualizations for anatomy and health professions educators. Anatomical Sciences Education, 18(8), 751–756. https://doi.org/10.1002/ase.70016
Rush, B. R., Rankin, D. C., & White, B. J. (2016). The impact of item-writing flaws and item complexity on examination item difficulty and discrimination value. BMC Medical Education, 16(1), 250–259. https://doi.org/10.1186/s12909-016-0773-3
Shakurnia, A., Ghafourian, M., Khodadadi, A., Ghadiri, A., Amari, A., & Shariffat, M. (2022). Evaluating Functional and Non-Functional Distractors and Their Relationship with Difficulty and Discrimination Indices in Four-Option Multiple-Choice Questions. Education in Medicine Journal, 14(4), 55–62. https://doi.org/10.21315/eimj2022.14.4.5
Shankar, D. R., Singh, D. H. P., Dewan, D. S., & Singh, D. R. (2024). An In-Depth Analysis of Multiple-Choice Question Quality In Community Medicine Examinations: Evaluating Implications For Competency-Based Medical Education At Noida International Institute Of Medical Sciences (NIIMS). African Journal of Biomedical Research, 27(45), 13959–13964. https://doi.org/10.53555/AJBR.v27i4S.7072
Sozer, E. M., Zeybekoglu, Z., & Kaya, M. (2019). Using mid-semester course evaluation as a feedback tool for improving learning and teaching in higher education. Assessment & Evaluation in Higher Education, 44(7), 1003–1016. https://doi.org/10.1080/02602938.2018.1564810
Stankous, N. V. (2016). Constructive Response Vs. Multiple-Choice Tests In Math: American Experience And Discussion (Review). European Scientific Journal, 12(10), 1–9. https://doi.org/10.19044/esj.2016.v12n10p%p
Terao, T., & Ishii, H. (2020). A Comparison of Distractor Selection Among Proficiency Levels in Reading Tests: A Focus on Summarization Processes in Japanese EFL Learners. Sage Open, 10(1), 1–14. https://doi.org/10.1177/2158244020902087
Ukobizaba, F., Nizeyimana, G., & Mukuka, A. (2021). Assessment Strategies for Enhancing Students’ Mathematical Problem-solving Skills: A Review of Literature. Eurasia Journal of Mathematics, Science and Technology Education, 17(3), 1–10. https://doi.org/10.29333/ejmste/9728
Vincent, W., & Shanmugam, S. K. S. (2020). The Role of Classical Test Theory to Determine the Quality of Classroom Teaching Test Items. Pedagogia : Jurnal Pendidikan, 9(1), 5–34. https://doi.org/10.21070/pedagogia.v9i1.123
Wahyuni, A., Muhaimin, L. H., Hendriyanto, A., & Tririnika, Y. (2024). Exploring Middle School Students’ Challenges in Mathematical Literacy: A Study on AKM Problem-Solving. AL-ISHLAH: Jurnal Pendidikan, 16(3), 3335–3349. https://doi.org/10.35445/alishlah.v16i3.5729
Wati, D. D. E., Dewi, R. K., & Amri, C. (2023). Analysis of student ability formulating learning objectives in natural science phase D kurikulum merdeka. Jurnal Atrium Pendidikan Biologi, 8(1), 15–21. https://doi.org/10.24036/apb.v8i1.14028
Xiromeriti, M., & Newton, P. M. (2024). Solving Not Answering. Validation of Guidance for Writing Higher-Order Multiple-Choice Questions in Medical Science Education. Medical Science Educator, 34(6), 1469–1477. https://doi.org/10.1007/s40670-024-02140-7
Xuyen, P. T. M. (2023). Exploring the Efficacy of Summative Assessment to Promote the Continuous Improvement of Students’ English Proficiency. US-China Education Review B, 13(6), 346–357. https://doi.org/10.17265/2161-6248/2023.06.002
Zainina, K. A., Mufiqoh, M. Z., Aprilia, N., & Isnaeni, B. (2025). Rasch Model: Analysis of Biology Question Item in the Indonesia Independent Curriculum. Jurnal Penelitian Pendidikan IPA, 10(12), 10990–10998. https://doi.org/10.29303/jppipa.v10i12.7661
DOI: https://doi.org/10.31764/ijeca.v8i3.36276
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Yoga Tegar Santosa, Dini Wardani Maulida, Juli Ferdianto, Sri Sutarni, Yulia Maftuhah Hidayati

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IJECA (International Journal of Education and Curriculum Application) already indexed:
___________________________________________________________________
| |
____________________________________________________________________
IJECA Publisher Office:





.jpg)




