Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory

Salima Aldazharova; Gulnara Issayeva; Samat Maxutov; Nuri Balta

doi:10.30935/cedtech/15592

Salima Aldazharova ¹, Gulnara Issayeva ¹, Samat Maxutov ², Nuri Balta ² ^*

More Detail

CONT ED TECHNOLOGY, Volume 16, Issue 4, Article No: ep538

https://doi.org/10.30935/cedtech/15592

Submitted: 02 August 2024, Published Online: 07 November 2024

OPEN ACCESS 1011 Views 863 Downloads

Download Full Text (PDF)

Abstract

This study investigates the performance of GPT-4, an advanced AI model developed by OpenAI, on the force concept inventory (FCI) to evaluate its accuracy, reasoning patterns, and the occurrence of false positives and false negatives. GPT-4 was tasked with answering the FCI questions across multiple sessions. Key findings include GPT-4’s proficiency in several FCI items, particularly those related to Newton’s third law, achieving perfect scores on many items. However, it struggled significantly with questions involving the interpretation of figures and spatial reasoning, resulting in a higher occurrence of false negatives where the reasoning was correct, but the answers were incorrect. Additionally, GPT-4 displayed several conceptual errors, such as misunderstanding the effect of friction and retaining the outdated impetus theory of motion. The study’s findings emphasize the importance of refining AI-driven tools to make them more effective in educational settings. Addressing both AI limitations and common misconceptions in physics can lead to improved educational outcomes.

Keywords: AI assisted learning, force concept inventory, GPT-4, physics education

References

Anderson, M., Anderson, S. L., & Armen, C. (2019). Machine ethics: Creating an ethical intelligent agent. AI Magazine, 40(4), 45–52.
Balta, N. (2024). A short review of AI in education: Perspectives from the Web of Science database. The European Educational Researcher, 7(2), 41–43. https://doi.org/10.31757/euer.723
Balta, N., & Eryılmaz, A. (2017). Counterintuitive dynamics test. International Journal of Science and Mathematics Education, 15, 411–431. https://doi.org/10.1007/s10763-015-9694-6
Bengio, Y., Lavoie, P., & Vincent, P. (2020). Learning neural networks to solve differential equations. Journal of Machine Learning Research, 21(1), 3485–3510.
Boehnlein, A., Diefenthaler, M., Fanelli, C., Hjorth-Jensen, M., Horn, T., Kuchera, M. P., Lee, D., Pang, L.-G., Poon, A., Sato, N., Schram, M., Scheinker, A., Smith, M. S., Wang, X.-N., & Ziegler, V. (2021). Artificial intelligence and machine learning in nuclear physics. arXiv. https://doi.org/10.1103/RevModPhys.94.031003
Buabeng, I. (2018). Physics classroom interactions: Teaching strategies and practices. Journal of Research in Science, Mathematics and Technology Education, 1(3), 311–328. https://doi.org/10.31756/jrsmte.134
Chen, L., Chen, P., & Lin, Z. (2020). Artificial intelligence in education: A review. IEEE Access, 8, 75264–75278. https://doi.org/10.1109/ACCESS.2020.2988510
Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1989). Categorization and representation of physics problems by experts and novices. Cognitive Science, 13(2), 145–182. https://doi.org/10.1207/s15516709cog1302_1
Dahlkemper, M. N., Lahme, S. Z., & Klein, P. (2023). How do physics students evaluate artificial intelligence responses on comprehension questions? A study on the perceived scientific accuracy and linguistic quality of ChatGPT. Physical Review Physics Education Research, 19(1), Article 010142. https://doi.org/10.1103/PhysRevPhysEducRes.19.010142
de los Ángeles Domínguez-González, M., Hervás-Gómez, C., Díaz-Noguera, M. D., & Reina-Parrado, M. (2023). Attention to diversity from artificial intelligence. The European Educational Researcher, 6(3), 101–115. https://doi.org/10.31757/euer.633
Docktor, J. L., & Mestre, J. P. (2014). Synthesis of discipline-based education research in physics. Physical Review Special Topics-Physics Education Research, 10(2), Article 020119. https://doi.org/10.1103/PhysRevSTPER.10.020119
Ge, Z., & Hu, Y. (2020). Innovative application of artificial intelligence (AI) in the management of higher education and teaching. Journal of Physics: Conference Series, 1533(3), Article 032089. https://doi.org/10.1088/1742-6596/1533/3/032089
Geiger, P., Willner, J., & Kuhn, D. (2021). Misconceptions in physics: A comparative analysis of human and AI reasoning. Physics Education Research, 23(2), 231–245.
Hake, R. R. (1998). Interactive-engagement versus traditional methods: A six-thousand-student survey of mechanics test data for introductory physics courses. American Journal of Physics, 66(1), 64–74. https://doi.org/10.1119/1.18809
Halloun, I. A., & Hestenes, D. (1985). The initial knowledge state of college physics students. American Journal of Physics, 53(11), 1043–1055. https://doi.org/10.1119/1.14030
Hammer, D. (1996). Misconceptions or p-prims: How may alternative perspectives of cognitive structure influence instructional perceptions and intentions. Journal of the Learning Sciences, 5(2), 97–127. https://doi.org/10.1207/s15327809jls0502_1
Hestenes, D., Wells, M., & Swackhamer, G. (1992). Force concept inventory. The Physics Teacher, 30(3), 141–151. https://doi.org/10.1119/1.2343497
Holmes, W., Bialik, M., & Fadel, C. (2019). Artificial intelligence in education: Promises and implications for teaching and learning. Center for Curriculum Redesign.
Jung, E. (2020). Impetus. In H. Lagerlund (Ed.), Encyclopedia of medieval philosophy: Philosophy between 500 and 1500 (pp. 832–835). Springer. https://doi.org/10.1007/978-94-024-1665-7_239
Kortemeyer, G. (2023). Could an artificial-intelligence agent pass an introductory physics course? Physical Review Physics Education Research, 19(1), Article 010132. https://doi.org/10.1103/PhysRevPhysEducRes.19.010132
Krupp, L., Steinert, S., Kiefer-Emmanouilidis, M., Avila, K. E., Lukowicz, P., Kuhn, J., Küchemann, S., & Karolus, J. (2024). Unreflected acceptance–Investigating the negative consequences of ChatGPT-assisted problem solving in physics education. Frontiers in Artificial Intelligence and Applications, 386, 199–212. https://doi.org/10.3233/FAIA240195
Kuzu, S. Y. (2021). Artificial intelligence based machine learning approach in high energy physics. International Journal of Innovative Engineering Applications, 5(2), 176–180. https://doi.org/10.46460/ijiea.929292
Lai, J. W., & Cheong, K. H. (2022). Educational opportunities and challenges in augmented reality: Featuring implementations in physics education. IEEE Access, 10, 43143–43158. https://doi.org/10.1109/ACCESS.2022.3166478
Lample, G., & Charton, F. (2019). Deep learning for symbolic mathematics. arXiv. https://doi.org/10.48550/arXiv.1912.01412
Luckin, R., Holmes, W., Griffiths, M., & Forcier, L. B. (2016). Intelligence unleashed: An argument for AI in education. Pearson.
Mahligawati, F., Allanas, E., Butarbutar, M. H., & Nordin, N. A. N. (2023). Artificial intelligence in physics education: A comprehensive literature review. Journal of Physics: Conference Series, 2596(1), Article 012080. https://doi.org/10.1088/1742-6596/2596/1/012080
McDermott, L. C., & Redish, E. F. (1999). Resource letter: PER-1: Physics education research. American Journal of Physics, 67(9), 755–767. https://doi.org/10.1119/1.19122
Mustofa, H. A., Bilad, M. R., & Grendis, N. W. B. (2024). Utilizing AI for physics problem solving: A literature review and ChatGPT experience. Lensa: Jurnal Kependidikan Fisika, 12(1), 78–97. https://doi.org/10.33394/j-lkf.v12i1.11748
OpenAI. (2023). GPT-4: Technical report. OpenAI. https://cdn.openai.com/papers/gpt-4.pdf
Polverini, G., & Gregorcic, B. (2024). Performance of ChatGPT on the test of understanding graphs in kinematics. Physical Review Physics Education Research, 20(1), Article 010109. https://doi.org/10.1103/PhysRevPhysEducRes.20.010109
Roll, I., & Wylie, R. (2016). Evolution and revolution in artificial intelligence in education. International Journal of Artificial Intelligence in Education, 26(2), 582–599. https://doi.org/10.1007/s40593-016-0110-3
Rosé, C. P., Resnick, L., Goldman, P., & Sherin, B. L. (2019). The future of AI in education: Integrating technology and human judgment. In R. Sharpe, H. Beetham, & S. de Freitas (Eds.), Rethinking learning in the digital age (pp. 265–293). Springer.
Schoenfeld, A. H. (2018). On reasoning and sense making in mathematics and science: Themes and highlights. International Journal of STEM Education, 5(1), 3–13.
Schunk, D. H., & Pajares, F. (2002). The development of academic self-efficacy. In A. Wigfield, & J. S. Eccles (Eds.), Development of achievement motivation (pp. 15–31). Academic Press. https://doi.org/10.1016/B978-012750053-9/50003-6
Smith, T. I., & Knight, R. D. (2021). Using computer simulations to improve physics learning. Journal of Science Education and Technology, 30(3), 346–358.
Tschisgale, P., Wulff, P., & Kubsch, M. (2023). Integrating artificial intelligence-based methods into qualitative research in physics education research: A case for computational grounded theory. Physical Review Physics Education Research, 19(2), Article 020123. https://doi.org/10.1103/PhysRevPhysEducRes.19.020123
Van der Veen, J. T., & Van den Berg, E. (2021). Enhancing conceptual understanding with computer simulations in physics education. Physics Education, 56(1), Article 015011.
Wang, L. (2020). Artificial intelligence and career development of college teachers: Challenge and countermeasures. Journal of Physics: Conference Series, 1550(2), Article 022030. https://doi.org/10.1088/1742-6596/1550/2/022030
West, C. G. (2023). Advances in apparent conceptual physics reasoning in GPT-4. arXiv. https://doi.org/10.48550/arXiv.2303.17012
Wink, R., & Bonivento, W. M. (2023). Artificial intelligence: New challenges and opportunities in physics education. In M. Streit-Bianchi, M. Michelini, W. Bonivento, & M. Tuveri, M. (Eds.), New challenges and opportunities in physics education. Challenges in physics education (pp. 427–434). Springer. https://doi.org/10.1007/978-3-031-37387-9_27
Wulff, P. (2024). Physics language and language use in physics–What do we know and how AI might enhance language-related research and instruction. European Journal of Physics, 45(2), Article 023001. https://doi.org/10.1088/1361-6404/ad0f9c
Yeadon, W., & Hardy, T. (2024). The impact of AI in physics education: A comprehensive review from GCSE to university levels. Physics Education, 59(2), Article 025010. https://doi.org/10.1088/1361-6552/ad1fa2
Yerushalmi, E., Cohen, E., & Singh, C. (2017). Assessing and improving student reasoning in physics. Physical Review Physics Education Research, 13(1), Article 010121.
Yilmaz, H., Maxutov, S., Baitekov, A., & Balta, N. (2023). Student’s perception of Chat GPT: A technology acceptance model study. International Educational Review, 1(1), 57– 83. https://doi.org/10.58693/ier.114
Zanca, F., Avanzo, M., Colgan, N., Crijns, W., Guidi, G., Hernandez-Giron, I., Kagadis, G. C., Diaz, O., Zaidi, H., Russo, P., Toma-Dasu, I., & Kortesniemi, M. (2021). Focus issue: Artificial intelligence in medical physics. Physica Medica: European Journal of Medical Physics, 83, 287–291. https://doi.org/10.1016/j.ejmp.2021.05.008
Zohar, A., & Dori, Y. J. (2012). Metacognition in science education: Trends in current research. Springer. https://doi.org/10.1007/978-94-007-2132-6

Citation

Aldazharova, S., Issayeva, G., Maxutov, S., & Balta, N. (2024). Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory. Contemporary Educational Technology, 16(4), ep538. https://doi.org/10.30935/cedtech/15592

Aldazharova, S., Issayeva, G., Maxutov, S., and Balta, N. (2024). Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory. Contemporary Educational Technology, 16(4), ep538. https://doi.org/10.30935/cedtech/15592

Aldazharova S, Issayeva G, Maxutov S, Balta N. Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory. CONT ED TECHNOLOGY. 2024;16(4), ep538. https://doi.org/10.30935/cedtech/15592

Aldazharova, Salima, Gulnara Issayeva, Samat Maxutov, and Nuri Balta. "Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory". Contemporary Educational Technology 2024 16 no. 4 (2024): ep538. https://doi.org/10.30935/cedtech/15592

Aldazharova, Salima et al. "Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory". Contemporary Educational Technology, vol. 16, no. 4, 2024, ep538. https://doi.org/10.30935/cedtech/15592

Aldazharova S, Issayeva G, Maxutov S, Balta N. Assessing AI’s problem solving in physics: Analyzing reasoning, false positives and negatives through the force concept inventory. CONT ED TECHNOLOGY. 2024;16(4):ep538. https://doi.org/10.30935/cedtech/15592