Undergraduate Pacific Studies Exam Generation and Answering Using Retrieval Augmented Generation and Large Language Models
Document Type
Conference Proceeding
Publication Date
1-7-2025
Abstract
The capabilities of large language models have increased to the point where entire textbooks can be queried using retrieval-augmented generation (RAG). The study evaluates the ability of OpenAI’s ChatGPT-3.5-Turbo and ChatGPT-4-Turbo models to create and answer exam questions based on an undergraduate textbook. 14 exams were created with true-false, multiple-choice, and short-answer questions from a textbook available online. The accuracy of the models in answering these questions is assessed both with and without access to the source material. Performance was evaluated using text-similarity metrics including ROUGE-1, cosine similarity, and word embeddings. 56 exam scores were analyzed to find that RAG-assisted models outperformed those without access to the textbook, and that ChatGPT-4-Turbo was more accurate than ChatGPT-3.5-Turbo on nearly all exams. The findings demonstrate the potential of generative artificial intelligence tools in academic assessments and provide insights into comparative performance of these models.
DOI
HICSS Record: https://hdl.handle.net/10125/109033
Source Publication
58th Hawaii International Conference on System Sciences, HICSS 2025
Recommended Citation
Tyndall, E., Gayheart, C., Some, A., Genz, J., Langhals, B., & Wagner, T. (2025). Undergraduate Pacific Studies Exam Generation and Answering Using Retrieval Augmented Generation and Large Language Models. Proceedings of the Annual Hawaii International Conference on System Sciences, 1604–1613. https://hdl.handle.net/10125/109033
Comments
The "Link to Full Text" on this page opens or downloads the conference paper as hosted at the publisher website.
This is an Open Access conference paper distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License, which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way. CC BY-NC-ND 4.0
Presented at HICSS 2025 as part of the Minitrack on Natural Language Processing and Large Language Models Supporting Data Analytics for System Sciences