A massive study examining the performance of an artificial intelligence tool, ChatGPT, compared to accounting students has uncovered surprising results. ChatGPT was collected from 186 different institutions, with 25,817 questions asked, and the co-authors of the study found that the chatbot scored lower than the students. The results of the study, called “The ChatGPT Artificial Intelligence Chatbot: How Well Does It Answer Accounting Assessment Questions?”, have been published in Issues in Accounting Education, and offer insight into the capabilities and limitations of the AI.
University of Auckland accounting and finance academics, Ruth Dimes and David Hay, Co-Authors of the study, entered assessment questions into ChatGPT-3 and evaluated the accuracy of its responses. Ruth Dimes, who directs the Business School’s Business Masters program, examined two recent exams from the ‘analysing financial statements’ course by entering the exam questions into ChatGPT and recording the accuracy of its responses. She discovered the same overall results as the total study, and admitted surprise at the poorer performance from ChatGPT in comparison to the students.
David Hay, Professor of Auditing, also had similar findings when asking questions from his auditing course. He found that ChatGPT was able to perform slightly better in auditing courses when compared to financial accounting courses, but still not as well as the students.
When looking at the different topic areas of the assessment, the chatbot did relatively better on topics like Information Systems and Auditing, rather than financial, managerial, or tax. The total results of the study indicated that students scored an average of 76.7 percent while ChatGPT scored 47.4 percent based on fully correct answers. However, accounting for partially correct answers, ChatGPT was able to scrape through with an average of 56.5 percent overall.
The study and its implications have been intriguing for Dimes and Hay. Dimes is interested in seeing how newer versions of ChatGPT and other AI tools would respond in a similar study and highlights the need for universities to focus on assessing critical thinking, rather than rote learning. The co-authors of the study from around the world have congratulated each other for collaboration and collecting results so quickly.
Overall, the study indicates possible implications for universities and the accounting field, and further research into chatbot performance and AI capabilities could be beneficial moving forward.
The study was led by Professor David Wood from Brigham Young University in Utah, and the co-authors of the study include 328 researchers from around the world.
Ruth Dimes is an academic at the University of Auckland specialising in Business Masters and Analysing Financial Statements. David Hay is a Professor of Auditing at the University of Auckland and also a co-author of the study. Both academics have been instrumental in their part in the study and their work has allowed for the collection of a large number of questions to be collected and evaluated.