OpenAI’s newest AI chatbot product, GPT-4, has been making headlines for its impressive performance in various exams, including passing the bar exam with a score in the 90th percentile, passing 13 of 15 advanced placement (AP) exams, and getting nearly perfect score on the GRE Verbal test. However, when the chatbot was put to the test in accounting exams, it was no match for humans. A recent experiment conducted by 186 universities, including Brigham Young University (BYU), found that students scored an average of 76.7 percent, while ChatGPT scored 47.4 percent.
OpenAI is a company co-founded by Elon Musk, Greg Brockman and Ilya Sutskever and is backed by Microsoft. OpenAI’s primary mission is to make artificial intelligence technologies benefit humans so they strive to create powerful methods and technologies to address the world’s most pressing challenges. The company is well-known for producing various AI models such as the massively popular GPT-3, an AI-powered text-generating system that produces human-like
The purpose of the experiment was to determine how well OpenAI’s technology works in accounting. The results showed that ChatGPT struggled with higher-order questions, mathematical processes required for tax, financial, and managerial assessments, as well as short-answer questions. In addition, ChatGPT also had difficulty in recognizing when it is performing math and making nonsensical errors.
Lead study author and BYU professor of accounting, David Wood commented on the results saying, “We’re trying to focus on what we can do with this technology now that we couldn’t do before to improve the teaching process for faculty and the learning process for students. Testing it out was eye-opening”.
Although the results of the experiment were clear, the AI chatbot was still able to do well in some areas, such as true/false questions and multiple-choice questions. Additionally, it did better on AIS and auditing questions which could show its potential use cases in business and finance.
Despite its impressive performance in some areas, the experiment also found that ChatGPT often provided explanations for incorrect answers and even made up facts. This is why researchers urge people not to rely solely on the AI chatbot for its learning capabilities, as Jessica Wood, a freshman at BYU commented “Trying to learn solely by using ChatGPT is a fool’s errand.”
Although ChatGPT has currently failed to clear the accounting exam, researchers hope that Microsoft-backed OpenAI’s GPT-4 will improve exponentially in the future. With OpenAI’s commitment to making AI technologies work for human, ChatGPT may soon surpass human performance and revolutionize the education sector.