A chatbot designed to provide health-related advice has been found to score poorly on a practice exam commonly used by urologists in training. ChatGPT, which uses artificial intelligence technology, was found to have scored less than 30% on the American Urological Association’s Self Study Program for Urology, a 150-question practice exam. The AI tool’s answers to open-ended questions were frequently found to be redundant and cyclical in nature, and its score was just 26.7% on this style of question. Researchers also identified that the chatbot made certain types of errors that posed a risk of spreading medical misinformation.
Despite previous successes at providing empathetic answers to patient questions and positive outcomes on the United States Medical Licensing Exam, ChatGPT failed to perform as well when it came to urological practice. Reflecting the issue of a fundamental lack of understanding of the context in which it was being used. The study, conducted by clinicians at the University of Nebraska Medical Center, recommended that caution should be used when deploying such tools in clinical settings and that further work was needed to refine them.