Limits of AI Highlighted as ChatGPT Struggles with Gastro Exam

In a reminder of the limits of artificial intelligence (AI), the OpenAI’s ChatGPT system has failed to pass a practice test created by the American College of Gastroenterology (ACG). When tested on questions from the ACG’s 2021 and 2022 multiple-choice assessment, both the GPT-3.5 and GPT-4 versions of the AI chatbot failed to reach the 70% passing grade.

The tests were conducted by Arvind Trindade, MD, of Northwell Health’s Feinstein Institutes for Medical Research in Manhasset, New York, and his colleagues. Questions from the assessment were copied and pasted directly into ChatGPT, which then generated a response and explanation. From these, the authors selected the correspond answer.

The GPT-3.5 and GPT-4 versions scored a 65.1% (296 of 455 questions) and a 62.4% (284 of 455 questions), respectively. The scores were below the required 70% grade to pass the exam. Shockingly, the scores were lower than expected, prompting the authors of the study to call for a higher standard to be set.

Currently, there have been recent papers showing ChatGPT passing other medical assessments. But, Dr. Trindade argued that it doesn’t mean it’s ready for clinical use. He commented that medical professionals should think about how to optimize this technology rather than relying on it for clinical use. He also noted that the medical community should have much higher standards than, for example, a 95% accuracy threshold.

Google researchers have developed their own medically trained AI model, Med-PaLM, which achieved 67.6% accuracy and surpassed the common threshold for passing scores. An updated version of this model, known as Med-PaLM 2, even achieved an 85% accuracy and performed at “expert” physician levels.

AI chatbots such as ChatGPT have also been found to beat physicians in answering patient-generated questions. During a blind evaluation, the AI chatbot’s responses were preferred over real-physician answers 75% of the time.

While this research into AI medical credentialing tests has shown tremendous progress, it is also a reminder that AI is far from providing accurate, reliable advice. Medical professionals should consider all available forms of information when making decisions and should always prioritize human expertise over artificial intelligence.

Limits of AI Highlighted as ChatGPT Struggles with Gastro Exam

Frequently Asked Questions (FAQs) Related to the Above News

Subscribe

How to Use Chat GPT: Step by Step Guide to Start Open AI ChatGPT

Fascinating Facts on ChatGPT

ChatGPT Global News Offers Comprehensive AI-Powered News Coverage

An Overview of ChatGPT

Meet the Experts Who Trained ChatGPT

More like this
Related

Obama’s Techno-Optimism Shifts as Democrats Navigate Changing Tech Landscape

Tech Evolution: From Obama’s Optimism to Harris’s Vision

Tonix Pharmaceuticals TNXP Shares Fall 14.61% After Q2 Earnings Report

The Future of Good Jobs: Why College Degrees are Essential through 2031

About us

Company

The latest

Obama’s Techno-Optimism Shifts as Democrats Navigate Changing Tech Landscape

Tech Evolution: From Obama’s Optimism to Harris’s Vision

Tonix Pharmaceuticals TNXP Shares Fall 14.61% After Q2 Earnings Report

Subscribe

Limits of AI Highlighted as ChatGPT Struggles with Gastro Exam

Frequently Asked Questions (FAQs) Related to the Above News

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

More like this
Related