MLCommons Launches New Platform for Benchmarking AI Medical Models

Date:

Title: MLCommons Launches MedPerf, a New Platform to Evaluate AI Medical Models

The healthcare industry is increasingly embracing AI, with 80% of healthcare organizations already having an AI strategy in place, and an additional 15% planning to launch one, according to a survey by Optum. In response to the growing demand for AI in healthcare, MLCommons has developed a new platform called MedPerf to benchmark and evaluate AI medical models.

With the proliferation of medical models in the market, it has become challenging to determine which models actually perform as advertised. Many medical models are trained with data from limited clinical settings, leading to biases and harmful impacts, particularly on minority patient populations.

MedPerf aims to establish a reliable and trusted way to benchmark and evaluate medical models. Designed to be used by healthcare organizations rather than vendors, MedPerf allows hospitals and clinics to assess AI models on demand. It utilizes federated evaluation to remotely deploy models and evaluate them on-premises while protecting patient privacy.

As part of a two-year collaboration led by the Medical Working Group, MedPerf received input from over 20 companies and more than 20 academic institutions, including Google, Amazon, IBM, Intel, Brigham and Women’s Hospital, Stanford, and MIT.

In a recent test, MedPerf hosted the NIH-funded Federated Tumor Segmentation (FeTS) Challenge, which involved evaluating 41 different models across 32 healthcare sites on six continents. The results showed reduced performance of the models at sites with different patient demographics, revealing the biases within them.

While MLCommons sees MedPerf as a foundational step towards accelerating medical AI through open and scientific approaches, it primarily focuses on evaluating radiology scan-analyzing models. However, MLCommons encourages AI researchers to validate their own models using the platform and urges data owners to register their patient data to enhance the robustness of MedPerf’s testing.

See also  US Workers Embrace ChatGPT for Basic Tasks Despite Employer Restrictions: Reuters/Ipsos Poll

While MedPerf addresses the issue of medical model bias, there are still challenges to overcome in implementing AI in healthcare. Incorporating AI into the daily routines of doctors and nurses, as well as dealing with complex care-delivery and technical systems, remains a challenge. A report from Duke University highlights this gap between AI marketing and the practical implementation of the technology.

The concerns surrounding AI in healthcare are reflected in a poll by Yahoo Finance, which found that 55% of healthcare practitioners believe the technology is not yet ready for use, and only 26% believe it can be trusted.

While MedPerf offers a valuable tool for benchmarking and evaluating medical models, the safe deployment of these models requires continuous auditing by vendors, customers, and researchers. Benchmarks alone do not provide a complete picture, and thorough testing is essential to ensure responsible use of AI in healthcare.

Frequently Asked Questions (FAQs) Related to the Above News

What is MedPerf?

MedPerf is a new platform developed by MLCommons to benchmark and evaluate AI medical models used in healthcare.

Why was MedPerf created?

MedPerf was created in response to the increasing demand for AI in healthcare and the need for a reliable and trusted way to evaluate the performance of medical models.

How is MedPerf different from other benchmarking platforms?

MedPerf is designed specifically for healthcare organizations rather than vendors. It utilizes federated evaluation to remotely deploy models and evaluate them on-premises while protecting patient privacy.

Who collaborated on the development of MedPerf?

MedPerf received input from over 20 companies and more than 20 academic institutions, including Google, Amazon, IBM, Intel, Brigham and Women's Hospital, Stanford, and MIT.

What types of models does MedPerf evaluate?

Currently, MedPerf primarily focuses on evaluating radiology scan-analyzing models, but MLCommons encourages AI researchers to validate their own models using the platform.

How does MedPerf address the issue of biases in medical models?

MedPerf's tests have revealed biases within medical models by evaluating their performance at different healthcare sites with varying patient demographics.

What are the challenges in implementing AI in healthcare?

Incorporating AI into the daily routines of healthcare professionals and dealing with complex care-delivery and technical systems are among the challenges highlighted in a report from Duke University.

Can AI in healthcare be trusted?

According to a poll by Yahoo Finance, only 26% of healthcare practitioners believe AI can be trusted, while 55% believe it is not yet ready for use.

Are benchmarks alone enough for the responsible use of AI in healthcare?

No, thorough testing and continuous auditing by vendors, customers, and researchers are essential to ensure the responsible deployment of AI models in healthcare.

Please note that the FAQs provided on this page are based on the news article published. While we strive to provide accurate and up-to-date information, it is always recommended to consult relevant authorities or professionals before making any decisions or taking action based on the FAQs or the news article.

Advait Gupta
Advait Gupta
Advait is our expert writer and manager for the Artificial Intelligence category. His passion for AI research and its advancements drives him to deliver in-depth articles that explore the frontiers of this rapidly evolving field. Advait's articles delve into the latest breakthroughs, trends, and ethical considerations, keeping readers at the forefront of AI knowledge.

Share post:

Subscribe

Popular

More like this
Related

Amazon Founder Bezos Plans $5 Billion Share Sell-Off After Record High

Amazon Founder Bezos plans to sell $5 billion worth of shares after record highs. Stay updated on his investment strategy and Amazon's growth.

Noplace App Brings Back Social Connection, Tops App Store Charts

Discover Noplace App - the top-ranking app fostering social connection. Find out why it's dominating the App Store charts!

Real Housewife Shamed by Daughter Over Excessive Beauty Filter – Reaction Goes Viral

Reality star Jeana Keough faces daughter's criticism over excessive beauty filter, but receives overwhelming support for embracing her real self.

UAB Breakthrough: Deep Learning Revolutionizes Cardiac Health Study in Fruit Flies

Revolutionize cardiac health study with deep learning technology in fruit flies! UAB breakthrough leads to groundbreaking insights in heart research.