ChatGPT: A Doctor-Beating AI for Depression Treatment, Without Bias
According to recent research published in the open access journal Family Medicine and Community Health, ChatGPT, an AI language model, may outperform doctors in following recognized treatment standards for clinical depression. Moreover, it does so without any of the gender or social class biases that can sometimes emerge in the doctor-patient relationship. However, the researchers emphasize the need for further research to explore how well this technology can handle severe cases, as well as address potential risks and ethical considerations associated with its use.
Depression affects a significant portion of the population, and many individuals seek help from their family doctors as a first step. Evidence-based clinical guidelines typically recommend a tiered approach to depression treatment, adjusting the course of action based on the severity of the condition.
The researchers believe that ChatGPT has the potential to offer quick, data-driven insights that can supplement traditional diagnostic methods while providing confidentiality and anonymity. To investigate its capabilities, they examined how the AI technology evaluated the recommended therapeutic approach for mild and severe major depression compared to 1,249 French primary care doctors, of which 73% were women.
Using carefully designed vignettes, which depicted patients experiencing symptoms of sadness, sleep problems, and loss of appetite over a three-week period, the researchers created eight versions of these vignettes featuring variations in patient characteristics such as gender, social class, and depression severity. Each vignette was presented ten times to ChatGPT versions 3.5 and 4.
The researchers asked ChatGPT the following question for each of the eight vignettes: What do you think a primary care physician should suggest in this situation? The responses included options such as watchful waiting, referral for psychotherapy, prescribed drugs, or a combination of these.
The findings revealed that only slightly more than 4% of family doctors exclusively recommended referral for psychotherapy for mild cases, in line with clinical guidance. In contrast, ChatGPT-3.5 and ChatGPT-4 selected this option in 95% and 97.5% of cases, respectively. Most doctors proposed drug treatment exclusively (48%) or a combination of psychotherapy and prescribed drugs (32.5%). For severe cases, doctors mostly recommended psychotherapy combined with prescribed drugs (44.5%), while ChatGPT-3.5 and ChatGPT-4 proposed this option in 72% and 100% of cases, respectively. ChatGPT did not recommend prescribing drugs exclusively, unlike 40% of the doctors.
Regarding the choice of medication, doctors commonly recommended a combination of antidepressants, anti-anxiety drugs, and sleeping pills (67.5% of cases), followed by exclusive use of antidepressants (18%) and exclusive use of anti-anxiety drugs and sleeping pills (14%). ChatGPT, on the other hand, more frequently recommended exclusive use of antidepressants (74% for version 3.5 and 68% for version 4). The AI models also suggested combining antidepressants with anti-anxiety drugs and sleeping pills more often than the doctors (26% for ChatGPT-3.5 and 32% for ChatGPT-4).
Notably, unlike previously published research, ChatGPT did not exhibit any biases related to gender or social class in its recommended treatment.
The researchers acknowledge the limitations of their study, which focused on specific versions of ChatGPT and a representative sample of primary care doctors from France. They also note that the vignettes only represented initial visits for depression complaints, not ongoing treatment or other patient variables known to a doctor.
The study highlights that ChatGPT-4 demonstrated greater precision in aligning treatment suggestions with clinical guidelines and did not exhibit any discernible gender or socioeconomic biases. However, the researchers caution against using AI as a substitute for human clinical judgment in diagnosing and treating depression. They emphasize the importance of ongoing research to verify the reliability of AI systems like ChatGPT. Implementing such technologies could enhance the quality and impartiality of mental health services, but strict considerations must govern data privacy and security due to the sensitive nature of mental health information.
In conclusion, with its potential to supplement primary health care decision-making, ChatGPT offers an exciting prospect for mental health treatment. However, further research and development are necessary to validate its recommendations and address potential ethical concerns.