Study: Medical Chatbots Not Ready Yet For Patient Care

By Susie Madrak — February 10, 2026

A new study published yesterday provided a sobering look at whether A.I. chatbots are actually good at providing medical advice to the general public.

Researchers found that the chatbots were no better than Google at guiding users toward the correct diagnoses or helping them figure out what to do next. The technology also posed unique risks, sometimes presenting false information or dramatically changing its advice depending on slight changes in the wording of the questions.

None of the models evaluated were “ready for deployment in direct patient care,” the researchers concluded in the first randomized study of its kind.

Since A.I. chatbots were made publicly available, health questions have become one of the most common topics users ask about. Major A.I. companies, including Amazon and OpenAI, have rolled out products specifically aimed at answering health questions.

Creating an AI solution that could have dangerous consequences is one way to postpone that inevitable AI market crash. Granted, the models have passed medical licensing exams and have outperformed doctors on challenging diagnostic problems. But there's a reason why student doctors need clinical experience.

But AI is a classic example of GIGO -- garbage in, garbage out. After all, it's just a machine that scrapes the available data. Ask the wrong question, get the wrong answer.

A spokesperson for OpenAI said ChatGPT iterations today are significantly better at answering health questions than the version tested in the study, and is now phased out. They say their internal data showed that many new models were far less likely to make common types of mistakes, including hallucinations and errors in potentially urgent situations.

NOTE: The risks of using chatbots for mental health are well known by now. But that's another story.

⚠️ Despite all the hype, chatbots still make terrible doctors. Out today is the largest user study of language models for medical self-diagnosis. We found that chatbots provide inaccurate and inconsistent answers, and that people are better off using online searches or their own judgment.

— Luc Rocher (@rocher.lc) 2026-02-09T17:07:57.976Z

Explore more

Discussion

We welcome relevant, respectful comments. Any comments that are sexist or in any other way deemed hateful by our staff will be deleted and constitute grounds for a ban from posting on the site. Please refer to our Terms of Service for information on our posting policy.