Dr ChatGPT: The pros and cons of artificial intelligence in medical consultations
A study shows that the chatbot can provide more empathetic advice to patients, but experts warn that doctors must always have the final say
When a patient asks about the risk of dying after swallowing a toothpick, two answers are given. The first points out that between two or six hours after ingestion, it is likely that it has already passed to the intestines, explaining that many people swallow toothpicks without anything happening them. But it also advises the patient to go to the emergency room if they are experiencing a “stomach ache.” The second answer is in a similar vein. It replies that, although it’s normal to worry, serious harm is unlikely to occur after swallowing a toothpick as it’s small and made of wood, which is not toxic or poisonous. However, if the patient has “abdominal pain, difficulty swallowing or vomiting,” they should see a doctor. “It’s understandable that you may be feeling paranoid, but try not to worry too much. It is highly unlikely that the toothpick will cause you any serious harm,” it adds.
The two answers say basically the same thing, but the way they do so is slightly different. The first one is more aseptic and concise; while the second is more empathetic and detailed. The first was written by a doctor, and the second was from ChatGPT, the artificial intelligence (AI) generative tool that has revolutionized the planet. This experiment — part of a study published in the journal Jama Internal Medicine — was aimed at exploring the role AI assistants could play in medicine. It compared how real doctors and the chatbot responded to patient questions in an internet forum. The conclusions — based on an analysis from an external panel of health professionals who did not know who had answered what — found that ChatGPT’s responses were more emphathetic and high quality than the real doctor’s in 79% of cases.
The explosion of new AI tools has opened debate about their potential use in the field of health. ChatGPT, for example, is seeking to become a resource for health workers by helping them avoid bureaucratic tasks and develop medical procedures. On the street, it is already planning to replace the imprecise and often foolish Dr Google. Experts who spoke to EL PAÍS say that the technology has great potential, but that it is still in its infancy. Regulation on how it is applied in real medical practice still needs to be fine-tuned to address any ethical doubts, they say. The experts also point out that it is fallible, and can make mistakes. For this reason, everything that comes out of the chatbot will require the final review of a health professional.
Paradoxically, the machine —not the human — is the most empathetic voice in the Jama Internal Medicine study. At least, in the written response. Josep Munuera, head of the Diagnostic Imaging Service at Hospital Sant Pau in Barcelona, Spain, and an expert in digital technologies applied to health, warns that the concept of empathy is broader than what the study can analyze. Written communication is not the same as face-to-face communication, nor is raising a question on an online forum the same as doing so during a medical consultation. “When we talk about empathy, we are talking about many issues. At the moment, it is difficult to replace non-verbal language, which is very important when a doctor has to talk to a patient or their family,” he pointed out. But Munuera does admit these generative tools have great potential when it comes to simplifying medical jargon. “In written communication, technical medical language can be complex and we may have difficulty translating it into understandable language. Probably, these algorithms find the equivalence between the technical word and another and adapt it to the receiver.”
Joan Gibert, a bioinformatician and leading figure in the development of AI models at the Hospital del Mar in Barcelona, points out another variable when it comes to comparing the empathy of the doctor and the chatbox. “In the study, two concepts that enter into the equation are mixed: ChatGPT itself, which can be useful in certain scenarios and that has the ability to concatenate words that give us the feeling that it is more empathetic, and burnout among doctors, the emotional exhaustion when it comes to caring for patients that leaves clinicians unable to be more empathetic,” he explained.
The danger of "hallucinations"
Nevertheless, as is the case with the famous Dr Google, it’s important to be careful with ChatGPT’s responses, regardless of how sensitive or kind they may seem. Experts highlight that the chatbot is not a doctor and can give incorrect answers. Unlike other algorithms, ChatGPT is generative. In other words, it creates information according to the databases that it has been trained on, but it can still invent some responses. “You always have to keep in mind that it is not an independent entity and cannot serve as a diagnostic tool without supervision,” Gibert insisted.
These chatboxes can suffer from what experts call “hallucinations,” explained Gibert. “Depending on the situation, it could tell you something that is not true. The chatbot puts words together in a coherent way and because it has a lot of information, it can be valuable. But it has to be reviewed since, if not, it can fuel fake news,” he said. Munuera also highlighted the importance of “knowing the database that has trained the algorithm because if the databases are poor, the response will also be poor.”
Outside of the doctor’s office, the potential uses of ChatGPT in health are limited, since the information they provide can lead to errors. Jose Ibeas, a nephrologist at the Parc Taulí Hospital in Sabadel, Spain, and secretary of the Big Data and Artificial Intelligence Group of the Spanish Society of Nephrology, pointed out that it is “useful for the first layers of information because it synthesizes information and help, but when you enter a more specific area, in more complex pathologies, its usefulness is minimal or it’s wrong.”
“It is not an algorithm that helps resolve doubts,” added Munuera. “You have to understand that when you ask it to give you a differential diagnosis, it may invent a disease.” Similiarly, the AI system can tell a patient that nothing is wrong, when something is. This can lead to missed opportunities to see a doctor, because the patient follows the advice of the chatbot and does not speak to a real professional.
Where experts see more room for possibilies for AI is as a support tool for health professionals. For example, it could help doctors answer patient messages, albeit under supervision. The Jama Internal Medicine study suggests that it would help “improve workflow” and patient outcomes: “If more patients’ questions are answered quickly, with empathy, and to a high standard, it might reduce unnecessary clinical visits, freeing up resources for those who need them,” the researchers said. “Moreover, messaging is a critical resource for fostering patient equity, where individuals who have mobility limitations, work irregular hours, or fear medical bills, are potentially more likely to turn to messaging.”
The scientific community is also studying the use of these tools for other repetitive tasks, such as filling out forms and reports. “Based on the premise that everything will always, always, always need to be reviewed by the doctor,” AI could help medical professionals complete repetitive but important bureaucratic tasks, said Gibert. This, in turn, would allow doctors to spend more time on other issues, such as patient care. An article published in The Lancet, for example, suggests that AI technology could help streamline discharge summaries. Researchers say automating this process could east the work burden of doctors and even improve the quality of reports, but they are aware of the difficulties involved with training algorithms, which requires large amounts of data, and the risk of “depersonalization of care,” which could lead to resistance to the technology.
Ibeas insists that, for any medical use, these tools must be “checked” and the division of responsibilities must be well established. “The systems will never decide. It must be the doctor who has the final sign-off,” he argued.
Ethical issues
Gibert also pointed out some ethical considerations that must be taken into account when including these tools in clinical practice: “You need this type of technology to be under a legal umbrella, for there to be integrated solutions within the hospital structure and to ensure that patient data is not used to retrain the model. And if someone wants to do the latter, they should do it within a project, with anonymized data, following all the controls and regulations. Sensitive patient information cannot be shared recklessly.”
The bioinformatician also argued that AI solutions, such as ChatGPT or models that help with diagnosis, introduce “biases” that can affect how doctors relate to patients. For example, these tools could condition a doctor’s decision, one way or another. “The fact that the professional has the result of an AI model changes the very professional. Their way of relating [to patients] may be very good, but it can introduce problems, especially in professionals who have less experience. That is why the process has to be done in parallel: until the professional gives the diagnosis, they cannot see what the AI says.”
A group of researchers from Stanford University also examined how AI tools can help to further humanize health care in an article in Jama Internal Medicine. “The practice of medicine is much more than just processing information and associating words with concepts; it is ascribing meaning to those concepts while connecting with patients as a trusted partner to build healthier lives,” they concluded. “We can hope that emerging AI systems may help tame laborious tasks that overwhelm modern medicine and empower physicians to return our focus to treating human patients.”
As we wait to see how this incipient technology grows and what repercussions it has for the public, Munuera argued: “You have to understand that [ChatGPT] is not a medical tool and there is no health professional who can confirm the veracity of the answer [the chatbot gives]. You have to be prudent and understand what the limits are.” In summary, Ibeas said: “The system is good, robust, positive and it is the future, but like any tool, you have to know how to use it so that it does not become a weapon.”
Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition