ChatGPT, an artificial intelligence chatbot developed by OpenAI, has been in the spotlight for quite some time now. The scientific community has shown a lot of interest in ChatGPT, due its ability to generate thought-provoking responses. However, some of its answers have also been a subject of debate. Now, researchers have unveiled another intriguing power of ChatGPT. 


The latest version of ChatGPT can be used for obtaining breast cancer screening advice, but may sometimes provide inaccurate information, according to a study published April 4 in the journal Radiology


The answers generated by ChatGPT provide correct information the vast majority of the time, but sometimes, the information is fictitious, the study indicates. 


How the study was conducted


Researchers from the University of Maryland School of Medicine created a set of 25 questions related to advice on getting screened for breast cancer, as part of the study. Each question was submitted to ChatGPT three times to see what kind of responses the chatbot generated. This is because ChatGPT is known for varying its response each time a question is posed. 


What kind of responses did ChatGPT give?


According to the study, three radiologists who are fellowship-trained in mammography evaluated the responses, and found that the responses were appropriate for 22 out of the 25 questions. 


However, ChatGPT provided one answer based on outdated information. For two other questions, ChatGPT's responses were inconsistent, which means that the answers varied significantly each time the same question was posed.


In a statement released by the University of Maryland School of Medicine, Paul Yi, corresponding author on the paper, said the researchers found ChatGPT answered questions correctly about 88 per cent of the time, which is "pretty amazing". 


He also said that ChatGPT has the added benefit of summarising information into an easily digestible form for consumers to easily understand. 


According to the study, the questions correctly answered by ChatGPT include those about the symptoms of breast cancer, who are at risk of having the disease, and questions on the cost, age and frequency recommendations concerning mammograms. 


What drawbacks of ChatGPT did the study find?


However, the drawback of ChatGPT is that it is not as comprehensive in its responses as what a person would normally find on a Google search. Hana Haver, the lead author on the paper, said ChatGPT provided only one set of recommendations on breast cancer screening, which was issued from the American Cancer Society, but did not mention differing recommendations put out by the Centers for Disease Control and Prevention (CDC), or the US Preventive Services Task Force (USPSTF).


One of ChatGPT’s responses was outdated


The researchers have deemed one of ChatGPT's responses to be inappropriate. This is because the chatbot provided an outdated response to planning a mammogram around Covid-19 vaccination. The response is outdated because the advice to delay a mammogram for four to six weeks after getting a Covid-19 jab in the United States was changed in February 2022. Also, the CDC endorses the USPSTF guidelines, which do not recommend waiting. 


ChatGPT gave inconsistent responses to questions concerning an individual's personal risk of getting breast cancer and on where someone could get a mammogram.


Yi said the researchers have seen in their experience that ChatGPT sometimes makes up fake journal articles or health consortiums to support its claims, and that consumers should be aware that these are new, unproven technologies, and should still rely on their doctor, rather than ChatGPT, for advice.


What’s next?


The researchers are now analysing how well ChatGPT works for lung cancer screening recommendations and identifying ways to improve the recommendations made by ChatGPT to be more accurate and complete, and also understandable to those without a high level of education. 


Mark T Gladwin, Dean, University of Maryland School of Medicine, said that with the rapid evolution of ChatGPT and other large language models, the medical community has a responsibility to evaluate these technologies and protect patients from potential harm that may come from incorrect screening recommendations or outdated preventive health strategies.