Artificial intelligence for medical evidence summarization: evaluating the use of large language models

In a recent study posted to the medRxiv* preprint server, researchers systematically evaluated the capabilities and limitations of large language models (LLMs), specifically ChatGPT, for zero-shot medical evidence summarization.

Background

Text summarization research has relied on fine-tuned pre-trained models as the primary approach. However, these models often need large training datasets, which may not be accessible in certain domains, like medical literature.

Large language models (LLMs) have caused a shift in natural language processing (NLP) research due to their recent success in zero- and few-shot prompting.

Prompt-based models offer promise for medical evidence summarization by allowing the model to summarize without updating parameters simply by following human instructions. Yet, no research has been conducted on summarizing and evaluating medical evidence.

About the study

In the present study, researchers evaluated the effectiveness of LLMs, such as ChatGPT and GPT-3.5, in summarizing medical evidence across six clinical domains. The capabilities and limitations of these models are systematically examined.

The study utilized Cochrane Reviews from the Cochrane Library and concentrated on six clinical areas: Alzheimer’s disease, esophageal cancer, kidney disease, skin disorders, neurological conditions, and heart failure. The team collected the ten most recent reviews published for these six domains.

Domain experts verified reviews to ensure they fulfilled important research objectives. The study focused on single-document summarization, specifically on the abstracts obtained from Cochrane Reviews.

The zero-shot performance concerning medical evidence summarization was evaluated using two models, GPT-3.5 and ChatGPT. Two experimental setups were designed to assess the models’ capabilities.

The models were provided with the complete abstract, except for the Author’s Conclusions (ChatGPTAbstract) in the initial setup. Two models, ChatGPT-MainResult and GPT3.5-MainResult, were given the Objectives as well as main results sections of the abstract as input in the second setup.

The main results document was selected as it contains significant benefit and harm findings. It also summarized how the risk of bias affects conduct, trial design, and reporting.

The quality of the generated summaries was assessed using several automatic metrics, such as ROUGE-L, METEOR, and BLEU, compared to a reference summary. The values of the generated summaries were rated on a scale of 0.0 to 1.0, where a score of 1.0 suggested that the generated summaries matched the reference summary.

The model-generated summaries underwent a thorough human evaluation that surpassed the limitations of automatic metrics. The evaluation identified four dimensions for defining summary quality: coherence, factual consistency, comprehensiveness, and harmfulness.

A 5-point Likert scale was used to evaluate each dimension. Participants were asked to explain in a text box corresponding to each dimension if the summary obtained a low score. Participants were asked to rate the quality of the summaries and share their most and least preferred ones, along with reasons for their decisions.

Results

Similar performance was observed among all models regarding the ROUGE-L, METEOR, and BLEU metrics. LLMs-generated summaries were less novel regarding n-grams and tended to be more extractive than those written by humans.

ChatGPT-MainResult displayed greater abstraction than GPT3.5-MainResult and ChatGPT-Abstract; however, it still fell short of human reference. Around 50% of the reviews were written in 2022 and 2023, which falls outside the timeframe of GPT3.5 and ChatGPT’s capabilities. No significant variations were noted in quality metrics estimated before and after 2022.

The ChatGPT-MainResult LLM configuration was the most preferred, producing the highest number of preferred summaries, surpassing the other two configurations by a significant margin.

ChatGPTMainResult was the preferred option due to its ability to generate a thorough summary encompassing important details. The team noted that a lack of important data, fabricated errors, and errors in interpretation were the main reasons certain summaries were deemed the least preferred option.

The study also showed that ChatGPT-MainResult was the most preferred option due to its minimal factual inconsistency errors and lack of harmful or misleading statements.

Conclusion

The study findings revealed that the three model settings of ChatGPT-Abstract, ChatGPT-MainResult, and GPT3.5- MainResult produced comparable results when evaluated using automatic metrics. However, these metrics did not estimate factual inconsistency, potential for medical harm, or human preference for LLM-generated summaries.

The researchers believe that human evaluation is crucial for assessing the accuracy and quality of medical evidence summaries produced by LLMs. However, there is a need for more efficient automatic evaluation methods in this area.

Journal reference:

Does COVID-19 vaccination affect the menstrual cycle?

The onset of the coronavirus disease 2019 (COVID-19) pandemic led to two years of mounting waves of illness and death, affecting hundreds of millions of people around the world. Even after the outbreak’s severity subsided, the potential long-term sequelae of the infection or COVID-19 vaccination continue to be a matter of concern.

A new paper published in the Vaccine Journal reports on the association of COVID-19 vaccination with menstrual cycle abnormalities.

 

Introduction

Thousands of social media posts and vaccine safety surveillance system reports have described disruption of the menstrual cycle following vaccination with the COVID-19 vaccines. Women have reported longer, heavier, irregular periods and, in some cases, breakthrough bleeding in postmenopausal women.

This has led to many expressing concern about whether these vaccines compromise female reproductive health.

Biologically, a pathway whereby the immune response evoked by a vaccine produces a short-term effect on the endocrine master gland, the hypothalamus, and the linked pituitary-ovarian axis, is quite plausible. This could explain how vaccination could theoretically affect the menstrual cycle.

Acute and temporary effects on menstruation have been reported with typhoid, hepatitis B, and human papillomavirus (HPV) vaccines in prior research.

The current study looks at six major characteristics of the menstrual cycle in association with the menstrual cycle: length, regularity, duration of bleeding, intensity of bleeding, and period pain.

Earlier studies introduced reporting bias, lacked a control group, did not adjust for confounding factors, failed to assess menstrual characteristics other than cycle length, or lacked sufficient follow-up length.

The researchers used data from the Pregnancy Study Online (PRESTO) in the present study. This is a cohort of couples recruited to the survey online.

They were followed up from before conception, none being on fertility treatment. The study period was from January 2021 to August 2022, and the cohort included couples from the USA or Canada.

The study contained approximately 1,100 couples between 21 and 45 years of age. Questionnaires assessed them at baseline, and every eight weeks after that, for up to 12 months. They were asked about COVID-19 vaccination as well as their menstrual cycle characteristics.

What did the study show?

Of the more than one thousand participants, about 14% sent in six follow-up questionnaires, while 65% conceived within the next year. Just over one in ten began fertility treatment, and 2% stopped attempts to conceive. The rest, about 9%, stopped follow-up.

None of the participants were COVID-19 vaccinated at the outset, but almost 40% took one or more doses during the study period. Most of them took the Moderna or Pfizer vaccines, at 32% and 61%, respectively.

Among the vaccinated, seven out of eight were vaccinated from February to May 2021. The majority were better educated, with a higher income, and trying to have their first babies, compared to the unvaccinated group.

After compensating for sociodemographic factors, reproductive and lifestyle factors, and any medical conditions, the researchers estimated any differences in menstrual characteristics in relation to COVID-19 vaccination.

After adjustment, the first dose of the COVID-19 vaccine was associated with a lengthening of the next cycle by a mean of one day. The corresponding increase in the first cycle after the second dose was 1.3 days. Interestingly, the association was stronger from April 2021 to August 2022 than from January to March 2021.

By the second cycle following vaccination, these associations had weakened, indicating the effect to be temporary. Thus, long cycles became more prevalent after the first dose, from ~6% to 11%, but decreased in prevalence for the next cycle, at 7.3%.

There were no strong associations between the vaccination and menstrual cycle regularity, bleed intensity, duration of bleeding, or dysmenorrhea.

Irrespective of the vaccine brand, there was no significant change in the proportion of participants with irregular cycles (15%) after vaccination following the first or second doses. There was no change even after adjusting for a history of COVID-19 or infection with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).

It must be remembered that these were couples trying to conceive, not on contraception, and many were successful. Thus, they could not be followed up for more than a few months after each vaccine dose.

Also, older women were not included in the study by design. Thirdly, most participants were White college graduates.

What are the implications?

The current study shows no significant link between COVID-19 vaccination and menstrual function beyond a short delay of one day in the first cycle following each dose and an equally short-lived increase in the prevalence of long cycles. Both of these changes disappeared by the second cycle post-vaccination.

This temporary effect is probably due to immune system activation, mediated by cytokines that interfere with the hypothalamo-pituitary-ovarian (HPO) axis.

No association with fertility was observed, nor were any other menstrual cycle characteristics shown to undergo alteration in association with COVID-19 vaccination.

Taken together, these results indicate that short-term changes in menstrual cycle characteristics likely do not translate into meaningful differences in fertility.”

Journal reference:

Progesterone could be used in the fight against Parkinson’s disease

Progesterone was shown in a study to have a protective effect on intestine nerve cells. These findings raise hopes that the hormone could be used in the fight against Parkinson’s disease.

The nerve cells of the gastrointestinal tract communicate with those of the brain and spinal cord. This suggests that the nervous system of the digestive tract could influence processes in the brain that lead to Parkinson’s. Paula Neufeld and Lennart Stegemann, medical doctoral students at the Department of Cytology at the Faculty of Medicine at Ruhr University Bochum, Germany, were the first to detect progesterone receptors in the nerve cells of the gastrointestinal tract and showed that progesterone protects the cells. Their findings open up perspectives for the development of novel neuroprotective therapeutic approaches to counteract diseases such as Parkinson’s and Alzheimer’s. The study was published in the journal Cells on April 21, 2023.

The second brain

The enteric nervous system (ENS) is a complex network that stretches along the entire gastrointestinal tract. It consists of about 100 million nerve cells, autonomously controls digestive processes and is often referred to as the second brain of humans. But its function is much more than digestion: recent research has shown that the ENS communicates closely with the central nervous system (CNS), i.e. the brain and spinal cord.

The communication between the ENS and the CNS is currently associated with the pathogenesis of various neurological diseases such as Parkinson’s disease and Alzheimer’s disease, as well as depression.”

Professor Carsten Theiβ, Head of the Department of Cytology at Ruhr University Bochum

The gut-brain axis is not a one-way street; both nervous systems influence each other.

A person’s diet has a direct impact on the intestinal microbiome, which in turn interacts with the ENS. Studies show that the composition of the microbiome can also affect the CNS via the gut-brain axis, especially via the vagus nerve, and promote diseases such as Parkinson’s disease. A balanced diet can therefore not only contribute to the preservation of nerve cells in the intestine, but may also delay Parkinson’s disease for many years or even prevent it entirely.

The protective effect of progesterone

Medical doctoral students Paula Neufeld and Lennart Stegemann have now successfully demonstrated a protective effect of the natural steroid hormone progesterone on the nerve cells of the ENS. In a series of experiments, the duo cultivated nerve cells from the ENS over several weeks and treated them with a cell toxin to simulate harmful conditions similar to Parkinson’s disease. They found that the nerve cells that were additionally treated with progesterone died significantly less frequently than the untreated cells.

 

Paula Neufeld points out the significance of their discovery: “Our research provides important insights to complete our basic knowledge about the role of progesterone receptors in the enteric nervous system. This opens up completely new avenues for studying the neuroprotective mechanisms of action of progesterone inside and outside the intestinal tract.” Lennart Stegemann adds that “this study could potentially pave the way for new steroid hormone-based therapeutic approaches. There is also hope that steroid-based therapeutic approaches could help to slow down or even stop neurodegenerative diseases”.

Cooperation partners

The paper is the result of collaboration and well-established translational research between the Department of Cytology headed by Professor Carsten Theiβ at the Ruhr University Bochum Medical Campus and Professor Matthias Vorgerd, senior consultant at the Clinic for Neurology at the BG University Hospital Bergmannsheil in Bochum.

Source:
Journal reference:

Stegemann, L. N., et al. (2023) Progesterone: A Neuroprotective Steroid of the Intestine. Cells. doi.org/10.3390/cells12081206.

Placeholder

أهلاً بالعالم !

مرحباً بك في ووردبريس. هذه مقالتك الأولى. حررّها أو احذفها، ثم ابدأ النشر!