Searching and Summarizing PubMed with LLMs

What if an AI could perform a literature review for you? Here I illustrate using Large Language Models (LLMs) and related technologies for search and summarization. We use OpenAI's gpt-3.5-turbo (ChatGPT) and gpt-4 models, SentenceTransformers sentence embeddings, and the PubMed API to build a first-pass literature summarizer, that 1) designs an effective search query given a user question, 2) retrieves and indexes pubmed abstracts with embeddings, 3) selects the top-matching abstracts for individual summarization in light of the original question, and 4) summarizes the summaries including references.

TL;DR: "What are the most common symptoms of Long-COVID in children?":

Long COVID is a condition that affects some individuals who have recovered from COVID-19, and it can cause persistent symptoms for several months after the initial infection. Several studies have investigated the prevalence and symptoms of Long COVID in children and adolescents. According to a retrospective study in a pediatric cohort, 17.6% of children and adolescents with COVID-19 developed long COVID, with respiratory symptoms being more common in the first weeks and neuropsychiatric symptoms developing over time. Chronic conditions and obesity were identified as risk factors, and adolescents were found to be at a greater risk for long COVID (Baptista de Lima et al., 2023). Another observational study conducted in Argentina found that one-third of children and adolescents with previously confirmed COVID-19 experienced persistent symptoms for more than three months after the diagnosis. The most common symptoms reported were headache, dizziness, loss of taste, dyspnea, cough, fatigue, muscle pain, and loss of weight, with the loss of smell only reported in infected children. Older age, symptomatic COVID-19, and comorbidities were identified as independent predictors of long-term symptoms (Seery et al., 2023). A prospective study conducted from July 2020 to December 2021, which included 215 children aged 0-18 years who tested positive for SARS-CoV-2, found that 32.6% of the children had persistent symptoms at 2 months, 9.3% at 4 months, and 2.3% at 6 months, including dyspnea, dry cough, fatigue, and runny nose (Jamaica Balderas et al., 2023). A case-control study of 274 children found that prolonged non-neuropsychiatric symptoms were more frequent in the case group, with the most common long COVID symptom in children being abdominal pain (Ahn et al., 2023). A recent study published in JAMIA Open aimed to identify conditions and symptoms associated with pediatric Long-COVID (PASC) using a data mining approach. The study found significant enrichment among children with PASC in cardiac, respiratory, neurologic, psychological, endocrine, gastrointestinal, and musculoskeletal systems, with the most significant symptoms related to circulatory and respiratory systems such as dyspnea, difficulty breathing, and fatigue and malaise (Lorman et al., 2023). However, none of the studies provided a comprehensive list of the most common symptoms of Long-COVID in children.

Authors, years, titles, and URLs:

  • Ahn et al. 2023: Non-neuropsychiatric Long COVID Symptoms in Children Visiting a Pediatric Infectious Disease Clinic After an Omicron Surge. https://pubmed.ncbi.nlm.nih.gov/36795575/
  • Baptista de Lima et al. 2023: Long COVID in Children and Adolescents: A Retrospective Study in a Pediatric Cohort. https://pubmed.ncbi.nlm.nih.gov/36728643
  • Jamaica Balderas et al. 2023: Long COVID in children and adolescents: COVID-19 follow-up results in third-level pediatric hospital. https://pubmed.ncbi.nlm.nih.gov/36793333
  • Lorman et al. 2023: Understanding pediatric long COVID using a tree-based scan statistic approach: an EHR-based cohort study from the RECOVER Program. https://pubmed.ncbi.nlm.nih.gov/36926600
  • Seery et al. 2023: Persistent symptoms after COVID-19 in children and adolescents from Argentina. https://pubmed.ncbi.nlm.nih.gov/36736574

We also have some fun visualizing articles in 'semantic space':

Graph visualization of pubmed articles, connected by embedding cosine distance and colored by pagerank-weighted similarity to the query.

Read the full post here.

Share