Safety, Pharmacokinetics, Biomarker Response, and Efficacy of E6742, a Dual Antagonist of Toll-Like Receptors 7 and 8, in a First-in-Patient, Randomized, Double-Blind, Phase 1 …

medRxiv

Published On 2024

Objectives To evaluate the safety, tolerability, pharmacokinetics (PK), biomarker response, and efficacy of E6742 in a phase 1/2 study in patients with systemic lupus erythematosus (SLE). Methods Two sequential cohorts of SLE patients were enrolled and randomized to 12 weeks of twice-daily treatment with E6742 (100 or 200 mg; n = 8 or 9) or placebo (n = 9). Results The proportion of patients with any treatment-emergent adverse events (TEAEs) was 58.8% in the E6742 group (37.5% for 100 mg; 77.8% for 200 mg) and 66.7% in the placebo group. No Common Terminology Criteria for Adverse Events ≥ Grade 3 TEAEs occurred. PK parameter levels were similar between SLE patients and healthy adults in previous phase 1 studies. The interferon gene signature (IGS) and levels of proinflammatory cytokines (interleukin-1β, interleukin-6, tumor necrosis factor-α) after ex-vivo challenge with a Toll-like receptor 7/8 agonist were immediately decreased by E6742 treatment. Dose-dependent improvements in the British Isles Lupus Assessment Group-based Composite Lupus Assessment response were observed at Week 12 in the E6742 (37.5% for 100 mg; 57.1% for 200 mg) and placebo (33.3%) groups. E6742 also had therapeutic effects on other symptoms, including skin inflammation, arthritis, and levels of anti-double-stranded DNA antibodies and complements. Conclusions E6742 had a favorable safety profile and was well tolerated, with marked IGS responses and sufficient efficacy signals in patients with SLE. These results provide the first clinical evidence to support E6742 in the treatment of SLE, and support larger, longer-term clinical trials.

Journal

medRxiv

Published On

2024

Page

2024.04. 26.24306410

Authors

Shizuo Akira

Shizuo Akira

Osaka University

Position

H-Index(all)

303

H-Index(since 2020)

143

I-10 Index(all)

0

I-10 Index(since 2020)

0

Citation(all)

0

Citation(since 2020)

0

Cited By

0

Research Interests

immunology

University Profile Page

Other Articles from authors

Shizuo Akira

Shizuo Akira

Osaka University

Nature Communications

The IL-33/ST2 axis is protective against acute inflammation during the course of periodontitis

Periodontitis, which is induced by repeated bacterial invasion and the ensuing immune reactions that follow, is the leading cause of tooth loss. Periodontal tissue is comprised of four different components, each with potential role in pathogenesis, however, most studies on immune responses focus on gingival tissue. Here, we present a modified ligature-induced periodontitis model in male mice to analyze the pathogenesis, which captures the complexity of periodontal tissue. We find that the inflammatory response in the peri-root tissues and the expression of IL-6 and RANKL by Thy-1.2− fibroblasts/stromal cells are prominent throughout the bone destruction phase, and present already at an early stage. The initiation phase is characterized by high levels of ST2 (encoded by Il1rl1) expression in the peri-root tissue, suggesting that the IL-33/ST2 axis is involved in the pathogenesis. Both Il1rl1- and Il33-deficient mice …

Shizuo Akira

Shizuo Akira

Osaka University

Rheumatology

Platelet TLR7 is essential for the formation of platelet–neutrophil complexes and low-density neutrophils in lupus nephritis

Objectives Platelets and low-density neutrophils (LDNs) are major players in the immunopathogenesis of SLE. Despite evidence showing the importance of platelet–neutrophil complexes (PNCs) in inflammation, little is known about the relationship between LDNs and platelets in SLE. We sought to characterize the role of LDNs and Toll-like receptor 7 (TLR7) in clinical disease. Methods Flow cytometry was used to immunophenotype LDNs from SLE patients and controls. The association of LDNs with organ damage was investigated in a cohort of 290 SLE patients. TLR7 mRNA expression was assessed in LDNs and high-density neutrophils (HDNs) using publicly available mRNA sequencing datasets and our own cohort using RT-PCR. The role of TLR7 in platelet binding was evaluated in platelet–HDN mixing studies using TLR7-deficient mice and Klinefelter syndrome …

Shizuo Akira

Shizuo Akira

Osaka University

Nutrients

IL-33 Reduces Saturated Fatty Acid Accumulation in Mouse Atherosclerotic Foci

The cellular and molecular mechanisms of atherosclerosis are still unclear. Type 2 innate lymphocytes (ILC2) exhibit anti-inflammatory properties and protect against atherosclerosis. This study aimed to elucidate the pathogenesis of atherosclerosis development using atherosclerosis model mice (ApoE KO mice) and mice deficient in IL-33 receptor ST2 (ApoEST2 DKO mice). Sixteen-week-old male ApoE KO and ApoEST2 DKO mice were subjected to an 8-week regimen of a high-fat, high-sucrose diet. Atherosclerotic foci were assessed histologically at the aortic valve ring. Chronic inflammation was assessed using flow cytometry and real-time polymerase chain reaction. In addition, saturated fatty acids (palmitic acid) and IL-33 were administered to human aortic endothelial cells (HAECs) to assess fatty acid metabolism. ApoEST2 DKO mice with attenuated ILC2 had significantly worse atherosclerosis than ApoE KO mice. The levels of saturated fatty acids, including palmitic acid, were significantly elevated in the arteries and serum of ApoEST2 DKO mice. Furthermore, on treating HAECs with saturated fatty acids with or without IL-33, the Oil Red O staining area significantly decreased in the IL-33-treated group compared to that in the non-treated group. IL-33 potentially prevented the accumulation of saturated fatty acids within atherosclerotic foci.

Shizuo Akira

Shizuo Akira

Osaka University

medRxiv

Safety, Pharmacokinetics, Biomarker Response, and Efficacy of E6742, a Dual Antagonist of Toll-Like Receptors 7 and 8, in a First-in-Patient, Randomized, Double-Blind, Phase 1 …

Objectives To evaluate the safety, tolerability, pharmacokinetics (PK), biomarker response, and efficacy of E6742 in a phase 1/2 study in patients with systemic lupus erythematosus (SLE). Methods Two sequential cohorts of SLE patients were enrolled and randomized to 12 weeks of twice-daily treatment with E6742 (100 or 200 mg; n = 8 or 9) or placebo (n = 9). Results The proportion of patients with any treatment-emergent adverse events (TEAEs) was 58.8% in the E6742 group (37.5% for 100 mg; 77.8% for 200 mg) and 66.7% in the placebo group. No Common Terminology Criteria for Adverse Events ≥ Grade 3 TEAEs occurred. PK parameter levels were similar between SLE patients and healthy adults in previous phase 1 studies. The interferon gene signature (IGS) and levels of proinflammatory cytokines (interleukin-1β, interleukin-6, tumor necrosis factor-α) after ex-vivo challenge with a Toll-like receptor 7/8 agonist were immediately decreased by E6742 treatment. Dose-dependent improvements in the British Isles Lupus Assessment Group-based Composite Lupus Assessment response were observed at Week 12 in the E6742 (37.5% for 100 mg; 57.1% for 200 mg) and placebo (33.3%) groups. E6742 also had therapeutic effects on other symptoms, including skin inflammation, arthritis, and levels of anti-double-stranded DNA antibodies and complements. Conclusions E6742 had a favorable safety profile and was well tolerated, with marked IGS responses and sufficient efficacy signals in patients with SLE. These results provide the first clinical evidence to support E6742 in the treatment of SLE, and support larger, longer-term clinical trials.

Shizuo Akira

Shizuo Akira

Osaka University

International Immunology

TAK1-binding proteins (TAB) 2 and TAB3 are redundantly required for TLR-induced cytokine production in macrophages

Transforming growth factor-β-activated kinase 1 (TAK1) plays a pivotal role in innate and adaptive immunity. TAK1 is essential for the activation of mitogen-activated protein kinases (MAPKs) and nuclear factor (NF)-κB pathways downstream of diverse immune receptors, including Toll-like receptors (TLRs). Upon stimulation with TLR ligands, TAK1 is activated via recruitment to lysine 63-linked polyubiquitin chain through TAK1-binding proteins (TAB) 2 and TAB3. However, the physiological importance of TAB2 and TAB3 in macrophages is still controversial. A previous study has shown that mouse bone marrow-derived macrophages (BMDMs) isolated from mice double deficient for TAB2 and TAB3 produced tumor necrosis factor (TNF)-α and interleukin (IL)-6 to the similar levels as control wild-type BMDMs in response to TLR ligands such as lipopolysaccharide (LPS) or Pam3CSK4, indicating that TAB2 and …

Shizuo Akira

Shizuo Akira

Osaka University

Nature

CGRP sensory neurons promote tissue healing via neutrophils and macrophages

The immune system has a critical role in orchestrating tissue healing. As a result, regenerative strategies that control immune components have proved effective 1, 2. This is particularly relevant when immune dysregulation that results from conditions such as diabetes or advanced age impairs tissue healing following injury 2, 3. Nociceptive sensory neurons have a crucial role as immunoregulators and exert both protective and harmful effects depending on the context 4, 5, 6, 7, 8, 9, 10, 11, 12. However, how neuro–immune interactions affect tissue repair and regeneration following acute injury is unclear. Here we show that ablation of the Na V 1.8 nociceptor impairs skin wound repair and muscle regeneration after acute tissue injury. Nociceptor endings grow into injured skin and muscle tissues and signal to immune cells through the neuropeptide calcitonin gene-related peptide (CGRP) during the healing process …

Shizuo Akira

Shizuo Akira

Osaka University

Journal of Experimental & Clinical Cancer Research

Regnase-1 downregulation promotes pancreatic cancer through myeloid-derived suppressor cell-mediated evasion of anticancer immunity

BackgroundPancreatitis is known to be an important risk factor for pancreatic ductal adenocarcinoma (PDAC). However, the exact molecular mechanisms of how inflammation promotes PDAC are still not fully understood. Regnase-1, an endoribonuclease, regulates immune responses by degrading mRNAs of inflammation-related genes. Herein, we investigated the role of Regnase-1 in PDAC.MethodsClinical significance of intratumor Regnase-1 expression was evaluated by immunohistochemistry in 39 surgically-resected PDAC patients. The functional role of Regnase-1 was investigated by pancreas-specific Regnase-1 knockout mice and Kras-mutant Regnase-1 knockout mice. The mechanistic studies with gene silencing, RNA immunoprecipitation sequencing (RIP-seq) and immune cell reconstitution were performed in human/mouse PDAC cell lines and a syngeneic orthotopic tumor transplantation model of …

Shizuo Akira

Shizuo Akira

Osaka University

Cardiovascular Research

The microsomal prostaglandin E synthase-1/prostaglandin E2 axis induces recovery from ischaemia via recruitment of regulatory T cells

Aims Microsomal prostaglandin E synthase-1 (mPGES-1)/prostaglandin E2 (PGE2) induces angiogenesis through the prostaglandin E2 receptor (EP1–4). Among immune cells, regulatory T cells (Tregs), which inhibit immune responses, have been implicated in angiogenesis, and PGE2 is known to modulate the function and differentiation of Tregs. We hypothesized that mPGES-1/PGE2-EP signalling could contribute to recovery from ischaemic conditions by promoting the accumulation of Tregs. Methods and results Wild-type (WT), mPGES-1-deficient (mPges-1−/−), and EP4 receptor-deficient (Ep4−/−) male mice, 6–8 weeks old, were used. Hindlimb ischaemia was induced by femoral artery ligation. Recovery from ischaemia was suppressed in mPges-1−/− mice and compared with WT mice. The number of accumulated forkhead box protein P3 (FoxP3)+ cells in ischaemic …

Shizuo Akira

Shizuo Akira

Osaka University

Immunity

Expression of the readthrough transcript CiDRE in alveolar macrophages boosts SARS-CoV-2 susceptibility and promotes COVID-19 severity

Lung infection during severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) via the angiotensin-I-converting enzyme 2 (ACE2) receptor induces a cytokine storm. However, the precise mechanisms involved in severe COVID-19 pneumonia are unknown. Here, we showed that interleukin-10 (IL-10) induced the expression of ACE2 in normal alveolar macrophages, causing them to become vectors for SARS-CoV-2. The inhibition of this system in hamster models attenuated SARS-CoV-2 pathogenicity. Genome-wide association and quantitative trait locus analyses identified a IFNAR2-IL10RB readthrough transcript, COVID-19 infectivity-enhancing dual receptor (CiDRE), which was highly expressed in patients harboring COVID-19 risk variants at the IFNAR2 locus. We showed that CiDRE exerted synergistic effects via the IL-10-ACE2 axis in alveolar macrophages and functioned as a decoy receptor for type I …

Shizuo Akira

Shizuo Akira

Osaka University

Journal of Clinical Microbiology

MGIT-seq for the Identification of Nontuberculous Mycobacteria and Drug Resistance: a Prospective Study

Because nontuberculous mycobacterial pulmonary disease is a considerable health burden, a simple and clinically applicable analytical protocol enabling the identification of subspecies and drug-resistant disease is required to determine the treatment strategy. We aimed to develop a simplified workflow consisting only of direct sequencing of mycobacterial growth indicator tube cultures (MGIT-seq). In total, 138 patients were prospectively enrolled between April 2021 and May 2022, and culture-positive MGIT broths were subjected to sequencing using MinION, a portable next-generation sequencer. Sequence analysis was conducted to identify species using core genome multilocus sequence typing and to predict macrolide and amikacin (AMK) resistance based on previously reported mutations in rrl, rrs, and erm(41). The results were compared to clinical tests for species identification and drug susceptibility. A …

Shizuo Akira

Shizuo Akira

Osaka University

Journal of Experimental Medicine

TLR7/8 stress response drives histiocytosis in SLC29A3 disorders

Loss-of-function mutations in the lysosomal nucleoside transporter SLC29A3 cause lysosomal nucleoside storage and histiocytosis: phagocyte accumulation in multiple organs. However, little is known about the mechanism by which lysosomal nucleoside storage drives histiocytosis. Herein, histiocytosis in Slc29a3−/− mice was shown to depend on Toll-like receptor 7 (TLR7), which senses a combination of nucleosides and oligoribonucleotides (ORNs). TLR7 increased phagocyte numbers by driving the proliferation of Ly6Chi immature monocytes and their maturation into Ly6Clow phagocytes in Slc29a3−/− mice. Downstream of TLR7, FcRγ and DAP10 were required for monocyte proliferation. Histiocytosis is accompanied by inflammation in SLC29A3 disorders. However, TLR7 in nucleoside-laden splenic monocytes failed to activate inflammatory responses. Enhanced production of proinflammatory cytokines was …

Shizuo Akira

Shizuo Akira

Osaka University

International Immunopharmacology

TLR4 agonist activity of Alcaligenes lipid a utilizes MyD88 and TRIF signaling pathways for efficient antigen presentation and T cell differentiation by dendritic cells

Alcaligenes faecalis was previously identified as an intestinal lymphoid tissue-resident commensal bacteria, and our subsequent studies showed that lipopolysaccharide and its core active element (i.e., lipid A) have a potent adjuvant activity to promote preferentially antigen-specific Th17 response and antibody production. Here, we compared A. faecalis lipid A (ALA) with monophosphoryl lipid A, a licensed lipid A–based adjuvant, to elucidate the immunological mechanism underlying the adjuvant properties of ALA. Compared with monophosphoryl lipid A, ALA induced higher levels of MHC class II molecules and costimulatory CD40, CD80, and CD86 on dendritic cells (DCs), which in turn resulted in strong T cell activation. Moreover, ALA more effectively promoted the production of IL-6 and IL-23 from DCs than did monophosphoryl lipid A, thus leading to preferential induction of Th17 and Th1 cells. As underlying …

Shizuo Akira

Shizuo Akira

Osaka University

The EMBO Journal

TDAG51 promotes transcription factor FoxO1 activity during LPS‐induced inflammatory responses

Tight regulation of Toll‐like receptor (TLR)‐mediated inflammatory responses is important for innate immunity. Here, we show that T‐cell death‐associated gene 51 (TDAG51/PHLDA1) is a novel regulator of the transcription factor FoxO1, regulating inflammatory mediator production in the lipopolysaccharide (LPS)‐induced inflammatory response. TDAG51 induction by LPS stimulation was mediated by the TLR2/4 signaling pathway in bone marrow‐derived macrophages (BMMs). LPS‐induced inflammatory mediator production was significantly decreased in TDAG51‐deficient BMMs. In TDAG51‐deficient mice, LPS‐ or pathogenic Escherichia coli infection‐induced lethal shock was reduced by decreasing serum proinflammatory cytokine levels. The recruitment of 14‐3‐3ζ to FoxO1 was competitively inhibited by the TDAG51‐FoxO1 interaction, leading to blockade of FoxO1 cytoplasmic translocation and thereby …

Shizuo Akira

Shizuo Akira

Osaka University

Spatiotemporal analysis of periodontitis—the unique role of the IL-33/ST2 axis

Periodontitis, which is induced by repeated bacterial invasion and the ensuing immune reactions that follow, is the leading cause of tooth loss. However, studies on immune responses are usually based on gingival tissue, although periodontal tissue is actually comprised of four different components. Here, we have developed a novel model to analyze the pathogenesis of periodontitis with different periodontal tissue components. We have found that the inflammatory response in the peri-root tissues and the expression of interleukin (IL)-6 and nuclear factor κ-Β ligand (RANKL) by Thy-1.2–fibroblasts/stromal cells were prominent during the course of bone destruction. Furthermore, a comprehensive analysis of the initiation phase confirmed these findings while revealing a high level of expression of ST2 (also known as IL33R and encoded by Il1rl1) in the peri-root tissue. Il1rl1 and Il33 deficient mice exhibited exacerbated bone loss in the acute phase of periodontitis, demonstrating the protective role of the IL-33/ST2 axis in acute inflammation. Thus, the findings obtained with this novel model show the crucial role of the peri-root tissue and advance our understanding of the etiology of periodontitis.

Shizuo Akira

Shizuo Akira

Osaka University

Genes to Cells

Breaking self‐regulation of Regnase‐1 promotes its own protein expression

The RNA‐binding protein (RBP) Regnase‐1 is an endonuclease that regulates immune responses by modulating target mRNA stability. Regnase‐1 degrades a group of inflammation‐associated mRNAs, which contributes to a balanced immune response and helps prevent autoimmune diseases. Regnase‐1 also cleaves its own mRNA by binding stem‐loop (SL) RNA structures in its 3′UTR. To understand how this autoregulation is important for immune responses, we generated mice with a 2‐bp genome deletion in the target SL of the Regnase‐1 3′‐untranslated region (3′UTR). Deletion of these nucleotides inhibited SL formation and limited Regnase‐1‐mediated mRNA degradation. Mutant mice had normal hematopoietic cell differentiation. Biochemically, mutation of the 3′UTR SL increased Regnase‐1 mRNA stability and enhanced both Regnase‐1 mRNA and protein levels in mouse embryonic …

Shizuo Akira

Shizuo Akira

Osaka University

The EMBO Journal

Secretion of mitochondrial DNA via exosomes promotes inflammation in Behçet's syndrome

Mitochondrial DNA (mtDNA) leakage into the cytoplasm can occur when cells are exposed to noxious stimuli. Specific sensors recognize cytoplasmic mtDNA to promote cytokine production. Cytoplasmic mtDNA can also be secreted extracellularly, leading to sterile inflammation. However, the mode of secretion of mtDNA out of cells upon noxious stimuli and its relevance to human disease remain unclear. Here, we show that pyroptotic cells secrete mtDNA encapsulated within exosomes. Activation of caspase‐1 leads to mtDNA leakage from the mitochondria into the cytoplasm via gasdermin‐D. Caspase‐1 also induces intraluminal membrane vesicle formation, allowing for cellular mtDNA to be taken up and secreted as exosomes. Encapsulation of mtDNA within exosomes promotes a strong inflammatory response that is ameliorated upon exosome biosynthesis inhibition in vivo. We further show that monocytes …

2023/10/16

Article Details
Shizuo Akira

Shizuo Akira

Osaka University

Hepatology

Interleukin‐33 facilitates liver regeneration through serotonin‐involved gut‐liver axis

Conclusions:Our study identified that IL‐33 is pro‐regenerative in a noninjurious model of liver resection. The underlying mechanism involved IL‐33/ST2‐induced increase of serotonin release from enterochromaffin cells to portal blood and subsequent HTR2A/p70S6K activation in hepatocytes by serotonin. The findings implicate the potential of targeting the IL‐33/ST2/serotonin pathway to reduce the risk of post‐hepatectomy liver failure and small‐for‐size syndrome.

Shizuo Akira

Shizuo Akira

Osaka University

Journal of Investigative Dermatology

436 Keratinocyte Regnase-1, a down-modulator of skin inflammation, contributes to protection from carcinogenesis through regulating COX-2

The relationship between chronic inflammation and carcinogenesis has been controversial, despite known clinical evidence such as development of squamous cell carcinoma (SCC) from chronic skin ulcer. We previously demonstrated that a ribonuclease Regnase-1 (Reg1) in keratinocytes, played a down-modulator of skin inflammation in a cross-regulatory fashion with proinflammatory signals. Here, we addressed whether Reg1 had a protective role in skin carcinogenesis. The two-stage carcinogenesis protocol allowed keratinocyte-specific Reg1 KO mice (K5. Cre+ Reg1 fl/fl: Reg1-cKO) mice developed 10 or more papillomas and several SCCs per head, while almost no tumor emerged in controls (Reg1 fl/fl). By repeated exposure to ultraviolet B irradiation, Reg1-cKO mice showed marked acanthosis and atypical epidermis resembling solar keratosis at much earlier timing compared with controls. These data …

Shizuo Akira

Shizuo Akira

Osaka University

CiDRE+ M2c macrophages hijacked by SARS-CoV-2 cause COVID-19 severity

Infection of the lungs with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) via the angiotensin I converting enzyme 2 (ACE2) receptor induces a type of systemic inflammation known as a cytokine storm. However, the precise mechanisms involved in severe coronavirus disease 2019 (COVID-19) pneumonia are unknown. Here, we show that interleukin-10 (IL-10) changed normal alveolar macrophages into ACE2-expressing M2c-type macrophages that functioned as spreading vectors for SARS-CoV-2 infection. The depletion of alveolar macrophages and blockade of IL-10 attenuated SARS-CoV-2 pathogenicity. Furthermore, genome-wide association and quantitative trait locus analyses identified novel mRNA transcripts in human patients, COVID-19 infectivity enhancing dual receptor (CiDRE), which has unique synergistic effects within the IL-10-ACE2 system in M2c-type macrophages. Our results demonstrate that alveolar macrophages stimulated by IL-10 are key players in severe COVID-19. Collectively, CiDRE expression levels are potential risk factors that predict COVID-19 severity, and CiDRE inhibitors might be useful as COVID-19 therapies.

Other articles from medRxiv journal

Etienne Vachon-Presseau

Etienne Vachon-Presseau

McGill University

medRxiv

A Biomarker-Based Framework for the Prediction of Future Chronic Pain

Chronic pain is a multifactorial condition presenting significant diagnostic and prognostic challenges. Biomarkers for the classification and the prediction of chronic pain are therefore critically needed. In this multi-dataset study of over 523,000 participants, we applied machine learning to multi-dimensional biological data from the UK Biobank to identify biomarkers for 35 medical conditions associated with pain (e.g., clinical diagnosis of rheumatoid arthritis, fibromyalgia, stroke, gout, etc.) or self-reported chronic pain (e.g., back pain, knee pain, etc). Biomarkers derived from blood immunoassays, brain and bone imaging, and genetics were effective in predicting medical conditions associated with chronic pain (area under the curve (AUC) 0.62-0.87) but not self-reported pain (AUC 0.50-0.62). Among the biomarkers identified was a composite blood-based signature that predicted the onset of various medical conditions approximately nine years in advance (AUC 0.59-0.72). Notably, all biomarkers worked in synergy with psychosocial factors, accurately predicting both medical conditions (AUC 0.69-0.91) and self-report pain (AUC 0.71-0.92). Over a period of 15 years, individuals scoring high on both biomarkers and psychosocial risk factors had twice the cumulative incidence of diagnoses for pain-associated medical conditions (Hazard Ratio (HR): 2.26) compared to individuals scoring high on biomarkers but low on psychosocial risk factors (HR: 1.06). In summary, we identified various biomarkers for chronic pain conditions and showed that their predictive efficacy heavily depended on psychological and social influences. These findings underscore …

Hasse Karlsson

Hasse Karlsson

Turun yliopisto

medRxiv

Associations Between Prenatal Exposure to Maternal Diabetes and Obesity and Newborn Subcortical Volumes

ImportanceChildren prenatally exposed to maternal diabetes have a higher risk of developing obesity and metabolic disorders. Alterations in the brain development is hypothesized as a potential mechanism underlying this relationship but has not been fully tested in humans.ObjectivesTo examine the mediating role of child brain structure in the relationships between prenatal exposure to maternal diabetes and child adiposity.Design, setting and participantsThis was a cross-sectional study of children (ages 9-to-10-years-old) from the baseline assessment of the Adolescent Brain and Cognitive Development (ABCD) Study® (N=11,875).ExposuresPrenatal exposure to maternal diabetes was determined via self-reported questionnaire.Main outcomes and measuresChild adiposity markers included age- and sex-specific body mass index (BMI z-scores), waist circumference, and waist-to-height ratio (WHtR). T1-weighted magnetic resonance imaging (MRI) was used to assess brain structure. Linear mixed effects models examined associations of prenatal exposure to maternal diabetes with child adiposity markers and brain structure controlling for sociodemographic covariates. Mediation models were performed to investigate the mediating role of brain structure on the association between maternal diabetes exposure and child adiposity markers.ResultsThe sample consisted of 8,521 children (agem: 9.92±0.63 years; sex: 51.4% males; 7% exposed to maternal diabetes). Children prenatally exposed vs. unexposed to maternal diabetes had greater BMI z-scores (β (95% CI) = 0.175 (0.093, 0.256; FDR corrected P<0.001), waist circumference (β (95% CI …

Gunn-Helen Moen

Gunn-Helen Moen

Universitetet i Oslo

medRxiv

Serum proteomic profiling of physical activity reveals CD300LG as a novel exerkine with a potential causal link to glucose homeostasis

Background Physical activity has been associated with preventing the development of type 2 diabetes and atherosclerotic cardiovascular disease. However, our understanding of the precise molecular mechanisms underlying these effects remains incomplete and good biomarkers to objectively assess physical activity are lacking. Methods We analyzed 3072 serum proteins in 26 men, normal weight or overweight, undergoing 12 weeks of a combined strength and endurance exercise intervention. We estimated insulin sensitivity with hyperinsulinemic euglycemic clamp, maximum oxygen uptake, muscle strength, and used MRI/MRS to evaluate body composition and organ fat depots. Muscle and subcutaneous adipose tissue biopsies were used for mRNA sequencing. Additional association analyses were performed in samples from up to 47,747 individuals in the UK Biobank, as well as using 2-sample Mendelian randomization and mice models. Results Following 12 weeks of exercise intervention, we observed significant changes in 283 serum proteins. Notably, 66 of these proteins were elevated in overweight men and positively associated with liver fat before the exercise regimen, but were normalized after exercise. Furthermore, for 19.7% and 12.1% of the exercise-responsive proteins, corresponding changes in mRNA expression levels in muscle and fat, respectively, were shown. The protein CD300LG displayed consistent alterations in blood, muscle, and fat. Serum CD300LG exhibited positive associations with insulin sensitivity, and to angiogenesis-related gene expression in both muscle and fat. Furthermore, serum CD300LG was …

Shelby Bachman

Shelby Bachman

University of Southern California

medRxiv

Development of a Living Library of Digital Health Technologies for Alzheimers Disease and Related Dementias: Initial Results from a Landscape Analysis and Community …

Digital health technologies offer valuable advantages to dementia researchers and clinicians as screening tools, diagnostic aids, and monitoring instruments. To support the use and advancement of these resources, a comprehensive overview of the current technological landscape is essential. A multi-stakeholder working group, convened by the Digital Medicine Society (DiMe), conducted a landscape review to identify digital health technologies for Alzheimers disease and related dementia populations. We searched studies indexed in PubMed, Embase, and APA PsycInfo to identify manuscripts published between May 2003 to May 2023 reporting analytical validation, clinical validation, or usability/feasibility results for relevant digital health technologies. Additional technologies were identified through community outreach. We collated 172 peer-reviewed manuscripts, poster presentations, or regulatory documents for 106 different technologies for Alzheimers disease and related dementia assessment covering diverse populations such as Lewy Body, vascular dementias, frontotemporal dementias, and all severities of Alzheimers disease. Wearable sensors represent 32% of included technologies, non-wearables 61%, and technologies with components of both account for the remaining 7%. Neurocognition is the most prevalent concept of interest, followed by physical activity and sleep. Clinical validation is reported in 69% of evidence, analytical validation in 34%, and usability/feasibility in 20% (not mutually exclusive). These findings provide a landscape overview for clinicians and researchers to appraise the clinical utility and relative maturity of …

Eric A. Storch

Eric A. Storch

Baylor College of Medicine

medRxiv

Genome-wide association study identifies 30 obsessive-compulsive disorder associated loci

Obsessive-compulsive disorder (OCD) affects ~1% of the population and exhibits a high SNP-heritability, yet previous genome-wide association studies (GWAS) have provided limited information on the genetic etiology and underlying biological mechanisms of the disorder. We conducted a GWAS meta-analysis combining 53,660 OCD cases and 2,044,417 controls from 28 European-ancestry cohorts revealing 30 independent genome-wide significant SNPs and a SNP-based heritability of 6.7%. Separate GWAS for clinical, biobank, comorbid, and self-report sub-groups found no evidence of sample ascertainment impacting our results. Functional and positional QTL gene-based approaches identified 249 significant candidate risk genes for OCD, of which 25 were identified as putatively causal, highlighting WDR6, DALRD3, CTNND1 and genes in the MHC region. Tissue and single-cell enrichment analyses highlighted hippocampal and cortical excitatory neurons, along with D1- and D2-type dopamine receptor-containing medium spiny neurons, as playing a role in OCD risk. OCD displayed significant genetic correlations with 65 out of 112 examined phenotypes. Notably, it showed positive genetic correlations with all included psychiatric phenotypes, in particular anxiety, depression, anorexia nervosa, and Tourette syndrome, and negative correlations with a subset of the included autoimmune disorders, educational attainment, and body mass index.. This study marks a significant step toward unraveling its genetic landscape and advances understanding of OCD genetics, providing a foundation for future interventions to address this debilitating …

Dimitri Christakis

Dimitri Christakis

University of Washington

medRxiv

Real-world Effectiveness and Causal Mediation Study of BNT162b2 on Long COVID Risks in Children and Adolescents (preprint)

Background: The impact of pre-infection vaccination on the risk of long COVID remains unclear in the pediatric population. Further, it is unknown if such pre-infection vaccination can mitigate the risk of long COVID beyond its established protective benefits against SARS-CoV-2 infection. Objective: To assess the effectiveness of BNT162b2 on long COVID risks with various strains of the SARS-CoV-2 virus in children and adolescents, using comparative effectiveness methods. To disentangle the overall effectiveness of the vaccine on long COVID outcomes into its independent impact and indirect impact via prevention of SARS-CoV-2 infections, using causal mediation analysis. Design: Real-world vaccine effectiveness study and mediation analysis in three independent cohorts: adolescents (12 to 20 years) during the Delta phase, children (5 to 11 years) and adolescents (12 to 20 years) during the Omicron phase. Setting: Twenty health systems in the RECOVER PCORnet electronic health record (EHR) Program. Participants: 112,590 adolescents (88,811 vaccinated) in the Delta period, 188,894 children (101,277 vaccinated), and 84,735 adolescents (37,724 vaccinated) in the Omicron period. Exposures: First dose of the BNT162b2 vaccine vs. no receipt of COVID-19 vaccine. Measurements: Outcomes of interest include conclusive or probable diagnosis of long COVID following a documented SARS-CoV-2 infection, and body-system-specific condition clusters of post-acute sequelae of SARS-CoV-2 infection (PASC), such as cardiac, gastrointestinal, musculoskeletal, respiratory, and syndromic categories. The effectiveness was reported as (1 …

Mark R Walter

Mark R Walter

University of Alabama at Birmingham

medRxiv

Role of DOCK8 in Hyper-inflammatory Syndromes

Background Cytokine storm syndromes (CSS), including hemophagocytic lymphohistiocytosis (HLH), are increasingly recognized as hyper-inflammatory states leading to multi-organ failure and death. Familial HLH (FHL) in infancy results from homozygous genetic defects in perforin-mediated cytolysis by CD8 T-lymphocytes and natural killer (NK) cells. Later onset CSS are frequently associated with heterozygous defects in FHL genes, but genetic etiologies for most are unknown. We identified rare DOCK8 variants in CSS patients. Objective We explore the role of CSS patient derived DOCK8 mutations on cytolytic activity in NK cells. We further study effects of Dock8-/- in murine models of CSS. Methods DOCK8 cDNA from 2 unrelated CSS patients with different missense mutations were introduced into human NK-92 NK cells by foamy virus transduction. NK cell degranulation (CD107a), cytolytic activity against K562 target cells, and interferon-gamma (IFNlower case Greek gamma) production were explored by flow cytometry (FCM). A third CSS patient DOCK8 mRNA splice acceptor site variant was explored by exon trapping. Dock8-/- mice were assessed for features of CSS (weight loss, splenomegaly, hepatic inflammation, cytopenias, and IFNlower case Greek gamma levels) upon challenge with lymphochoriomeningitic virus (LCMV) and excess IL-18. Results Both patient DOCK8 missense mutations decreased cytolytic function in NK cells in a partial dominant-negative fashion in vitro. The patient DOCK8 splice variant disrupted mRNA splicing in vitro. Dock8-/- mice tolerated excess IL-18 but developed features of CSS upon LCMV infection …

Thomas Iosifidis

Thomas Iosifidis

Curtin University

medRxiv

REAL TIME MONITORING OF RESPIRATORY VIRAL INFECTIONS IN COHORT STUDIES USING A SMARTPHONE APP.

Background and Objectives Cohort studies investigating respiratory disease pathogenesis aim to pair mechanistic investigations with longitudinal virus detection but are limited by the burden of methods tracking illness over time. In this study, we explored the utility of a smartphone app to robustly identify symptomatic respiratory illnesses, while reducing burden and facilitating real-time data collection and adherence monitoring. Methods The AERIAL TempTracker smartphone app was assessed in the AERIAL and COCOON birth cohort studies. Participants recorded daily temperatures and associated symptoms/medications in TempTracker for 6 months, with daily use adherence measured over this period. Regular participant feedback was collected at quarterly study visits. Symptomatic respiratory illnesses meeting study criteria prompted an automated app alert and collection of a nose/throat swab for testing of eight respiratory viruses. Results In total, 32,764 daily TempTracker entries from 348 AERIAL participants and 30,542 entries from 361 COCOON participants were recorded. This corresponded to an adherence median of 67.0% (range 1.9$[ndash]100%) and 55.4% (range 1.1–100%) of each participants study period, respectively. Feedback was positive, with 75.5% of responding families reporting no barriers to use. A total of 648 symptomatic respiratory illness events from 249/709 participants were identified with significant variability between individuals in the frequency (0–16 events per participant), duration (1–13 days), and virus detected (rhinovirus in 42.7%). Conclusions A smartphone app provides a reliable method to capture the …

Charles-François de Lannoy

Charles-François de Lannoy

McMaster University

medRxiv

Measuring the fitted filtration efficiency of cloth masks, medical masks and respirators

Importance Masks reduce transmission of SARS-CoV2 and other respiratory pathogens. Comparative studies of the fitted filtration efficiency of different types of masks of are few.   Objective To describe the fitted filtration efficiency against small aerosols (0.02 – 1 µm) of medical and non-medical masks and respirators when worn, and how this is affected by user modifications (hacks) and by overmasking with a cloth mask.   Design We tested a 2-layer woven-cotton cloth mask of a consensus design, ASTM-certified level 1 and level 3 masks, a non-certified mask, KF94s, KN95s, an N95 and a CaN99.   Setting Closed rooms with ambient particles supplemented by salt particles.   Participants 12 total participants; 21 – 55 years, 68% female, 77% white, NIOSH 1 to 10.   Main Outcome and Measure Using standard methods and a PortaCount 8038, we counted 0.02–1µm particles inside and outside masks and respirators, expressing results as the percentage filtered by each mask. We also studied level 1 and level 3 masks with earguards, scrub caps, the knot-and-tuck method, and the effects of braces or overmasking with a cloth mask.   Results Filtration efficiency for the cloth mask was 47-55%, for level 1 masks 52-60%, for level 3 masks 60-77%. A non-certified KN95 look-alike, two KF94s, and three KN95s filtered 57-77%, and the N95 and CaN99 97-98% without fit testing. External braces and overmasking with a well-fitting cloth mask increased filtration, but earguards, scrub caps, and the knot-and-tuck method did not.   Limitations Limited number of masks of each type sampled; no adjustment for multiple comparisons.   Conclusions and Relevance …

Yu-Wei Wu

Yu-Wei Wu

Taipei Medical University

medRxiv

Widely accessible prognostication using medical history for fetal growth restriction and small for gestational age in nationwide insured women

Objectives Prevention of fetal growth restriction/small for gestational age is adequate if screening is accurate. Ultrasound and biomarkers can achieve this goal; however, both are often inaccessible. This study aimed to develop, validate, and deploy a prognostic prediction model for screening fetal growth restriction/small for gestational age using only medical history. Methods From a nationwide health insurance database (n=1,697,452), we retrospectively selected visits of 12-to-55-year-old females to 22,024 healthcare providers of primary, secondary, and tertiary care. This study used machine learning (including deep learning) to develop prediction models using 54 medical-history predictors. After evaluating model calibration, clinical utility, and explainability, we selected the best by discrimination ability. We also externally validated and compared the models with those from previous studies, which were rigorously selected by a systematic review of Pubmed, Scopus, and Web of Science. Results We selected 169,746 subjects with 507,319 visits for predictive modeling. The best prediction model was a deep-insight visible neural network. It had an area under the receiver operating characteristics curve of 0.742 (95% confidence interval 0.734 to 0.750) and a sensitivity of 49.09% (95% confidence interval 47.60% to 50.58% using a threshold with 95% specificity). The model was competitive against the previous models in a systematic review of 30 eligible studies of 381 records, including those using either ultrasound or biomarker measurements. We deployed a web application to apply the model. Conclusions Our model used only medical history …

Geraint Rees

Geraint Rees

University College London

medRxiv

Differential default-mode network effective connectivity in young-onset Alzheimer's disease variants

Young-onset Alzheimer's Disease(AD) is a rare form of AD characterized by early symptom onset (< 65 years) and heterogeneous clinical phenotypes. Previous studies have consistently shown that patients with late-onset AD exhibit alterations in the default mode network-a large-scale brain network associated with self-related processing and autobiographical memory. However, the functional organization of the default-mode network is far less clear in young-onset AD. Here, we assessed default-mode network effective connectivity in two common young-onset AD variants (i.e., typical amnestic variant and posterior cortical atrophy) and healthy participants to identify disease- and variant-specific differences in the default-mode network. This case-control study was conducted with thirty-nine young-onset AD patients, including typical amnestic (n = 26, 15 females, mean age = 61) and posterior cortical atrophy (n = 13; 8 females, mean age = 61.8), and 24 age-matched healthy participants (13 females, mean age=60.1). All participants underwent resting-state functional MRI and extensive neuropsychological testing. Spectral dynamic causal modelling was performed to quantify resting-state effective connectivity between default-mode network regions. Parametric empirical Bayes analysis was then performed to characterise group differences in effective connectivity. Our results showed that patients with typical AD variant showed increased connectivity from medial prefrontal cortex to posterior default-mode network nodes as well as reduced inhibitory connectivity from hippocampus to other default-mode network nodes, relative to healthy controls …

Harro Seelaar

Harro Seelaar

Erasmus Universiteit Rotterdam

medRxiv

Generalizability of trial criteria on amyloid-lowering therapy against Alzheimers disease to individuals with MCI or early AD in the general population

Background Treatment with monoclonal antibodies against amyloid-beta; slowed cognitive decline in recent randomized clinical trials in patients with mild cognitive impairment (MCI) and early dementia due to Alzheimers disease (AD). However, stringent trial eligibility criteria may affect generalizability of these findings to clinical practice. Methods We extracted eligibility criteria for trials of aducanumab, lecanemab and donanemab from published reports, and applied these to participants with MCI or early clinical AD dementia from the population-based Rotterdam Study. Participants underwent questionnaires, genotyping, brain MRI, cognitive testing, and cardiovascular assessment. We had continuous linkage with medical records and pharmacy dispensary data. We determined amyloid status using an established and validated prediction model based on age and APOE genotype. We assessed progression to dementia within 5 years among participants with MCI, stratified for eligibility. Results Of 968 participants (mean age: 75 years, 56% women), 779 had MCI and 189 early clinical AD dementia. Across the three drug trials, around 40% of participants would be ineligible because of predicted amyloid negativity. At least one clinical exclusion criterion was present in 76.3% (95% CI; 73.3-79.3) of participants for aducanumab, 75.8% (73.0-78.7) for lecanemab, and 59.8% (56.4-63.3) for donanemab. Criteria that most often led to exclusion were a history of cardiovascular disease (35.2%), use of anticoagulant (31.2%), use of psychotropic or immunological medications (20.4%), history of anxiety or depression (15.9%), or lack of social support (15 …

Harro Seelaar

Harro Seelaar

Erasmus Universiteit Rotterdam

medRxiv

Frontoparietal network integrity supports cognitive function despite atrophy and hypoperfusion in pre-symptomatic frontotemporal dementia: multimodal analysis of brain function …

INTRODUCTION Gene carriers of frontotemporal dementia can remain cognitively well despite neurodegeneration. A better understanding of brain structural, perfusion and functional patterns in pre-symptomatic stage could inform accurate staging and potential mechanisms. METHODS We included 207 pre-symptomatic carriers and 188 relatives without mutations. The grey matter volume, cerebral perfusion, and resting-state functional network maps were co-analyzed using linked independent component analysis (LICA). Multiple regression analysis was used to investigate the relationship of LICA components to genetic status and cognition. RESULTS Pre-symptomatic carriers showed an age-related decrease in the left frontoparietal network integrity while non-carriers did not. Executive functions of pre-symptomatic carriers dissociated from the level of atrophy and cerebrovascular dysfunction, but became dependent on the left frontoparietal network integrity in older age. DISCUSSION The frontoparietal network integrity of pre-symptomatic carriers showed a distinctive relationship to age and cognition compared to non-carriers, despite atrophy and hypoperfusion. Functional network integrity may contribute to brain resilience in pre-symptomatic frontotemporal dementia, mitigating the effects of atrophy and hypoperfusion.

Harro Seelaar

Harro Seelaar

Erasmus Universiteit Rotterdam

medRxiv

Cerebrovascular reactivity impairment in genetic frontotemporal dementia

INTRODUCTION Cerebrovascular reactivity (CVR) is an indicator of cerebrovascular health and its signature in hereditary frontotemporal dementia (FTD) remains unknown. We investigated CVR in genetic FTD and its relationship to cognition. METHODS CVR differences were assessed between 284 pre-symptomatic and 124 symptomatic mutation carriers, and 265 non-carriers, using resting-state fluctuation amplitudes (RSFA) on component-based and voxel-level RSFA maps. Associations and interactions between RSFA, age, genetic status, and cognition were examined using generalised linear models. RESULTS Compared to non-carriers, mutation carriers exhibited greater RSFA reductions, predominantly in frontal cortex. These reductions increased with age. The RSFA in these regions correlated with cognitive function in symptomatic and, to a lesser extent, pre-symptomatic individuals, independent of disease stage. DISCUSSION CVR impairment in genetic FTD predominantly affects frontal cortical areas, and its preservation may yield cognitive benefits for at-risk individuals. Cerebrovascular health may be a potential target for biomarker identification and disease-modifying efforts.

Sophia Shalhout, PhD

Sophia Shalhout, PhD

Harvard University

medRxiv

ctDNA predicts recurrence and survival in stage I and II HPV-associated head and neck cancer patients treated with surgery

Human papillomavirus-associated oropharyngeal squamous cell carcinomas (HPV+OPSCC) release circulating tumor HPV DNA (ctHPVDNA) into the blood which we, and others, have shown is an accurate real-time biomarker of disease status. In a prior prospective observational trial of 34 patients with AJCC 8 stage I-II HPV+OPSCC treated with surgery, we reported that ctHPVDNA was rapidly cleared within hours of surgery in patients who underwent complete cancer extirpation, yet remained elevated in those with macroscopic residual disease. The primary outcomes of this study were to assess 2-year OS and RFS between patients with and without molecular residual disease (MRD) following completion of treatment in this prospective cohort. MRD was defined as persistent elevation of ctHPVDNA at two consecutive time points, without clinical evidence of disease. The secondary outcomes were 2-year OS and RFS between patients with and without detectable MRD after surgery. We observed that patients with MRD after treatment completion were more likely to recur compared to patients without MRD, while there was no difference in recurrence rates between patients with MRD and without MRD on postoperative day 1. OS did not significantly differ between patients with MRD after surgery or treatment completion compared to patients without MRD; however, time to death was significantly different between the groups in both settings, suggesting that with a larger sample size OS would differ significantly between the groups or that the impact of MRD detection on survival is time dependent.

Sophia Shalhout, PhD

Sophia Shalhout, PhD

Harvard University

medrxiv

Immunotherapy Time of Infusion Impacts Survival in Head and Neck Cancer: A Propensity Score Matched Analysis

The adaptive immune response is physiologically regulated by the circadian rhythm. Data in lung and melanoma malignancies suggests immunotherapy infusions earlier in the day may be associated with improved response; however, the optimal time of administration for patients with head and neck squamous cell carcinoma (HNSCC) is not known. We aimed to evaluate the association of immunotherapy infusion time with overall survival (OS) and progression free survival (PFS) in patients with HNSCC in an Institutional Review Board-approved, retrospective cohort study. 113 patients met study inclusion criteria and 98 patients were included in a propensity score-matched cohort. In the full unmatched cohort (N = 113), each additional 20 % of infusions received after 1500 h conferred an OS hazard ratio (HR) of 1.35 (95 % C.I.1.2–1.6; p-value = 0.0003) and a PFS HR of 1.34 (95 % C.I.1.2–1.6; p-value < 0.0001). A …

Barbra Dickerman

Barbra Dickerman

Harvard University

medRxiv

Reduced effectiveness of repeat influenza vaccination: distinguishing among within-season waning, recent clinical infection, and subclinical infection

Studies have reported that prior-season influenza vaccination is associated with higher risk of clinical influenza infection among vaccinees. This effect might arise from incomplete consideration of within-season waning and recent infection. Using data from the US Flu Vaccine Effectiveness (VE) Network (2011–2012 to 2018–2019 seasons), we found that repeat vaccinees were vaccinated earlier in a season by one week. After accounting for waning VE, repeat vaccinees were still more likely to test positive for A (H3N2)(OR= 1.11, 95% CI: 1.02–1.21) but not for influenza B or A (H1N1). We found that clinical infection influences individuals’ decision to vaccinate in the following season while protecting against clinical infection of the same (sub) type. However, adjusting for recent clinical infections did not strongly influence the estimated effect of prior-season vaccination. In contrast, we found that adjusting for subclinical …

David A. Clifton

David A. Clifton

University of Oxford

medRxiv

Mitigating Machine Learning Bias Between High Income and Low-Middle Income Countries for Enhanced Model Fairness and Generalizability

Collaborative efforts in artificial intelligence (AI) are increasingly common between high-income countries (HICs) and low- to middle-income countries (LMICs). Given the resource limitations often encountered by LMICs, collaboration becomes crucial for pooling resources, expertise, and knowledge. Despite the apparent advantages, ensuring the fairness and equity of these collaborative models is essential, especially considering the distinct differences between LMIC and HIC hospitals. In this study, we show that collaborative AI approaches can lead to divergent performance outcomes across HIC and LMIC settings, particularly in the presence of data imbalances. Through a real-world COVID-19 screening case study, we demonstrate that implementing algorithmic-level bias mitigation methods significantly improves outcome fairness between HIC and LMIC sites while maintaining high diagnostic sensitivity. We compare our results against previous benchmarks, utilizing datasets from four independent United Kingdom Hospitals and one Vietnamese hospital, representing HIC and LMIC settings, respectively.

David A. Clifton

David A. Clifton

University of Oxford

medRxiv

Deep Learning for Multi-Label Disease Classification of Retinal Images: Insights from Brazilian Data for AI Development in Lower-Middle Income Countries

Retinal fundus imaging is a powerful tool for disease screening and diagnosis in opthalmology. With the advent of machine learning and artificial intelligence, in particular modern computer vision classification algorithms, there is broad scope for technology to improve accuracy, increase accessibility and reduce cost in these processes. In this paper we present the first deep learning model trained on the first Brazilian multi-label opthalmological datatset. We train a multi-label classifier using over 16,000 clinically-labelled fundus images. Across a range of 13 retinal diseases, we obtain frequency-weighted AUC and F1 scores of 0.92 and 0.70 respectively. Our work establishes a baseline model on this new dataset and furthermore demonstrates the applicability and power of artificial intelligence approaches to retinal fundus disease diagnosis in under-represented populations.

David A. Clifton

David A. Clifton

University of Oxford

medRxiv

Large Language Models in Healthcare: A Comprehensive Benchmark

The adoption of large language models (LLMs) to assist clinicians has attracted remarkable attention. Existing works mainly adopt the close-ended question-answering task with answer options for evaluation. However, in real clinical settings, many clinical decisions, such as treatment recommendations, involve answering open-ended questions without pre-set options. Meanwhile, existing studies mainly use accuracy to assess model performance. In this paper, we comprehensively benchmark diverse LLMs in healthcare, to clearly understand their strengths and weaknesses. Our benchmark contains seven tasks and thirteen datasets across medical language generation, understanding, and reasoning. We conduct a detailed evaluation of existing sixteen LLMs in healthcare under both zero-shot and few-shot (i.e., 1,3,5-shot) learning settings. We report the results on five metrics (i.e. matching, faithfulness, comprehensiveness, generalizability, and robustness) that are critical in achieving trust from clinical users. We further invite medical experts to conduct human evaluation.