0 citations

A Study of Generative Large Language Model for Medical Research and Healthcare

arXiv (Cornell University)2023

Citations Over Time

Cheng Peng, Xi Yang, Aokun Chen, Kaleb E Smith, Nima PourNejatian, Anthony Costa, Cheryl Martin, Mona G. Flores, Ying Zhang, Tanja Magoč, Gloria Lipori, Duane A. Mitchell, Naykky Singh Ospina, Mustafa M Ahmed, William R. Hogan, Elizabeth Shenkman, Yi Guo, Jiang Bian, Yonghui Wu

Abstract

There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language processing for medical research. Synthetic NLP models trained using GatorTronGPT generated text outperform NLP models trained using real-world clinical text. Physicians Turing test using 1 (worst) to 9 (best) scale shows that there is no significant difference in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights on the opportunities and challenges of LLMs for medical research and healthcare.

Related Papers

→ Evaluation of the readability of ACOG patient education pamphlets(1999)44 cited
→ Enthusiasm in the Development of Radical Innovations(2007)33 cited
The Validity of Some Popular Readability Formulas(2012)
→ Evaluation of the Readability of ACOG Patient Education Pamphlets(1999)7 cited
→ Readability as a Source of Measurement Error in Medical Education Assessment(2019)