Timezone: »
Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a development dataset, which contains board-certified radiologist annotations for 500 radiology reports from the MIMIC-CXR dataset (14,579 entities and 10,889 relations), and a test dataset, which contains two independent sets of board-certified radiologist annotations for 100 radiology reports split equally across the MIMIC-CXR and CheXpert datasets. Using these datasets, we train and test a deep learning model, RadGraph Benchmark, that achieves a micro F1 of 0.82 and 0.73 on relation extraction on the MIMIC-CXR and CheXpert test sets respectively. Additionally, we release an inference dataset, which contains annotations automatically generated by RadGraph Benchmark across 220,763 MIMIC-CXR reports (around 6 million entities and 4 million relations) and 500 CheXpert reports (13,783 entities and 9,908 relations) with mappings to associated chest radiographs. Our freely available dataset can facilitate a wide range of research in medical natural language processing, as well as computer vision and multi-modal learning when linked to chest radiographs.
Author Information
Saahil Jain (Stanford University)
Ashwin Agrawal (Stanford University)
Adriel Saporta (Apple)
Steven Truong (Toronto University)
Du Nguyen Duong
Tan Bui
Pierre Chambon (Stanford University)
Yuhao Zhang (Amazon AWS AI)
Matthew Lungren (Stanford)
Andrew Ng (Stanford University)
Andrew Ng, Chief Scientist at Baidu, Chairman & Co-Founder of Coursera, Adjunct Professor, Stanford Dr. Andrew Ng joined Baidu in May 2014 as chief scientist. He is responsible for driving the company's global AI strategy and infrastructure. He leads Baidu Research in Beijing and Silicon Valley as well as technical teams in the areas of speech, big data and image search. In addition to his role at Baidu, Dr. Ng is an adjunct professor in the computer science department at Stanford University. In 2011 he led the development of Stanford's Massive Open Online Course (MOOC) platform and taught an online machine learning class that was offered to over 100,000 students. This led to the co-founding of Coursera, where he continues to serve as chairman. Previously, Dr. Ng was the founding lead of the Google Brain deep learning project. Dr. Ng has authored or co-authored over 100 research papers in machine learning, robotics and related fields. In 2013 he was named to the Time 100 list of the most influential persons in the world. He holds degrees from Carnegie Mellon University, MIT and the University of California, Berkeley.
Curtis Langlotz
Pranav Rajpurkar (Computer Science Department, Stanford University)
More from the Same Authors
-
2021 : RadGraph: Extracting Clinical Entities and Relations from Radiology Reports »
Saahil Jain · Ashwin Agrawal · Adriel Saporta · Steven Truong · Du Nguyen Duong · Tan Bui · Pierre Chambon · Yuhao Zhang · Matthew Lungren · Andrew Ng · Curtis Langlotz · Pranav Rajpurkar -
2021 : Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management »
Cécile Logé · Emily Ross · David Dadey · Saahil Jain · Adriel Saporta · Andrew Ng · Pranav Rajpurkar -
2022 : Adapting Pretrained Vision-Language Foundational Models to Medical Imaging Domains »
Pierre Chambon · Christian Bluethgen · Curtis Langlotz · Akshay Chaudhari -
2021 : What are “meaningful” ML datasets and the opportunities and challenges in creating them? »
Judy Wawira · Matthew Lungren · Elaine Nsoesie · Ari Robicsek -
2021 Workshop: Machine learning from ground truth: New medical imaging datasets for unsolved medical problems. »
Katy Haynes · Ziad Obermeyer · Emma Pierson · Marzyeh Ghassemi · Matthew Lungren · Sendhil Mullainathan · Matthew McDermott -
2021 : Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management »
Cécile Logé · Emily Ross · David Dadey · Saahil Jain · Adriel Saporta · Andrew Ng · Pranav Rajpurkar -
2020 : Andrew Ng: Practical limitations of today's deep learning in healthcare »
Andrew Ng -
2019 : Climate Change: A Grand Challenge for ML »
Yoshua Bengio · Carla Gomes · Andrew Ng · Jeff Dean · Lester Mackey -
2019 Workshop: Tackling Climate Change with ML »
David Rolnick · Priya Donti · Lynn Kaack · Alexandre Lacoste · Tegan Maharaj · Andrew Ng · John Platt · Jennifer Chayes · Yoshua Bengio -
2017 : Panel: Limited Labeled Data in Medical Imaging »
Daniel Rubin · Matthew Lungren -
2016 Tutorial: Nuts and Bolts of Building Applications using Deep Learning »
Andrew Ng