Skip to main content
NIH Intramural Research Program, Our Research Changes Lives

Navigation controls

  • Search
  • Menu

Social follow links

  • Podcast
  • Instagram
  • Twitter
  • YouTube
  • LinkedIn

Main navigation

  • About Us
    • What Is the IRP?
    • History
    • Honors
      • Nobel Prize
      • Lasker Award
      • Breakthrough Prize
      • Shaw Prize
      • Presidential Early Career Award for Scientists and Engineers (PECASE)
      • Presidential Medal of Freedom
      • National Medal of Science
      • Searle Scholars
      • The National Academy of Sciences
      • The National Academy of Medicine
      • The National Academy of Engineering
      • The American Academy of Arts and Sciences
      • National Medal of Technology & Innovation
      • Samuel J. Heyman Service to America Medals
      • Crafoord Prize
      • Fellows of the Royal Society
      • Canada Gairdner Awards
    • Organization & Leadership
    • Our Programs
      • NCI
      • NEI
      • NHGRI
      • NHLBI
      • NIA
      • NIAAA
      • NIAID
      • NIAMS
      • NIBIB
      • NICHD
      • NIDA
      • NIDCD
      • NIDCR
      • NIDDK
      • NIEHS
      • NIMH
      • NIMHD
      • NINDS
      • NINR
      • NLM
      • CC
      • NCATS
      • NCCIH
    • Research Campus Locations
    • Contact Information
  • Our Research
    • Scientific Focus Areas
      • Biomedical Engineering & Biophysics
      • Cancer Biology
      • Cell Biology
      • Chemical Biology
      • Chromosome Biology
      • Clinical Research
      • Computational Biology
      • Developmental Biology
      • Epidemiology
      • Genetics & Genomics
      • Health Disparities
      • Immunology
      • Microbiology & Infectious Diseases
      • Molecular Biology & Biochemistry
      • Molecular Pharmacology
      • Neuroscience
      • RNA Biology
      • Social & Behavioral Sciences
      • Stem Cell Biology
      • Structural Biology
      • Systems Biology
      • Virology
    • Principal Investigators
      • View by Investigator Name
      • View by Scientific Focus Area
    • Accomplishments
      • View All Accomplishments by Date
      • View All Health Topics
      • The Body
      • Health & Wellness
      • Conditions & Diseases
      • Procedures
    • Accelerating Science
      • Investing in Cutting-Edge Animal Models
      • Creating Cell-Based Therapies
      • Advancing Computational and Structural Biology
      • Combating Drug Resistance
      • Developing Novel Imaging Techniques
      • Charting the Pathways of Inflammation
      • Zooming in on the Microbiome
      • Uncovering New Opportunities for Natural Products
      • Stimulating Neuroscience Research
      • Pursuing Precision Medicine
      • Unlocking the Potential of RNA Biology and Therapeutics
      • Producing Novel Vaccines
    • Research in Action
      • View All Stories
      • Battling Blood-Sucking Bugs
      • Unexpected Leads to Curb Addiction
      • Shaping Therapies for Sickle Cell Disease
      • The Mind’s Map Maker
    • Trans-IRP Research Resources
      • Supercomputing
    • IRP Review Process
    • Commercializing Inventions
  • NIH Clinical Center
    • Clinical Center Facilities
    • Clinical Faculty
    • Advancing Translational Science
    • Clinical Trials
      • Get Involved with Clinical Research
      • Physician Resources
  • News & Events
    • In the News
    • I am Intramural Blog
    • Speaking of Science Podcast
    • SciBites Video Shorts
    • The NIH Catalyst Newsletter
    • Events
  • Careers
    • Faculty-Level Scientific Careers
    • Trans-NIH Scientific Recruitments
      • Stadtman Tenure-Track Investigators
        • Science, the Stadtman Way
      • Lasker Clinical Research Scholars
      • Independent Research Scholar
    • Scientific & Clinical Careers
    • Administrative Careers
  • Research Training
    • Program Information
    • Training Opportunities
    • NIH Work/Life Resources
I am Intramural Blog

Challenges to Training Artificial Intelligence with Medical Imaging Data

By Hoo-Chang Shin

Tuesday, November 1, 2016

If you were going to train an artificial intelligence (AI) system to understand and accurately diagnose medical images, what kind of information do you think would be most effective: lots of general image data, or small amounts of specific data?

In my previous blog post, I briefly described our team’s work using data in the NIH Clinical Center’s picture archiving and communication systems (PACS) to build an AI system that can learn from radiology reports and images stored in PACS to generate descriptive keywords of new images independently of human eyes. Such a system could be used to provide first-impression summarizations of medical images and index large amounts of images so that doctors or researchers can search for images with certain characteristics, such as “images with brain tumor" or "magnetic resonance images of muscle.” However, our research objectives are even grander.

One of our main goals is to build an AI system that can make automatic disease diagnoses from patient scans, something that our proposed system is not quite able to do. To address its limitations, we used Metathesaurus (a medical semantic database of the Unified Medical Language System [UMLS]) and the RadLex radiology language database to detect disease-specific words used in radiology from the reports. In addition, we employed a negation detection algorithm to determine whether each detected disease word is mentioned positively (disease is present) or negatively (disease is absent) in the image descriptions. Then, we trained deep convolutional neural networks to detect the presence or absence of a disease in given medical images.

Below are two example outputs of the trained AI system, which generated keywords and detected the presence or absence of a disease with confidence levels. The original text from each image’s human doctor-generated radiology report is also shown on the right.

Example outputs for automated image interpretation, where the AI’s most probable disease detection matches the radiologist’s assigned label.

As we found, the AI system can detect the presence or absence of a disease in a given image, which supplements AI generated keywords towards the purpose of diagnosis. Still, two big challenges remain:

Challenge 1: Reports written by different doctors can vary widely, and multiple terminologies may be used to describe the same type of disease.

Matching variable descriptions to a single disease type and training an AI to understand them all is challenging. To train for more specificity in disease detection, we could only use about 10% of the entire data available in the PACS. By aiming to generate approximate keywords describing each image we could ignore description variability, but then we also had to be more specific in the text mining, which filtered out about 90% of the available data.

Challenge 2: Doctors sometimes write in an indefinite way when describing a disease.

"Possibility of an infection abscess," "not obviously cysts," or "possibly due to cyst" (as with image “d” above) are descriptors commonly used by doctors to advise further investigation. Without manually examining such images, it is challenging to derive any concrete findings from uncertain text descriptions. Indeed, it was hard to find any cyst in the “d” example image.

Nonetheless, having more data is always desirable when building a machine learning system, so that the system can better generalize beyond the examples it has seen in the training set. Our experience raises interesting questions regarding the trade-offs in designing an artificial intelligence to understand medical data: Do we go big and less specific with the data used for training, or small and more specific?

Look forward to my next post, where I will describe our efforts to train an AI system to understand medical images, using unrelated images of everyday objects.

More details about our study can be found in the published journal article: Shin et al., Interleaved Text/Image Deep Mining on a Large-Scale Radiology Database for Automated Image Interpretation, Journal of Machine Learning Research.


Category: IRP Life
Tags: artificial intelligence, imaging, data, radiology, Clinical Center, PACS, MRI, big data, high-performance computing

Related Blog Posts

  • Robot Superheroes to Big Data
  • Doctor Data: How Computers Are Invading the Clinic
  • Top-of-the-Line Supercomputer Turbocharges NIH Research
  • Three Billion Base Pairs vs. One Powerful Computer
  • Leveraging AI To Combat Cervical Cancer

This page was last updated on Wednesday, July 5, 2023

Blog menu

  • Contributing Authors
    • Anindita Ray
    • Brandon Levy
    • Devon Valera
    • Melissa Glim
  • Categories
    • IRP Discoveries
    • Profiles
    • Events
    • NIH History
    • IRP Life

Blog links

  • Subscribe to RSS feed

Get IRP Updates

Subscribe

  • Email
  • Print
  • Share Twitter Facebook LinkedIn

Main navigation

  • About Us
    • What Is the IRP?
    • History
    • Honors
    • Organization & Leadership
    • Our Programs
    • Research Campus Locations
    • Contact Information
  • Our Research
    • Scientific Focus Areas
    • Principal Investigators
    • Accomplishments
    • Accelerating Science
    • Research in Action
    • Trans-IRP Research Resources
    • IRP Review Process
    • Commercializing Inventions
  • NIH Clinical Center
    • Clinical Center Facilities
    • Clinical Faculty
    • Advancing Translational Science
    • Clinical Trials
  • News & Events
    • In the News
    • I am Intramural Blog
    • Speaking of Science Podcast
    • SciBites Video Shorts
    • The NIH Catalyst Newsletter
    • Events
  • Careers
    • Faculty-Level Scientific Careers
    • Trans-NIH Scientific Recruitments
    • Scientific & Clinical Careers
    • Administrative Careers
  • Research Training
    • Program Information
    • Training Opportunities
    • NIH Work/Life Resources
  • Department of Health and Human Services
  • National Institutes of Health
  • USA.gov

Footer

  • Home
  • Contact Us
  • IRP Brand Materials
  • HHS Vulnerability Disclosure
  • Web Policies & Notices
  • Site Map
  • Search