Skip to content

Latest commit

 

History

History
80 lines (61 loc) · 4.11 KB

README.md

File metadata and controls

80 lines (61 loc) · 4.11 KB

MedGen

This project is a TensorFlow reimplementation of the paper On the Automatic Generation of Medical Imaging Reports by Jing et all, published in the year 2018.

Check out the paper here!

Paper Inspiration

Medical images like xrays, CTs, MRIs and other type of scans are used for diagnosis of a lot of diseases. Specialized medical professionals read and interpret these medical images. Report writing for these scans can be time-consuming, and to address this issue, we looked into automatic generation of these reports.

Sample Medical Report

A medical report has three main points:

  • Impressions, which provide diagnosis
  • Findings, which lists all observations
  • Tags, which list keywords which represent the critical information in the findings

Sample Report

Dataset

The dataset used in the paper was the Indiana University Chest X-Ray Collection (IU X-Ray) (Demner-Fushman et al., 2015), which is a set of chest x-ray images paired with their corresponding diagnostic reports. The images were obtained from here and the reports were obtained from here.

Due to computational difficulties, we used a sample set of 1000 scans for training and 200 scans for testing, the details of which are present in the directory /data.

Model Components

The architecture proposed by the paper is shown below.

Proposed Architecture

The three main proposals of the paper are:

  • A multi-task framework which jointly performs the prediction of tags and generation of paragraphs for reports
  • Co-attention mechanism which takes visual features as well as semantic features into account
  • Hierarchical LSTM model

Implementation details

  • The zip files containing images and reports were mounted from Google Drive.
  • A dataset was prepared through a data cleaning process that consists of two images per report, one frontal and one lateral view.
  • Reports were extraced from .xml files and the frontal and lateral views were combined to prepare the above mentioned dataset and this was used to generate features.
  • glove.840B.300d was used for obtaining vector representations and generating the embedding matrix. It is available here.
    • To run the model, download the glove file and add to MedGen folder.
  • Features were extracted using DenseNet121 model loaded with ChexNet weights (available here). The paper used a VGG-19 network.
    • The features are available in ./features directory.
  • The features were fed into a model with the following structure

Model structure

  • To train the model, run encoder_decoder.ipynb in root directory.

Generated Report

  • The model was trained for 10 epochs. Due to computational difficulties, we were unable to train for more epochs and hence the model did not converge.
  • Final BLEU score was 0.643

Generated Report

Dependencies

  • TensorFlow 2.4.1
  • Keras 2.4.3
  • Numpy
  • Pandas 1.1.5
  • Sklearn 0.23.2
  • PIL 8.0.1
  • Nltk 3.5
  • Matplotlib 3.3.2
  • Opencv 4.5.1
  • Tqdm 4.50.2
  • OS

To-do

  • Complete tag prediction using MLC
  • Integrate semantic features in co-attention model

Contributors

For any queries, please open an issue at the repository, or email any of the contributors.