Gla-AI4BioMed at RRG24: Visual Instruction-tuned Adaptation for Radiology Report Generation

Overview

We introduce a radiology-focused visual language model designed to generate radiology reports from chest X-rays. Building on previous findings that large language models (LLMs) can acquire multimodal capabilities when aligned with pretrained vision encoders, we demonstrate similar potential with chest X-ray images. Our model combines an image encoder with a fine-tuned LLM based on the Vicuna-7B architecture, enabling it to generate different sections of a radiology report with notable accuracy.

Install

Please refer to the Libra repository for code and environment details, as this project is compatible with it. Below is a brief outline:

Create and activate a new conda environment (e.g., libra).
Install the required dependencies (e.g., pip install -e .).

git clone https://github.com/X-iZhang/Libra.git
cd Libra

conda create -n libra python=3.10 -y
conda activate libra
pip install --upgrade pip  # enable PEP 660 support
pip install -e .

For more detailed instructions, see Libra's README.

Model Weight

Version	Base LLM	Vision Encoder	Checkpoint
Libra-v0.5-impressions	Vicuna-7B	CLIP	libra-v0.5-impressions
Libra-v0.5-findings	Vicuna-7B	CLIP	libra-v0.5-findings

Quick Start

CLI Inference

We support running inference using the CLI. To use our model, run:

python -m libra.serve.cli \
    --model-path X-iZhang/libra-v0.5-impressions  \
    --conv-mode libra_v0 \
    --image-file "./path/to/chest_x_ray.jpg"

Script Inference

You can use the libra_eval function in libra/eval/run_libra.py to easily launch a model trained by yourself or us on local machine or in Google Colab, after installing this repository.

from libra.eval import libra_eval

model_path = "X-iZhang/libra-v0.5-impressions "  # Or "X-iZhang/libra-v0.5-findings " 

# Define the paths to the images. 
image_files = "./path/to/chest_x_ray.jpg"

# Define the prompt to guide the model's response.
prompt = "Provide a detailed description of the impression in the radiology image. " 
# Or  "Provide a detailed description of the findings in the radiology image. " 

# Specify the conversational mode, matching the PROMPT_VERSION used during training.
conv_mode = "libra_v0"

# Call the libra_eval function.
libra_eval(
    model_path=model_path,
    image_file=image_files,
    query=prompt,
    temperature=0.9,
    top_p=0.8,
    conv_mode=conv_mode,
    max_new_tokens=512
)

Data Preparation

We use the officially provided dataset. For information on data structure, preprocessing, and additional script usage, please refer to the instructions in Libra. For detailed formats related to data training or evaluation, see Custom_Data.md.

Acknowledgments 🙏

We extend our gratitude to the BioNLP 2024 RRG24 Shared Task organisers for providing the baseline pipeline ViLMedic and curating these challenging and exciting tasks.

Also, we sincerely thank the following projects for their contributions:

LLaVA: A Large Language and Vision Assistant, laying the groundwork for multimodal understanding.
FastChat: An Open Platform for Training, Serving, and Evaluating Large Language Model based Chatbots.
LLaMA: Open and efficient foundation language models that inspired our core language processing capabilities.

Citation ✒️

If you find our paper useful in your research and applications, please cite using this BibTeX:

@inproceedings{Zhang_2024,
   title={Gla-AI4BioMed at RRG24: Visual Instruction-tuned Adaptation for Radiology Report Generation},
   url={http://dx.doi.org/10.18653/v1/2024.bionlp-1.54},
   DOI={10.18653/v1/2024.bionlp-1.54},
   booktitle={Proceedings of the 23rd Workshop on Biomedical Natural Language Processing},
   publisher={Association for Computational Linguistics},
   author={Zhang, Xi and Meng, Zaiqiao and Lever, Jake and Ho, Edmond S.L.},
   year={2024},
   pages={624–634}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gla-AI4BioMed at RRG24: Visual Instruction-tuned Adaptation for Radiology Report Generation

Overview

Contents

Install

Model Weight

Quick Start

CLI Inference

Script Inference

Data Preparation

Acknowledgments 🙏

Citation ✒️

About

Releases

Packages

License

X-iZhang/RRG-BioNLP-ACL2024

Folders and files

Latest commit

History

Repository files navigation

Gla-AI4BioMed at RRG24: Visual Instruction-tuned Adaptation for Radiology Report Generation

Overview

Contents

Install

Model Weight

Quick Start

CLI Inference

Script Inference

Data Preparation

Acknowledgments 🙏

Citation ✒️

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages