Skip to content

[ICASSP 2025] First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher Network

Notifications You must be signed in to change notification settings

Huster-Hq/PSDNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PSDNet

First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher Network


Qiang Hu1, Mei Liu2, Qiang Li1,†, Zhiwei Wang1, †

1 WNLO, HUST, 2 HUST Tongji Medical College
(: corresponding author)

Overview

In this paper, we, for the first time, reduce the annotation cost to just a single frame per polyp video, regardless of the video's length. To this end, we introduce a new task, First-Frame Supervised Video Polyp Segmentation (FSVPS), and propose a novel Propagative and Semantic Dual-Teacher Network (PSDNet). Specifically, PSDNet adopts a teacher-student framework but employs two distinct types of teachers: the propagative teacher and the semantic teacher. The propagative teacher is a universal object tracker that propagates the first-frame annotation to subsequent frames as pseudo labels. However, tracking errors may accumulate over time, gradually degrading the pseudo labels and misguiding the student model. To address this, we introduce the semantic teacher, an exponential moving average of the student model, which produces more stable and time-invariant pseudo labels. PSDNet merges the pseudo labels from both teachers using a carefully-designed back-propagation strategy. This strategy assesses the quality of the pseudo labels by tracking them backward to the first frame. High-quality pseudo labels are more likely to spatially align with the firstframe annotation after this backward tracking, ensuring more accurate teacher-to-student knowledge transfer and improved segmentation performance.


Models

SUN-SEG Video Polyp Segmentation (VPS)

Model Backbone Seen-Easy (Dice) Seen-Hard (Dice) Unseen-Easy (Dice) Unseen-Hard (Dice) Weights
PSDNet PVT 0.900 0.860 0.798 0.806 ckpts

Performance on SUN-SEG

1. Quantitative Comparisons


2. Qualitative Comparisons


Quick start

- Preliminaries

  • Python 3.8+
  • PyTorch 1.9+
  • TorchVision corresponding to the PyTorch version
  • NVIDIA GPU + CUDA

1. Install dependencies for SALI.

cd PSDNet
# Install other dependent packages
pip install -r requirements.txt

# Install cuda extensions for FA
cd lib/ops_align
python setup.py build develop
cd ../..

2. Prepare the datasets for SALI.

Please refer to PNS+ to get access to the SUN-SEG dataset, and download it to path ./datasets. The path structure should be as follows:

  SALI
  ├── datasets
  │   ├── SUN-SEG
  │   │   ├── TestEasyDataset
  │   │   │   ├── Seen
  │   │   │   ├── Unseen
  │   │   ├── TestHardDataset
  │   │   │   ├── Seen
  │   │   │   ├── Unseen
  │   │   ├── TrainDataset

- Testing

python test_video.py

Acknowledgments

Thanks XMem for the implementation of an efficient universal video object segmentaion, which is used as the propagative teacher model in this work.

Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝 :

@article{hu2024first,
  title={First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher Network},
  author={Hu, Qiang and Liu, Mei and Li, Qiang and Wang, Zhiwei},
  journal={arXiv preprint arXiv:2412.16503},
  year={2024}
}

About

[ICASSP 2025] First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher Network

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published