This is a capstone project on generalized few-shot segmentation (GFSS). This repo presents a neat and scalable codebase for GFSS research. The code for our key contributions can be found in the following files:
- Test Env: Python 3.9.7 (Singularity)
- Path on NYU Greene:
/scratch/hl3797/overlay-25GB-500K.ext3
- Path on NYU Greene:
- Packages:
- torch (1.10.2+cu113), torchvision (0.11.3+cu113), timm (0.5.4)
- numpy, scipy, pandas, tensorboardX
- cv2, einops
git clone https://github.com/hmdliu/GFSS-Capstone && cd GFSS-Capstone
Note: Make sure the path in scripts/prepare_pascal.sh
works for you.
# default data root: ../dataset/VOCdevkit/VOC2012
bash scripts/prepare_pascal.sh
You may refer to PFENet for more details.
For ImageNet pre-trained weights, please download it here (credits PFENet) and unzip as initmodel/
.
For base class pre-trained weights, you may find it here (credits CWT) and rename them as follows: pretrained/[dataset]/split[i]/pspnet_resnet[layers]/best.pth
. We'll release our weights shortly.
- initmodel: ImageNet pre-trained backbone weights.
.pth
- pretrained: Base classes pre-trained backbone weights.
.pth
- configs: Base configurations for experiments.
.yaml
- scripts: Training and helper scripts.
.sh
.slurm
- results: Logs and checkpoints.
.log
.pth
.yaml
- src: Source code.
.py
exp_id
aims to make efficient config modifications for experiment purposes. It follows the format of [exp_group]_[meta_cfg]_[train_cfg]
, see src/exp.py
for a sample usage (pascal 1-shot on fold 0).
# debug mode (i.e., only log to shell)
python -m src.test --config configs/pascal_mib.yaml --exp_id sample_t2_pm10 --debug True
# submit to slurm
sbatch scripts/test_pascal.slurm configs/pascal_mib.yaml sample_t2_pm10
# output dir: results/sample/t2_pm10
tail results/sample/t2_pm10/output.log
Qualitative Ablation on Contrastive Loss (left) and Distillation Loss (right):
The repo for drawing utilities can be found here.
Major References: PFENet, CWT, Segmenter, MiB, and ContrastiveSeg.
- Haoming Liu (hl3797@nyu.edu)
- Chengyu Zhang (cz1627@nyu.edu)
- Xiaochen Lu (xl3139@nyu.edu)
We thank Professor Li Guo for her consistent guidance throughout the project. We thank Professor Hongyi Wen for his suggestions on the project write-up. This work was supported through the NYU IT High Performance Computing resources, services, and staff expertise.