P2AT: Pyramid Pooling Axial Transformer for Real-time Semantic Segmentation [Arxiv]
The paper has been accepted at Expert Systems with Applications ESWA
Code will be released soon, stay tuned!
You need to download the Cityscapesdatasets. and rename the folder cityscapes, then put the data under data folder.
You can download the Kaggle
- You need to download the Cityscapes datasets, unzip them and put the files in the
data
folder with following structure.
$SEG_ROOT/data\
├── Camvid
│ ├── images
│ ├── labels
│ ├── cityscapes
│ ├── gtFine
│ ├── test
│ ├── train
│ ├── val
│ ── ── leftImg8bit
│ ├── test
│ ├── train
│ └── val
│ ├── list
├── Camvid
│ ├── test.lst
│ ├── train.lst
│ ├── trainval.lst
│ └── val.lst
│ ├── cityscapes
│ ├── test.lst
│ ├── train.lst
│ ├── trainval.lst
│ └── val.lst
- For instance, train the P2AT-S on Camvid dataset with batch size of 8 on 2 GPUs:
python train.py --cfg configs/camvid/p2at_small_camvid.yaml GPUS (0,1) TRAIN.BATCH_SIZE_PER_GPU 4