ICML2024-FSTTA

Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation

Introduction

Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation

Junyu Gao, Xuan Yao, Changsheng Xu

State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences.

Paper Link on ICML 2024

Usage

Prerequisites

Install Matterport3D simulators: follow instructions here. We use the latest version the same as DUET.

export PYTHONPATH=Matterport3DSimulator/build:$PYTHONPATH

Install requirements:

conda create --name fsvln python=3.8.5
conda activate fsvln

Required packages are listed in requirements.txt. You can install by running:

pip install -r requirements.txt

Please download data from Dropbox, including processed annotations, features and pretrained models of REVERIE datasets and R2R datasets. Before running the code, please put the data in `datasets' directory.
Please download pretrained LXMERT model by running:

mkdir -p datasets/pretrained 
wget https://nlp.cs.unc.edu/data/model_LXRT.pth -P datasets/pretrained

Pretraining (Base Model)

Combine behavior cloning and auxiliary proxy tasks in pretraining:

cd pretrain_src
bash run_reverie.sh

Fine-tuning (Base Model)

Use pseudo interative demonstrator to fine-tune the model:

cd map_nav_src
bash scripts/run_reverie.sh

Test-time Adaptation & Evaluation

Use pseudo interative demonstrator to equip the model with our FSTTA:

cd map_nav_src
bash scrips/run_reverie_tta.sh

Acknowledgements

Our implementations are partially based on VLN-DUET, HM3DAutoVLN and VLN-BEVBert. Thanks to the authors for sharing their code.

Related Work

Citation

If you find this project useful in your research, please consider cite:

@inproceedings{Gao2024Fast,
  title={Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation},
  author={Junyu Gao and Xuan Yao and Changsheng Xu},
  journal={Proceedings of the 41st International Conference on Machine Learning},
  year={2024},
  url={}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
datasets		datasets
img		img
map_nav_src		map_nav_src
pretrain_src		pretrain_src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ICML2024-FSTTA

Introduction

Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation

Usage

Prerequisites

Pretraining (Base Model)

Fine-tuning (Base Model)

Test-time Adaptation & Evaluation

Acknowledgements

Related Work

Citation

About

Releases

Packages

Languages

Feliciaxyao/ICML2024-FSTTA

Folders and files

Latest commit

History

Repository files navigation

ICML2024-FSTTA

Introduction

Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation

Usage

Prerequisites

Pretraining (Base Model)

Fine-tuning (Base Model)

Test-time Adaptation & Evaluation

Acknowledgements

Related Work

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages