This repo contains the code for our NeurIPS 2020 paper "Transductive Infomation Maximization (TIM) for few-shot learning" available at https://arxiv.org/abs/2008.11297. Our method maximizes the mutual information between the query features and predictions of a few-shot task, subject to supervision constraint from the support set. Results provided in the paper can be reproduced with this repo. Code was developped under python 3.8.3 and pytorch 1.4.0. The code is parallelized over tasks (which makes the execution of the 10'000 tasks very efficient).
Use the provided script:
python -m scripts.downloads.download_data
For Tiered-Imagenet, please download the zip file at https://drive.google.com/drive/folders/163HGKZTvfcxsY96uIF6ILK_6ZmlULf_j?usp=sharing, and unzip it into the data/ folder.
We also put the cross-entropy trained models at your disposal. To download the models
python -m scripts.downloads.download_models
For higher reproducibility, we provide the environment that was used to obtain our results. To download it, execute:
python -m scripts.downloads.download_environment
If you face an issue with the previous scripts, everything required can be downloaded manually at https://drive.google.com/open?id=1KicPkBFOQQJptWmSh3NcE4tHl2ZKD87B Each dataset.zip must be extracted inside the data/ folder. The checkpoints.zip files must be extracted at the root of the directory, same for env.zip file.
Instead of using the pre-trained models, you may want to train the models from scratch. Before anything, don't forget to activate the downloaded environment:
source env/bin/activate
Then to visualize the results, turn on your local visdom server:
python -m visdom.server -port 8097
and open it in your browser : http://localhost:8097/ . Then, for instance, if you want to train a Resnet-18 on mini-Imagenet, go to the root of the directory, and execute:
bash scripts/train/resnet18.sh
Important : Whenever you have trained yourself a new model and want to test it, please specify the option eval.fresh_start=True
to your test command. Otherwise, the code may use cached information (used to speed-up experiments) from previously used models that are longer valid.
Before anything, don't forget to activate the downloaded environement:
source env/bin/activate
(1 shot/5 shot) | Arch | mini-Imagenet | Tiered-Imagenet |
---|---|---|---|
TIM-ADM | Resnet-18 | 73.6 / 85.0 | 80.0 / 88.5 |
TIM-GD | Resnet-18 | 73.9 / 85.0 | 79.9 / 88.5 |
TIM-ADM | WRN28-10 | 77.5 / 87.2 | 82.0 / 89.7 |
TIM-GD | WRN28-10 | 77.8 / 87.4 | 82.1 / 89.8 |
To reproduce the results from Table 1. in the paper, use the bash files at scripts/evaluate/. For instance, if you want to reproduce the methods on mini-Imagenet, go to the root of the directory and execute:
bash scripts/evaluate/<tim_adm or tim_gd>/mini.sh
This will reproduce the results for the three network architectures in the paper (Resnet-18/WideResNet28-10/DenseNet-121). Upon completion, exhaustive logs can be found in logs/ folder
(5 shot) | Arch | CUB -> CUB | mini-Imagenet -> CUB |
---|---|---|---|
TIM-ADM | Resnet18 | 90.7 | 70.3 |
TIM-GD | Resnet18 | 90.8 | 71.0 |
If you want to reproduce the methods on CUB -> CUB, go to the root of the directory and execute:
bash scripts/evaluate/<tim_adm or tim_gd>/cub.sh
If you want to reproduce the methods on mini -> CUB, go to the root of the directory and execute:
bash scripts/evaluate/<tim_adm or tim_gd>/mini2cub.sh
If you want to reproduce the methods with more ways (10 and 20 ways) on mini-Imagenet, go to the root of the directory and execute:
bash scripts/evaluate/<tim_adm or tim_gd>/mini_10_20_ways.sh
(1 shot/5 shot) | Arch | 10 ways | 20 ways |
---|---|---|---|
TIM-ADM | Resnet18 | 56.0 / 72.9 | 39.5 / 58.8 |
TIM-GD | Resnet18 | 56.1 / 72.8 | 39.3 / 59.5 |
If you want to reproduce the 4 loss configurations of on mini-Imagenet, Tiered-Imagenet and CUB, go to the root of the directory and execute:
bash scripts/ablation/<tim_adm or tim_gd or tim_gd_all>/weighting_effect.sh
for respectively TIM-ADM, TIM-GD {W} and TIM-GD {phi, W}.
For further questions or details, reach out to Malik Boudiaf (malik.boudiaf.1@etsmtl.net)
We would like to thank the authors from SimpleShot code https://github.com/mileyan/simple_shot and LaplacianShot https://github.com/imtiazziko/LaplacianShot for giving access to their pre-trained models and to their codes from which this repo was inspired.