The official repository for NeurIPS'23 paper. We extended the original repo DER++ with our method.
-
Use
python main.py
to run experiments. -
For example, for dataset Seq-CIFAR100, run
python main.py --dataset seq-cifar100 --model trire --buffer_size 200 --load_best_args --img_size 32 --tensorboard --reservoir_buffer --kwinner_sparsity 0.3 --pruning_technique CWI --sparsity 0.2 --lr_fl 0.002 --lr_sl 0.0001 --reset_act_counters --train_budget_1 0.6 --train_budget_2 0.2 --reparameterize --reinit_technique rewind --use_cl_mask --reg_weight 0.05 --stable_model_update_freq 0.1 --rewind_tuning_incl --use_het_drop
where,
kwinner_sparsity
andsparsity
represents the percentage of most activated neurons and corresponding most important weights to be retained at the end of Retain stage respectively.
pruning_technique
: {'CWI', 'Magnitude Pruning', 'Fisher Information'}
lr_fl
is the learning rate for Retain and Revise stages andlr_sl
is the slowed learning rate for Revise stage.
train_budget_1
andtrain_budget_2
are the percentages of epochs dedicated to Retain and Revise stages which also implicates that the rest of the epochs are used for Rewind stage.
use_cl_mask
indicates that the model is using a single head classifier.
reinit_technique
: {xavier, rewind}
- Use argument
--load_best_args
to use the best hyperparameters from the paper. - New models can be added to the
models/
folder. - New datasets can be added to the
datasets/
folder.
Class-Il / Task-IL settings
- Seq-CIFAR10
- Seq-CIFAR100
- SeqTinyImageNet
If you find the code useful in your research, please consider citing our paper:
@article{vijayan2024trire,
title={TriRE: A Multi-Mechanism Learning Paradigm for Continual Knowledge Retention and Promotion},
author={Vijayan, Preetha and Bhat, Prashant and Zonooz, Bahram and Arani, Elahe},
journal={Advances in Neural Information Processing Systems},
volume={36},
year={2024}
}