Skip to content

Solutions for subject identification and relapse detection of the e-Prevention Challenge hosted at the ICASSP 2023 conference

License

Notifications You must be signed in to change notification settings

perceivelab/e-prevention-icassp-2023

Repository files navigation

Ensemble and personalized transformer models for subject identification and relapse detection in e-Prevention Challenge

Salvatore Calcagno, Raffaele Mineo, Daniela Giordano, Concetto Spampinato

Overview

Official PyTorch implementation of paper: "Ensemble and personalized transformer models for subject identification and relapse detection in e-Prevention Challenge"

Abstract

We present the devised solutions for subject identification and relapse detection of the e-Prevention Challenge hosted at the ICASSP 2023 conference [1] [2] [3]. We specifically design an ensemble scheme of six models - five transformer-based ones and a CNN model - for the identification of subjects from wearable devices, while a personalized - one for each subject - scheme is used for relapse detection in psychotic disorder. Our final submitted solutions yield top performance on both tracks of the challenge: we ranked second on the subject identification task (with an accuracy of 93.85%) and first on the relapse detection task (with a ROC-AUC and PR-AUC of about 0.65).

Method

Track 1

We show below the employed architectures for the ensemble model

Model Type Architecture Details Training Setting
CNN (Transformer Ablation) 5 convolutional layers (conv1D, ReLU, BatchNorm, Dropout), AdaptiveAvgPool1d, Time2Vec, Fully Connected Classification Head batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 1e-4, factor 0.5, patience 10 epochs)
Transformer Embedding (5 convolutional layers, AdaptiveAvgPool1d)
Positional Embedding (sin and cos encoding)
Transformer Encoder (model depth 128, nlayers 2 , nhead 2, d_hid 512)
Fully Connected Classification Head
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-4, factor 0.5, patience 10 epochs)
Transformer Embedding (5 convolutional layers, AdaptiveAvgPool1d)
Positional Embedding (Time2Vec)
Transformer Encoder (model depth 32, nlayers 2 , nhead 2, d_hid 128 )
Fully Connected Classification Head
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-4, factor 0.5, patience 10 epochs)
Transformer Embedding (5 convolutional layers, AdaptiveAvgPool1d)
Positional Embedding (Time2Vec)
Transformer Encoder (model depth 32, nlayers 2 , nhead 2, d_hid 128 )
Fully Connected Classification Head
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-4, factor 0.5, patience 10 epochs)
Transformer Embedding (5 convolutional layers, AdaptiveAvgPool1d)
Positional Embedding (Time2Vec)
Transformer Encoder (model depth 32, nlayers 2 , nhead 2, d_hid 768)
Fully Connected Classification Head
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-4, factor 0.5, patience 10 epochs)
Transformer Embedding (5 convolutional layers, AdaptiveAvgPool1d)
Positional Embedding (Time2Vec)
Transformer Encoder (model depth 128, nlayers 2 , nhead 2, d_hid 768)
Fully Connected Classification Head
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-4, factor 0.5, patience 10 epochs)

Track 2

Best configurations were found using grid search for each subject:

For CNN-based models we tested the following parameters:

"parameters": {
    "subject": {"values": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]},
    "data_type": {"values": ["aggregated", "raw"]},
    "learning_rate": {"values": [5e-3, 5e-4, 5e-5]},
    "enable_variational": {"values": [0, 1]},
    "model": {"values": ["cnn1d_autoencoder", "volund"]}
}

For Transformer-based models we tested the following parameters:

"parameters": {
    "subject": {"values": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]},
    "data_type": {"values": ["aggregated"]},
    "learning_rate": {"values": [5e-3, 5e-4, 5e-5]},
    "enable_variational": {"values": [0, 1]},
    "model": {"values": ["transformer_autoencoder"]},
    "d_model": {"values": [32, 64, 128]},
    "nhead": {"values": [4, 8, 16]},
    "nlayers": {"values": [2, 4]},
}

We show below the employed architectures for each subject.

Subject Model Type  Architecture Details Training Setting
0 Transformer Embedding (linear projection)
Positional Embedding (sin and cos encoding)
Transformer Encoder (model depth 32, nlayers 2 , nhead 8, d_hid 2048)
Transformer Decoder (model depth 32, nlayers 2 , nhead 8, d_hid 2048)
Linear Mapping
data type aggregated
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-3, factor 0.5, patience 10 epochs)
1 Transformer Embedding (linear projection)
Positional Embedding (sin and cos encoding)
Transformer Encoder (model depth 128, nlayers 2 , nhead 16, d_hid 2048)
Transformer Decoder (model depth 128, nlayers 2 , nhead 16, d_hid 2048)
Linear Mapping
data type aggregated
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-4, factor 0.5, patience 10 epochs)
2 CNN CNN Encoder: 5 convolutional layers (conv1D, ReLU, BatchNorm, Dropout)
Bottleneck: conv1D, ReLU
CNN Decoder: 5 transposed convolutional layers (convTranspose1D, ReLU, BatchNorm, Dropout)
data type aggregated
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-3, factor 0.5, patience 10 epochs)
3 Transformer Embedding (linear projection)
Positional Embedding (sin and cos encoding)
Transformer Encoder (model depth 32, nlayers 2 , nhead 4, d_hid 2048)
Transformer Decoder (model depth 32, nlayers 2 , nhead 4, d_hid 2048)
Linear Mapping
data type aggregated
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-4, factor 0.5, patience 10 epochs)
4 Transformer Embedding (linear projection)
Positional Embedding (sin and cos encoding)
Transformer Encoder (model depth 32, nlayers 2 , nhead 8, d_hid 2048)
Transformer Decoder (model depth 32, nlayers 2 , nhead 8, d_hid 2048)
Linear Mapping
data type aggregated
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-3, factor 0.5, patience 10 epochs)
5 Volund data type raw
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-3, factor 0.5, patience 10 epochs)
6 CNN CNN Encoder: 5 convolutional layers (conv1D, ReLU, BatchNorm, Dropout)
Bottleneck: conv1D, ReLU
CNN Decoder: 5 transposed convolutional layers (convTranspose1D, ReLU, BatchNorm, Dropout)
Linear Mapping
data type raw
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-3, factor 0.5, patience 10 epochs)
7 Transformer Embedding (linear projection)
Positional Embedding (sin and cos encoding)
Transformer Encoder (model depth 32, nlayers 2 , nhead 8, d_hid 2048)
Transformer Decoder (model depth 32, nlayers 2 , nhead 8, d_hid 2048)
Linear Mapping
data type aggregated
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-3, factor 0.5, patience 10 epochs)
8 Transformer Embedding (linear projection)
Positional Embedding (sin and cos encoding)
Transformer Encoder (model depth 128, nlayers 2 , nhead 8, d_hid 2048)
Transformer Decoder (model depth 128, nlayers 2 , nhead 8, d_hid 2048)
Linear Mapping
data type aggregated
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-3, factor 0.5, patience 10 epochs)
9 Transformer Embedding (linear projection)
Positional Embedding (sin and cos encoding)
Transformer Encoder (model depth 128, nlayers 2 , nhead 8, d_hid 2048)
Transformer Decoder (model depth 128, nlayers 2 , nhead 8, d_hid 2048)
Linear Mapping
data type aggregated
batch size 64
Adam optimizer
scheduler reduceLROnPlateau (initial learning rate 5e-3, factor 0.5, patience 10 epochs)

How to run

Pre-requisites

  • NVIDIA GPU (Tested on Nvidia A6000 GPUs )
  • Wandb account (change entity and project name in scripts)
  • The datasets provided for track 1 and track 2 should be placed in ../datasets
  • Requirements

Track 1

Train Ensemble Models

To start training, simply run the following commands. Each command shows a model configuration, which will be used in the ensemble during validation and test. Please note that the first two commands are the same, since the same model was used with a weight of 2 in the voting scheme.

python train_track1.py --window_size 2160 --model transformer --d_model 32 --nhead 2 --d_hid 128 --nlayers 2 --learning_rate 5e-4 --enable_scheduler 1 --batch_size 64 --split_path data/track1/width3_stride3 --data_dir data/track1/width3_stride3
python train_track1.py --window_size 2160 --model transformer --d_model 32 --nhead 2 --d_hid 128 --nlayers 2 --learning_rate 5e-4 --enable_scheduler 1 --batch_size 64 --split_path data/track1/width3_stride3 --data_dir data/track1/width3_stride3
python train_track1.py --window_size 2160 --model transformer --d_model 32 --nhead 2 --d_hid 768 --nlayers 2 --learning_rate 5e-4 --enable_scheduler 1 --batch_size 64 --split_path data/track1/width3_stride3 --data_dir data/track1/width3_stride3
python train_track1.py --window_size 1080 --model transformer --d_model 128 --nhead 2 --d_hid 768 --nlayers 2 --learning_rate 5e-4 --enable_scheduler 1 --batch_size 64 --split_path data/track1/width1_5_stride1_5 --data_dir data/track1/width1_5_stride1_5
python train_track1.py --window_size 2160 --model transformer_ablation_time2vec --d_model 128 --nhead 2 --d_hid 512 --nlayers 2 --learning_rate 5e-4 --enable_scheduler 1 --batch_size 64 --split_path data/track1/width3_stride3 --data_dir data/track1/width3_stride3
python train_track1.py --window_size 2160 --model transformer_ablation --learning_rate 1e-4 --enable_scheduler 1 --batch_size 64 --split_path data/track1/width3_stride3 --data_dir data/track1/width3_stride3

Test Example

The code expects a txt file ensemble.txt with the list of names of models (the structure is shown below). The file should be placed in the root directory.

YYYY-MM-DD_hh-mm-ss_<model1>
YYYY-MM-DD_hh-mm-ss_<model2>
YYYY-MM-DD_hh-mm-ss_<model3>
YYYY-MM-DD_hh-mm-ss_<model4>
YYYY-MM-DD_hh-mm-ss_<model5>
YYYY-MM-DD_hh-mm-ss_<model6>

Model names YYYY-MM-DD_hh-mm-ss_<model*> should be retrieved from the directory list in the experiments folder, after training.

Run the following to retrieve accuracies of single and ensemble models on the provided validation set.

python test_track1.py --split val

Use --split test if you want to obtain predictions over test samples. Predictions will be saved into a file named test_track1.csv We don't have the ground truth for this split.

The default essemble scheme is sum. You can use the --scheme argument if you want to change the ensemble scheme. Allowed schemes are min, max and sum.

Track 2

Train Best Configurations

To start training, simply run the following commands.

python train_track2.py --subject 0 --model transformer_autoencoder --d_model 32 --nhead 8 --nlayers 2 --data_type aggregated --learning_rate 5e-3
python train_track2.py --subject 1 --model transformer_autoencoder --d_model 128 --nhead 16 --nlayers 2 --data_type aggregated --learning_rate 5e-4
python train_track2.py --subject 2 --model cnn1d_autoencoder --data_type aggregated --learning_rate 5e-3
python train_track2.py --subject 3 --model transformer_autoencoder --d_model 32 --nhead 4 --nlayers 2 --data_type aggregated --learning_rate 5e-4
python train_track2.py --subject 4 --model transformer_autoencoder --d_model 32 --nhead 8 --nlayers 2 --data_type aggregated --learning_rate 5e-3
python train_track2.py --subject 5 --model volund --data_type raw --learning_rate 5e-3
python train_track2.py --subject 6 --model cnn1d_autoencoder --data_type raw --learning_rate 5e-3
python train_track2.py --subject 7 --model transformer_autoencoder --d_model 128 --nhead 8 --nlayers 2 --data_type aggregated --learning_rate 5e-3
python train_track2.py --subject 8 --model transformer_autoencoder --d_model 128 --nhead 8 --nlayers 2 --data_type aggregated --learning_rate 5e-3
python train_track2.py --subject 9 --model transformer_autoencoder --d_model 128 --nhead 8 --nlayers 2 --data_type aggregated --learning_rate 5e-3

Test Example

The code expects a txt file best_models.txt with the list of names of models, the same as for the first track. The file should be placed in the root directory.

Run the following to retrieve performace (ROC-AUC, PRC-AUC and the harmonic mean of the previous two) of single models on the provided validation set.

python test_track2.py --split val

Use --split test if you want to obtain predictions over test samples. Predictions will be saved into a file named test_track2.csv We don't have the ground truth for this split.

References

[1] A Zlatintsi, P P Filntisis, C Garoufis, N Efthymiou, P Maragos, A Menychtas, I Maglogiannis, et al., “E- prevention: Advanced support system for monitoring and relapse prevention in patients with psychotic disorders analyzing long-term multimodal data from wearables and video captures,” Sensors, vol. 22, no. 19, 2022.

[2] G Retsinas, P P Filntisis, N Efthymiou, E Theodosis, A Zlatintsi, and P Maragos, “Person identification using deep convolutional neural networks on short-term signals from wearable sensors,” in ICASSP. IEEE, 2020.

[3] M Panagiotou, A Zlatintsi, PP Filntisis, AJ Roumeliotis, N Efthymiou, and P Maragos, “A comparative study of autoencoder architectures for mental health analysis us- ing wearable sensors data,” in EUSIPCO. IEEE, 2022.

[4] S M Kazemi, R Goel, S Eghbali, J Ramanan, J Sahota, S Thakur, S Wu, C Smyth, P Poupart, and M Brubaker, “Time2vec: Learning a vector representation of time,” arXiv preprint arXiv:1907.05321, 2019.

About

Solutions for subject identification and relapse detection of the e-Prevention Challenge hosted at the ICASSP 2023 conference

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •