You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# for Par_bn
setup_adapter_training(model, "par_bn", data_args.task_name or "glue")
# for mam_adapter
setup_adapter_training(model, "mam", data_args.task_name or "glue")
use this run_glue.py, run my script as previously list
Expected behavior
$ python run_script.py
/proj/ossdataset1/wenjingk/peft/adapters/run_adapters.sh: line 18: cd: adapters: No such file or directory
Using the WANDB_DISABLED environment variable is deprecated and will be removed in v5. Use the --report_to flag to control the integrations used for logging result (for instance --report_to none).
12/15/2023 17:06:33 - WARNING - main - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
/proj/ossdataset1/wenjingk/anaconda3/envs/llm/lib/python3.8/site-packages/datasets/load.py:2088: FutureWarning: 'use_auth_token' was deprecated in favor of 'token' in version 2.14.0 and will be removed in 3.0.0.
You can remove this warning by passing 'token=<use_auth_token>' instead.
warnings.warn(
[WARNING|modeling_utils.py:3952] 2023-12-15 17:06:41,336 >> Some weights of BertAdapterModel were not initialized from the model checkpoint at bert-large-uncased and are newly initialized: ['heads.default.3.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
12/15/2023 17:06:44 - WARNING - accelerate.utils.other - Detected kernel version 5.4.0, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.
10%|███████████████████▏ | 536/5360 [05:10<34:51, 2.31it/s]
{'eval_loss': 0.6233669519424438, 'eval_matthews_correlation': 0.0, 'eval_runtime': 10.1056, 'eval_samples_per_second': 103.21, 'eval_steps_per_second': 12.963, 'epoch': 1.0}
15%|████████████████████████████▊ | 804/5360 [07:50<32:58, 2.30it/s]
20%|██████████████████████████████████████▏ | 1072/5360 [10:32<30:59, 2.31it/s]
{'eval_loss': 0.618126392364502, 'eval_matthews_correlation': 0.0, 'eval_runtime': 10.0855, 'eval_samples_per_second': 103.415, 'eval_steps_per_second': 12.989, 'epoch': 3.0}
25%|███████████████████████████████████████████████▊ | 1340/5360 [13:11<29:06, 2.30it/s]
30%|█████████████████████████████████████████████████████████▎ | 1608/5360 [15:52<27:10, 2.30it/s]
{'eval_loss': 0.6205015182495117, 'eval_matthews_correlation': 0.0, 'eval_runtime': 10.0833, 'eval_samples_per_second': 103.439, 'eval_steps_per_second': 12.992, 'epoch': 5.0}
35%|██████████████████████████████████████████████████████████████████▊ | 1876/5360 [18:32<25:13, 2.30it/s]
40%|████████████████████████████████████████████████████████████████████████████▍ | 2144/5360 [21:14<23:17, 2.30it/s]
{'eval_loss': 0.6273216009140015, 'eval_matthews_correlation': 0.0, 'eval_runtime': 10.0745, 'eval_samples_per_second': 103.529, 'eval_steps_per_second': 13.003, 'epoch': 7.0} 45%|█████████████████████████████████████████████████████████████████████████████████████▉ | 2412/5360 [23:53<21:16, 2.31it/s]
50%|███████████████████████████████████████████████████████████████████████████████████████████████▌ | 2680/5360 [26:35<19:23, 2.30it/s]
{'eval_loss': 0.6188081502914429, 'eval_matthews_correlation': 0.0, 'eval_runtime': 10.0585, 'eval_samples_per_second': 103.694, 'eval_steps_per_second': 13.024, 'epoch': 9.0}
The text was updated successfully, but these errors were encountered:
wenjingk-xilinx
changed the title
Par_bn and MAM got "eval_matthews_correlation': 0.0" on GLUE-Cola
Par_bn and mam_adapter got "eval_matthews_correlation': 0.0" on GLUE-Cola
Dec 15, 2023
From the script parameters you shared, it seems you're using a bert-large checkpoint. For successful training, you might need to tune the training hyperparameters for this model a bit. E.g. by lowering the learning rate or enabling warmup steps to help the model converge.
I was able to get a Matthews coefficient of ~0.546 after 1 epoch with your training setup (par_bn config) by lowering the learning rate to 5e-5 and setting --warmup_steps 200.
Environment info
Information
Model I am using (Bert, XLNet ...): Bert-large
Language I am using the model on (English, Chinese ...): English
Adapter setup I am using (if any): as I use setup_adapter_training(model, adapter_args, data_args.task_name or "glue")
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
Expected behavior
The text was updated successfully, but these errors were encountered: