Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev/sinhala mbart #1

Open
wants to merge 41 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
d975d92
Sinhala mbart kwargs
NomadXD Jul 15, 2021
e52b390
sinhala sample dataset added
NomadXD Jul 21, 2021
bb5ef21
valid, test modified for sinhala
NomadXD Jul 21, 2021
734942d
max tokens and batch size changed
NomadXD Jul 22, 2021
4a064b2
--memory-efficient-fp16 arg added
NomadXD Jul 22, 2021
c571220
max-tokens 928, batch size 16, warmup 500, total 20000,
NomadXD Jul 22, 2021
8ff3e82
--skip-invalid-size-inputs-valid-test
NomadXD Jul 22, 2021
565f7a2
removed --validate-interval-updates
NomadXD Jul 22, 2021
bb561aa
warmup 1000, total 40000
NomadXD Jul 22, 2021
a766e14
max_tokens=256
NomadXD Jul 22, 2021
3258de4
warmup 2500 , total 40000 , update-freq 8
NomadXD Jul 22, 2021
d0f86f1
update freq 2
NomadXD Jul 22, 2021
5a8f2ca
batch size 32, max_tokens 1024
NomadXD Jul 22, 2021
c7436a7
disable validation
NomadXD Jul 22, 2021
688c067
max_update 1000
NomadXD Jul 22, 2021
db5353f
max_sentences = 1024, save_interval = 1000
NomadXD Jul 22, 2021
bc338cb
save-interval=5, validate-interval=100
NomadXD Jul 22, 2021
b889318
evaluation dataset added
NomadXD Jul 23, 2021
84a5d47
simplify modified for si, examples.si added
NomadXD Jul 23, 2021
b4b236e
examples changed
NomadXD Jul 23, 2021
285b962
examples.si changed
NomadXD Jul 23, 2021
6795a32
examples.si changed
NomadXD Jul 23, 2021
20f0347
examples.si changed
NomadXD Jul 23, 2021
32ca980
ignore generated datasets
NomadXD Jul 24, 2021
ca6484a
newsela dataset added
NomadXD Aug 4, 2021
d431bc6
test, valid dataset changed to newsela 2000
NomadXD Aug 4, 2021
ab4b600
fine-tune dataset for sinhala
NomadXD Aug 6, 2021
db55bff
max_updates=100
NomadXD Aug 6, 2021
04239fe
max_tokens=32
NomadXD Aug 6, 2021
497b422
muss_si_newsela
NomadXD Aug 6, 2021
45de2a3
mbart_si_
NomadXD Aug 6, 2021
1b28bfa
examples.si 350
NomadXD Aug 6, 2021
f887860
test params
NomadXD Aug 6, 2021
470944c
newsela_1
NomadXD Aug 8, 2021
db6245f
test dataset added
NomadXD Aug 14, 2021
2b5e451
valid dataset added
NomadXD Aug 14, 2021
98f871a
write output to file
NomadXD Aug 31, 2021
ca7fc71
file open in w+
NomadXD Aug 31, 2021
f6402dc
--finetune-from-model mode enabled
NomadXD Sep 1, 2021
eff782c
--reset-optimizer --reset-meters --reset-dataloader --reset-lr-scheduler
NomadXD Sep 1, 2021
771a054
--memory-efficient-fp16
NomadXD Sep 1, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
max tokens and batch size changed
  • Loading branch information
NomadXD committed Jul 22, 2021
commit 734942d9ed0c216e99d90ad49553ed115b7a0017
4 changes: 2 additions & 2 deletions muss/mining/training.py
Original file line number Diff line number Diff line change
@@ -242,12 +242,12 @@ def get_mbart_kwargs(dataset, language, use_access, use_short_name=False):
'train_kwargs': add_dicts(
{'ngpus': 8},
args_str_to_dict(
f'''--restore-file {mbart_path} --arch mbart_large --task translation_from_pretrained_bart --source-lang {source_lang} --target-lang {target_lang} --encoder-normalize-before --decoder-normalize-before --criterion label_smoothed_cross_entropy --label-smoothing 0.2 --dataset-impl mmap --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' --lr-scheduler polynomial_decay --lr 3e-05 --min-lr -1 --warmup-updates 2500 --total-num-update 40000 --dropout 0.3 --attention-dropout 0.1 --weight-decay 0.0 --max-tokens 1024 --update-freq 2 --log-format simple --log-interval 2 --reset-optimizer --reset-meters --reset-dataloader --reset-lr-scheduler --langs ar_AR,cs_CZ,de_DE,en_XX,es_XX,et_EE,fi_FI,fr_XX,gu_IN,hi_IN,it_IT,ja_XX,kk_KZ,ko_KR,lt_LT,lv_LV,my_MM,ne_NP,nl_XX,ro_RO,ru_RU,si_LK,tr_TR,vi_VN,zh_CN
f'''--restore-file {mbart_path} --arch mbart_large --task translation_from_pretrained_bart --source-lang {source_lang} --target-lang {target_lang} --encoder-normalize-before --decoder-normalize-before --criterion label_smoothed_cross_entropy --label-smoothing 0.2 --dataset-impl mmap --optimizer adam --adam-eps 1e-06 --adam-betas '(0.9, 0.98)' --lr-scheduler polynomial_decay --lr 3e-05 --min-lr -1 --warmup-updates 2500 --total-num-update 40000 --dropout 0.3 --attention-dropout 0.1 --weight-decay 0.0 --max-tokens 256 --update-freq 32 --log-format simple --log-interval 2 --reset-optimizer --reset-meters --reset-dataloader --reset-lr-scheduler --langs ar_AR,cs_CZ,de_DE,en_XX,es_XX,et_EE,fi_FI,fr_XX,gu_IN,hi_IN,it_IT,ja_XX,kk_KZ,ko_KR,lt_LT,lv_LV,my_MM,ne_NP,nl_XX,ro_RO,ru_RU,si_LK,tr_TR,vi_VN,zh_CN
--layernorm-embedding --ddp-backend no_c10d'''
),
), # noqa: E501
'generate_kwargs': args_str_to_dict(
f'''--task translation_from_pretrained_bart --source_lang {source_lang} --target-lang {target_lang} --batch-size 32 --langs ar_AR,cs_CZ,de_DE,en_XX,es_XX,et_EE,fi_FI,fr_XX,gu_IN,hi_IN,it_IT,ja_XX,kk_KZ,ko_KR,lt_LT,lv_LV,my_MM,ne_NP,nl_XX,ro_RO,ru_RU,si_LK,tr_TR,vi_VN,zh_CN''' # noqa: E501
f'''--task translation_from_pretrained_bart --source_lang {source_lang} --target-lang {target_lang} --batch-size 16 --langs ar_AR,cs_CZ,de_DE,en_XX,es_XX,et_EE,fi_FI,fr_XX,gu_IN,hi_IN,it_IT,ja_XX,kk_KZ,ko_KR,lt_LT,lv_LV,my_MM,ne_NP,nl_XX,ro_RO,ru_RU,si_LK,tr_TR,vi_VN,zh_CN''' # noqa: E501
),
'evaluate_kwargs': get_evaluate_kwargs(language),
}