ruGPT3 for QA

This repo includes an experiment of fine-tuning ruGPT-3Large for Question Answering (QA). It also runs the model on SQuAD-like dataset: sberquad. It uses Huggingface Inc.'s PyTorch implementation of ruGPT-3 and adapts from their fine-tuning of BERT for QA.

SQuAD data can be downloaded from: https://github.com/rajpurkar/SQuAD-explorer/tree/master/dataset

SberQuAD data can be downloaded from: https://github.com/kniazevgeny/BERT-QA-fine-tuning

To train and validate the model:

GPU or CPU

python gpt2_squad.py --output_dir=output/ --train_file=data/train-v2.0.json --do_train --train_batch_size=8 --predict_file=data/dev-v2.0.json --do_predict --model_name=ruGPT3Small

Also, you could specify model name. Use --model_name arg. Example: --model_name=ruGPT3Large

Only 3 models are avaliable: ruGPT3Small, ruGPT3Medium and ruGPT3Large

TPU (Colab)

python gpt2_squad_tpu.py --output_dir=output/ --train_file=data/train-v2.0.json --do_train --train_batch_size=32 --predict_file=data/dev-v2.0.json --do_predict

Arguments:

Required:

--train_file SQuAD-like json for training. E.g., train-v1.1.json
--predict_file SQuAD-like json for predictions. E.g., dev-v1.1.json or test-v1.1.json
--output_dir The output directory where the model checkpoints and predictions will be written.

To evaluate:


python evaluate-v2.0.py data/dev-v2.0.json output/predictions.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ruGPT3 for QA

To train and validate the model:

GPU or CPU

TPU (Colab)

Arguments:

To evaluate:

Files

README.md

Latest commit

History

README.md

File metadata and controls

ruGPT3 for QA

To train and validate the model:

GPU or CPU

TPU (Colab)

Arguments:

To evaluate: