Repository for paper titled "Stacked DeBERT: All Attention in Incomplete Data for Text Classification".
Overview • Requirements • How to Use • How to Cite
-
Proposed model: Stacked Denoising BERT
-
Task: Text classification from noisy data
- Twitter Sentiment Classification
- Incomplete Intent Classification: text with STT error
-
Baseline models
- BERT
- NLU Platforms: Rasa (spacy and tf), Dialogflow, and SAP Conversational AI
- Semantic Hashing with Classifier
Python 3.6 (3.7.3 tested), PyTorch 1.0.1.post2, CUDA 9.0 or 10.1
pip install --default-timeout=1000 torch==1.0.1.post2
pip install -r requirements.txt
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
- Chatbot NLU Evaluation Benchmark dataset with missing/incorrect data (STT errors) and Twitter Sentiment dataset (Check Dataset README)
- Training done on:
- Twitter dataset: complete data, incomplete data, complete+incomplete data
- Chatbot dataset: complete data, 2 TTS-STT data (gtts-witai, macsay-witai)
- Twitter Sentiment Corpus
CUDA_VISIBLE_DEVICES=0,1 ./scripts/twitter_sentiment/run_bert_classifier_inc_with_corr.sh
Script for Inc+Corr dataset. Scripts corresponding to Inc and Corr are also available in the same folder.
- Chatbot Incomplete Intent Corpus: texts with STT Error
CUDA_VISIBLE_DEVICES=0,1 ./scripts/stterror_intent/run_bert_classifier_stterror.sh
Script for noisy data (stterror). Script for clean, non-noisy data, is also available (complete).
- Training on Twitter Corpus
CUDA_VISIBLE_DEVICES=0,1 ./scripts/twitter_sentiment/run_stacked_debert_dae_classifier_twitter_inc_with_corr.sh
Make sure the OUTPUT directory is the same as the fine-tuned BERT or copy the BERT model to your new output dir.
- Training on NLU Evaluation Corpora for TTS=gtts/macsay and STT=witai and autoencoder epochs 100-1000.
CUDA_VISIBLE_DEVICES=0,1 ./scripts/stterror_intent/run_stacked_debert_dae_classifier_stterror.sh
- Testing on NLU Evaluation Corpora
CUDA_VISIBLE_DEVICES=0 python run_stacked_debert_dae_classifier.py --seed 1 --task_name "sentiment140_sentiment" --save_best_model --do_eval --do_lower_case --data_dir ./data/twitter_sentiment_data/sentiment140/ --bert_model bert-base-uncased --max_seq_length 128 --train_batch_size 4 --eval_batch_size 1 --learning_rate 2e-5 --num_train_epochs_autoencoder 3 --num_train_epochs 3 --output_dir_first_layer "./results/test/results_stacked_debert_dae_earlyStopWithEvalLoss_twitter_10seeds/inc_with_corr_sentences_TestOnlyIncorrect/sentiment140_ep3_bs4_inc_with_corr_TestOnlyIncorrect_seed1_first_layer_epae1000/" --output_dir "./results/test/results_stacked_debert_dae_earlyStopWithEvalLoss_twitter_10seeds/inc_with_corr_sentences_TestOnlyIncorrect/sentiment140_ep3_bs4_inc_with_corr_TestOnlyIncorrect_seed1_second_layer_epae1000/"
In case you wish to use this code, please use the following citation:
@article{CUNHASERGIO202187,
title = "Stacked DeBERT: All attention in incomplete data for text classification",
author = "Gwenaelle {Cunha Sergio} and Minho Lee",
journal = "Neural Networks",
volume = "136",
pages = "87 - 96",
year = "2021",
issn = "0893-6080",
doi = "https://doi.org/10.1016/j.neunet.2020.12.018",
url = "http://www.sciencedirect.com/science/article/pii/S0893608020304433"
}
Email for further requests or questions:
gwena.cs@gmail.com
The authors would like to thank Snips.co and Kaggle for their public datasets (Snips NLU Benchmark and Sentiment140 Twitter Dataset), and HuggingFace's BERT PyTorch code.