We tried both statistical machine translation models, and neural machine translation models for this task. The dataset can not be made public as it was part of a contained study, but these models can be used for any 2 languages.
We used a phrase based SMT model, with Giza++ to get the word alignment.
We had 4 models in place:
- A baseline seq-2-seq model, using LSTM
- Neural Machine Translation By Jointly Learning To Align And Translate (paper link)
- Effective Approaches to Attention-based Neural Machine Translation (paper link)
- Modeling Coverage for Neural Machine Translation (paper link)
Presentation for this can be found here where the details of the implementation has been explained in detail, along with the results.
Built and tested using Python3 on Linux.
Saurabh Chand Ramola, Sumukh S