Skip to content

Evaluation of the morphological quality of machine translation outputs

Notifications You must be signed in to change notification settings

franckbrl/morpheval_v2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

386782c · Nov 17, 2020

History

29 Commits
Nov 17, 2020
Jun 23, 2018
Dec 17, 2019
Apr 25, 2018

Repository files navigation

morpheval_v2

Evaluation of the morphological quality of machine translation outputs. The automatically generated test suite in English should be translated into one of the supported target languages (French, Czech, German). The output is then analyzed and provides three types of information:

  • Adequacy: has the morphological information been well conveyed from the source?
  • Fluency: do we have local agreement?
  • Consistency: how well is the system confident in its prediction?

Requirements

How To

Translate the source file morpheval.limsi.v2.en.sents and run the Moses tokenizer on it (with arguments -no-escape and -l {fr|cs|de}). Then:

French

python3 evaluate_fr.py -i output.tokenized -n morpheval.limsi.v2.en.info -d lefff.pkl

Czech

cat output.tokenized | sed 's/$/\n/' | tr ' ' '\n' | morphodita/src/run_morpho_analyze dictionary --input=vertical --output=vertical > output.analysis
python3 evaluate_cs.py -i output.analysis -n morpheval.limsi.v2.en.info

German

cat output.tokenized | tr ' ' '\n' | sort -u | ./smor > output.smored
python3 evaluate_de.py -i output.tokenized -n morpheval.limsi.v2.en.info -d output.smored

Publication

Franck Burlot and François Yvon, Evaluating the morphological competence of machine translation systems. In Proceedings of the Second Conference on Machine Translation (WMT’17). Association for Computational Linguistics, Copenhagen, Denmark, 2017.

About

Evaluation of the morphological quality of machine translation outputs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages