textmetrics

Automatic text metrics---BLEU, ROUGE, and METEOR, plus extras like vocab and ngrams.

Usage

# Compares each candidate (c) separately against all references (r).
python -m textmetrics.main c1.txt c2.txt --references r1.txt r2.txt r3.txt

Installation

Requires:

Perl (for BLEU)
Java 1.8 (for METEOR)
Python 3.6+

pip install textmetrics

Features

BLEU
ROUGE
METEOR

Notes

BLEU and METEOR use the refernce implementations (in Perl and Java, respectively). We originally used the reference Perl implementation for ROUGE as well, but it ran so slowly that we opted for a Python reimplementation instead. (ROUGE's original Perl implementation is also more difficult to setup, even with wrapper libraries.)

Worklist

pypi
API support (possible to have interface for passing strings?)
ROUGE crashes things if it decides there aren't sentences (e.g., run with README.md as input and reference)
Add back in orig ROUGE for completeness (place behind switch)
BLEU perl script fails if the filename ends in gz because it tries to un-gzip it, which happens eventually when creating a lot of files. we should wrap the filename creation so this doesn't happen
ngrams has divide by zero error. With two simple files (two lines each, same first line, differing second line) running with 2.txt --references 1.txt 1.txt triggered this divide by zero
Demo + guide for better README (should cover file + API usage)
Tests
Early check in each module for whether program runnable + nice error message (e.g., no java or bad version, no perl or bad version, etc.)

Note to self: I followed this guide for packaging to pypi, and future uploads will probably look like:

# (1) ensure tests pass

# (2) bump version in setup.py

# (3) commit + push to github

# (4) generate distribution
python setup.py sdist bdist_wheel

# (5) Upload
twine upload dist/*

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
textmetrics		textmetrics
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

textmetrics

Usage

Installation

Features

Notes

Worklist

About

Releases

Packages

Languages

License

mbforbes/textmetrics

Folders and files

Latest commit

History

Repository files navigation

textmetrics

Usage

Installation

Features

Notes

Worklist

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages