Skip to content

v1.0.2

Compare
Choose a tag to compare
@lkrsnik lkrsnik released this 07 Sep 08:21
· 53 commits to master since this release
  • fixed issue where the parser produced non-CONLLU-compliant extension labels with underscores (e.g. cc_preconj) instead of colon-separated labels (e.g. cc:preconj)
  • during lemmatization, if a token consists of a character that is not present in the seq2seq vocabulary, lemma will now be identical to the token
  • added PUNCT control
  • fixed MISC collumn bug for NER
  • punct in Bulgarian UPOS was renamed to Z