Releases: clarinsi/classla
Releases · clarinsi/classla
v2.1.1
v2.1
- Added new models for all languages
- Added new "web" processing type
- Fixed sentence splitting in the tokenizers
v2.0
- Added new models for standard Slovenian
- Added new inflectional lexicon for Slovenian
- Adapted tests to new model outputs
- Modified lexicon to store underscores instead of empty strings
- Other changes
v1.2.0
v1.1.1
v1.1.0
- Added tokenizer pretag option for both obeliks and reldi-tokeniser (via
pos_lemma_pretag
) - Updated Slovene inflectional lexicon and moved from lemmatizer model to morphosyntactic annotation model
- Added upos and ufeats control to Slovene inflectional lexicon
- Other smaller fixes
v1.0.2
- fixed issue where the parser produced non-CONLLU-compliant extension labels with underscores (e.g.
cc_preconj
) instead of colon-separated labels (e.g.cc:preconj
) - during lemmatization, if a token consists of a character that is not present in the seq2seq vocabulary, lemma will now be identical to the token
- added PUNCT control
- fixed MISC collumn bug for NER
punct
in Bulgarian UPOS was renamed toZ