Releases · clarinsi/classla · GitHub

10 Apr 11:10

5roop

v2.1.1 Latest

Latest

reldi-tokeniser 1.0.3 added as dependency, in which a bug in abbreviation loading has been resolved.

Assets 4

08 Aug 07:32

lukatercon

v2.1

Added new models for all languages
Added new "web" processing type
Fixed sentence splitting in the tokenizers

Assets 2

16 Feb 18:41

lukatercon

v2.0

Added new models for standard Slovenian
Added new inflectional lexicon for Slovenian
Adapted tests to new model outputs
Modified lexicon to store underscores instead of empty strings
Other changes

Assets 2

29 Jun 11:32

lkrsnik

v1.2.0

Added SRL parsing to Slovenian language
Fixed training for lemmatizer and pos tagger
Added toy tests for all trainings
Other smaller fixes

Assets 2

06 May 09:21

lkrsnik

v1.1.1

Updated external package version requirements. Mainly due to updates in Slovenian obeliks tokenizer

Assets 2

12 Jan 09:36

lkrsnik

v1.1.0

Added tokenizer pretag option for both obeliks and reldi-tokeniser (via pos_lemma_pretag)
Updated Slovene inflectional lexicon and moved from lemmatizer model to morphosyntactic annotation model
Added upos and ufeats control to Slovene inflectional lexicon
Other smaller fixes

Assets 2

07 Sep 08:21

lkrsnik

v1.0.2

fixed issue where the parser produced non-CONLLU-compliant extension labels with underscores (e.g. cc_preconj) instead of colon-separated labels (e.g. cc:preconj)
during lemmatization, if a token consists of a character that is not present in the seq2seq vocabulary, lemma will now be identical to the token
added PUNCT control
fixed MISC collumn bug for NER
punct in Bulgarian UPOS was renamed to Z

Assets 2