diff --git a/Changelog b/Changelog index 141945d..c6aa2ba 100644 --- a/Changelog +++ b/Changelog @@ -1,5 +1,9 @@ Finnish language model for spaCy +Version 0.7.1, 2021-08-21 + +* Works on Python 3.7 again + Version 0.7.0, 2021-07-12 * Compatibility with spaCy v3.1 diff --git a/README.md b/README.md index 676fb0a..8008c42 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,5 @@ +[![CI status](https://circleci.com/gh/aajanki/spacy-fi/tree/master.svg?style=shield)](https://circleci.com/gh/aajanki/spacy-fi/tree/master) + # Experimental Finnish language model for spaCy Finnish language model for [spaCy](https://spacy.io/). The model does POS tagging, dependency parsing, word vectors, noun phrase extraction, token frequencies, morphological features and lemmatization. The morphological features and lemmatization are based on [Voikko](https://voikko.puimula.org/). @@ -15,7 +17,7 @@ Compatibility with spaCy versions: | spacy-fi version | Compatible with spaCy version | | ---------------- | ----------------------------- | -| 0.7.0 | 3.0.x, 3.1.x | +| 0.7.x | 3.0.x, 3.1.x | | 0.6.0 | 3.0.x | | 0.5.0 | 3.0.x | | 0.4.x | 2.3.x | @@ -84,9 +86,10 @@ for t in doc: ### What about named entity recognizer (NER)? -An earlier version of this script optinally trained a NER model, but -the current version does not. Mostly because the model was never very -good and the training data potentially has licensing issues. +The [feature branch +feature/ner](https://github.com/aajanki/spacy-fi/tree/feature/ner) has +training scripts for a NER model. It's not merged in the main branch +because the accuracy is quite poor. ### Packaging and publishing diff --git a/fi/meta.json b/fi/meta.json index 275b8f8..f469cc0 100644 --- a/fi/meta.json +++ b/fi/meta.json @@ -1,7 +1,7 @@ { "lang": "fi", "name": "experimental_web_md", - "version": "0.7.0", + "version": "0.7.1", "requirements": ["voikko>=0.5"], "description": "Finnish language model: POS tagger, dependency parser, lemmatizer, morphological features", "author": "Antti Ajanki", diff --git a/packaging.md b/packaging.md index 2e88fa0..8abb28c 100644 --- a/packaging.md +++ b/packaging.md @@ -6,6 +6,13 @@ Remember to change the version in [fi/meta.json](fi/meta.json)! tools/package_model.sh models/taggerparser/model-best ``` +To override the default spaCy compatibility specification, add a new +spec as the second paramter: + +```sh +tools/package_model.sh models/taggerparser/model-best ">=3.0.0,<3.2.0" +``` + ## Publishing ```sh diff --git a/tools/package_model.sh b/tools/package_model.sh index 523daf9..86a4fea 100755 --- a/tools/package_model.sh +++ b/tools/package_model.sh @@ -2,7 +2,7 @@ set -eu -TRAINED_MODEL=$1 +TRAINED_MODEL="$1" mkdir -p packages rm -rf packages/* @@ -12,6 +12,11 @@ PACKAGE_DIR=$(ls -d packages/*/fi_*) NEW_PACKAGE_DIR=$(echo "$PACKAGE_DIR" | sed -E 's#(.*?)/#\1/spacy_#') mv "$PACKAGE_DIR" "$NEW_PACKAGE_DIR" +if [ "$#" -ge 2 ]; then + echo "Overriding spacy_version" + sed -i -E "s/\"spacy_version\": *\".+\"/\"spacy_version\":\"$2\"/g" packages/*/meta.json +fi + echo "Building the package" cp python_packaging/setup.py packages/*/