Release Release 0.15.1 · flairNLP/flair

This release fixes compatibility bugs with the newest PyTorch and SciPy versions, and adds a number of small improvements and new features.

Improvements and new features

SegtokTokenizer: Add option to customize SegtokTokenizer, by @alanakbik in #3592
RegexpTagger: Add option to define matching groups to RegexpTagger, by @alanakbik in #3598
RelationClassifier: Optimize RelationClassifier by adding the option to filter long sentences and truncate context, by @alanakbik in #3593
RelationClassifier: Modify printouts in RelationClassifier evaluation to remove clutter by @alanakbik in #3591
Add sentence labeler, by @MattGPT-ai in #3570
Adding a Deep Nearest Class Means Classifier model to Flair, by @sheldon-roberts in #3532
Add per-task metrics by @ntravis22 in #3605
Add options to load full documents as Sentence objects, by @alanakbik in #3595

New Model: Deep Nearest Class Means Classifier (#3532)

Adds a new Nearest Class Mean classification approach to Flair that classifies data points to the class with the closest class data mean. This approach can be used as an alternative to fitting a Softmax Classifier. It is now available for any class in Flair that implements DefaultClassifier. For instance, to train a TextClassifier with DeepNCMs you can use the following code:

from flair.data import Corpus
from flair.datasets import TREC_50
from flair.embeddings import TransformerDocumentEmbeddings
from flair.models import TextClassifier
from flair.nn import DeepNCMDecoder
from flair.trainers import ModelTrainer
from flair.trainers.plugins import DeepNCMPlugin

# load the TREC dataset
corpus: Corpus = TREC_50()

label_type = "class"

# make a transformer document embedding
document_embeddings = TransformerDocumentEmbeddings("distilbert-base-uncased", fine_tune=True)

# create the label_dictionary
label_dictionary = corpus.make_label_dictionary(label_type=label_type)

# create a text classifier with a special DeepNCM decoder
classifier = TextClassifier(
    document_embeddings,
    label_type=label_type,
    label_dictionary=label_dictionary,
    decoder=DeepNCMDecoder(
        mean_update_method="condensation",
        embeddings_size=document_embeddings.embedding_length,
        label_dictionary=label_dictionary,
    ),
)

# initialize the trainer
trainer = ModelTrainer(classifier, corpus)

# train the model using the DeepNCM plugin
trainer.fine_tune(
    "resources/taggers/deepncm_baseline",
    plugins=[DeepNCMPlugin()],
)

Contributed by @sheldon-roberts in #3532

Datasets

Add BarNER Dataset by @stefan-it in #3604

Bug Fixes

Fix model loading for compatibility with PyTorch 2.6, by @helpmefindaname in #3608
Fix SciPy compatibility by updating scipy .A to toarray(), by @sg-wbi in #3606
Fix: use proper eval default main eval metrics for text regression model by @MattGPT-ai in #3602
Fix: cast indices tensor to int to fix bug by @MattGPT-ai in #3601

New Contributors

@sg-wbi made their first contribution in #3606
@ntravis22 made their first contribution in #3605

Full Changelog: v0.15.0...v0.15.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.15.1

Improvements and new features

New Model: Deep Nearest Class Means Classifier (#3532)

Datasets

Bug Fixes

New Contributors

Contributors