How to use different Transformers models in spaCy #10327
Locked
polm
started this conversation in
Help: Best practices
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Using
spacy-transformers
, many Hugging Face models can be loaded and used in spaCy. However it's important to understand how they're used and the limitations before you try to incorporate a model.Models from the Hugging Face Hub can be specified by name and loaded into a
transformer
component to be used as a source of features, similar to a tok2vec component. This has a number of implications.First, task specific heads are not supported for training within spaCy. This means that if you load an NER model from Hugging Face you can't use it directly for NER with
spacy-transformers
. This isn't supported because the variation in implementation for task-specific heads is too high. That doesn't mean you can't use these models though - you can use the wrappers inspacy-huggingface-pipelines
or write your own custom component to wrap them and get the predictions. This does have the downside that they won't be trainable within spaCy, and serialization also won't be handled automatically.Changing the base model requires retraining. Keep in mind that any components that use a Transformer for features - like NER, textcat, or other components - rely on mutually learned representations. If you change the base model in a Transformers component you therefore have to retrain your model. If you don't, downstream components will get embeddings completely different from what they expect and you'll get nonsense results.
To be perfectly clear, this also means that you cannot take a trained pipeline like
en_core_web_trf
, replace thetransformer
component, and get meaningful results.Not all models are supported.
spacy-transformers
works by handling common conventions for models on the Hugging Face Hub, but there's no fixed standard, so some models may simply not work. In rare cases we've seen models that don't give an error but also don't give meaningful results. If you're not sure if your model is working, try replacing it withroberta-base
and see if you're able to train a model that way - if so it may be a model compatibility issue.OK, with that out of the way, here's how you specify a different model in a config when training a model from scratch:
That's it! You can also see the docs for more details. Have fun with Transformers, and if you make something cool remember to share it on the Show & Tell board.
Beta Was this translation helpful? Give feedback.
All reactions