Building entityrecognizer from json takes substantially shorter than loading model #12091
SjoerdBraaksma
started this conversation in
Help: Best practices
Replies: 1 comment 1 reply
-
Thanks for the report and the example code. The slowdown is due to the fact that adding new patterns to the entity ruler component requires executing all the pipes preceding it to generate that internal We previously considered exposing a |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello!
In my current project I'm working with two custom entityrecognizers (one for names, one for places), with a large number of patters (500k).
I noticed that adding these pipes to the standard dutch model like so:
This takes around 40 seconds.
However, if I save this model and load it from disk (nlp.to_disk and spacy.load) takes about 4,5 minutes. Not much difference if I exclude all but the 2 rulers and the ner component.
Am I doing something wrong with saving and loading, or am I not implementing something that could give me a time save boost?
Thanks in advance for the help!
Beta Was this translation helpful? Give feedback.
All reactions