Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use bulk processing from spacy pipeline #2392

Merged
merged 7 commits into from
Feb 17, 2025
Merged

Use bulk processing from spacy pipeline #2392

merged 7 commits into from
Feb 17, 2025

Conversation

gunthercox
Copy link
Owner

Spacy supports functionality to improve performance when processing large amounts of text. Example from their documentation:

texts = ["This is a text", "These are lots of texts", "..."]
- docs = [nlp(text) for text in texts]
+ docs = list(nlp.pipe(texts))

https://spacy.io/usage/processing-pipelines#processing


It also looks like there are enable and disable options that can be used to include or exclude parts of the text processing pipeline. These might be a possible place to look for future performance improvements, but right now it isn't clear if adjusting these makes a significant impact.


Closes #2350

@gunthercox gunthercox merged commit 99ddfb9 into master Feb 17, 2025
1 check passed
@gunthercox gunthercox deleted the bulk branch February 17, 2025 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chatterbot 1.0.5 version with spacy 2.1.9 is too slow
1 participant