v0.6.0
Highlights
- Major speedup, up to 2x to 5x when passing multiple documents (for MMR and MaxSum) compared to single documents
- Same results whether passing a single document or multiple documents
- MMR and MaxSum now work when passing a single document or multiple documents
- Improved documentation
- Added 🤗 Hugging Face Transformers
from keybert import KeyBERT
from transformers.pipelines import pipeline
hf_model = pipeline("feature-extraction", model="distilbert-base-cased")
kw_model = KeyBERT(model=hf_model)
- Highlighting support for Chinese texts
- Now uses the
CountVectorizer
for creating the tokens - This should also improve the highlighting for most applications and higher n-grams
- Now uses the
NOTE: Although highlighting for Chinese texts is improved, since I am not familiar with the Chinese language there is a good chance it is not yet as optimized as for other languages. Any feedback with respect to this is highly appreciated!
Fixes
- Fix typo in ReadMe by @priyanshul-govil in #117
- Add missing optional dependencies (gensim, use, and spacy) by @yusuke1997
in #114