Skip to content

v0.6.0

Compare
Choose a tag to compare
@MaartenGr MaartenGr released this 27 Jul 14:20
· 25 commits to master since this release
9dd7b59

Highlights

  • Major speedup, up to 2x to 5x when passing multiple documents (for MMR and MaxSum) compared to single documents
  • Same results whether passing a single document or multiple documents
  • MMR and MaxSum now work when passing a single document or multiple documents
  • Improved documentation
  • Added 🤗 Hugging Face Transformers
from keybert import KeyBERT
from transformers.pipelines import pipeline

hf_model = pipeline("feature-extraction", model="distilbert-base-cased")
kw_model = KeyBERT(model=hf_model)
  • Highlighting support for Chinese texts
    • Now uses the CountVectorizer for creating the tokens
    • This should also improve the highlighting for most applications and higher n-grams

image

NOTE: Although highlighting for Chinese texts is improved, since I am not familiar with the Chinese language there is a good chance it is not yet as optimized as for other languages. Any feedback with respect to this is highly appreciated!

Fixes