v0.1.25
This release drops dask for a thin multi-processing client, and comes with lots of performance improvements, namely the slow import time of lilac.
We have also added a simple API for loading from HuggingFace
import lilac as ll
from datasets import load_dataset
hf_ds = load_dataset('Open-Orca/SlimOrca-Dedup')
ds = ll.from_huggingface(hf_ds)
And a simple API for getting embeddings:
answer_emb = ds.get_embeddings('jina-v2-small', rowid, 'answer')[0]['vector']
We've also added some color to the UI, and organized components a little better
Features
- Add Jina V2 embeddings by @dsmilkov in #966
- Add sugar for
ll.from_huggingface()
by @dsmilkov in #962 - Improve the row header to give us space for deleting. by @nsthorat in #965
Performance
- Reduce import times by @brilee in #961
- Using
loky
(thin wrapper aroundmultiprocessing
) instead of dask by @dsmilkov in #947 - fix iterable robustness by @brilee in #977
Bug fixes
- Fix memory leak caused by Iterable/Iterator mixups by @brilee in #974
- Fix broken doc links. by @nsthorat in #964
- Add color scales for semantic / concept search. Add openchat format. by @nsthorat in #975
Other Changes
Full Changelog: v0.1.24...v0.1.25