Releases: alexklibisz/elastiknn
Releases · alexklibisz/elastiknn
0.1.0-PRE21
- Re-implemented LSH and sparse-indexed queries using an optimized custom Lucene query based on the TermInSetQuery.
This is 3-5x faster on LSH benchmarks. - Updated L1, and L2 similarities such that they're bounded in [0,1].
0.1.0-PRE20
- Added an option for LSH queries to use the more-like-this heuristics to pick a subset of LSH hashes to retrieve candidate vectors.
Uses Lucene's MoreLikeThis class
to pick a subset of hashes based on index statistics. It's generally much faster than using all of the hashes,
yields comparable recall, but is still disabled by default.
0.1.0-PRE19
- Omitting norms in LSH and sparse indexed queries.
This shaves ~15% of runtime off of a sparse indexed benchmark.
Results for LSH weren't as meaningful unfortunately.
0.1.0-PRE18
- Removed the internal vector caching and instead using
sun.misc.Unsafe
to speed up vector serialization and deserialization.
The result is actually faster queries without caching than it previously had with caching.
Also able to remove the protobuf dependency which was previously used to serialize vectors. - Upgraded Elasticsearch version from 7.4.0 to 7.6.2.
Attempted to use 7.7.1 but the Apache Lucene version used in 7.7.x introduces a performance regression (Details).
Switching from Java 13 to 14 also yields a slight speedup for intersections on sparse vectors.
0.1.0-PRE17
- Internal change from custom Lucene queries to FunctionScoreQueries. This reduces quite a bit of boilerplate code and
surface area for bugs and performance regressions. - Add optional progress bar to Python ElastiknnModel.
0.1.0-PRE16
- Updated client-elastic4s to use elastic4s version 7.6.0.
- Implemented a demo webapp using Play framework. Hosted at demo.elastiknn.klibisz.com.
0.1.0-PRE15
- Implemented LSH for Hamming, Angular, and L2 similarities.
- First pass at a documentation website.
0.1.0-PRE14
- Implemented LSH for Hamming, Angular, and L2 similarities.
- First pass at a documentation website.
0.1.0-PRE12
- Implemented LSH for Hamming, Angular, and L2 similarities.
- First pass at a documentation website.
0.1.0-PRE10
- Introduced a cache for exact similarity queries that maintains deserialized vectors in memory instead of repeatedly
reading them and deserializing them. By default the cache entries expire after 90 seconds. - Fixed a mapping issue that was causing warnings to be printed at runtime. Specifically, the term fields corresponding
to a vector should be given the same name as the field where the vector is stored. A bit confusing, but it works.