Skip to content

vearch/vearch

Repository files navigation

License: Apache-2.0 Build Status Go Report Card Gitter

Overview

Vearch is a cloud-native distributed vector database for efficient similarity search of embedding vectors in your AI applications.

Key features

  • Hybrid search: Both vector search and scalar filtering.

  • Performance: Fast vector retrieval - search from millions of objects in milliseconds.

  • Scalability & Reliability: Replication and elastic scaling out.

Document

Restful APIs

OpenAPIs

SDK

SDK Description
Python SDK Python client for Vearch
Go SDK Go client for Vearch
Java SDK Java client (under development)

Usage Cases

Use Vearch as a Memory Backend

Vearch integrates with popular AI frameworks:

Framework Integration
Langchain Use Vearch as vector store in Langchain
LlamaIndex Integrate with LlamaIndex for knowledge bases
Langchaingo Go implementation of Langchain with Vearch support
LangChain4j Java implementation with Vearch integration

Real world Demos

  • VisualSearch: Vearch can be leveraged to build a complete visual search system to index billions of images. The image retrieval plugin for object detection and feature extraction is also required.

Quick start

Kubernetes Deployment

# Via Helm Repository
$ helm repo add vearch https://vearch.github.io/vearch-helm
$ helm repo update && helm install my-release vearch/vearch

# Or from Local Charts
$ git clone https://github.com/vearch/vearch-helm.git && cd vearch-helm
$ helm install my-release ./charts -f ./charts/values.yaml

Docker Compose Deployment

# Standalone Mode
$ cd cloud && cp ../config/config.toml .
$ docker-compose --profile standalone up -d

# Cluster Mode
$ cd cloud && cp ../config/config_cluster.toml .
$ docker-compose --profile cluster up -d

Other Deployment Methods

Components

Vearch Architecture

arc

Master: Responsible for schema management, cluster-level metadata, and resource coordination.

Router: Provides RESTful API: upsert, delete, search and query; request routing, and result merging.

PartitionServer (PS): Hosts document partitions with raft-based replication. Gamma is the core vector search engine implemented based on faiss. It provides the ability of storing, indexing and retrieving the vectors and scalars.

Technical Reference

Academic Citation

When using Vearch in academic or research projects, please cite our paper:

@misc{li2019design,
      title={The Design and Implementation of a Real Time Visual Search System on JD E-commerce Platform},
      author={Jie Li and Haifeng Liu and Chuanghua Gui and Jianyu Chen and Zhenyun Ni and Ning Wang},
      year={2019},
      eprint={1908.07389},
      archivePrefix={arXiv},
      primaryClass={cs.IR}
}

Community Support

Connect With Us

Connect with the Vearch community through multiple channels:

  • GitHub Issues: Report bugs or request features on our issues page
  • Email Discussion: For public discussion or questions, contact us at vearch-maintainers@groups.io
  • Slack Channel: Join our community on Slack for real-time discussions

Contribution

We welcome contributions from the community! Check our contribution guidelines to get started.

License

Vearch is licensed under the Apache License, Version 2.0.

For complete licensing details, please see LICENSE and NOTICE in our repository.


© 2019 Vearch Contributors. All Rights Reserved.