Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VoyageAI vectorizer and reranker #152

Closed
wants to merge 14 commits into from
2 changes: 2 additions & 0 deletions .github/workflows/run_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ jobs:
GCP_LOCATION: ${{ secrets.GCP_LOCATION }}
GCP_PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
AZURE_OPENAI_API_KEY: ${{secrets.AZURE_OPENAI_API_KEY}}
AZURE_OPENAI_ENDPOINT: ${{secrets.AZURE_OPENAI_ENDPOINT}}
AZURE_OPENAI_DEPLOYMENT_NAME: ${{secrets.AZURE_OPENAI_DEPLOYMENT_NAME}}
Expand All @@ -80,6 +81,7 @@ jobs:
GCP_LOCATION: ${{ secrets.GCP_LOCATION }}
GCP_PROJECT_ID: ${{ secrets.GCP_PROJECT_ID }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
AZURE_OPENAI_API_KEY: ${{secrets.AZURE_OPENAI_API_KEY}}
AZURE_OPENAI_ENDPOINT: ${{secrets.AZURE_OPENAI_ENDPOINT}}
AZURE_OPENAI_DEPLOYMENT_NAME: ${{secrets.AZURE_OPENAI_DEPLOYMENT_NAME}}
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,7 @@ Integrate with popular embedding models and providers to greatly simplify the pr
- [OpenAI](https://www.redisvl.com/api/vectorizer.html#openaitextvectorizer)
- [HuggingFace](https://www.redisvl.com/api/vectorizer.html#hftextvectorizer)
- [GCP VertexAI](https://www.redisvl.com/api/vectorizer.html#vertexaitextvectorizer)
- [VoyageAI](https://www.redisvl.com/api/vectorizer/html#voyageaitextvectorizer)

```python
from redisvl.utils.vectorize import CohereTextVectorizer
Expand Down
12 changes: 12 additions & 0 deletions docs/api/reranker.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,15 @@ CohereReranker
.. autoclass:: CohereReranker
:show-inheritance:
:members:


VoyageAIReranker
================

.. _voyageaireranker_api:

.. currentmodule:: redisvl.utils.rerank.voyageai

.. autoclass:: VoyageAIReranker
:show-inheritance:
:members:
11 changes: 11 additions & 0 deletions docs/api/vectorizer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,14 @@ CohereTextVectorizer
:show-inheritance:
:members:


VoyageAITextVectorizer
====================

.. _voyageaitextvectorizer_api:

.. currentmodule:: redisvl.utils.vectorize.text.voyageai

.. autoclass:: VoyageAITextVectorizer
:show-inheritance:
:members:
2 changes: 1 addition & 1 deletion docs/user_guide/getting_started_01.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -413,7 +413,7 @@
"source": [
"## Creating `VectorQuery` Objects\n",
"\n",
"Next we will create a vector query object for our newly populated index. This example will use a simple vector to demonstrate how vector similarity works. Vectors in production will likely be much larger than 3 floats and often require Machine Learning models (i.e. Huggingface sentence transformers) or an embeddings API (Cohere, OpenAI). `redisvl` provides a set of [Vectorizers](https://www.redisvl.com/user_guide/vectorizers_04.html#openai) to assist in vector creation."
"Next we will create a vector query object for our newly populated index. This example will use a simple vector to demonstrate how vector similarity works. Vectors in production will likely be much larger than 3 floats and often require Machine Learning models (i.e. Huggingface sentence transformers) or an embeddings API (Cohere, OpenAI, VoyageAI). `redisvl` provides a set of [Vectorizers](https://www.redisvl.com/user_guide/vectorizers_04.html#openai) to assist in vector creation."
]
},
{
Expand Down
48 changes: 45 additions & 3 deletions docs/user_guide/rerankers_06.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"\n",
"In this notebook, we will show how to use RedisVL to rerank search results\n",
"(documents or chunks or records) based on the input query. Today RedisVL\n",
"supports reranking through the [Cohere /rerank API](https://docs.cohere.com/docs/rerank-2).\n",
"supports reranking through the [Cohere /rerank API](https://docs.cohere.com/docs/rerank-2) or [VoyageAI /rerank API](https://docs.voyageai.com/docs/reranker).\n",
"\n",
"Before running this notebook, be sure to:\n",
"1. Have installed ``redisvl`` and have that environment active for this notebook.\n",
Expand Down Expand Up @@ -75,7 +75,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Init the Reranker\n",
"### Init the Cohere Reranker\n",
"\n",
"Initialize the reranker. Install the cohere library and provide the right Cohere API Key."
]
Expand Down Expand Up @@ -113,12 +113,54 @@
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Init the VoyageAI Reranker\n",
"\n",
"Initialize the reranker. Install the voyageai library and provide the right VoyageAI API Key."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#!pip install voyageai"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"\n",
"# setup the API Key\n",
"api_key = os.environ.get(\"VOYAGE_API_KEY\") or getpass.getpass(\"Enter your VoyageAI API key: \")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from redisvl.utils.rerank import VoyageAIReranker\n",
"\n",
"reranker = VoyageAIReranker(model=\"rerank-lite-1\", limit=3, api_config={\"api_key\": api_key})",
"# Please check the available models at https://docs.voyageai.com/docs/reranker"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Rerank documents\n",
"\n",
"Below we will use the `CohereReranker` to rerank and also truncate the list of\n",
"Below we will use the `CohereReranker` or the `VoyageAIReranker` to rerank and also truncate the list of\n",
"documents above based on relevance to the initial query."
]
},
Expand Down
71 changes: 71 additions & 0 deletions docs/user_guide/vectorizers_04.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
"2. HuggingFace\n",
"3. Vertex AI\n",
"4. Cohere\n",
"5. VoyageAI\n",
"\n",
"Before running this notebook, be sure to\n",
"1. Have installed ``redisvl`` and have that environment active for this notebook.\n",
Expand Down Expand Up @@ -501,6 +502,76 @@
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### VoyageAI\n",
"\n",
"[VoyageAI](https://dash.voyageai.com/) allows you to implement language AI into your product. The `VoyageAITextVectorizer` makes it simple to use RedisVL with the embeddings models at VoyageAI. For this you will need to install `voyageai`.\n",
"\n",
"```bash\n",
"pip install voyageai\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"# setup the API Key\n",
"api_key = os.environ.get(\"VOYAGE_API_KEY\") or getpass.getpass(\"Enter your VoyageAI API key: \")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"Special attention needs to be paid to the `input_type` parameter for each `embed` call. For example, for embedding \n",
"queries, you should set `input_type='query'`; for embedding documents, set `input_type='document'`. See\n",
"more information [here](https://docs.voyageai.com/docs/embeddings)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Vector dimensions: 1024\n",
"[0.015814896672964096, 0.046988241374492645, -0.00518248463049531, -0.05383478105068207, -0.015586535446345806, -0.0837097093462944, 0.03744547441601753, -0.007797810714691877, 0.00717928446829319, 0.06857716292142868]\n",
"Vector dimensions: 1024\n",
"[0.006725038401782513, 0.01441393606364727, -0.030212024226784706, -0.06782275438308716, -0.021446991711854935, -0.07667966187000275, 0.01804908737540245, -0.015767497941851616, -0.02152789570391178, 0.049741245806217194]\n"
]
}
],
"source": [
"from redisvl.utils.vectorize import VoyageAITextVectorizer\n",
"\n",
"# create a vectorizer\n",
"vo = VoyageAITextVectorizer(\n",
" model=\"voyage-law-2\", # Please check the available models at https://docs.voyageai.com/docs/embeddings\n",
" api_config={\"api_key\": api_key},\n",
")\n",
"\n",
"# embed a search query\n",
"test = vo.embed(\"This is a test sentence.\", input_type='query')\n",
"print(\"Vector dimensions: \", len(test))\n",
"print(test[:10])\n",
"\n",
"# embed a document\n",
"test = vo.embed(\"This is a test sentence.\", input_type='document')\n",
"print(\"Vector dimensions: \", len(test))\n",
"print(test[:10])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
Expand Down
Loading