-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add SingleStore vectorstore integration (#1409)
* add SingleStore vectorstore integration * undo wrong index formatting * update yarn.lock file * fix merge error * switched to mysql2/promise and addressed other review comments * update docs * sanitize sql queries * sanitize sql queries --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
- Loading branch information
1 parent
b495505
commit 08bfda4
Showing
13 changed files
with
381 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
30 changes: 30 additions & 0 deletions
30
docs/docs/modules/indexes/vector_stores/integrations/singlestore.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
--- | ||
sidebar_class_name: node-only | ||
--- | ||
|
||
import CodeBlock from "@theme/CodeBlock"; | ||
|
||
# SingleStore | ||
|
||
[SingleStoreDB](https://singlestore.com/) is a high-performing, distributed database system. For an extended period, it has offered support for vector functions such as [dot_product](https://docs.singlestore.com/managed-service/en/reference/sql-reference/vector-functions/dot_product.html), thus establishing itself as an optimal solution for AI applications necessitating text similarity matching. | ||
|
||
:::tip Compatibility | ||
Only available on Node.js. | ||
::: | ||
|
||
LangChain.js accepts `mysql2/promise Pool` as the connections pool for SingleStore vectorstore. | ||
|
||
## Setup | ||
|
||
1. Establish a SingleStoreDB environment. You have the flexibility to choose between [Cloud-based](https://docs.singlestore.com/managed-service/en/getting-started-with-singlestoredb-cloud.html) or [On-Premise](https://docs.singlestore.com/db/v8.1/en/developer-resources/get-started-using-singlestoredb-for-free.html) editions. | ||
2. Install the mysql2 JS client | ||
|
||
```bash npm2yarn | ||
npm install -S mysql2 | ||
``` | ||
|
||
## Usage | ||
|
||
import UsageExample from "@examples/indexes/vector_stores/singlestore.ts"; | ||
|
||
<CodeBlock language="typescript">{UsageExample}</CodeBlock> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
import { SingleStoreVectorStore } from "langchain/vectorstores/singlestore"; | ||
import { OpenAIEmbeddings } from "langchain/embeddings/openai"; | ||
import { createPool } from "mysql2/promise"; | ||
|
||
export const run = async () => { | ||
const pool = createPool({ | ||
host: process.env.SINGLESTORE_HOST, | ||
port: Number(process.env.SINGLESTORE_PORT), | ||
user: process.env.SINGLESTORE_USERNAME, | ||
password: process.env.SINGLESTORE_PASSWORD, | ||
database: process.env.SINGLESTORE_DATABASE, | ||
}); | ||
|
||
const vectorStore = await SingleStoreVectorStore.fromTexts( | ||
["Hello world", "Bye bye", "hello nice world"], | ||
[{ id: 2 }, { id: 1 }, { id: 3 }], | ||
new OpenAIEmbeddings(), | ||
{ | ||
connectionPool: pool, | ||
} | ||
); | ||
|
||
const resultOne = await vectorStore.similaritySearch("hello world", 1); | ||
console.log(resultOne); | ||
await pool.end(); | ||
}; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
import type { | ||
Pool, | ||
RowDataPacket, | ||
OkPacket, | ||
ResultSetHeader, | ||
FieldPacket, | ||
} from "mysql2/promise"; | ||
import { format } from "mysql2"; | ||
import { VectorStore } from "./base.js"; | ||
import { Embeddings } from "../embeddings/base.js"; | ||
import { Document } from "../document.js"; | ||
|
||
export interface SingleStoreVectorStoreConfig { | ||
connectionPool: Pool; | ||
tableName?: string; | ||
contentColumnName?: string; | ||
vectorColumnName?: string; | ||
metadataColumnName?: string; | ||
} | ||
|
||
export class SingleStoreVectorStore extends VectorStore { | ||
connectionPool: Pool; | ||
|
||
tableName: string; | ||
|
||
contentColumnName: string; | ||
|
||
vectorColumnName: string; | ||
|
||
metadataColumnName: string; | ||
|
||
constructor(embeddings: Embeddings, config: SingleStoreVectorStoreConfig) { | ||
super(embeddings, config); | ||
this.connectionPool = config.connectionPool; | ||
this.tableName = config.tableName ?? "embeddings"; | ||
this.contentColumnName = config.contentColumnName ?? "content"; | ||
this.vectorColumnName = config.vectorColumnName ?? "vector"; | ||
this.metadataColumnName = config.metadataColumnName ?? "metadata"; | ||
} | ||
|
||
async createTableIfNotExists(): Promise<void> { | ||
await this.connectionPool | ||
.execute(`CREATE TABLE IF NOT EXISTS ${this.tableName} ( | ||
${this.contentColumnName} TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci, | ||
${this.vectorColumnName} BLOB, | ||
${this.metadataColumnName} JSON);`); | ||
} | ||
|
||
async addDocuments(documents: Document[]): Promise<void> { | ||
const texts = documents.map(({ pageContent }) => pageContent); | ||
const vectors = await this.embeddings.embedDocuments(texts); | ||
return this.addVectors(vectors, documents); | ||
} | ||
|
||
async addVectors(vectors: number[][], documents: Document[]): Promise<void> { | ||
await this.createTableIfNotExists(); | ||
const { tableName } = this; | ||
|
||
await Promise.all( | ||
vectors.map(async (vector, idx) => { | ||
try { | ||
await this.connectionPool.execute( | ||
format( | ||
`INSERT INTO ${tableName} VALUES (?, JSON_ARRAY_PACK('[?]'), ?);`, | ||
[ | ||
documents[idx].pageContent, | ||
vector, | ||
JSON.stringify(documents[idx].metadata), | ||
] | ||
) | ||
); | ||
} catch (error) { | ||
console.error(`Error adding vector at index ${idx}:`, error); | ||
} | ||
}) | ||
); | ||
} | ||
|
||
async similaritySearchVectorWithScore( | ||
query: number[], | ||
k: number, | ||
_filter?: undefined | ||
): Promise<[Document, number][]> { | ||
// use vector DOT_PRODUCT as a distance function | ||
const [rows]: [ | ||
( | ||
| RowDataPacket[] | ||
| RowDataPacket[][] | ||
| OkPacket | ||
| OkPacket[] | ||
| ResultSetHeader | ||
), | ||
FieldPacket[] | ||
] = await this.connectionPool.query( | ||
format( | ||
`SELECT ${this.contentColumnName}, | ||
${this.metadataColumnName}, | ||
DOT_PRODUCT(${this.vectorColumnName}, JSON_ARRAY_PACK('[?]')) as __score FROM ${this.tableName} | ||
ORDER BY __score DESC LIMIT ?;`, | ||
[query, k] | ||
) | ||
); | ||
const result: [Document, number][] = []; | ||
for (const row of rows as RowDataPacket[]) { | ||
const rowData = row as unknown as Record<string, unknown>; | ||
result.push([ | ||
new Document({ | ||
pageContent: rowData[this.contentColumnName] as string, | ||
metadata: rowData[this.metadataColumnName] as Record<string, unknown>, | ||
}), | ||
Number(rowData.score), | ||
]); | ||
} | ||
return result; | ||
} | ||
|
||
static async fromTexts( | ||
texts: string[], | ||
metadatas: object[], | ||
embeddings: Embeddings, | ||
dbConfig: SingleStoreVectorStoreConfig | ||
): Promise<SingleStoreVectorStore> { | ||
const docs = texts.map((text, idx) => { | ||
const metadata = Array.isArray(metadatas) ? metadatas[idx] : metadatas; | ||
return new Document({ | ||
pageContent: text, | ||
metadata, | ||
}); | ||
}); | ||
return SingleStoreVectorStore.fromDocuments(docs, embeddings, dbConfig); | ||
} | ||
|
||
static async fromDocuments( | ||
docs: Document[], | ||
embeddings: Embeddings, | ||
dbConfig: SingleStoreVectorStoreConfig | ||
): Promise<SingleStoreVectorStore> { | ||
const instance = new this(embeddings, dbConfig); | ||
await instance.addDocuments(docs); | ||
return instance; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
/* eslint-disable no-process-env */ | ||
/* eslint-disable import/no-extraneous-dependencies */ | ||
import { test, expect } from "@jest/globals"; | ||
import { createPool } from "mysql2/promise"; | ||
import { OpenAIEmbeddings } from "../../embeddings/openai.js"; | ||
import { SingleStoreVectorStore } from "../singlestore.js"; | ||
import { Document } from "../../document.js"; | ||
|
||
test("SingleStoreVectorStore", async () => { | ||
expect(process.env.SINGLESTORE_HOST).toBeDefined(); | ||
expect(process.env.SINGLESTORE_PORT).toBeDefined(); | ||
expect(process.env.SINGLESTORE_USERNAME).toBeDefined(); | ||
expect(process.env.SINGLESTORE_PASSWORD).toBeDefined(); | ||
expect(process.env.SINGLESTORE_DATABASE).toBeDefined(); | ||
|
||
const pool = createPool({ | ||
host: process.env.SINGLESTORE_HOST, | ||
port: Number(process.env.SINGLESTORE_PORT), | ||
user: process.env.SINGLESTORE_USERNAME, | ||
password: process.env.SINGLESTORE_PASSWORD, | ||
database: process.env.SINGLESTORE_DATABASE, | ||
}); | ||
const vectorStore = await SingleStoreVectorStore.fromTexts( | ||
["Hello world", "Bye bye", "hello nice world"], | ||
[ | ||
{ id: 2, name: "2" }, | ||
{ id: 1, name: "1" }, | ||
{ id: 3, name: "3" }, | ||
], | ||
new OpenAIEmbeddings(), | ||
{ | ||
connectionPool: pool, | ||
contentColumnName: "cont", | ||
metadataColumnName: "met", | ||
vectorColumnName: "vec", | ||
} | ||
); | ||
expect(vectorStore).toBeDefined(); | ||
|
||
const results = await vectorStore.similaritySearch("hello world", 1); | ||
|
||
expect(results).toEqual([ | ||
new Document({ | ||
pageContent: "Hello world", | ||
metadata: { id: 2, name: "2" }, | ||
}), | ||
]); | ||
|
||
await vectorStore.addDocuments([ | ||
new Document({ | ||
pageContent: "Green forest", | ||
metadata: { id: 4, name: "4" }, | ||
}), | ||
new Document({ | ||
pageContent: "Green field", | ||
metadata: { id: 5, name: "5" }, | ||
}), | ||
]); | ||
|
||
const results2 = await vectorStore.similaritySearch("forest", 1); | ||
|
||
expect(results2).toEqual([ | ||
new Document({ | ||
pageContent: "Green forest", | ||
metadata: { id: 4, name: "4" }, | ||
}), | ||
]); | ||
|
||
await pool.end(); | ||
}); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
08bfda4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Successfully deployed to the following URLs:
langchainjs-docs – ./
langchainjs-docs-langchain.vercel.app
langchainjs-docs-git-main-langchain.vercel.app
langchainjs-docs-ruddy.vercel.app
js.langchain.com