Skip to content

Commit

Permalink
feat: Add Cloudflare Vectorize and Workers AI embeddings (#2740)
Browse files Browse the repository at this point in the history
* Add Cloudflare Vectorize and Workers AI embeddings

* Add id options

* Make params mandatory in CloudflareWorkersAIEmbeddings constructor

* Add Vectorize/WorkersAI example to CF test exports

* Small interface tweaks, add docs

* Revert optional dep listing

* Fix typo

* Fix typo

* Update docs, fix CI

* Fix build

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
  • Loading branch information
mhart and jacoblee93 authored Sep 29, 2023
1 parent bb6204a commit 03911cc
Show file tree
Hide file tree
Showing 15 changed files with 744 additions and 29 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
hide_table_of_contents: true
---

# Cloudflare Workers AI

If you're deploying your project in a Cloudflare worker, you can use Cloudflare's [built-in Workers AI embeddings](https://developers.cloudflare.com/workers-ai/) with LangChain.js.

## Setup

First, [follow the official docs](https://developers.cloudflare.com/workers-ai/get-started/workers-wrangler/) to set up your worker.

You'll also need to install the official Cloudflare AI SDK:

```bash npm2yarn
npm install @cloudflare/ai
```

## Usage

Below is an example worker that uses Workers AI embeddings with a [Cloudflare Vectorize](/docs/modules/data_connection/vectorstores/integrations/cloudflare_vectorize) vectorstore.

:::note
If running locally, be sure to run wrangler as `npx wrangler dev --remote`!
:::

```toml
name = "langchain-test"
main = "worker.js"
compatibility_date = "2023-09-22"

[[vectorize]]
binding = "VECTORIZE_INDEX"
index_name = "langchain-test"

[ai]
binding = "AI"
```

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/indexes/vector_stores/cloudflare_vectorize/example.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
hide_table_of_contents: true
---

# Cloudflare Vectorize

If you're deploying your project in a Cloudflare worker, you can use [Cloudflare Vectorize](https://developers.cloudflare.com/vectorize/) with LangChain.js.
It's a powerful and convenient option that's built directly into Cloudflare.

## Setup

:::tip Compatibility
Cloudflare Vectorize is currently in open beta, and requires a Cloudflare account on a paid plan to use.
:::

After [setting up your project](https://developers.cloudflare.com/vectorize/get-started/intro/#prerequisites),
create an index by running the following Wrangler command:

```bash
$ npx wrangler vectorize create <index_name> --preset @cf/baai/bge-small-en-v1.5
```

You can see a full list of options for the `vectorize` command [in the official documentation](https://developers.cloudflare.com/workers/wrangler/commands/#vectorize).

You'll then need to update your `wrangler.toml` file to include an entry for `[[vectorize]]`:

```toml
[[vectorize]]
binding = "VECTORIZE_INDEX"
index_name = "<index_name>"
```

## Usage

Below is an example worker that adds documents to a vectorstore, queries it, or clears it depending on the path used. It also uses [Cloudflare Workers AI Embeddings](/docs/modules/data_connection/text_embedding/integrations/cloudflare_ai).

:::note
If running locally, be sure to run wrangler as `npx wrangler dev --remote`!
:::

```toml
name = "langchain-test"
main = "worker.js"
compatibility_date = "2023-09-22"

[[vectorize]]
binding = "VECTORIZE_INDEX"
index_name = "langchain-test"

[ai]
binding = "AI"
```

import CodeBlock from "@theme/CodeBlock";
import Example from "@examples/indexes/vector_stores/cloudflare_vectorize/example.ts";

<CodeBlock language="typescript">{Example}</CodeBlock>
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Vercel Postgres

LangChain.js supports using the [`@vercel/postgres`](https://www.npmjs.com/package/@vercel/postgres) package to use to generic Postgres databases
LangChain.js supports using the [`@vercel/postgres`](https://www.npmjs.com/package/@vercel/postgres) package to use generic Postgres databases
as vector stores, provided they support the [`pgvector`](https://github.com/pgvector/pgvector) Postgres extension.

This integration is particularly useful from web environments like Edge functions.
Expand All @@ -13,7 +13,7 @@ To work with Vercel Postgres, you need to install the `@vercel/postgres` package
npm install @vercel/postgres
```

This integration automatically connects using the connection string set under `process.env.POSTGRES_URL`.
This integration automatically connects using the connection string set under `process.env.POSTGRES_URL`.
You can also pass a connection string manually like this:

```typescript
Expand All @@ -29,7 +29,7 @@ const vectorstore = await VercelPostgres.initialize(

### Connecting to Vercel Postgres

A simple way to get started is to create a serverless [Vercel Postgres instance](https://vercel.com/docs/storage/vercel-postgres/quickstart).
A simple way to get started is to create a serverless [Vercel Postgres instance](https://vercel.com/docs/storage/vercel-postgres/quickstart).
If you're deploying to a Vercel project with an associated Vercel Postgres instance, the required `POSTGRES_URL` environment variable
will already be populated in hosted environments.

Expand Down
19 changes: 9 additions & 10 deletions environment_tests/test-exports-cf/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,17 +36,16 @@ export default {
env: Env,
ctx: ExecutionContext
): Promise<Response> {

const constructorParameters
= env.AZURE_OPENAI_API_KEY ? {
azureOpenAIApiKey: env.AZURE_OPENAI_API_KEY,
azureOpenAIApiInstanceName: env.AZURE_OPENAI_API_INSTANCE_NAME,
azureOpenAIApiDeploymentName: env.AZURE_OPENAI_API_DEPLOYMENT_NAME,
azureOpenAIApiVersion: env.AZURE_OPENAI_API_VERSION,
}
const constructorParameters = env.AZURE_OPENAI_API_KEY
? {
azureOpenAIApiKey: env.AZURE_OPENAI_API_KEY,
azureOpenAIApiInstanceName: env.AZURE_OPENAI_API_INSTANCE_NAME,
azureOpenAIApiDeploymentName: env.AZURE_OPENAI_API_DEPLOYMENT_NAME,
azureOpenAIApiVersion: env.AZURE_OPENAI_API_VERSION,
}
: {
openAIApiKey: env.OPENAI_API_KEY,
}
openAIApiKey: env.OPENAI_API_KEY,
};

// Intantiate a few things to test the exports
new OpenAI(constructorParameters);
Expand Down
9 changes: 8 additions & 1 deletion environment_tests/test-exports-cf/wrangler.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
name = "test-exports-cf"
main = "src/index.ts"
compatibility_date = "2023-04-05"
compatibility_date = "2023-09-22"

[[vectorize]]
binding = "VECTORIZE_INDEX"
index_name = "langchain-test"

[ai]
binding = "AI"
56 changes: 56 additions & 0 deletions examples/src/indexes/vector_stores/cloudflare_vectorize/example.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
import type {
VectorizeIndex,
Fetcher,
Request,
} from "@cloudflare/workers-types";

import { CloudflareVectorizeStore } from "langchain/vectorstores/cloudflare_vectorize";
import { CloudflareWorkersAIEmbeddings } from "langchain/embeddings/cloudflare_workersai";

export interface Env {
VECTORIZE_INDEX: VectorizeIndex;
AI: Fetcher;
}

export default {
async fetch(request: Request, env: Env) {
const { pathname } = new URL(request.url);
const embeddings = new CloudflareWorkersAIEmbeddings({
binding: env.AI,
modelName: "@cf/baai/bge-small-en-v1.5",
});
const store = new CloudflareVectorizeStore(embeddings, {
index: env.VECTORIZE_INDEX,
});
if (pathname === "/") {
const results = await store.similaritySearch("hello", 5);
return Response.json(results);
} else if (pathname === "/load") {
// Upsertion by id is supported
await store.addDocuments(
[
{
pageContent: "hello",
metadata: {},
},
{
pageContent: "world",
metadata: {},
},
{
pageContent: "hi",
metadata: {},
},
],
{ ids: ["id1", "id2", "id3"] }
);

return Response.json({ success: true });
} else if (pathname === "/clear") {
await store.delete({ ids: ["id1", "id2", "id3"] });
return Response.json({ success: true });
}

return Response.json({ error: "Not Found" }, { status: 404 });
},
};
6 changes: 6 additions & 0 deletions langchain/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,9 @@ embeddings/base.d.ts
embeddings/cache_backed.cjs
embeddings/cache_backed.js
embeddings/cache_backed.d.ts
embeddings/cloudflare_workersai.cjs
embeddings/cloudflare_workersai.js
embeddings/cloudflare_workersai.d.ts
embeddings/fake.cjs
embeddings/fake.js
embeddings/fake.d.ts
Expand Down Expand Up @@ -166,6 +169,9 @@ vectorstores/elasticsearch.d.ts
vectorstores/memory.cjs
vectorstores/memory.js
vectorstores/memory.d.ts
vectorstores/cloudflare_vectorize.cjs
vectorstores/cloudflare_vectorize.js
vectorstores/cloudflare_vectorize.d.ts
vectorstores/chroma.cjs
vectorstores/chroma.js
vectorstores/chroma.d.ts
Expand Down
23 changes: 20 additions & 3 deletions langchain/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,9 @@
"embeddings/cache_backed.cjs",
"embeddings/cache_backed.js",
"embeddings/cache_backed.d.ts",
"embeddings/cloudflare_workersai.cjs",
"embeddings/cloudflare_workersai.js",
"embeddings/cloudflare_workersai.d.ts",
"embeddings/fake.cjs",
"embeddings/fake.js",
"embeddings/fake.d.ts",
Expand Down Expand Up @@ -178,6 +181,9 @@
"vectorstores/memory.cjs",
"vectorstores/memory.js",
"vectorstores/memory.d.ts",
"vectorstores/cloudflare_vectorize.cjs",
"vectorstores/cloudflare_vectorize.js",
"vectorstores/cloudflare_vectorize.d.ts",
"vectorstores/chroma.cjs",
"vectorstores/chroma.js",
"vectorstores/chroma.d.ts",
Expand Down Expand Up @@ -668,7 +674,8 @@
"@aws-sdk/types": "^3.357.0",
"@azure/storage-blob": "^12.15.0",
"@clickhouse/client": "^0.0.14",
"@cloudflare/workers-types": "^4.20230904.0",
"@cloudflare/ai": "^1.0.12",
"@cloudflare/workers-types": "^4.20230922.0",
"@elastic/elasticsearch": "^8.4.0",
"@faker-js/faker": "^7.6.0",
"@getmetal/metal-sdk": "^4.0.0",
Expand Down Expand Up @@ -789,7 +796,7 @@
"@aws-sdk/credential-provider-node": "^3.388.0",
"@azure/storage-blob": "^12.15.0",
"@clickhouse/client": "^0.0.14",
"@cloudflare/workers-types": "^4.20230904.0",
"@cloudflare/ai": "^1.0.12",
"@elastic/elasticsearch": "^8.4.0",
"@getmetal/metal-sdk": "*",
"@getzep/zep-js": "^0.7.0",
Expand Down Expand Up @@ -893,7 +900,7 @@
"@clickhouse/client": {
"optional": true
},
"@cloudflare/workers-types": {
"@cloudflare/ai": {
"optional": true
},
"@elastic/elasticsearch": {
Expand Down Expand Up @@ -1262,6 +1269,11 @@
"import": "./embeddings/cache_backed.js",
"require": "./embeddings/cache_backed.cjs"
},
"./embeddings/cloudflare_workersai": {
"types": "./embeddings/cloudflare_workersai.d.ts",
"import": "./embeddings/cloudflare_workersai.js",
"require": "./embeddings/cloudflare_workersai.cjs"
},
"./embeddings/fake": {
"types": "./embeddings/fake.d.ts",
"import": "./embeddings/fake.js",
Expand Down Expand Up @@ -1432,6 +1444,11 @@
"import": "./vectorstores/memory.js",
"require": "./vectorstores/memory.cjs"
},
"./vectorstores/cloudflare_vectorize": {
"types": "./vectorstores/cloudflare_vectorize.d.ts",
"import": "./vectorstores/cloudflare_vectorize.js",
"require": "./vectorstores/cloudflare_vectorize.cjs"
},
"./vectorstores/chroma": {
"types": "./vectorstores/chroma.d.ts",
"import": "./vectorstores/chroma.js",
Expand Down
4 changes: 4 additions & 0 deletions langchain/scripts/create-entrypoints.js
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ const entrypoints = {
// embeddings
"embeddings/base": "embeddings/base",
"embeddings/cache_backed": "embeddings/cache_backed",
"embeddings/cloudflare_workersai": "embeddings/cloudflare_workersai",
"embeddings/fake": "embeddings/fake",
"embeddings/ollama": "embeddings/ollama",
"embeddings/openai": "embeddings/openai",
Expand Down Expand Up @@ -72,6 +73,7 @@ const entrypoints = {
"vectorstores/base": "vectorstores/base",
"vectorstores/elasticsearch": "vectorstores/elasticsearch",
"vectorstores/memory": "vectorstores/memory",
"vectorstores/cloudflare_vectorize": "vectorstores/cloudflare_vectorize",
"vectorstores/chroma": "vectorstores/chroma",
"vectorstores/googlevertexai": "vectorstores/googlevertexai",
"vectorstores/hnswlib": "vectorstores/hnswlib",
Expand Down Expand Up @@ -280,6 +282,7 @@ const requiresOptionalDependency = [
"callbacks/handlers/llmonitor",
"chains/load",
"chains/sql_db",
"embeddings/cloudflare_workersai",
"embeddings/cohere",
"embeddings/googlevertexai",
"embeddings/googlepalm",
Expand All @@ -301,6 +304,7 @@ const requiresOptionalDependency = [
"prompts/load",
"vectorstores/analyticdb",
"vectorstores/chroma",
"vectorstores/cloudflare_vectorize",
"vectorstores/googlevertexai",
"vectorstores/elasticsearch",
"vectorstores/hnswlib",
Expand Down
Loading

1 comment on commit 03911cc

@vercel
Copy link

@vercel vercel bot commented on 03911cc Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.