Skip to content

Commit

Permalink
feat (provider/amazon-bedrock): add support for cache points (#4955)
Browse files Browse the repository at this point in the history
  • Loading branch information
shaper authored Feb 21, 2025
1 parent 49d614f commit d1475de
Show file tree
Hide file tree
Showing 18 changed files with 1,132 additions and 57 deletions.
5 changes: 5 additions & 0 deletions .changeset/slimy-lions-visit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@ai-sdk/amazon-bedrock': patch
---

feat (provider/amazon-bedrock): add support for cache points
97 changes: 97 additions & 0 deletions content/providers/01-ai-sdk-providers/08-amazon-bedrock.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,103 @@ if (result.providerMetadata?.bedrock.trace) {

See the [Amazon Bedrock Guardrails documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) for more information.

### Cache Points

<Note>
Amazon Bedrock prompt caching is currently in preview release. To request
access, visit the [Amazon Bedrock prompt caching
page](https://aws.amazon.com/bedrock/prompt-caching/).
</Note>

In messages, you can use the `providerOptions` property to set cache points. Set the `bedrock` property in the `providerOptions` object to `{ cachePoint: { type: 'default' } }` to create a cache point.

Cache usage information is returned in the `providerMetadata` object`. See examples below.

<Note>
Cache points have model-specific token minimums and limits. For example,
Claude 3.5 Sonnet v2 requires at least 1,024 tokens for a cache point and
allows up to 4 cache points. See the [Amazon Bedrock prompt caching
documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html)
for details on supported models, regions, and limits.
</Note>

```ts
import { bedrock } from '@ai-sdk/amazon-bedrock';
import { generateText } from 'ai';

const cyberpunkAnalysis =
'... literary analysis of cyberpunk themes and concepts ...';

const result = await generateText({
model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'),
messages: [
{
role: 'system',
content: `You are an expert on William Gibson's cyberpunk literature and themes. You have access to the following academic analysis: ${cyberpunkAnalysis}`,
providerOptions: {
bedrock: { cachePoint: { type: 'default' } },
},
},
{
role: 'user',
content:
'What are the key cyberpunk themes that Gibson explores in Neuromancer?',
},
],
});

console.log(result.text);
console.log(result.providerMetadata?.bedrock?.usage);
// Shows cache read/write token usage, e.g.:
// {
// cacheReadInputTokens: 1337,
// cacheWriteInputTokens: 42,
// }
```

Cache points also work with streaming responses:

```ts
import { bedrock } from '@ai-sdk/amazon-bedrock';
import { streamText } from 'ai';

const cyberpunkAnalysis =
'... literary analysis of cyberpunk themes and concepts ...';

const result = streamText({
model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'),
messages: [
{
role: 'assistant',
content: [
{ type: 'text', text: 'You are an expert on cyberpunk literature.' },
{ type: 'text', text: `Academic analysis: ${cyberpunkAnalysis}` },
],
providerOptions: { bedrock: { cachePoint: { type: 'default' } } },
},
{
role: 'user',
content:
'How does Gibson explore the relationship between humanity and technology?',
},
],
});

for await (const textPart of result.textStream) {
process.stdout.write(textPart);
}

console.log(
'Cache token usage:',
(await result.providerMetadata)?.bedrock?.usage,
);
// Shows cache read/write token usage, e.g.:
// {
// cacheReadInputTokens: 1337,
// cacheWriteInputTokens: 42,
// }
```

### Model Capabilities

| Model | Image Input | Object Generation | Tool Usage | Tool Streaming |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import { bedrock } from '@ai-sdk/amazon-bedrock';
import { generateText } from 'ai';
import 'dotenv/config';
import fs from 'node:fs';

const errorMessage = fs.readFileSync('data/error-message.txt', 'utf8');

async function main() {
const result = await generateText({
model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'),
messages: [
{
role: 'assistant',
content: [
{
type: 'text',
text: 'You are a JavaScript expert.',
},
{
type: 'text',
text: `Error message: ${errorMessage}`,
},
],
providerOptions: { bedrock: { cachePoint: { type: 'default' } } },
},
{
role: 'user',
content: [
{
type: 'text',
text: 'Explain the error message.',
},
],
},
],
});

console.log(result.text);
console.log();
console.log('Token usage:', result.usage);
console.log('Cache token usage:', result.providerMetadata?.bedrock?.usage);
console.log('Finish reason:', result.finishReason);
console.log('Response headers:', result.response.headers);
}

main().catch(console.error);
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import { bedrock } from '@ai-sdk/amazon-bedrock';
import { generateText } from 'ai';
import 'dotenv/config';
import fs from 'node:fs';

const errorMessage = fs.readFileSync('data/error-message.txt', 'utf8');

async function main() {
const result = await generateText({
model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'),
maxTokens: 512,
messages: [
{
role: 'system',
content: `You are a helpful assistant. You may be asked about ${errorMessage}.`,
providerOptions: {
bedrock: { cachePoint: { type: 'default' } },
},
},
{
role: 'user',
content: `Explain the error message`,
},
],
});

console.log(result.text);
console.log();
console.log('Token usage:', result.usage);
console.log('Cache token usage:', result.providerMetadata?.bedrock?.usage);
console.log('Finish reason:', result.finishReason);
console.log('Response headers:', result.response.headers);
}

main().catch(console.error);
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
import { generateText, tool } from 'ai';
import 'dotenv/config';
import { z } from 'zod';
import { bedrock } from '@ai-sdk/amazon-bedrock';

const weatherTool = tool({
description: 'Get the weather in a location',
parameters: z.object({
location: z.string().describe('The location to get the weather for'),
}),
// location below is inferred to be a string:
execute: async ({ location }) => ({
location,
temperature: weatherData[location],
}),
});

const weatherData: Record<string, number> = {
'New York': 72.4,
'Los Angeles': 84.2,
Chicago: 68.9,
Houston: 89.7,
Phoenix: 95.6,
Philadelphia: 71.3,
'San Antonio': 88.4,
'San Diego': 76.8,
Dallas: 86.5,
'San Jose': 75.2,
Austin: 87.9,
Jacksonville: 83.6,
'Fort Worth': 85.7,
Columbus: 69.8,
'San Francisco': 68.4,
Charlotte: 77.3,
Indianapolis: 70.6,
Seattle: 65.9,
Denver: 71.8,
'Washington DC': 74.5,
Boston: 69.7,
'El Paso': 91.2,
Detroit: 67.8,
Nashville: 78.4,
Portland: 66.7,
Memphis: 81.3,
'Oklahoma City': 82.9,
'Las Vegas': 93.4,
Louisville: 75.6,
Baltimore: 73.8,
Milwaukee: 66.5,
Albuquerque: 84.7,
Tucson: 92.3,
Fresno: 87.2,
Sacramento: 82.5,
Mesa: 94.8,
'Kansas City': 77.9,
Atlanta: 80.6,
Miami: 88.3,
Raleigh: 76.4,
Omaha: 73.5,
'Colorado Springs': 70.2,
'Long Beach': 79.8,
'Virginia Beach': 78.1,
Oakland: 71.4,
Minneapolis: 65.8,
Tulsa: 81.7,
Arlington: 85.3,
Tampa: 86.9,
'New Orleans': 84.5,
Wichita: 79.4,
Cleveland: 68.7,
Bakersfield: 88.6,
Aurora: 72.3,
Anaheim: 81.5,
Honolulu: 84.9,
'Santa Ana': 80.7,
Riverside: 89.2,
'Corpus Christi': 87.6,
Lexington: 74.8,
Henderson: 92.7,
Stockton: 83.9,
'Saint Paul': 66.2,
Cincinnati: 72.9,
Pittsburgh: 70.4,
Greensboro: 75.9,
Anchorage: 52.3,
Plano: 84.8,
Lincoln: 74.2,
Orlando: 85.7,
Irvine: 78.9,
Newark: 71.6,
Toledo: 69.3,
Durham: 77.1,
'Chula Vista': 77.4,
'Fort Wayne': 71.2,
'Jersey City': 72.7,
'St. Petersburg': 85.4,
Laredo: 90.8,
Madison: 67.3,
Chandler: 93.6,
Buffalo: 66.8,
Lubbock: 83.2,
Scottsdale: 94.1,
Reno: 76.5,
Glendale: 92.8,
Gilbert: 93.9,
'Winston-Salem': 76.2,
Irving: 85.1,
Hialeah: 87.8,
Garland: 84.6,
Fremont: 73.9,
Boise: 75.3,
Richmond: 76.7,
'Baton Rouge': 83.7,
Spokane: 67.4,
'Des Moines': 72.1,
Tacoma: 66.3,
'San Bernardino': 88.1,
Modesto: 84.3,
Fontana: 87.4,
'Santa Clarita': 82.6,
Birmingham: 81.9,
};

async function main() {
const result = await generateText({
model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'),
tools: {
weather: weatherTool,
},
prompt: 'What is the weather in San Francisco?',
// TODO: need a way to set cachePoint on `tools`.
providerOptions: {
bedrock: {
cachePoint: {
type: 'default',
},
},
},
});

// typed tool calls:
for (const toolCall of result.toolCalls) {
switch (toolCall.toolName) {
case 'weather': {
toolCall.args.location; // string
break;
}
}
}

// typed tool results for tools with execute method:
for (const toolResult of result.toolResults) {
switch (toolResult.toolName) {
case 'weather': {
toolResult.args.location; // string
toolResult.result.location; // string
toolResult.result.temperature; // number
break;
}
}
}

console.log(result.text);
console.log(JSON.stringify(result.toolCalls, null, 2));
console.log(JSON.stringify(result.providerMetadata, null, 2));
}

main().catch(console.error);
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import { bedrock } from '@ai-sdk/amazon-bedrock';
import { generateText } from 'ai';
import 'dotenv/config';
import fs from 'node:fs';

async function main() {
const result = await generateText({
model: bedrock('anthropic.claude-3-5-sonnet-20241022-v2:0'),
messages: [
{
role: 'user',
content: [
{ type: 'image', image: fs.readFileSync('./data/comic-cat.png') },
{
type: 'text',
text: 'What is in this image?',
},
],
providerOptions: { bedrock: { cachePoint: { type: 'default' } } },
},
],
});

console.log(result.text);
console.log();
console.log('Token usage:', result.usage);
// TODO: no cache token usage for some reason
// https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
// the only delta is some of the lead-in to passing the message bytes, and
// perhaps the size of the image.
console.log('Cache token usage:', result.providerMetadata?.bedrock?.usage);
console.log('Finish reason:', result.finishReason);
console.log('Response headers:', result.response.headers);
}

main().catch(console.error);
Loading

0 comments on commit d1475de

Please sign in to comment.