Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use Bedrock with LiteLLM #281

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
234 changes: 234 additions & 0 deletions poc-to-prod/bedrock_with_litellm.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to use Bedrock with LiteLLM\n",
"This notebook demonstrates how to use Bedrock with LiteLLM. Making calls to various Large Language Models in Amazon Bedrock using LiteLLM SDK is simplified in this notebook.\n",
"\n",
"Customers implementing GenAI workloads at scale in production look for Load-balancing, routing and budget controls. This is easily achieved using an sdk such as LiteLLM which offers a range of features including load-balancing, budget and routing.\n",
"\n",
"In this example, you will learn how to make calls to Bedrock API using LiteLLM.\n",
"1. Overview - The notebook demonstrates how to use Amazon Bedrock with LiteLLM, simplifying calls to various Large Language Models (LLMs) in Amazon Bedrock using the LiteLLM SDK.\n",
" 1. What are we demonstrating - We're showing how to make API calls to Amazon Bedrock using LiteLLM, including basic completions, function calling, using guardrails, and streaming responses.\n",
" 2. What use case - This is useful for customers implementing GenAI workloads at scale in production who need load-balancing, routing, and budget controls, which LiteLLM offers.\n",
" 3. What will you learn:\n",
" - How to set up and use LiteLLM with Amazon Bedrock\n",
" - Making basic completion calls to Bedrock models\n",
" - Using function calling with Bedrock and LiteLLM\n",
" - Implementing Bedrock Guardrails with LiteLLM\n",
" - Using streaming for responses\n",
"2. What is the architectural pattern and why we select this - The pattern uses LiteLLM as an abstraction layer over Amazon Bedrock. This is chosen because it simplifies API calls and provides additional features like load-balancing and budget controls.\n",
" [User Application] -> [LiteLLM SDK] -> [Amazon Bedrock API] -> [Various LLMs]\n",
"3. What are the libraries to install - litellm, boto3\n",
"4. What model did we choose and why this model - The notebook uses \"bedrock/anthropic.claude-3-haiku-20240307-v1:0\", which is Claude 3 Haiku model from Anthropic on Amazon Bedrock. This model is likely chosen for its capabilities and availability on Bedrock.\n",
"5. Every cell needs to have a markup - The notebook does include markdown cells explaining each section. The code is structured into sections for different functionalities (basic usage, function calling, guardrails, streaming)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's start by importing the libraries used in this example"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python3 -m pip install litellm --quiet\n",
"!python3 -m pip install boto3 --quiet # LiteLLM requires boto3 to be installed for Bedrock APi requests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup AWS Credentials\n",
"We'll now setup AWS Environment variables to use a region and a profile/iam_role as below"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# set env\n",
"# os.environ[\"AWS_ACCESS_KEY_ID\"] = \"\"\n",
"# os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"\"\n",
"# os.environ[\"AWS_REGION_NAME\"] = \"\"\n",
"\n",
"from litellm import completion\n",
"\n",
"response = completion(\n",
" model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
" messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
" max_tokens=1000,\n",
" temperature=0.7\n",
" # aws_profile_name=\"\"\n",
")\n",
"print (response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can also specify a region in the call as follows:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from litellm import completion\n",
"\n",
"response = completion(\n",
" model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
" messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
" max_tokens=1000,\n",
" temperature=0.7,\n",
" aws_region_name=\"us-west-2\"\n",
")\n",
"print (response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## How to use Function Calling with Bedrock + LiteLM"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from litellm import completion\n",
"\n",
"tools = [\n",
" {\n",
" \"type\": \"function\",\n",
" \"function\": {\n",
" \"name\": \"get_current_weather\",\n",
" \"description\": \"Get the current weather in a given location\",\n",
" \"parameters\": {\n",
" \"type\": \"object\",\n",
" \"properties\": {\n",
" \"location\": {\n",
" \"type\": \"string\",\n",
" \"description\": \"The city and state, e.g. San Francisco, CA\",\n",
" },\n",
" \"unit\": {\"type\": \"string\", \"enum\": [\"celsius\", \"fahrenheit\"]},\n",
" },\n",
" \"required\": [\"location\"],\n",
" },\n",
" },\n",
" }\n",
"]\n",
"messages = [{\"role\": \"user\", \"content\": \"What's the weather like in Boston today?\"}]\n",
"\n",
"response = completion(\n",
" model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
" messages=messages,\n",
" tools=tools,\n",
" tool_choice=\"auto\",\n",
")\n",
"\n",
"# Add any assertions, here to check response args\n",
"print(response)\n",
"assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)\n",
"assert isinstance(\n",
" response.choices[0].message.tool_calls[0].function.arguments, str\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## How to use Bedrock Guardrails with LiteLM\n",
"1. We first setup a new GuardRail in AWS Console --> Bedrock --> GuardRails.\n",
"2. Once a Guardrail is setup, fetch the `Guardrail ID and Version` and replace the `$guardrailIdentifier` & `guardrailVersion`in the code below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from litellm import completion\n",
"\n",
"response = completion(\n",
" model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
" messages=[{ \"content\": \"Who were the winners of Olympics 2021 Swimming?\",\"role\": \"user\"}],\n",
" max_tokens=100,\n",
" temperature=0.7,\n",
" guardrailConfig={\n",
" \"guardrailIdentifier\": \"REPLACE_ME\", # The identifier (ID) for the guardrail.\n",
" \"guardrailVersion\": \"REPLACE_ME\", # The version of the guardrail.\n",
" \"trace\": \"enabled\", # The trace behavior for the guardrail. Can either be \"disabled\" or \"enabled\"\n",
" },\n",
")\n",
"print (response.choices[0].message.content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use Streaming"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from litellm import completion\n",
"\n",
"response = completion(\n",
" model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
" messages=[{\"role\": \"user\", \"content\": \"Who were the winners of Olympics 2021 Swimming?\"}],\n",
" max_tokens=100,\n",
" temperature=0.7,\n",
" stream=True\n",
")\n",
"for chunk in response:\n",
" print(chunk.choices[0].delta.content or \"\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}