From 529701b5c5e64d14d6922c6c352c8fb351282ea5 Mon Sep 17 00:00:00 2001 From: amit-lulla Date: Wed, 21 Aug 2024 12:33:45 +0100 Subject: [PATCH] Bedrock with LiteLLM --- poc-to-prod/bedrock_with_litellm.ipynb | 234 +++++++++++++++++++++++++ 1 file changed, 234 insertions(+) create mode 100644 poc-to-prod/bedrock_with_litellm.ipynb diff --git a/poc-to-prod/bedrock_with_litellm.ipynb b/poc-to-prod/bedrock_with_litellm.ipynb new file mode 100644 index 00000000..2671838d --- /dev/null +++ b/poc-to-prod/bedrock_with_litellm.ipynb @@ -0,0 +1,234 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# How to use Bedrock with LiteLLM\n", + "This notebook demonstrates how to use Bedrock with LiteLLM. Making calls to various Large Language Models in Amazon Bedrock using LiteLLM SDK is simplified in this notebook.\n", + "\n", + "Customers implementing GenAI workloads at scale in production look for Load-balancing, routing and budget controls. This is easily achieved using an sdk such as LiteLLM which offers a range of features including load-balancing, budget and routing.\n", + "\n", + "In this example, you will learn how to make calls to Bedrock API using LiteLLM.\n", + "1. Overview - The notebook demonstrates how to use Amazon Bedrock with LiteLLM, simplifying calls to various Large Language Models (LLMs) in Amazon Bedrock using the LiteLLM SDK.\n", + " 1. What are we demonstrating - We're showing how to make API calls to Amazon Bedrock using LiteLLM, including basic completions, function calling, using guardrails, and streaming responses.\n", + " 2. What use case - This is useful for customers implementing GenAI workloads at scale in production who need load-balancing, routing, and budget controls, which LiteLLM offers.\n", + " 3. What will you learn:\n", + " - How to set up and use LiteLLM with Amazon Bedrock\n", + " - Making basic completion calls to Bedrock models\n", + " - Using function calling with Bedrock and LiteLLM\n", + " - Implementing Bedrock Guardrails with LiteLLM\n", + " - Using streaming for responses\n", + "2. What is the architectural pattern and why we select this - The pattern uses LiteLLM as an abstraction layer over Amazon Bedrock. This is chosen because it simplifies API calls and provides additional features like load-balancing and budget controls.\n", + " [User Application] -> [LiteLLM SDK] -> [Amazon Bedrock API] -> [Various LLMs]\n", + "3. What are the libraries to install - litellm, boto3\n", + "4. What model did we choose and why this model - The notebook uses \"bedrock/anthropic.claude-3-haiku-20240307-v1:0\", which is Claude 3 Haiku model from Anthropic on Amazon Bedrock. This model is likely chosen for its capabilities and availability on Bedrock.\n", + "5. Every cell needs to have a markup - The notebook does include markdown cells explaining each section. The code is structured into sections for different functionalities (basic usage, function calling, guardrails, streaming)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's start by importing the libraries used in this example" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!python3 -m pip install litellm --quiet\n", + "!python3 -m pip install boto3 --quiet # LiteLLM requires boto3 to be installed for Bedrock APi requests" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Setup AWS Credentials\n", + "We'll now setup AWS Environment variables to use a region and a profile/iam_role as below" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# set env\n", + "# os.environ[\"AWS_ACCESS_KEY_ID\"] = \"\"\n", + "# os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"\"\n", + "# os.environ[\"AWS_REGION_NAME\"] = \"\"\n", + "\n", + "from litellm import completion\n", + "\n", + "response = completion(\n", + " model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n", + " messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n", + " max_tokens=1000,\n", + " temperature=0.7\n", + " # aws_profile_name=\"\"\n", + ")\n", + "print (response.choices[0].message.content)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "You can also specify a region in the call as follows:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from litellm import completion\n", + "\n", + "response = completion(\n", + " model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n", + " messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n", + " max_tokens=1000,\n", + " temperature=0.7,\n", + " aws_region_name=\"us-west-2\"\n", + ")\n", + "print (response.choices[0].message.content)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## How to use Function Calling with Bedrock + LiteLM" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from litellm import completion\n", + "\n", + "tools = [\n", + " {\n", + " \"type\": \"function\",\n", + " \"function\": {\n", + " \"name\": \"get_current_weather\",\n", + " \"description\": \"Get the current weather in a given location\",\n", + " \"parameters\": {\n", + " \"type\": \"object\",\n", + " \"properties\": {\n", + " \"location\": {\n", + " \"type\": \"string\",\n", + " \"description\": \"The city and state, e.g. San Francisco, CA\",\n", + " },\n", + " \"unit\": {\"type\": \"string\", \"enum\": [\"celsius\", \"fahrenheit\"]},\n", + " },\n", + " \"required\": [\"location\"],\n", + " },\n", + " },\n", + " }\n", + "]\n", + "messages = [{\"role\": \"user\", \"content\": \"What's the weather like in Boston today?\"}]\n", + "\n", + "response = completion(\n", + " model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n", + " messages=messages,\n", + " tools=tools,\n", + " tool_choice=\"auto\",\n", + ")\n", + "\n", + "# Add any assertions, here to check response args\n", + "print(response)\n", + "assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)\n", + "assert isinstance(\n", + " response.choices[0].message.tool_calls[0].function.arguments, str\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## How to use Bedrock Guardrails with LiteLM\n", + "1. We first setup a new GuardRail in AWS Console --> Bedrock --> GuardRails.\n", + "2. Once a Guardrail is setup, fetch the `Guardrail ID and Version` and replace the `$guardrailIdentifier` & `guardrailVersion`in the code below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "from litellm import completion\n", + "\n", + "response = completion(\n", + " model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n", + " messages=[{ \"content\": \"Who were the winners of Olympics 2021 Swimming?\",\"role\": \"user\"}],\n", + " max_tokens=100,\n", + " temperature=0.7,\n", + " guardrailConfig={\n", + " \"guardrailIdentifier\": \"REPLACE_ME\", # The identifier (ID) for the guardrail.\n", + " \"guardrailVersion\": \"REPLACE_ME\", # The version of the guardrail.\n", + " \"trace\": \"enabled\", # The trace behavior for the guardrail. Can either be \"disabled\" or \"enabled\"\n", + " },\n", + ")\n", + "print (response.choices[0].message.content)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Use Streaming" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from litellm import completion\n", + "\n", + "response = completion(\n", + " model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n", + " messages=[{\"role\": \"user\", \"content\": \"Who were the winners of Olympics 2021 Swimming?\"}],\n", + " max_tokens=100,\n", + " temperature=0.7,\n", + " stream=True\n", + ")\n", + "for chunk in response:\n", + " print(chunk.choices[0].delta.content or \"\")" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.12.0" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}