From 529701b5c5e64d14d6922c6c352c8fb351282ea5 Mon Sep 17 00:00:00 2001
From: amit-lulla <amit.lulla@gmail.com>
Date: Wed, 21 Aug 2024 12:33:45 +0100
Subject: [PATCH] Bedrock with LiteLLM

---
 poc-to-prod/bedrock_with_litellm.ipynb | 234 +++++++++++++++++++++++++
 1 file changed, 234 insertions(+)
 create mode 100644 poc-to-prod/bedrock_with_litellm.ipynb

diff --git a/poc-to-prod/bedrock_with_litellm.ipynb b/poc-to-prod/bedrock_with_litellm.ipynb
new file mode 100644
index 00000000..2671838d
--- /dev/null
+++ b/poc-to-prod/bedrock_with_litellm.ipynb
@@ -0,0 +1,234 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# How to use Bedrock with LiteLLM\n",
+    "This notebook demonstrates how to use Bedrock with LiteLLM. Making calls to various Large Language Models in Amazon Bedrock using LiteLLM SDK is simplified in this notebook.\n",
+    "\n",
+    "Customers implementing GenAI workloads at scale in production look for Load-balancing, routing and budget controls. This is easily achieved using an sdk such as LiteLLM which offers a range of features including load-balancing, budget and routing.\n",
+    "\n",
+    "In this example, you will learn how to make calls to Bedrock API using LiteLLM.\n",
+    "1. Overview - The notebook demonstrates how to use Amazon Bedrock with LiteLLM, simplifying calls to various Large Language Models (LLMs) in Amazon Bedrock using the LiteLLM SDK.\n",
+    "    1. What are we demonstrating - We're showing how to make API calls to Amazon Bedrock using LiteLLM, including basic completions, function calling, using guardrails, and streaming responses.\n",
+    "    2. What use case - This is useful for customers implementing GenAI workloads at scale in production who need load-balancing, routing, and budget controls, which LiteLLM offers.\n",
+    "    3. What will you learn:\n",
+    "        - How to set up and use LiteLLM with Amazon Bedrock\n",
+    "        - Making basic completion calls to Bedrock models\n",
+    "        - Using function calling with Bedrock and LiteLLM\n",
+    "        - Implementing Bedrock Guardrails with LiteLLM\n",
+    "        - Using streaming for responses\n",
+    "2. What is the architectural pattern and why we select this - The pattern uses LiteLLM as an abstraction layer over Amazon Bedrock. This is chosen because it simplifies API calls and provides additional features like load-balancing and budget controls.\n",
+    "    [User Application] -> [LiteLLM SDK] -> [Amazon Bedrock API] -> [Various LLMs]\n",
+    "3. What are the libraries to install - litellm, boto3\n",
+    "4. What model did we choose and why this model - The notebook uses \"bedrock/anthropic.claude-3-haiku-20240307-v1:0\", which is Claude 3 Haiku model from Anthropic on Amazon Bedrock. This model is likely chosen for its capabilities and availability on Bedrock.\n",
+    "5. Every cell needs to have a markup - The notebook does include markdown cells explaining each section. The code is structured into sections for different functionalities (basic usage, function calling, guardrails, streaming)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's start by importing the libraries used in this example"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python3 -m pip install litellm --quiet\n",
+    "!python3 -m pip install boto3 --quiet # LiteLLM requires boto3 to be installed for Bedrock APi requests"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Setup AWS Credentials\n",
+    "We'll now setup AWS Environment variables to use a region and a profile/iam_role as below"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# set env\n",
+    "# os.environ[\"AWS_ACCESS_KEY_ID\"] = \"\"\n",
+    "# os.environ[\"AWS_SECRET_ACCESS_KEY\"] = \"\"\n",
+    "# os.environ[\"AWS_REGION_NAME\"] = \"\"\n",
+    "\n",
+    "from litellm import completion\n",
+    "\n",
+    "response = completion(\n",
+    "    model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
+    "    messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
+    "    max_tokens=1000,\n",
+    "    temperature=0.7\n",
+    "    # aws_profile_name=\"\"\n",
+    ")\n",
+    "print (response.choices[0].message.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can also specify a region in the call as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from litellm import completion\n",
+    "\n",
+    "response = completion(\n",
+    "    model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
+    "    messages=[{ \"content\": \"Hello, how are you?\",\"role\": \"user\"}],\n",
+    "    max_tokens=1000,\n",
+    "    temperature=0.7,\n",
+    "    aws_region_name=\"us-west-2\"\n",
+    ")\n",
+    "print (response.choices[0].message.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## How to use Function Calling with Bedrock + LiteLM"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from litellm import completion\n",
+    "\n",
+    "tools =  [\n",
+    "    {\n",
+    "        \"type\": \"function\",\n",
+    "        \"function\": {\n",
+    "            \"name\": \"get_current_weather\",\n",
+    "            \"description\": \"Get the current weather in a given location\",\n",
+    "            \"parameters\": {\n",
+    "                \"type\": \"object\",\n",
+    "                \"properties\": {\n",
+    "                    \"location\": {\n",
+    "                        \"type\": \"string\",\n",
+    "                        \"description\": \"The city and state, e.g. San Francisco, CA\",\n",
+    "                    },\n",
+    "                    \"unit\": {\"type\": \"string\", \"enum\": [\"celsius\", \"fahrenheit\"]},\n",
+    "                },\n",
+    "                \"required\": [\"location\"],\n",
+    "            },\n",
+    "        },\n",
+    "    }\n",
+    "]\n",
+    "messages = [{\"role\": \"user\", \"content\": \"What's the weather like in Boston today?\"}]\n",
+    "\n",
+    "response = completion(\n",
+    "    model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
+    "    messages=messages,\n",
+    "    tools=tools,\n",
+    "    tool_choice=\"auto\",\n",
+    ")\n",
+    "\n",
+    "# Add any assertions, here to check response args\n",
+    "print(response)\n",
+    "assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)\n",
+    "assert isinstance(\n",
+    "    response.choices[0].message.tool_calls[0].function.arguments, str\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## How to use Bedrock Guardrails with LiteLM\n",
+    "1. We first setup a new GuardRail in AWS Console --> Bedrock --> GuardRails.\n",
+    "2. Once a Guardrail is setup, fetch the `Guardrail ID and Version` and replace the `$guardrailIdentifier` & `guardrailVersion`in the code below."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from litellm import completion\n",
+    "\n",
+    "response = completion(\n",
+    "    model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
+    "    messages=[{ \"content\": \"Who were the winners of Olympics 2021 Swimming?\",\"role\": \"user\"}],\n",
+    "    max_tokens=100,\n",
+    "    temperature=0.7,\n",
+    "    guardrailConfig={\n",
+    "        \"guardrailIdentifier\": \"REPLACE_ME\", # The identifier (ID) for the guardrail.\n",
+    "        \"guardrailVersion\": \"REPLACE_ME\", # The version of the guardrail.\n",
+    "        \"trace\": \"enabled\", # The trace behavior for the guardrail. Can either be \"disabled\" or \"enabled\"\n",
+    "    },\n",
+    ")\n",
+    "print (response.choices[0].message.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Use Streaming"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from litellm import completion\n",
+    "\n",
+    "response = completion(\n",
+    "    model=\"bedrock/anthropic.claude-3-haiku-20240307-v1:0\",\n",
+    "    messages=[{\"role\": \"user\", \"content\": \"Who were the winners of Olympics 2021 Swimming?\"}],\n",
+    "    max_tokens=100,\n",
+    "    temperature=0.7,\n",
+    "    stream=True\n",
+    ")\n",
+    "for chunk in response:\n",
+    "    print(chunk.choices[0].delta.content or \"\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.0"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}