diff --git a/examples/Data_Commons_with_Function_Calling.ipynb b/examples/Data_Commons_with_Function_Calling.ipynb new file mode 100644 index 00000000..a58d6bed --- /dev/null +++ b/examples/Data_Commons_with_Function_Calling.ipynb @@ -0,0 +1,462 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "O2J7t2PS4V9q" + }, + "source": [ + "# Connect to Data Commons Natural Language API" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ta0yNrKn251R" + }, + "source": [ + "Google launched (09/2024) the [Data Gemma model](https://blog.google/technology/ai/google-datagemma-ai-llm/). Their [RIG notebook](https://colab.sandbox.google.com/github/datacommonsorg/llm-tools/blob/master/notebooks/datagemma_rig.ipynb) connects the Data Gemma model to the [Data Commons Natural language API](https://docs.datacommons.org/2023/09/13/explore.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WVi2F8Qf3_Xk" + }, + "source": [ + "This notebook explores the posibility of connecting the Gemini-API to that same interface." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "lfHmfcGD4U13" + }, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "p8qK3Ow24nrd" + }, + "source": [ + "### Install\n", + "\n", + "Install the necessary python packages." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "F8kJhIGYyA-z" + }, + "outputs": [], + "source": [ + "!pip install -Uq \"google.generativeai>=0.8.1\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "pZhYYcbGXcTa" + }, + "outputs": [], + "source": [ + "!pip install -Uq datacommons" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fnHrOSeQ4rYj" + }, + "source": [ + "Use the helper code from the datacommons `llm-tools`, which were written for Data Gemma." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CxXL7wbZazHO" + }, + "source": [ + "https://github.com/datacommonsorg/llm-tools/blob/main/data_gemma/datacommons.py" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "q3SkXl-nnrjU" + }, + "outputs": [], + "source": [ + "!pip install -q git+https://github.com/datacommonsorg/llm-tools@d99b583ca7aa5e7085c3181a87e23364749d7c63" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fzXMfsoR5D5n" + }, + "source": [ + "### Import\n", + "\n", + "Import the packages" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "T3bKWHScnp7y" + }, + "outputs": [], + "source": [ + "import data_gemma.datacommons as dc_lib" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "CqLB4HsRySxw" + }, + "outputs": [], + "source": [ + "import google.generativeai as genai" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HNxzzrx7p-uC" + }, + "outputs": [], + "source": [ + "from IPython import display" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Hv7g1pbIoJcd" + }, + "source": [ + "Get an api key from: https://apikeys.datacommons.org, make sure activate the NL-API foryour key.\n", + "\n", + "Note: The datacommons trial key does not work for the natural language API.\n", + "\n", + "The code below fetches the keys from the \"Colab Secrets\" tab (\"🔑\" on the lsft of the colab window)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "WdTbIfcWoI5M" + }, + "outputs": [], + "source": [ + "from google.colab import userdata\n", + "\n", + "DATACOMMONS_API_KEY = userdata.get(\"DATACOMMONS_API_KEY\")\n", + "genai.configure(api_key=userdata.get('GOOGLE_API_KEY'))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Lwem9rW3Xga9" + }, + "outputs": [], + "source": [ + "dc = dc_lib.DataCommons(api_key=DATACOMMONS_API_KEY)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OgX9_-ZF5s-x" + }, + "source": [ + "## Try the Data Commond NL API" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "4FFLyn6RnbVB" + }, + "outputs": [], + "source": [ + "dc.point(\"what is the GDP of Spain?\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "K9eFEcNgupL8" + }, + "outputs": [], + "source": [ + "dc.table(\"what was the GDP of spain over the years?\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "X60ytbXM5xKG" + }, + "source": [ + "## Write some wrapper functions for Gemini\n", + "\n", + "This section is to make the two methods callable by the API.\n", + "The main goal here is just giving a clear docstring what explains what the function does." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "pxHPJydZxRuj" + }, + "outputs": [], + "source": [ + "import dataclasses\n", + "\n", + "def asdict(thing):\n", + " thing = dataclasses.asdict(thing)\n", + " thing = {key:value for key,value in thing.items() if thing != \"\"}\n", + " thing.pop('query', None)\n", + " thing.pop('id', None)\n", + " return thing" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_sIQJuT8uwFl" + }, + "outputs": [], + "source": [ + "def datacommons_point(query:str):\n", + " \"\"\"Call the datacommons api with a natural language query and return a single value.\n", + "\n", + " For example: \"what is the GDP of Spain?\"\n", + "\n", + " If the lookup fails it returns an empty result.\n", + " \"\"\"\n", + " return asdict(dc.point(query))" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "rrICd1lQwxI7" + }, + "outputs": [], + "source": [ + "def datacommons_table(query:str):\n", + " \"\"\"Call the datacommons api with a natural language query and return a table of values.\n", + "\n", + " For example: \"what was the GDP of spain over the years?\"\n", + "\n", + " If the lookup fails it returns an empty result.\n", + " \"\"\"\n", + " return asdict(dc.table(query))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "j-9cwuJ36LJ3" + }, + "source": [ + "## Try it" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "1brrEYZb0ZyO" + }, + "outputs": [], + "source": [ + "SI = \"\"\"You are a data analist and research assistant.\n", + "You have access to the DataCommons natural language API.\n", + "The user will send you a question, and you will query the database to research the answer.\n", + "Make sure your answers are well researched: Don't be lazy and simply copy paste the user's question.\n", + "Any good answer will likely take several database queries.\n", + "\"\"\"\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "hNi6bMfqx939" + }, + "outputs": [], + "source": [ + "model = genai.GenerativeModel(model_name='gemini-1.5-flash',\n", + " # Describe the tools to Gemini\n", + " tools=[datacommons_point, datacommons_table],\n", + " system_instruction=SI)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "OrsxHhCbyvZB" + }, + "outputs": [], + "source": [ + "chat = model.start_chat(\n", + " # Have the Chat-session automatically make the function calls and send back the responses.\n", + " enable_automatic_function_calling=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "RGkTQSDTzIcO" + }, + "outputs": [], + "source": [ + "# Ignore the returned response, look at the chat history for the full results.\n", + "_ = chat.send_message('Give me a report on the progress in public health in Pakistan over the last 20 years')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "jocFGz7RuwKs" + }, + "source": [ + "### Show the result text\n", + "\n", + "The easiest way to show the result is to go through the the chat history and print all the text chiunks:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "VLQge7tY8afk" + }, + "outputs": [], + "source": [ + "for message in chat.history:\n", + " for part in message.parts:\n", + " if text:=part.text:\n", + " print(f'{message.role.title()}:\\n\\n')\n", + " print(text)\n", + " print('-'*80)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Bryzd9hDuus2" + }, + "source": [ + "### Show detailed results\n", + "\n", + "You can get more detailed output if you process the different output types.\n", + "`Part`s can contain `text`, `function_call`, or `function_response`. For the `table` responses, convert them to markdown." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "UPyI4LRfsIo3" + }, + "outputs": [], + "source": [ + "def markdown_table(table):\n", + " table = table.splitlines()\n", + "\n", + " def fixline(line):\n", + " if line.count('-') == len(line):\n", + " line = '|-|-|'\n", + " else:\n", + " line = f\"|{line}|\"\n", + " return line\n", + "\n", + " table = [fixline(line) for line in table if line]\n", + " table = '\\n'.join(table)\n", + " return table" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "RKLTQ7Wox3nm" + }, + "outputs": [], + "source": [ + "def table_fr(fr):\n", + " fr.pop('table', None)\n", + " fr = {key: value for key, value in fr.items() if value}\n", + " fr = [f\"|{str(key)}|{str(value)}|\" for key, value in fr.items()]\n", + " fr = '|key|value|\\n|-|-|\\n'+'\\n'.join(fr)\n", + " return fr" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "zcAbERfj7R3F" + }, + "outputs": [], + "source": [ + "for message in chat.history:\n", + " display.display(display.Markdown(f'## {message.role.title()}:\\n\\n'))\n", + " for part in message.parts:\n", + " if text:=part.text:\n", + " display.display(display.Markdown(text))\n", + " if function_call:=part.function_call:\n", + " display.display(display.Markdown(f\"function_call: {function_call.name}(\\\"{function_call.args['query']}\\\")\"))\n", + " if function_response:=part.function_response:\n", + " fr = dict(function_response.response)\n", + " if table:=function_response.response.get('table', None):\n", + " display.display(display.Markdown(f\"function_response: {function_response.name}\"))\n", + " display.display(display.Markdown(table_fr(fr)))\n", + " display.display(display.Markdown(markdown_table(table)))\n", + " else:\n", + " display.display(display.Markdown(f\"function_response:\"))\n", + " display.display(display.Markdown(table_fr(fr)))\n", + " display.display(display.Markdown('-'*80))" + ] + } + ], + "metadata": { + "colab": { + "name": "Data_Commons_with_Function_Calling.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} diff --git a/examples/README.md b/examples/README.md index 6b9c6ca6..ca4465ec 100644 --- a/examples/README.md +++ b/examples/README.md @@ -23,6 +23,7 @@ This is a collection of fun examples for the Gemini API. * [Translate a public domain](https://github.com/google-gemini/cookbook/blob/main/examples/Translate_a_Public_Domain_Book.ipynb): In this notebook, you will explore Gemini model as a translation tool, demonstrating how to prepare data, create effective prompts, and save results into a `.txt` file. * [Working with Charts, Graphs, and Slide Decks](https://github.com/google-gemini/cookbook/blob/main/examples/Working_with_Charts_Graphs_and_Slide_Decks.ipynb): Gemini models are powerful multimodal LLMs that can process both text and image inputs. This notebook shows how Gemini 1.5 Flash model is capable of extracting data from various images. * [Entity extraction](https://github.com/google-gemini/cookbook/blob/main/examples/Entity_Extraction.ipynb): Use Gemini API to speed up some of your tasks, such as searching through text to extract needed information. Entity extraction with a Gemini model is a simple query, and you can ask it to retrieve its answer in the form that you prefer. +* [Connect to Data Commons Natural Language API](https://github.com/google-gemini/cookbook/blob/main/examples/Data_Commons_with_Function_Calling.ipynb) Uses `FunctionCalling` to to call the Data-Commons natural language API, similar to [DataGemma](https://blog.google/technology/ai/google-datagemma-ai-llm/) ### Integrations @@ -34,4 +35,4 @@ This is a collection of fun examples for the Gemini API. * [JSON Capabilities](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Tuning.ipynb): A directory with guides containing different types of tasks you can do with JSON schemas. * [Automate Google Workspace tasks with the Gemini API](https://github.com/google-gemini/cookbook/tree/main/examples/Apps_script_and_Workspace_codelab): This codelabs shows you how to connect to the Gemini API using Apps Script, and uses the function calling, vision and text capabilities to automate Google Workspace tasks - summarizing a document, analyzing a chart, sending an email and generating some slides directly. All of this is done from a free text input. -There are even more examples in the [quickstarts](https://github.com/google-gemini/cookbook/tree/main/quickstarts) folder and in the [Awesome Gemini page](../Awesome_gemini.md). \ No newline at end of file +There are even more examples in the [quickstarts](https://github.com/google-gemini/cookbook/tree/main/quickstarts) folder and in the [Awesome Gemini page](../Awesome_gemini.md).