diff --git a/examples/vector_databases/weaviate/generative-search-with-weaviate-and-openai.ipynb b/examples/vector_databases/weaviate/generative-search-with-weaviate-and-openai.ipynb new file mode 100644 index 0000000..6981775 --- /dev/null +++ b/examples/vector_databases/weaviate/generative-search-with-weaviate-and-openai.ipynb @@ -0,0 +1,275 @@ +{ + "cells": [ + { + "attachments": {}, + "cell_type": "markdown", + "id": "cb1537e6", + "metadata": {}, + "source": [ + "# Using Weaviate with Generative OpenAI module for Generative Search\n", + "\n", + "This notebook is prepared for a scenario where:\n", + "* Your data is already in Weaviate\n", + "* You want to use Weaviate with the Generative OpenAI module ([generative-openai](https://weaviate.io/developers/weaviate/modules/reader-generator-modules/generative-openai)).\n", + "\n" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "f1a618c5", + "metadata": {}, + "source": [ + "## Prerequisites\n", + "\n", + "This cookbook only coveres Generative Search examples, however, it doesn't cover the configuration and data imports.\n", + "\n", + "In order to make the most of this cookbook, please complete the [Getting Started cookbook](./getting-started-with-weaviate-and-openai.ipynb) firts, where you will learn the essentials of working with Weaviate and import the demo data.\n", + "\n", + "Checklist:\n", + "* completed [Getting Started cookbook](./getting-started-with-weaviate-and-openai.ipynb),\n", + "* crated a `Weaviate` instance,\n", + "* imported data into your `Weaviate` instance,\n", + "* you have an [OpenAI API key](https://beta.openai.com/account/api-keys)" + ] + }, + { + "cell_type": "markdown", + "id": "36fe86f4", + "metadata": {}, + "source": [ + "===========================================================\n", + "## Prepare your OpenAI API key\n", + "\n", + "The `OpenAI API key` is used for vectorization of your data at import, and for running queries.\n", + "\n", + "If you don't have an OpenAI API key, you can get one from [https://beta.openai.com/account/api-keys](https://beta.openai.com/account/api-keys).\n", + "\n", + "Once you get your key, please add it to your environment variables as `OPENAI_API_KEY`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "43395339", + "metadata": {}, + "outputs": [], + "source": [ + "# Export OpenAI API Key\n", + "!export OPENAI_API_KEY=\"your key\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "88be138c", + "metadata": {}, + "outputs": [], + "source": [ + "# Test that your OpenAI API key is correctly set as an environment variable\n", + "# Note. if you run this notebook locally, you will need to reload your terminal and the notebook for the env variables to be live.\n", + "import os\n", + "\n", + "# Note. alternatively you can set a temporary env variable like this:\n", + "# os.environ[\"OPENAI_API_KEY\"] = 'your-key-goes-here'\n", + "\n", + "if os.getenv(\"OPENAI_API_KEY\") is not None:\n", + " print (\"OPENAI_API_KEY is ready\")\n", + "else:\n", + " print (\"OPENAI_API_KEY environment variable not found\")" + ] + }, + { + "cell_type": "markdown", + "id": "91df4d5b", + "metadata": {}, + "source": [ + "## Connect to your Weaviate instance\n", + "\n", + "In this section, we will:\n", + "\n", + "1. test env variable `OPENAI_API_KEY` – **make sure** you completed the step in [#Prepare-your-OpenAI-API-key](#Prepare-your-OpenAI-API-key)\n", + "2. connect to your Weaviate with your `OpenAI API Key`\n", + "3. and test the client connection\n", + "\n", + "### The client \n", + "\n", + "After this step, the `client` object will be used to perform all Weaviate-related operations." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cc662c1b", + "metadata": {}, + "outputs": [], + "source": [ + "import weaviate\n", + "from datasets import load_dataset\n", + "import os\n", + "\n", + "# Connect to your Weaviate instance\n", + "client = weaviate.Client(\n", + " url=\"https://your-wcs-instance-name.weaviate.network/\",\n", + " # url=\"http://localhost:8080/\",\n", + " auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n", + " additional_headers={\n", + " \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n", + " }\n", + ")\n", + "\n", + "# Check if your instance is live and ready\n", + "# This should return `True`\n", + "client.is_ready()" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "id": "ceb14da9", + "metadata": {}, + "source": [ + "## Generative Search\n", + "Weaviate offers a [Generative Search OpenAI](https://weaviate.io/developers/weaviate/modules/reader-generator-modules/generative-openai) module, which generates responses based on the data stored in your Weaviate instance.\n", + "\n", + "The way you construct a generative search query is very similar to a standard semantic search query in Weaviate. \n", + "\n", + "For example:\n", + "* search in \"Articles\", \n", + "* return \"title\", \"content\", \"url\"\n", + "* look for objects related to \"football clubs\"\n", + "* limit results to 5 objects\n", + "\n", + "```\n", + " result = (\n", + " client.query\n", + " .get(\"Articles\", [\"title\", \"content\", \"url\"])\n", + " .with_near_text(\"concepts\": \"football clubs\")\n", + " .with_limit(5)\n", + " # generative query will go here\n", + " .do()\n", + " )\n", + "```\n", + "\n", + "Now, you can add `with_generate()` function to apply generative transformation. `with_generate` takes either:\n", + "- `single_prompt` - to generate a response for each returned object,\n", + "- `grouped_task` – to generate a single response from all returned objects.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "51559251", + "metadata": {}, + "outputs": [], + "source": [ + "def generative_search_per_item(query, collection_name):\n", + " prompt = \"Summarize in a short tweet the following content: {content}\"\n", + "\n", + " result = (\n", + " client.query\n", + " .get(collection_name, [\"title\", \"content\", \"url\"])\n", + " .with_near_text({ \"concepts\": [query], \"distance\": 0.7 })\n", + " .with_limit(5)\n", + " .with_generate(single_prompt=prompt)\n", + " .do()\n", + " )\n", + " \n", + " # Check for errors\n", + " if (\"errors\" in result):\n", + " print (\"\\033[91mYou probably have run out of OpenAI API calls for the current minute – the limit is set at 60 per minute.\")\n", + " raise Exception(result[\"errors\"][0]['message'])\n", + " \n", + " return result[\"data\"][\"Get\"][collection_name]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a4604726", + "metadata": {}, + "outputs": [], + "source": [ + "query_result = generative_search_per_item(\"football clubs\", \"Article\")\n", + "\n", + "for i, article in enumerate(query_result):\n", + " print(f\"{i+1}. { article['title']}\")\n", + " print(article['_additional']['generate']['singleResult']) # print generated response\n", + " print(\"-----------------------\")" + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "id": "a45ea160", + "metadata": {}, + "outputs": [], + "source": [ + "def generative_search_group(query, collection_name):\n", + " generateTask = \"Explain what these have in common\"\n", + "\n", + " result = (\n", + " client.query\n", + " .get(collection_name, [\"title\", \"content\", \"url\"])\n", + " .with_near_text({ \"concepts\": [query], \"distance\": 0.7 })\n", + " .with_generate(grouped_task=generateTask)\n", + " .with_limit(5)\n", + " .do()\n", + " )\n", + " \n", + " # Check for errors\n", + " if (\"errors\" in result):\n", + " print (\"\\033[91mYou probably have run out of OpenAI API calls for the current minute – the limit is set at 60 per minute.\")\n", + " raise Exception(result[\"errors\"][0]['message'])\n", + " \n", + " return result[\"data\"][\"Get\"][collection_name]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "11e0dad2", + "metadata": {}, + "outputs": [], + "source": [ + "query_result = generative_search_group(\"football clubs\", \"Article\")\n", + "\n", + "print (query_result[0]['_additional']['generate']['groupedResult'])" + ] + }, + { + "cell_type": "markdown", + "id": "2007be48", + "metadata": {}, + "source": [ + "Thanks for following along, you're now equipped to set up your own vector databases and use embeddings to do all kinds of cool things - enjoy! For more complex use cases please continue to work through other cookbook examples in this repo." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + }, + "vscode": { + "interpreter": { + "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" + } + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/examples/vector_databases/weaviate/getting-started-with-weaviate-and-openai.ipynb b/examples/vector_databases/weaviate/getting-started-with-weaviate-and-openai.ipynb index de26eaa..9c68408 100644 --- a/examples/vector_databases/weaviate/getting-started-with-weaviate-and-openai.ipynb +++ b/examples/vector_databases/weaviate/getting-started-with-weaviate-and-openai.ipynb @@ -241,6 +241,7 @@ "client = weaviate.Client(\n", " url=\"https://your-wcs-instance-name.weaviate.network/\",\n", " # url=\"http://localhost:8080/\",\n", + " auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n", " additional_headers={\n", " \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n", " }\n", diff --git a/examples/vector_databases/weaviate/hybrid-search-with-weaviate-and-openai.ipynb b/examples/vector_databases/weaviate/hybrid-search-with-weaviate-and-openai.ipynb index e4b57c2..9720479 100644 --- a/examples/vector_databases/weaviate/hybrid-search-with-weaviate-and-openai.ipynb +++ b/examples/vector_databases/weaviate/hybrid-search-with-weaviate-and-openai.ipynb @@ -241,6 +241,7 @@ "client = weaviate.Client(\n", " url=\"https://your-wcs-instance-name.weaviate.network/\",\n", "# url=\"http://localhost:8080/\",\n", + " auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n", " additional_headers={\n", " \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n", " }\n", diff --git a/examples/vector_databases/weaviate/question-answering-with-weaviate-and-openai.ipynb b/examples/vector_databases/weaviate/question-answering-with-weaviate-and-openai.ipynb index 7f13562..d2185f3 100644 --- a/examples/vector_databases/weaviate/question-answering-with-weaviate-and-openai.ipynb +++ b/examples/vector_databases/weaviate/question-answering-with-weaviate-and-openai.ipynb @@ -240,6 +240,7 @@ "client = weaviate.Client(\n", " url=\"https://your-wcs-instance-name.weaviate.network/\",\n", "# url=\"http://localhost:8080/\",\n", + " auth_client_secret=weaviate.auth.AuthApiKey(api_key=\"\"), # comment out this line if you are not using authentication for your Weaviate instance (i.e. for locally deployed instances)\n", " additional_headers={\n", " \"X-OpenAI-Api-Key\": os.getenv(\"OPENAI_API_KEY\")\n", " }\n",