openai-cookbook/apps/enterprise-knowledge-retrieval/enterprise_knowledge_retrieval.ipynb

1751 lines
138 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"id": "685d4507",
"metadata": {},
"source": [
"# Enterprise Knowledge Retrieval\n",
"\n",
"This notebook contains an end-to-end workflow to set up an Enterprise Knowledge Retrieval solution from scratch.\n",
"\n",
"### Problem Statement\n",
"\n",
"LLMs have great conversational ability but their knowledge is general and often out of date. Relevant knowledge often exists, but is kept in disparate datestores that are hard to surface with current search solutions.\n",
"\n",
"\n",
"### Objective\n",
"\n",
"We want to deliver an outstanding user experience where the user is presented with the right knowledge when they need it in a clear and conversational way. To accomplish this we need an LLM-powered solution that knows our organizational context and data, that can retrieve the right knowledge when the user needs it. \n"
]
},
{
"cell_type": "markdown",
"id": "8eab9aae",
"metadata": {},
"source": [
"## Solution\n",
"\n",
"![title](img/enterprise_knowledge_retrieval.png)\n",
"\n",
"We'll build a knowledge retrieval solution that will embed a corpus of knowledge (in our case a database of Wikipedia manuals) and use it to answer user questions.\n",
"\n",
"### Learning Path\n",
"\n",
"#### Walkthrough\n",
"\n",
"You can follow on to this solution walkthrough through either the video recorded here, or the text walkthrough below. We'll build out the solution in the following stages:\n",
"- **Setup:** Initiate variables and connect to a vector database.\n",
"- **Storage:** Configure the database, prepare our data and store embeddings and metadata for retrieval.\n",
"- **Search:** Extract relevant documents back out with a basic search function and use an LLM to summarise results into a concise reply.\n",
"- **Answer:** Add a more sophisticated agent which will process the user's query and maintain a memory for follow-up questions.\n",
"- **Evaluate:** Take a sample evaluated question/answer pairs using our service and plot them to scope out remedial action."
]
},
{
"cell_type": "markdown",
"id": "ae9b1412",
"metadata": {},
"source": [
"## Walkthrough"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4e85be52",
"metadata": {},
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2"
]
},
{
"cell_type": "markdown",
"id": "ab1a0a6a",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Import libraries and set up a connection to a Redis vector database for our knowledge base.\n",
"\n",
"You can substitute Redis for any other vectorstore or database - there are a [selection](https://python.langchain.com/en/latest/modules/indexes/vectorstores.html) that are supported by Langchain natively, while other connectors will need to be developed yourself."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c79535f1",
"metadata": {},
"outputs": [],
"source": [
"!pip install redis"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "cd8e3d30",
"metadata": {},
"outputs": [],
"source": [
"from ast import literal_eval\n",
"import openai\n",
"import os\n",
"import numpy as np\n",
"from numpy import array, average\n",
"import pandas as pd\n",
"from typing import Iterator\n",
"import tiktoken\n",
"from tqdm.auto import tqdm\n",
"import wget\n",
"\n",
"\n",
"# Redis imports\n",
"from redis import Redis as r\n",
"from redis.commands.search.query import Query\n",
"from redis.commands.search.field import (\n",
" TextField,\n",
" VectorField,\n",
" NumericField\n",
")\n",
"from redis.commands.search.indexDefinition import (\n",
" IndexDefinition,\n",
" IndexType\n",
")\n",
"\n",
"# Langchain imports\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.chains import RetrievalQA\n",
"\n",
"CHAT_MODEL = \"gpt-3.5-turbo\""
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "53641bc5",
"metadata": {},
"outputs": [],
"source": [
"pd.set_option('display.max_colwidth', 0)"
]
},
{
"cell_type": "code",
"execution_count": 101,
"id": "6fbde85b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"100% [..........................................................................] 4470649 / 4470649"
]
},
{
"data": {
"text/plain": [
"'wikipedia_articles_2000 (1).csv'"
]
},
"execution_count": 101,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"embeddings_url = 'https://cdn.openai.com/API/examples/data/wikipedia_articles_2000.csv'\n",
"\n",
"# The file is ~700 MB so this will take some time\n",
"wget.download(embeddings_url)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "5b873693",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Unnamed: 0</th>\n",
" <th>id</th>\n",
" <th>url</th>\n",
" <th>title</th>\n",
" <th>text</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>878</td>\n",
" <td>3661</td>\n",
" <td>https://simple.wikipedia.org/wiki/Photon</td>\n",
" <td>Photon</td>\n",
" <td>Photons (from Greek φως, meaning light), in many atomic models in physics, are particles which transmit light. In other words, light is carried over space by photons. Photon is an elementary particle that is its own antiparticle. In quantum mechanics each photon has a characteristic quantum of energy that depends on frequency: A photon associated with light at a higher frequency will have more energy (and be associated with light at a shorter wavelength).\\n\\nPhotons have a rest mass of 0 (zero). However, Einstein's theory of relativity says that they do have a certain amount of momentum. Before the photon got its name, Einstein revived the proposal that light is separate pieces of energy (particles). These particles came to be known as photons. \\n\\nA photon is usually given the symbol γ (gamma),\\n\\nProperties \\n\\nPhotons are fundamental particles. Although they can be created and destroyed, their lifetime is infinite.\\n\\nIn a vacuum, all photons move at the speed of light, c, which is equal to 299,792,458 meters (approximately 300,000 kilometers) per second.\\n\\nA photon has a given frequency, which determines its color. Radio technology makes great use of frequency. Beyond the visible range, frequency is less discussed, for example it is little used in distinguishing between X-Ray photons and infrared. Frequency is equivalent to the quantum energy of the photon, as related by the Planck constant equation,\\n\\n,\\n\\nwhere is the photon's energy, is the Plank constant, and is the frequency of the light associated with the photon. This frequency, , is typically measured in cycles per second, or equivalently, in Hz. The quantum energy of different photons is often used in cameras, and other machines that use visible and higher than visible radiation. This because these photons are energetic enough to ionize atoms. \\n\\nAnother property of a photon is its wavelength. The frequency , wavelength , and speed of light are related by the equation,\\n\\n,\\n\\nwhere (lambda) is the wavelength, or length of the wave (typically measured in meters.)\\n\\nAnother important property of a photon is its polarity. If you saw a giant photon coming straight at you, it could appear as a swath whipping vertically, horizontally, or somewhere in between. Polarized sunglasses stop photons swinging up and down from passing. This is how they reduce glare as light bouncing off of surfaces tend to fly that way. Liquid crystal displays also use polarity to control which light passes through. Some animals can see light polarization. \\n\\nFinally, a photon has a property called spin. Spin is related to light's circular polarization.\\n\\nPhoton interactions with matter\\nLight is often created or absorbed when an electron gains or loses energy. This energy can be in the form of heat, kinetic energy, or other form. For example, an incandescent light bulb uses heat. The increase of energy can push an electron up one level in a shell called a \"valence\". This makes it unstable, and like everything, it wants to be in the lowest energy state. (If being in the lowest energy state is confusing, pick up a pencil and drop it. Once on the ground, the pencil will be in a lower energy state). When the electron drops back down to a lower energy state, it needs to release the energy that hit it, and it must obey the conservation of energy (energy can neither be created nor destroyed). Electrons release this energy as photons, and at higher intensities, this photon can be seen as visible light.\\n\\nPhotons and the electromagnetic force\\nIn particle physics, photons are responsible for electromagnetic force. Electromagnetism is an idea that combines electricity with magnetism. One common way that we experience electromagnetism in our daily lives is light, which is caused by electromagnetism. Electromagnetism is also responsible for charge, which is the reason that you can not push your hand through a table. Since photons are the force-carrying particle of electromagnetism, they are also gauge bosons. Some mattercal
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2425</td>\n",
" <td>7796</td>\n",
" <td>https://simple.wikipedia.org/wiki/Thomas%20Dolby</td>\n",
" <td>Thomas Dolby</td>\n",
" <td>Thomas Dolby (born Thomas Morgan Robertson; 14 October 1958) is a British musican and computer designer. He is probably most famous for his 1982 hit, \"She Blinded me with Science\".\\n\\nHe married actress Kathleen Beller in 1988. The couple have three children together.\\n\\nDiscography\\n\\nSingles\\n\\nA Track did not chart in North America until 1983, after the success of \"She Blinded Me With Science\".\\n\\nAlbums\\n\\nStudio albums\\n\\nEPs\\n\\nReferences\\n\\nEnglish musicians\\nLiving people\\n1958 births\\nNew wave musicians\\nWarner Bros. Records artists</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>18059</td>\n",
" <td>67912</td>\n",
" <td>https://simple.wikipedia.org/wiki/Embroidery</td>\n",
" <td>Embroidery</td>\n",
" <td>Embroidery is the art of decorating fabric or other materials with designs stitched in strands of thread or yarn using a needle. Embroidery may also incorporate other materials such as metal strips, pearls, beads, quills, and sequins. Sewing machines can be used to create machine embroidery.\\n\\nQualifications \\nCity and Guilds qualification in Embroidery allows embroiderers to become recognized for their skill. This qualification also gives them the credibility to teach. For example, the notable textiles artist, Kathleen Laurel Sage, began her teaching career by getting the City and Guilds Embroidery 1 and 2 qualifications. She has now gone on to write a book on the subject.\\n\\nReferences\\n\\nOther websites\\n The Crimson Thread of Kinship at the National Museum of Australia\\n\\nNeedlework</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>12045</td>\n",
" <td>44309</td>\n",
" <td>https://simple.wikipedia.org/wiki/Consecutive%20integer</td>\n",
" <td>Consecutive integer</td>\n",
" <td>Consecutive numbers are numbers that follow each other in order. They have a difference of 1 between every two numbers. In a set of consecutive numbers, the mean and the median are equal. \\n\\nIf n is a number, then the next numbers will be n+1 and n+2. \\n\\nExamples \\n\\nConsecutive numbers that follow each other in order:\\n\\n 1, 2, 3, 4, 5\\n -3, 2, 1, 0, 1, 2, 3, 4\\n 6, 7, 8, 9, 10, 11, 12, 13\\n\\nConsecutive even numbers \\nConsecutive even numbers are even numbers that follow each other. They have a difference of 2 between every two numbers.\\n\\nIf n is an even integer, then n, n+2, n+4 and n+6 will be consecutive even numbers.\\n\\nFor example - 2,4,6,8,10,12,14,18 etc.\\n\\nConsecutive odd numbers\\nConsecutive odd numbers are odd numbers that follow each other. Like consecutive odd numbers, they have a difference of 2 between every two numbers.\\n\\nIf n is an odd integer, then n, n+2, n+4 and n+6 will be consecutive odd numbers.\\n\\nExamples\\n\\n3, 5, 7, 9, 11, 13, etc.\\n\\n23, 21, 19, 17, 15, -13, -11\\n\\nIntegers</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>11477</td>\n",
" <td>41741</td>\n",
" <td>https://simple.wikipedia.org/wiki/German%20Empire</td>\n",
" <td>German Empire</td>\n",
" <td>The German Empire (\"Deutsches Reich\" or \"Deutsches Kaiserreich\" in the German language) is the name for a group of German countries from January 18, 1871 to November 9, 1918. This is from the Unification of Germany when Wilhelm I of Prussia was made German Kaiser to when the third Emperor Wilhelm II was removed from power at the end of the First World War. In the 1920s, German nationalists started to call it the \"Second Reich\".\\n\\nThe name of Germany was \"Deutsches Reich\" until 1945. \"Reich\" can mean many things, empire, kingdom, state, \"richness\" or \"wealth\". Most members of the Empire were previously members of the North German Confederation. \\n\\nAt different times, there were three groups of smaller countries, each group was later called a \"Reich\" by some Germans. The first was the Holy Roman Empire. The second was the German Empire. The third was the Third Reich.\\n\\nThe words \"Second Reich\" were used for the German Empire by Arthur Moeller van den Bruck, a nationalist writer in the 1920s. He was trying to make a link with the earlier Holy Roman Empire which had once been very strong. Germany had lost First World War and was suffering big problems. van den Bruck wanted to start a \"Third Reich\" to unite the country. These words were later used by the Nazis to make themselves appear stronger.\\n\\nStates in the Empire\\n\\nRelated pages\\n Germany\\n Holy Roman Empire\\n Nazi Germany, or \"Drittes Reich\"\\n\\n1870s establishments in Germany\\n \\nStates and territories disestablished in the 20th century\\nStates and territories established in the 19th century\\n1871 establishments in Europe\\n1918 disestablishments in Germany</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Unnamed: 0 id url \\\n",
"0 878 3661 https://simple.wikipedia.org/wiki/Photon \n",
"1 2425 7796 https://simple.wikipedia.org/wiki/Thomas%20Dolby \n",
"2 18059 67912 https://simple.wikipedia.org/wiki/Embroidery \n",
"3 12045 44309 https://simple.wikipedia.org/wiki/Consecutive%20integer \n",
"4 11477 41741 https://simple.wikipedia.org/wiki/German%20Empire \n",
"\n",
" title \\\n",
"0 Photon \n",
"1 Thomas Dolby \n",
"2 Embroidery \n",
"3 Consecutive integer \n",
"4 German Empire \n",
"\n",
"
"0 Photons (from Greek φως, meaning light), in many atomic models in physics, are particles which transmit light. In other words, light is carried over space by photons. Photon is an elementary particle that is its own antiparticle. In quantum mechanics each photon has a characteristic quantum of energy that depends on frequency: A photon associated with light at a higher frequency will have more energy (and be associated with light at a shorter wavelength).\\n\\nPhotons have a rest mass of 0 (zero). However, Einstein's theory of relativity says that they do have a certain amount of momentum. Before the photon got its name, Einstein revived the proposal that light is separate pieces of energy (particles). These particles came to be known as photons. \\n\\nA photon is usually given the symbol γ (gamma),\\n\\nProperties \\n\\nPhotons are fundamental particles. Although they can be created and destroyed, their lifetime is infinite.\\n\\nIn a vacuum, all photons move at the speed of light, c, which is equal to 299,792,458 meters (approximately 300,000 kilometers) per second.\\n\\nA photon has a given frequency, which determines its color. Radio technology makes great use of frequency. Beyond the visible range, frequency is less discussed, for example it is little used in distinguishing between X-Ray photons and infrared. Frequency is equivalent to the quantum energy of the photon, as related by the Planck constant equation,\\n\\n,\\n\\nwhere is the photon's energy, is the Plank constant, and is the frequency of the light associated with the photon. This frequency, , is typically measured in cycles per second, or equivalently, in Hz. The quantum energy of different photons is often used in cameras, and other machines that use visible and higher than visible radiation. This because these photons are energetic enough to ionize atoms. \\n\\nAnother property of a photon is its wavelength. The frequency , wavelength , and speed of light are related by the equation,\\n\\n,\\n\\nwhere (lambda) is the wavelength, or length of the wave (typically measured in meters.)\\n\\nAnother important property of a photon is its polarity. If you saw a giant photon coming straight at you, it could appear as a swath whipping vertically, horizontally, or somewhere in between. Polarized sunglasses stop photons swinging up and down from passing. This is how they reduce glare as light bouncing off of surfaces tend to fly that way. Liquid crystal displays also use polarity to control which light passes through. Some animals can see light polarization. \\n\\nFinally, a photon has a property called spin. Spin is related to light's circular polarization.\\n\\nPhoton interactions with matter\\nLight is often created or absorbed when an electron gains or loses energy. This energy can be in the form of heat, kinetic energy, or other form. For example, an incandescent light bulb uses heat. The increase of energy can push an electron up one level in a shell called a \"valence\". This makes it unstable, and like everything, it wants to be in the lowest energy state. (If being in the lowest energy state is confusing, pick up a pencil and drop it. Once on the ground, the pencil will be in a lower energy state). When the electron drops back down to a lower energy state, it needs to release the energy that hit it, and it must obey the conservation of energy (energy can neither be created nor destroyed). Electrons release this energy as photons, and at higher intensities, this photon can be seen as visible light.\\n\\nPhotons and the electromagnetic force\\nIn particle physics, photons are responsible for electromagnetic force. Electromagnetism is an idea that combines electricity with magnetism. One common way that we experience electromagnetism in our daily lives is light, which is caused by electromagnetism. Electromagnetism is also responsible for charge, which is the reason that you can not push your hand through a table. Since photons are the force-carrying particle of electromagnetism, they are also gauge bosons. Some mattercalled dar
"1 Thomas Dolby (born Thomas Morgan Robertson; 14 October 1958) is a British musican and computer designer. He is probably most famous for his 1982 hit, \"She Blinded me with Science\".\\n\\nHe married actress Kathleen Beller in 1988. The couple have three children together.\\n\\nDiscography\\n\\nSingles\\n\\nA Track did not chart in North America until 1983, after the success of \"She Blinded Me With Science\".\\n\\nAlbums\\n\\nStudio albums\\n\\nEPs\\n\\nReferences\\n\\nEnglish musicians\\nLiving people\\n1958 births\\nNew wave musicians\\nWarner Bros. Records artists
"2 Embroidery is the art of decorating fabric or other materials with designs stitched in strands of thread or yarn using a needle. Embroidery may also incorporate other materials such as metal strips, pearls, beads, quills, and sequins. Sewing machines can be used to create machine embroidery.\\n\\nQualifications \\nCity and Guilds qualification in Embroidery allows embroiderers to become recognized for their skill. This qualification also gives them the credibility to teach. For example, the notable textiles artist, Kathleen Laurel Sage, began her teaching career by getting the City and Guilds Embroidery 1 and 2 qualifications. She has now gone on to write a book on the subject.\\n\\nReferences\\n\\nOther websites\\n The Crimson Thread of Kinship at the National Museum of Australia\\n\\nNeedlework
"3 Consecutive numbers are numbers that follow each other in order. They have a difference of 1 between every two numbers. In a set of consecutive numbers, the mean and the median are equal. \\n\\nIf n is a number, then the next numbers will be n+1 and n+2. \\n\\nExamples \\n\\nConsecutive numbers that follow each other in order:\\n\\n 1, 2, 3, 4, 5\\n -3, 2, 1, 0, 1, 2, 3, 4\\n 6, 7, 8, 9, 10, 11, 12, 13\\n\\nConsecutive even numbers \\nConsecutive even numbers are even numbers that follow each other. They have a difference of 2 between every two numbers.\\n\\nIf n is an even integer, then n, n+2, n+4 and n+6 will be consecutive even numbers.\\n\\nFor example - 2,4,6,8,10,12,14,18 etc.\\n\\nConsecutive odd numbers\\nConsecutive odd numbers are odd numbers that follow each other. Like consecutive odd numbers, they have a difference of 2 between every two numbers.\\n\\nIf n is an odd integer, then n, n+2, n+4 and n+6 will be consecutive odd numbers.\\n\\nExamples\\n\\n3, 5, 7, 9, 11, 13, etc.\\n\\n23, 21, 19, 17, 15, -13, -11\\n\\nIntegers
"4 The German Empire (\"Deutsches Reich\" or \"Deutsches Kaiserreich\" in the German language) is the name for a group of German countries from January 18, 1871 to November 9, 1918. This is from the Unification of Germany when Wilhelm I of Prussia was made German Kaiser to when the third Emperor Wilhelm II was removed from power at the end of the First World War. In the 1920s, German nationalists started to call it the \"Second Reich\".\\n\\nThe name of Germany was \"Deutsches Reich\" until 1945. \"Reich\" can mean many things, empire, kingdom, state, \"richness\" or \"wealth\". Most members of the Empire were previously members of the North German Confederation. \\n\\nAt different times, there were three groups of smaller countries, each group was later called a \"Reich\" by some Germans. The first was the Holy Roman Empire. The second was the German Empire. The third was the Third Reich.\\n\\nThe words \"Second Reich\" were used for the German Empire by Arthur Moeller van den Bruck, a nationalist writer in the 1920s. He was trying to make a link with the earlier Holy Roman Empire which had once been very strong. Germany had lost First World War and was suffering big problems. van den Bruck wanted to start a \"Third Reich\" to unite the country. These words were later used by the Nazis to make themselves appear stronger.\\n\\nStates in the Empire\\n\\nRelated pages\\n Germany\\n Holy Roman Empire\\n Nazi Germany, or \"Drittes Reich\"\\n\\n1870s establishments in Germany\\n \\nStates and territories disestablished in the 20th century\\nStates and territories established in the 19th century\\n1871 establishments in Europe\\n1918 disestablishments in Germany
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"article_df = pd.read_csv('./data/wikipedia_articles_2000.csv')\n",
"article_df.head()"
]
},
{
"cell_type": "markdown",
"id": "ee8240c5",
"metadata": {},
"source": [
"## Storage\n",
"\n",
"We'll initialise our vector database first. Which database you choose and how you store data in it is a key decision point, and we've collated a few principles to aid your decision here:\n",
"\n",
"#### How much data to store\n",
"How much metadata do you want to include in the index. Metadata can be used to filter your queries or to bring back more information upon retrieval for your application to use, but larger indices will be slower so there is a trade-off.\n",
"\n",
"There are two common design patterns here:\n",
"- **All-in-one:** Store your metadata with the vector embeddings so you perform semantic search and retrieval on the same database. This is easier to setup and run, but can run into scaling issues when your index grows.\n",
"- **Vectors only:** Store just the embeddings and any IDs/references needed to locate the metadata that goes with the vector in a different database or location. In this pattern the vector database is only used to locate the most relevant IDs, then those are looked up from a different database. This can be more scalable if your vector database is going to be extremely large, or if you have large volumes of metadata with each vector.\n",
"\n",
"#### Which vector database to use\n",
"\n",
"The vector database market is wide and varied, so we won't recommend one over the other. For a few options you can review [this cookbook](./vector_databases/Using_vector_databases_for_embeddings_search.ipynb) and the sub-folders, which have examples supplied by many of the vector database providers in the market. \n",
"\n",
"We're going to use Redis as our database for both document contents and the vector embeddings. You will need the full Redis Stack to enable use of Redisearch, which is the module that allows semantic search - more detail is in the [docs for Redis Stack](https://redis.io/docs/stack/get-started/install/docker/).\n",
"\n",
"To set this up locally, you will need to:\n",
"- Install an appropriate version of [Docker](https://docs.docker.com/desktop/) for your OS\n",
"- Ensure Docker is running i.e. by running ```docker run hello-world```\n",
"- Run the following command: ```docker run -d --name redis-stack -p 6379:6379 -p 8001:8001 redis/redis-stack:latest```.\n",
"\n",
"The code used here draws heavily on [this repo](https://github.com/RedisAI/vecsim-demo).\n",
"\n",
"After setting up the Docker instance of Redis Stack, you can follow the below instructions to initiate a Redis connection and create a Hierarchical Navigable Small World (HNSW) index for semantic search."
]
},
{
"cell_type": "code",
"execution_count": 103,
"id": "fecba6de",
"metadata": {},
"outputs": [],
"source": [
"# Setup Redis\n",
"\n",
"\n",
"REDIS_HOST = 'localhost'\n",
"REDIS_PORT = '6380'\n",
"REDIS_DB = '0'\n",
"\n",
"redis_client = r(host=REDIS_HOST, port=REDIS_PORT, db=REDIS_DB,decode_responses=False)\n",
"\n",
"\n",
"# Constants\n",
"VECTOR_DIM = 1536 # length of the vectors\n",
"PREFIX = \"wiki\" # prefix for the document keys\n",
"DISTANCE_METRIC = \"COSINE\" # distance metric for the vectors (ex. COSINE, IP, L2)"
]
},
{
"cell_type": "code",
"execution_count": 104,
"id": "4cb5247d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 104,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create search index\n",
"\n",
"# Index\n",
"INDEX_NAME = \"wiki-index\" # name of the search index\n",
"VECTOR_FIELD_NAME = 'content_vector'\n",
"\n",
"# Define RediSearch fields for each of the columns in the dataset\n",
"# This is where you should add any additional metadata you want to capture\n",
"id = TextField(\"id\")\n",
"url = TextField(\"url\")\n",
"title = TextField(\"title\")\n",
"text_chunk = TextField(\"content\")\n",
"file_chunk_index = NumericField(\"file_chunk_index\")\n",
"\n",
"# define RediSearch vector fields to use HNSW index\n",
"\n",
"text_embedding = VectorField(VECTOR_FIELD_NAME,\n",
" \"HNSW\", {\n",
" \"TYPE\": \"FLOAT32\",\n",
" \"DIM\": VECTOR_DIM,\n",
" \"DISTANCE_METRIC\": DISTANCE_METRIC\n",
" }\n",
")\n",
"# Add all our field objects to a list to be created as an index\n",
"fields = [url,title,text_chunk,file_chunk_index,text_embedding]\n",
"\n",
"redis_client.ping()"
]
},
{
"cell_type": "code",
"execution_count": 126,
"id": "266b8aee",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 126,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"redis_client.flushall()"
]
},
{
"cell_type": "markdown",
"id": "33c07f00",
"metadata": {},
"source": [
"Optional step to drop the index if it already exists\n",
"\n",
"```redis_client.ft(INDEX_NAME).dropindex()```\n",
"\n",
"If you want to clear the whole DB use:\n",
"\n",
"```redis_client.flushall()```"
]
},
{
"cell_type": "code",
"execution_count": 127,
"id": "08f30b56",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Unknown Index name\n",
"Not there yet. Creating\n"
]
}
],
"source": [
"# Check if index exists\n",
"try:\n",
" redis_client.ft(INDEX_NAME).info()\n",
" print(\"Index already exists\")\n",
"except Exception as e:\n",
" print(e)\n",
" # Create RediSearch Index\n",
" print('Not there yet. Creating')\n",
" redis_client.ft(INDEX_NAME).create_index(\n",
" fields = fields,\n",
" definition = IndexDefinition(prefix=[PREFIX], index_type=IndexType.HASH)\n",
" )"
]
},
{
"cell_type": "markdown",
"id": "7684b539",
"metadata": {},
"source": [
"### Data preparation\n",
"\n",
"The next step is to prepare your data. There are a few decisions to keep in mind here:\n",
"\n",
"#### Chunking your data\n",
"\n",
"In this context, \"chunking\" means cutting up the text into reasonable sizes so that the content will fit into the context length of the language model you choose. If your data is small enough or your LLM has a large enough context limit then you can proceed with no chunking, but in many cases you'll need to chunk your data. I'll share two main design patterns here:\n",
"- **Token-based:** Chunking your data based on some common token threshold i.e. 300, 500, 1000 depending on your use case. This approach works best with a grid-search evaluation to decide the optimal chunking logic over a set of evaluation questions. Variables to consider are whether chunks have overlaps, and whether you extend or truncate a section to keep full sentences and paragraphs together.\n",
"- **Deterministic:** Deterministic chunking uses some common delimiter, like a page break, paragraph end, section header etc. to chunk. This can work well if you have data of reasonable uniform structure, or if you can use GPT to help annotate the data first so you can guarantee common delimiters. However, it can be difficult to handle your chunks when you stuff them into the prompt given you need to cater for many different lengths of content, so consider that in your application design.\n",
"\n",
"#### Which vectors should you store\n",
"\n",
"It is critical to think through the user experience you're building towards because this will inform both the number and content of your vectors. Here are two example use cases that show how these can pan out:\n",
"- **Tool Manual Knowledge Base:** We have a database of manuals that our customers want to search over. For this use case, we want a vector to allow the user to identify the right manual, before searching a different set of vectors to interrogate the content of the manual to avoid any cross-pollination of similar content between different manuals. \n",
" - **Title Vector:** Could include title, author name, brand and abstract.\n",
" - **Content Vector:** Includes content only.\n",
"- **Investor Reports:** We have a database of investor reports that contain financial information about public companies. I want relevant snippets pulled out and summarised so I can decide how to invest. In this instance we want one set of content vectors, so that the retrieval can pull multiple entries on a company or industry, and summarise them to form a composite analysis.\n",
" - **Content Vector:** Includes content only, or content supplemented by other features that improve search quality such as author, industry etc.\n",
" \n",
"For this walkthrough we'll go with 1000 token-based chunking of text content with no overlap, and embed them with the article title included as a prefix."
]
},
{
"cell_type": "code",
"execution_count": 128,
"id": "948225f7",
"metadata": {},
"outputs": [],
"source": [
"# We'll use 1000 token chunks with some intelligence to not split at the end of a sentence\n",
"TEXT_EMBEDDING_CHUNK_SIZE = 1000\n",
"EMBEDDINGS_MODEL = \"text-embedding-ada-002\""
]
},
{
"cell_type": "code",
"execution_count": 129,
"id": "31004582",
"metadata": {},
"outputs": [],
"source": [
"# Create embeddings for a text using a tokenizer and an OpenAI engine\n",
"\n",
"def create_embeddings_for_text(text, tokenizer):\n",
" \"\"\"Return a list of tuples (text_chunk, embedding) and an average embedding for a text.\"\"\"\n",
" token_chunks = list(chunks(text, TEXT_EMBEDDING_CHUNK_SIZE, tokenizer))\n",
" text_chunks = [tokenizer.decode(chunk) for chunk in token_chunks]\n",
"\n",
" embeddings_response = openai.Embedding.create(input=text_chunks, model=EMBEDDINGS_MODEL)\n",
" embeddings = [embedding[\"embedding\"] for embedding in embeddings_response]\n",
" text_embeddings = list(zip(text_chunks, embeddings))\n",
"\n",
" return text_embeddings\n",
"\n",
"# Split a text into smaller chunks of size n, preferably ending at the end of a sentence\n",
"def chunks(text, n, tokenizer):\n",
" tokens = tokenizer.encode(text)\n",
" \"\"\"Yield successive n-sized chunks from text.\"\"\"\n",
" i = 0\n",
" while i < len(tokens):\n",
" # Find the nearest end of sentence within a range of 0.5 * n and 1.5 * n tokens\n",
" j = min(i + int(1.5 * n), len(tokens))\n",
" while j > i + int(0.5 * n):\n",
" # Decode the tokens and check for full stop or newline\n",
" chunk = tokenizer.decode(tokens[i:j])\n",
" if chunk.endswith(\".\") or chunk.endswith(\"\\n\"):\n",
" break\n",
" j -= 1\n",
" # If no end of sentence found, use n tokens as the chunk size\n",
" if j == i + int(0.5 * n):\n",
" j = min(i + n, len(tokens))\n",
" yield tokens[i:j]\n",
" i = j\n",
" \n",
"def get_unique_id_for_file_chunk(title, chunk_index):\n",
" return str(title+\"-!\"+str(chunk_index))\n",
"\n",
"def process_file(x,vector_list):\n",
" url = x['url']\n",
" title = x['title']\n",
" file_body_string = x['text']\n",
"\n",
" # Clean up the file string by replacing newlines and double spaces and semi-colons\n",
" clean_file_body_string = file_body_string.replace(\" \", \" \").replace(\"\\n\", \"; \").replace(';',' ')\n",
" # \n",
" \n",
" \"\"\"Return a list of tuples (text_chunk, embedding) and an average embedding for a text.\"\"\"\n",
" token_chunks = list(chunks(clean_file_body_string, TEXT_EMBEDDING_CHUNK_SIZE, tokenizer))\n",
" text_chunks = [f'Title: {title};\\n'+ tokenizer.decode(chunk) for chunk in token_chunks]\n",
" \n",
" embeddings_response = openai.Embedding.create(input=text_chunks, model=EMBEDDINGS_MODEL)\n",
"\n",
" embeddings = [embedding[\"embedding\"] for embedding in embeddings_response['data']]\n",
" text_embeddings = list(zip(text_chunks, embeddings))\n",
"\n",
" # Get the vectors array of triples: file_chunk_id, embedding, metadata for each embedding\n",
" # Metadata is a dict with keys: filename, file_chunk_index\n",
" \n",
" for i, (text_chunk, embedding) in enumerate(text_embeddings):\n",
" id = get_unique_id_for_file_chunk(title, i)\n",
" vector_list.append(({'id': id\n",
" , \"vector\": embedding, 'metadata': {\"url\": x['url']\n",
" ,\"title\": title\n",
" , \"content\": text_chunk\n",
" , \"file_chunk_index\": i}}))"
]
},
{
"cell_type": "code",
"execution_count": 130,
"id": "dfeff174",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 12.5 s, sys: 2.48 s, total: 15 s\n",
"Wall time: 11min 47s\n"
]
},
{
"data": {
"text/plain": [
"0 None\n",
"1 None\n",
"2 None\n",
"3 None\n",
"4 None\n",
" ... \n",
"1995 None\n",
"1996 None\n",
"1997 None\n",
"1998 None\n",
"1999 None\n",
"Length: 2000, dtype: object"
]
},
"execution_count": 130,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"# This step takes about 13 minutes\n",
"\n",
"# Initialise tokenizer\n",
"tokenizer = tiktoken.get_encoding(\"cl100k_base\")\n",
"\n",
"# List to hold vectors\n",
"vector_list = []\n",
"\n",
"# Process each PDF file and prepare for embedding\n",
"article_df.apply(lambda x: process_file(x, vector_list),axis = 1)"
]
},
{
"cell_type": "code",
"execution_count": 131,
"id": "0352283a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Title: Photon;\n",
"Photons (from Greek φως, meaning light), in many atomic models in physics, are particles which transmit light. In other words, light is carried over space by photons. Photon is an elementary particle that is its own antiparticle. In quantum mechanics each photon has a characteristic quantum of energy that depends on frequency: A photon associated with light at a higher frequency will have more energy (and be associated with light at a shorter wavelength). Photons have a rest mass of 0 (zero). However, Einstein's theory of relativity says that they do have a certain amount of momentum. Before the photon got its name, Einstein revived the proposal that light is separate pieces of energy (particles). These particles came to be known as photons. A photon is usually given the symbol γ (gamma), Properties Photons are fundamental particles. Although they can be created and destroyed, their lifetime is infinite. In a vacuum, all photons move at the speed of light, c, which is equal to 299,792,458 meters (approximately 300,000 kilometers) per second. A photon has a given frequency, which determines its color. Radio technology makes great use of frequency. Beyond the visible range, frequency is less discussed, for example it is little used in distinguishing between X-Ray photons and infrared. Frequency is equivalent to the quantum energy of the photon, as related by the Planck constant equation, , where is the photon's energy, is the Plank constant, and is the frequency of the light associated with the photon. This frequency, , is typically measured in cycles per second, or equivalently, in Hz. The quantum energy of different photons is often used in cameras, and other machines that use visible and higher than visible radiation. This because these photons are energetic enough to ionize atoms. Another property of a photon is its wavelength. The frequency , wavelength , and speed of light are related by the equation, , where (lambda) is the wavelength, or length of the wave (typically measured in meters.) Another important property of a photon is its polarity. If you saw a giant photon coming straight at you, it could appear as a swath whipping vertically, horizontally, or somewhere in between. Polarized sunglasses stop photons swinging up and down from passing. This is how they reduce glare as light bouncing off of surfaces tend to fly that way. Liquid crystal displays also use polarity to control which light passes through. Some animals can see light polarization. Finally, a photon has a property called spin. Spin is related to light's circular polarization. Photon interactions with matter Light is often created or absorbed when an electron gains or loses energy. This energy can be in the form of heat, kinetic energy, or other form. For example, an incandescent light bulb uses heat. The increase of energy can push an electron up one level in a shell called a \"valence\". This makes it unstable, and like everything, it wants to be in the lowest energy state. (If being in the lowest energy state is confusing, pick up a pencil and drop it. Once on the ground, the pencil will be in a lower energy state). When the electron drops back down to a lower energy state, it needs to release the energy that hit it, and it must obey the conservation of energy (energy can neither be created nor destroyed). Electrons release this energy as photons, and at higher intensities, this photon can be seen as visible light. Photons and the electromagnetic force In particle physics, photons are responsible for electromagnetic force. Electromagnetism is an idea that combines electricity with magnetism. One common way that we experience electromagnetism in our daily lives is light, which is caused by electromagnetism. Electromagnetism is also responsible for charge, which is the reason that you can not push your hand through a table. Since photons are the force-carrying particle of electromagnetism, they are also gauge bosons. Some mattercalled dark matteris not believed to be affected by
]
}
],
"source": [
"print(vector_list[0]['metadata']['content'])"
]
},
{
"cell_type": "code",
"execution_count": 132,
"id": "b218a207",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c92d6f1199f24eaa8bd6a78fd40009f9",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
" 0%| | 0/27 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Create a Redis pipeline to load all the vectors and their metadata\n",
"def load_vectors(client:r, input_list, vector_field_name):\n",
" p = client.pipeline(transaction=False)\n",
" for text in input_list: \n",
" #hash key\n",
" key=f\"{PREFIX}:{text['id']}\"\n",
" \n",
" #hash values\n",
" item_metadata = text['metadata']\n",
" #\n",
" item_keywords_vector = np.array(text['vector'],dtype= 'float32').tobytes()\n",
" item_metadata[vector_field_name]=item_keywords_vector\n",
" \n",
" # HSET\n",
" p.hset(key,mapping=item_metadata)\n",
" \n",
" p.execute()\n",
"\n",
"batch_size = 100 # how many vectors we insert at once\n",
"\n",
"for i in tqdm(range(0, len(vector_list), batch_size)):\n",
" # find end of batch\n",
" i_end = min(len(vector_list), i+batch_size)\n",
" meta_batch = vector_list[i:i_end]\n",
" \n",
" load_vectors(redis_client,meta_batch,vector_field_name=VECTOR_FIELD_NAME)"
]
},
{
"cell_type": "code",
"execution_count": 133,
"id": "d3466f7d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'2652'"
]
},
"execution_count": 133,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"redis_client.ft(INDEX_NAME).info()['num_docs']"
]
},
{
"cell_type": "markdown",
"id": "43839177",
"metadata": {},
"source": [
"### Search\n",
"\n",
"We can now use our knowledge base to bring back search results. This is one of the areas of highest friction in enterprise knowledge retrieval use cases, with the most common being that the system is not retrieving what you intuitively think are the most relevant documents. There are a few ways of tackling this - I'll share a few options here, as well as some resources to take your research further:\n",
"\n",
"#### Vector search, keyword search or a hybrid\n",
"\n",
"Despite the strong capabilities out of the box that vector search gives, search is still not a solved problem, and there are well proven [Lucene-based](https://en.wikipedia.org/wiki/Apache_Lucene) search solutions such Elasticsearch and Solr that use methods that work well for certain use cases, as well as the sparse vector methods of traditional NLP such as [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf). If your retrieval is poor, the answer may be one of these in particular, or a combination:\n",
"- **Vector search:** Converts your text into vector embeddings which can be searched using KNN, SVM or some other model to return the most relevant results. This is the approach we take in this workbook, using a RediSearch vector DB which employs a KNN search under the hood.\n",
"- **Keyword search:** This method uses any keyword-based search approach to return a score - it could use Elasticsearch/Solr out-of-the-box, or a TF-IDF approach like BM25.\n",
"- **Hybrid search:** This last approach is a mix of the two, where you produce both a vector search and keyword search result, before using an ```alpha``` between 0 and 1 to weight the outputs. There is a great example of this explained by the Weaviate team [here](https://weaviate.io/blog/hybrid-search-explained).\n",
"\n",
"#### Hypothetical Document Embeddings (HyDE)\n",
"\n",
"This is a novel approach from [this paper](https://arxiv.org/abs/2212.10496), which states that a hypothetical answer to a question is more semantically similar to the real answer than the question is. In practice this means that your search would use GPT to generate a hypothetical answer, then embed that and use it for search. I've seen success with this both as a pure search, and as a retry step if the initial retrieval fails to retrieve relevant content. A simple example implementation is here:\n",
"```\n",
"def answer_question_hyde(question,prompt):\n",
" \n",
" hyde_prompt = '''You are OracleGPT, an helpful expert who answers user questions to the best of their ability.\n",
" Provide a confident answer to their question. If you don't know the answer, make the best guess you can based on the context of the question.\n",
"\n",
" User question: USER_QUESTION_HERE\n",
" \n",
" Answer:'''\n",
" \n",
" hypothetical_answer = openai.Completion.create(model=COMPLETIONS_MODEL,prompt=hyde_prompt.replace('USER_QUESTION_HERE',question))['choices'][0]['text']\n",
" \n",
" search_results = get_redis_results(redis_client,hypothetical_answer)\n",
" \n",
" return search_results\n",
"```\n",
"\n",
"#### Fine-tuning embeddings\n",
"\n",
"This next approach leverages the learning you gain from real question/answer pairs that your users will generate during the evaluation approach. It works by:\n",
"- Creating a dataset of positive (and optionally negative) question and answer pairs. Positive examples would be a correct retrieval to a question, while negative would be poor retrievals.\n",
"- Calculating the embeddings for both questions and answers and the cosine similarity between them.\n",
"- Train a model to optimize the embeddings matrix and test retrieval, picking the best one.\n",
"- Perform a matrix multiplication of the base Ada embeddings by this new best matrix, creating a new fine-tuned embedding to do for retrieval.\n",
"\n",
"There is a great walkthrough of both the approach and the code to perform it in [this cookbook](./Customizing_embeddings.ipynb).\n",
"\n",
"#### Reranking\n",
"\n",
"One other well-proven method from traditional search solutions that can be applied to any of the above approaches is reranking, where we over-fetch our search results, and then deterministically rerank based on a modifier or set of modifiers.\n",
"\n",
"An example is investor reports again - it is highly likely that if we have 3 reports on Apple, we'll want to make our investment decisions based on the latest one. In this instance a ```recency``` modifier could be applied to the vector scores to sort them, giving us the latest one on the top even if it is not the most semantically similar to our search question. "
]
},
{
"cell_type": "markdown",
"id": "9b2fdc7a",
"metadata": {},
"source": [
"For this walkthrough we'll stick with a basic semantic search bringing back the top 5 chunks for a user question, and providing a summarised response using GPT."
]
},
{
"cell_type": "code",
"execution_count": 113,
"id": "89da0c45",
"metadata": {},
"outputs": [],
"source": [
"# Make query to Redis\n",
"def query_redis(redis_conn,query,index_name, top_k=5):\n",
" \n",
" \n",
"\n",
" ## Creates embedding vector from user query\n",
" embedded_query = np.array(openai.Embedding.create(\n",
" input=query,\n",
" model=EMBEDDINGS_MODEL,\n",
" )[\"data\"][0]['embedding'], dtype=np.float32).tobytes()\n",
"\n",
" #prepare the query\n",
" q = Query(f'*=>[KNN {top_k} @{VECTOR_FIELD_NAME} $vec_param AS vector_score]').sort_by('vector_score').paging(0,top_k).return_fields('vector_score','url','title','content','text_chunk_index').dialect(2) \n",
" params_dict = {\"vec_param\": embedded_query}\n",
"\n",
" \n",
" #Execute the query\n",
" results = redis_conn.ft(index_name).search(q, query_params = params_dict)\n",
" \n",
" return results\n",
"\n",
"# Get mapped documents from Redis results\n",
"def get_redis_results(redis_conn,query,index_name):\n",
" \n",
" # Get most relevant documents from Redis\n",
" query_result = query_redis(redis_conn,query,index_name)\n",
" \n",
" # Extract info into a list\n",
" query_result_list = []\n",
" for i, result in enumerate(query_result.docs):\n",
" result_order = i\n",
" url = result.url\n",
" title = result.title\n",
" text = result.content\n",
" score = result.vector_score\n",
" query_result_list.append((result_order,url,title,text,score))\n",
" \n",
" # Display result as a DataFrame for ease of us\n",
" result_df = pd.DataFrame(query_result_list)\n",
" result_df.columns = ['id','url','title','result','certainty']\n",
" return result_df"
]
},
{
"cell_type": "code",
"execution_count": 114,
"id": "f0161a54",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 7.24 ms, sys: 3.83 ms, total: 11.1 ms\n",
"Wall time: 504 ms\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>id</th>\n",
" <th>url</th>\n",
" <th>title</th>\n",
" <th>result</th>\n",
" <th>certainty</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>0</td>\n",
" <td>https://simple.wikipedia.org/wiki/Thomas%20Dolby</td>\n",
" <td>Thomas Dolby</td>\n",
" <td>Title: Thomas Dolby;\\nThomas Dolby (born Thomas Morgan Robertson 14 October 1958) is a British musican and computer designer. He is probably most famous for his 1982 hit, \"She Blinded me with Science\". He married actress Kathleen Beller in 1988. The couple have three children together. Discography Singles A Track did not chart in North America until 1983, after the success of \"She Blinded Me With Science\". Albums Studio albums EPs References English musicians Living people 1958 births New wave musicians Warner Bros. Records artists</td>\n",
" <td>0.127933084965</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>https://simple.wikipedia.org/wiki/Bobby%20Darin</td>\n",
" <td>Bobby Darin</td>\n",
" <td>Title: Bobby Darin;\\nWalden Robert Cassotto (May 14, 1936 December 20, 1973), better known as Bobby Darin, was an American pop singer, most famous during the 1950s. His hits included \"Mack the Knife\", \"Dream Lover\", \"If I Were a Carpenter\", \"Splish Splash\", and \"Beyond the Sea\". He also helped Wayne Newton begin his musical career. Career Allen Klein, an accountant who became an artist manager, first came to public attention when he audited Darin's royalty payments, and discovered Darin had been underpaid. His record company paid up, and Darin split the money with Klein. Darin was married to actress Sandra Dee from 1960 to 1967. They had a son, named Dodd. Darin died late in 1973 after heart surgery. In 2004, a movie, Beyond the Sea, was made about Darin's life and career. Actor Kevin Spacey, a longtime Darin fan, produced and starred in the movie, with Kate Bosworth as Sandra Dee. Other websites Hear Bobby Darin on the Pop Chronicles Singers from New York City Deaths from surgical complications 1936 births 1973 deaths People from the Bronx</td>\n",
" <td>0.25524866581</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" id url title \\\n",
"0 0 https://simple.wikipedia.org/wiki/Thomas%20Dolby Thomas Dolby \n",
"1 1 https://simple.wikipedia.org/wiki/Bobby%20Darin Bobby Darin \n",
"\n",
" result \\\n",
"0 Title: Thomas Dolby;\\nThomas Dolby (born Thomas Morgan Robertson 14 October 1958) is a British musican and computer designer. He is probably most famous for his 1982 hit, \"She Blinded me with Science\". He married actress Kathleen Beller in 1988. The couple have three children together. Discography Singles A Track did not chart in North America until 1983, after the success of \"She Blinded Me With Science\". Albums Studio albums EPs References English musicians Living people 1958 births New wave musicians Warner Bros. Records artists \n",
"1 Title: Bobby Darin;\\nWalden Robert Cassotto (May 14, 1936 December 20, 1973), better known as Bobby Darin, was an American pop singer, most famous during the 1950s. His hits included \"Mack the Knife\", \"Dream Lover\", \"If I Were a Carpenter\", \"Splish Splash\", and \"Beyond the Sea\". He also helped Wayne Newton begin his musical career. Career Allen Klein, an accountant who became an artist manager, first came to public attention when he audited Darin's royalty payments, and discovered Darin had been underpaid. His record company paid up, and Darin split the money with Klein. Darin was married to actress Sandra Dee from 1960 to 1967. They had a son, named Dodd. Darin died late in 1973 after heart surgery. In 2004, a movie, Beyond the Sea, was made about Darin's life and career. Actor Kevin Spacey, a longtime Darin fan, produced and starred in the movie, with Kate Bosworth as Sandra Dee. Other websites Hear Bobby Darin on the Pop Chronicles Singers from New York City Deaths from surgical complications 1936 births 1973 deaths People from the Bronx \n",
"\n",
" certainty \n",
"0 0.127933084965 \n",
"1 0.25524866581 "
]
},
"execution_count": 114,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"\n",
"wiki_query='What is Thomas Dolby known for?'\n",
"\n",
"result_df = get_redis_results(redis_client,wiki_query,index_name=INDEX_NAME)\n",
"result_df.head(2)"
]
},
{
"cell_type": "code",
"execution_count": 115,
"id": "48d136b0",
"metadata": {},
"outputs": [],
"source": [
"# Build a prompt to provide the original query, the result and ask to summarise for the user\n",
"retrieval_prompt = '''Use the content to answer the search query the customer has sent. Provide the source for your answer.\n",
"If you can't answer the user's question, say \"Sorry, I am unable to answer the question with the content\". Do not guess.\n",
"\n",
"Search query: \n",
"\n",
"SEARCH_QUERY_HERE\n",
"\n",
"Content: \n",
"\n",
"SEARCH_CONTENT_HERE\n",
"\n",
"Answer:\n",
"'''\n",
"\n",
"def answer_user_question(query):\n",
" \n",
" results = get_redis_results(redis_client,query,INDEX_NAME)\n",
" \n",
" retrieval_prepped = retrieval_prompt.replace('SEARCH_QUERY_HERE',query).replace('SEARCH_CONTENT_HERE',results['result'][0])\n",
" retrieval = openai.ChatCompletion.create(model=CHAT_MODEL,messages=[{'role':\"user\",'content': retrieval_prepped}],max_tokens=500)\n",
" \n",
" # Response provided by GPT-3.5\n",
" return retrieval['choices'][0]['message']['content']"
]
},
{
"cell_type": "code",
"execution_count": 116,
"id": "06f6e6ed",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Thomas Dolby is known for his 1982 hit \"She Blinded me with Science\" as well as being a British musician and computer designer. He has released multiple studio albums and singles. (Source: Content provided)\n"
]
}
],
"source": [
"print(answer_user_question(wiki_query))"
]
},
{
"cell_type": "markdown",
"id": "70e9b239",
"metadata": {},
"source": [
"### Answer\n",
"\n",
"We've now created a knowledge base that can answer user questions on Wikipedia. However, the user experience could be better, and this is where the Answer layer comes in, where an LLM Agent is used to interact with the user.\n",
"\n",
"There are different level of complexity in building a knowledge retrieval experience leveraging an LLM; there is an experience vs. effort trade-off to consider when selecting the right type of interaction. There are many patterns, but I'll highlight a few of the most common here:\n",
"\n",
"#### Choosing the user experience and architecture\n",
"\n",
"There are different level of complexity in building a knowledge retrieval experience leveraging an LLM; there is an experience vs. effort trade-off to consider when selecting the right type of interaction. There are many patterns, but I'll highlight a few of the most common here:\n",
"- **Q&A:** Your classic search engine use case, where the user inputs a question and your LLM gives them an answer either using its knowledge or, much more commonly, using a knowledge base that you prepare using the steps we've covered already. This simple use case assumes no memory of past queries is required, and no ability to clarify with the human or ask for more information.\n",
"- **Chat:** I think of Chat as being Q&A + memory - this is a slightly more sophisticated interaction where the LLM remembers what was previously asked and can delve deeper on something already covered.\n",
"- **Agent:** The most sophisticated is what LangChain calls an Agent, they leverage large language models to process and produce human-like results through a variety of tools, and will chain queries together dynamically until it has an answer that the LLM feels is appropriate to answer the user's question. However, for every \"turn\" you allow between Agent and user you increase the risks of loss of context, hallucination, or parsing errors, so be clear about the exact requirements your users have before embarking on building the Answer layer.\n",
"\n",
"Q&A use cases are the simplest to implement, while Agents can give the most sophisticated user experience - in this notebook we'll build an Agent with memory and a single Tool to give an appreciation for the flexibilty prompt chaining gives you in getting a more complete answer for your users.\n",
"\n",
"#### Ensuring reliability\n",
"\n",
"The more complexity you add, the more chance your LLM will fail to respond correctly, or a response will come back in the wrong format and break your Answer pipeline. We'll share a few methods our customers have used elsewhere to help \"channel\" the Agent down a more deterministic path, and to deal with issues when they do crop up:\n",
"- **Prompt chaining:** Prompting the model to take a step-by-step approach and think aloud using a scratchpad has been proven to deliver more consistent results. It also means that as a developer you can break up one complex prompt into many simpler, more deterministic prompts, with the output of one prompt becoming the input for the next. This approach is known as Chain-of-Thought (CoT) reasoning - I'd suggest digging deeper as this is a dynamic new area of research, with a few of the key papers referenced here:\n",
" - Chain of thought prompting [paper](https://arxiv.org/abs/2201.11903)\n",
" - Self-reflecting agent [paper](https://arxiv.org/abs/2303.11366)\n",
"- **Self-referencing:** You can return references for the LLM's answer through either your application logic, or by prompt engineering it to return references. I would generally suggest doing it in your application logic, although if you have multiple chunks then a hybrid approach where you ask the LLM to return the key of the chunk it used could be advisable. I view this as a UX opportunity, where for many search use cases giving the \"raw\" output of the chunks retrieved as well as the summarised answer can give the user the best of both worlds, but please go with whatever is most appropriate for your users.\n",
"- **Discriminator models:** The best control for unwanted outputs is undoubtably through preventing it from happening with prompt engineering, prompt chaining and retrieval. However, when all these fail then a discriminator model is a useful detective control. This is a classifier trained on past unwanted outputs, that flags the Agent's response to the user as Safe or Not, enabling you to perform some business logic to either retry, pass to a human, or say it doesn't know. \n",
" - There is an example in our [Help Center](https://help.openai.com/en/articles/5528730-fine-tuning-a-classifier-to-improve-truthfulness).\n",
"\n",
"This is a dynamic topic that has still not consolidated to a clear design that works best above all others, so for ease of implementation we will use LangChain, which supplies a framework with implementations for most of the concepts we've discussed above.\n",
"\n",
"We'll create an Agent with access to our knowledge base, give it a prompt template and a custom parser for extracting the answers, set up a prompt chain and then let it answer our Wikipedia questions.\n",
"\n",
"Our work here draws heavily on LangChain's great documentation, in particular [this guide](https://python.langchain.com/en/latest/modules/agents/agents/custom_llm_chat_agent.html)."
]
},
{
"cell_type": "code",
"execution_count": 117,
"id": "0ccca3da",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import Tool, AgentExecutor, LLMSingleActionAgent, AgentOutputParser\n",
"from langchain.prompts import BaseChatPromptTemplate\n",
"from langchain import SerpAPIWrapper, LLMChain\n",
"from langchain.chat_models import ChatOpenAI\n",
"from typing import List, Union\n",
"from langchain.schema import AgentAction, AgentFinish, HumanMessage\n",
"from langchain.memory import ConversationBufferWindowMemory\n",
"import re"
]
},
{
"cell_type": "code",
"execution_count": 118,
"id": "68a0b8dd",
"metadata": {},
"outputs": [],
"source": [
"# Define which tools the agent can use to answer user queries\n",
"tools = [\n",
" Tool(\n",
" name = \"Search\",\n",
" func=answer_user_question,\n",
" description=\"Useful for when you need to answer general knowledge questions. Input should be a fully formed question.\"\n",
" )\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 119,
"id": "39e101ee",
"metadata": {},
"outputs": [],
"source": [
"# Set up the base template\n",
"template = \"\"\"You are WikiGPT, a helpful bot who has access to a database of Wikipedia data to answer questions.\n",
"You have access to the following tools::\n",
"\n",
"{tools}\n",
"\n",
"Use the following format:\n",
"\n",
"Question: the input question you must answer\n",
"Thought: you should always think about what to do\n",
"Action: the action to take, should be one of [{tool_names}]\n",
"Action Input: the input to the action\n",
"Observation: the result of the action\n",
"... (this Thought/Action/Action Input/Observation can repeat N times)\n",
"Thought: I now know the final answer\n",
"Final Answer: the final answer to the original input question\n",
"\n",
"Begin! Remember to give detailed, informative answers\n",
"\n",
"Previous conversation history:\n",
"{history}\n",
"\n",
"New question: {input}\n",
"{agent_scratchpad}\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 120,
"id": "a2b9f271",
"metadata": {},
"outputs": [],
"source": [
"# Set up a prompt template\n",
"class CustomPromptTemplate(BaseChatPromptTemplate):\n",
" # The template to use\n",
" template: str\n",
" # The list of tools available\n",
" tools: List[Tool]\n",
" \n",
" def format_messages(self, **kwargs) -> str:\n",
" # Get the intermediate steps (AgentAction, Observation tuples)\n",
" # Format them in a particular way\n",
" intermediate_steps = kwargs.pop(\"intermediate_steps\")\n",
" thoughts = \"\"\n",
" for action, observation in intermediate_steps:\n",
" thoughts += action.log\n",
" thoughts += f\"\\nObservation: {observation}\\nThought: \"\n",
" # Set the agent_scratchpad variable to that value\n",
" kwargs[\"agent_scratchpad\"] = thoughts\n",
" # Create a tools variable from the list of tools provided\n",
" kwargs[\"tools\"] = \"\\n\".join([f\"{tool.name}: {tool.description}\" for tool in self.tools])\n",
" # Create a list of tool names for the tools provided\n",
" kwargs[\"tool_names\"] = \", \".join([tool.name for tool in self.tools])\n",
" formatted = self.template.format(**kwargs)\n",
" return [HumanMessage(content=formatted)]\n",
" \n",
" \n",
"class CustomOutputParser(AgentOutputParser):\n",
" \n",
" def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:\n",
" # Check if agent should finish\n",
" if \"Final Answer:\" in llm_output:\n",
" return AgentFinish(\n",
" # Return values is generally always a dictionary with a single `output` key\n",
" # It is not recommended to try anything else at the moment :)\n",
" return_values={\"output\": llm_output.split(\"Final Answer:\")[-1].strip()},\n",
" log=llm_output,\n",
" )\n",
" # Parse out the action and action input\n",
" regex = r\"Action\\s*\\d*\\s*:(.*?)\\nAction\\s*\\d*\\s*Input\\s*\\d*\\s*:[\\s]*(.*)\"\n",
" match = re.search(regex, llm_output, re.DOTALL)\n",
" if not match:\n",
" raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",
" action = match.group(1).strip()\n",
" action_input = match.group(2)\n",
" # Return the action and action input\n",
" return AgentAction(tool=action, tool_input=action_input.strip(\" \").strip('\"'), log=llm_output)"
]
},
{
"cell_type": "code",
"execution_count": 121,
"id": "454c3ca9",
"metadata": {},
"outputs": [],
"source": [
"prompt = CustomPromptTemplate(\n",
" template=template,\n",
" tools=tools,\n",
" # This omits the `agent_scratchpad`, `tools`, and `tool_names` variables because those are generated dynamically\n",
" # The history template includes \"history\" as an input variable so we can interpolate it into the prompt\n",
" input_variables=[\"input\", \"intermediate_steps\", \"history\"]\n",
")\n",
"\n",
"# Initiate the memory with k=2 to keep the last two turns\n",
"# Provide the memory to the agent\n",
"memory = ConversationBufferWindowMemory(k=2)"
]
},
{
"cell_type": "code",
"execution_count": 122,
"id": "34de07d2",
"metadata": {},
"outputs": [],
"source": [
"output_parser = CustomOutputParser()\n",
"\n",
"llm = ChatOpenAI(temperature=0,openai_organization='org-l89177bnhkme4a44292n5r3j')\n",
"\n",
"# LLM chain consisting of the LLM and a prompt\n",
"llm_chain = LLMChain(llm=llm, prompt=prompt)\n",
"\n",
"tool_names = [tool.name for tool in tools]\n",
"agent = LLMSingleActionAgent(\n",
" llm_chain=llm_chain, \n",
" output_parser=output_parser,\n",
" stop=[\"\\nObservation:\"], \n",
" allowed_tools=tool_names\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 123,
"id": "d3603d58",
"metadata": {},
"outputs": [],
"source": [
"agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=True, memory=memory)"
]
},
{
"cell_type": "code",
"execution_count": 124,
"id": "6bfb594b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I should use the Search tool to find information about Thomas Dolby\n",
"Action: Search\n",
"Action Input: \"What is Thomas Dolby known for?\"\u001b[0m\n",
"\n",
"Observation:\u001b[36;1m\u001b[1;3mThomas Dolby is known for his 1982 hit \"She Blinded Me With Science\" and his career as a British musician and computer designer. He has also released several studio albums and EPs. Source: Thomas Dolby Wikipedia page.\u001b[0m\u001b[32;1m\u001b[1;3mI now know what Thomas Dolby is known for.\n",
"Final Answer: Thomas Dolby is known for his hit song \"She Blinded Me With Science\" and his career as a British musician and computer designer.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'Thomas Dolby is known for his hit song \"She Blinded Me With Science\" and his career as a British musician and computer designer.'"
]
},
"execution_count": 124,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent_executor.run(f1_query)"
]
},
{
"cell_type": "code",
"execution_count": 125,
"id": "ba65b7e3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I got the answer from my database of Wikipedia data.\n",
"Action: Search\n",
"Action Input: \"Thomas Dolby Wikipedia\"\u001b[0m\n",
"\n",
"Observation:\u001b[36;1m\u001b[1;3mThomas Dolby is a British musician and computer designer who is best known for his 1982 hit \"She Blinded me with Science\". He was born as Thomas Morgan Robertson on October 14, 1958. He has released studio albums, EPs, and singles. He married actress Kathleen Beller in 1988 and they have three children together. (Source: Wikipedia)\u001b[0m\u001b[32;1m\u001b[1;3mI have found the source of my previous answer.\n",
"Final Answer: The source of my previous answer was the Wikipedia page for Thomas Dolby.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"'The source of my previous answer was the Wikipedia page for Thomas Dolby.'"
]
},
"execution_count": 125,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent_executor.run('What source did you get that answer from?')"
]
},
{
"cell_type": "markdown",
"id": "f19f7a7e",
"metadata": {},
"source": [
"### Evaluation\n",
"\n",
"Last comes the not-so-fun bit that will make the difference between nifty prototype and production application - the process of evaluating and tuning your results. \n",
"\n",
"The key takeaway here is to make a framework that saves the results of each evaluation, as well as the parameters. Evaluation can be a difficult task that takes significant resources, so it is best to start prepared to handle multiple iterations. Some useful principles we've seen successful deployments use are:\n",
"- **Assign clear product ownership and metrics:** Ensure you have a team aligned from the start to annotate the outputs and determine whether they're bad or good. This may seem an obvious step, but too often the focus is on the engineering challenge of successfully retrieving content rather than the product challenge of providing retrieval results that are useful.\n",
"- **Log everything:** Store all requests and responses to and from your LLM and retrieval service if you can, it builds a great base for fine-tuning both the embeddings and any fine-tuned models or few-shot LLMs in future.\n",
"- **Use GPT-4 as a labeller:** When running evaluations, it can help to use GPT-4 as a gatekeeper for human annotation. Human annotation is costly and time-consuming, so doing an initial evaluation run with GPT-4 can help set a quality bar that needs to be met to justify human labeling. At this stage I would not suggest using GPT-4 as your only labeler, but it can certainly ease the burden.\n",
" - This approach is outlined further in [this paper](https://arxiv.org/abs/2108.13487).\n",
"\n",
"We'll use these principles to make a quick evaluation framework where we will:\n",
"- Use GPT-4 to make a list of hypothetical questions on our topic\n",
"- Ask our Agent the questions and save question/answer tuples\n",
" - These two above steps simulate the actual users interacting with your application\n",
"- Get GPT-4 to evaluate whether the answers correctly respond to the questions\n",
"- Look at our results to measure how well the Agent answered the questions\n",
"- Plan remedial action"
]
},
{
"cell_type": "code",
"execution_count": 83,
"id": "b8314f7b",
"metadata": {},
"outputs": [],
"source": [
"import time\n",
"\n",
"# Build a prompt to provide the original query, the result and ask to summarise for the user\n",
"evaluation_question_prompt = '''You are a helpful Wikipedia assistant who generates creative general knowledge questions.\n",
"\n",
"Examples:\n",
"- Explain how photons work\n",
"- What is Thomas Dolby known for?\n",
"- What are some key events of the 20th century?\n",
"\n",
"Begin!\n",
"\n",
"Question:'''\n",
"\n",
"evaluation_questions = []\n",
"\n",
"for i in range(0,10):\n",
" try:\n",
" question = openai.ChatCompletion.create(model='gpt-4-0314',messages=[{\"role\":\"user\",\"content\":evaluation_question_prompt}],temperature=0.9)\n",
" evaluation_questions.append(question['choices'][0]['message']['content'])\n",
" except Exception as e:\n",
" print(e)\n",
" print('Retrying')\n",
" try:\n",
" time.sleep(10)\n",
" question = openai.ChatCompletion.create(model='gpt-4-0314',messages=[{\"role\":\"user\",\"content\":evaluation_question_prompt}],temperature=0.9)\n",
" evaluation_questions.append(question['choices'][0]['message']['content'])\n",
" except Exception as e:\n",
" print(e)\n"
]
},
{
"cell_type": "code",
"execution_count": 84,
"id": "9807f4f6",
"metadata": {},
"outputs": [],
"source": [
"# Clean up our lists of questions and append to one giant list\n",
"all_questions = []\n",
"for question in evaluation_questions:\n",
" question_list = question.replace('\\n\\n','\\n').split('\\n')\n",
" [all_questions.append(x) for x in question_list]"
]
},
{
"cell_type": "code",
"execution_count": 85,
"id": "ca64afd0",
"metadata": {},
"outputs": [],
"source": [
"question_answer_pairs = []"
]
},
{
"cell_type": "code",
"execution_count": 86,
"id": "4446041c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"- Can you describe the evolution of transportation throughout human history?\n",
"Could not parse LLM output: `Based on my searches, the history of transportation technology is a complex and multifaceted topic that requires a more specific question to provide a comprehensive answer. However, it can be inferred that transportation technology has evolved from simple tools and methods of travel to more advanced and efficient systems such as high-speed trains like the TGV and Maglev trains.`\n",
"- What factors contributed to the fall of the Roman Empire?\n",
"Could not parse LLM output: `There are multiple factors that contributed to the fall of the Roman Empire, including barbarian invasions, economic troubles, and political instability. It's important to note that the fall of the Roman Empire was a complex process that took place over several centuries, and there is no single cause that can be attributed to its decline.`\n",
"- How has currency evolved throughout history and what is the future of money?\n",
"Could not parse LLM output: `Based on my searches, I was unable to find a comprehensive answer to the original question. It may be helpful to consult additional sources or conduct more specific searches on related topics.`\n",
"- How do convergent and divergent boundaries affect plate tectonics?\n",
"Could not parse LLM output: `So convergent boundaries result in subduction and the formation of mountains and volcanoes, while divergent boundaries create new land and earthquakes. \n",
"Action: None`\n",
"- What factors led to the rise of the European Renaissance?\n",
"Could not parse LLM output: `Based on the information I found, it seems that the rediscovery of classical Greek and Roman art and culture, as well as advancements in various fields, were key factors in the rise of the European Renaissance.\n",
"Action: None`\n",
"- Describe the impact of the Green Revolution on global food production and the environment.\n",
"Could not parse LLM output: `Based on the information I found, the Green Revolution had both positive and negative impacts on global food production and the environment. I should summarize these impacts for the final answer.\n",
"Action: None`\n",
"- How did the phenomenon of globalization impact the world economy?\n",
"Could not parse LLM output: `India is an example of a country that has benefited from globalization in terms of their economy, according to the content.`\n",
"- What roles do neurotransmitters play in the human brain?\n",
"Could not parse LLM output: `Neurotransmitters are important for many brain functions, including memory and emotions. Alzheimer's disease affects the brain's ability to handle signals for memory and movement.\n",
"Action: None`\n",
"- What role did Marie Curie play in the discovery of radioactivity?\n",
"Could not parse LLM output: `Marie Curie's contributions to the discovery of radioactivity were significant and groundbreaking.\n",
"Action: None`\n"
]
}
],
"source": [
"for question in all_questions:\n",
" memory = ConversationBufferWindowMemory(k=2)\n",
" time.sleep(5)\n",
" agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, verbose=False,memory=memory)\n",
" try:\n",
" \n",
" answer = agent_executor.run(question)\n",
" except Exception as e:\n",
" print(question)\n",
" print(e)\n",
" answer = 'Unable to answer question'\n",
" question_answer_pairs.append((question,answer))"
]
},
{
"cell_type": "code",
"execution_count": 87,
"id": "94194df9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[('- How do tsunamis form and what are their impacts?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What are the primary functions of the United Nations?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- Can you describe the evolution of transportation throughout human history?',\n",
" 'Unable to answer question'),\n",
" ('- What role did the printing press play in the spread of knowledge and ideas?',\n",
" 'The printing press played a significant role in the spread of knowledge and ideas by making it easier and faster to produce books and other printed materials. This led to an increase in literacy and the dissemination of new ideas and information, ultimately contributing to the development of modern society.'),\n",
" ('- What are the origins and cultural significance of the Olympic Games?',\n",
" 'The Olympic Games were first held in Ancient Greece and were revived in 1896 with the first international Olympic Games in Athens, Greece. The cultural significance of the games lies in their ability to bring together people from all over the world in a spirit of international unity, cooperation, and sporting excellence. The games are a symbol of the human spirit and the pursuit of excellence, and they continue to be an important cultural event in modern times.'),\n",
" ('- How did the invention of the internet change the way we communicate and access information?',\n",
" 'The invention of the internet has had a significant impact on communication and information access, including the rise of email as a primary communication tool and the increased access to information through search engines and online databases. Social media has also significantly changed the way we communicate and access information, making it easier for people to share and disseminate their views and opinions to a broader audience and facilitating direct communication between individuals and groups who may not have previously had access to one another. Additionally, social media has become a valuable tool for political expression and organization.'),\n",
" ('- What are the major artistic movements and their main characteristics?',\n",
" 'The major artistic movements and their main characteristics include Realism, Impressionism, Post-Impressionism, Cubism, Surrealism, and Abstract Expressionism. Realism focuses on depicting everyday life and society as it truly exists, while Impressionism captures the fleeting effects of light and atmosphere. Post-Impressionism emphasizes individual expression and subjective experience, while Cubism uses geometric shapes and fragmentation of the subject. Surrealism explores the subconscious mind through dream imagery and symbolism, and Abstract Expressionism emphasizes the physical act of creating art with large canvases and gestural brushstrokes.'),\n",
" (\"- How do the Earth's tectonic plates affect its geological features and processes?\",\n",
" \"Tectonic plates affect Earth's geological features and processes by creating movements that can lead to earthquakes, volcanic activity, mountain-building, and oceanic trench formation along plate boundaries. The movement of plates varies and can create divergent boundaries, convergent boundaries, and transform fault boundaries. There are two types of tectonic plates: oceanic and continental, and their thickness and composition differ.\"),\n",
" ('- What factors contributed to the fall of the Roman Empire?',\n",
" 'Unable to answer question'),\n",
" ('- How has currency evolved throughout history and what is the future of money?',\n",
" 'Unable to answer question'),\n",
" ('- How do convergent and divergent boundaries affect plate tectonics?',\n",
" 'Unable to answer question'),\n",
" ('- What role do neurotransmitters play in the human nervous system?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- Describe the influence of the Harlem Renaissance on American culture and arts.',\n",
" 'The Harlem Renaissance had a significant impact on American culture and arts, particularly in literature and music. It provided a platform for African American writers, poets, and musicians to celebrate their heritage and culture, challenge stereotypes and discrimination, and become prominent figures in American culture.'),\n",
" ('- What factors led to the collapse of the Soviet Union?',\n",
" \"The factors that led to the collapse of the Soviet Union include the spread of communism becoming less popular, the country's inability to promote economic growth as well as Western states, and political reform allowing freedom of speech for everybody. Additionally, Gorbachev's decision to not force the countries of the Eastern bloc to stick to communism played a significant role in the collapse of the Soviet Union.\"),\n",
" (\"- How does photosynthesis contribute to the Earth's oxygen supply?\",\n",
" \"Photosynthesis contributes to the Earth's oxygen supply by producing oxygen as a by-product when plants create glucose through the process of using sunlight, carbon dioxide, and water.\"),\n",
" ('- In what ways did the Industrial Revolution transform society and the economy?',\n",
" 'The Industrial Revolution had a significant impact on society and the economy, leading to new inventions, mass production, and urbanization. It improved the standard of living and created new job opportunities, but also led to social and economic inequalities, unsustainable resource consumption, and environmental problems.'),\n",
" ('- How has the internet impacted globalization and communication?',\n",
" 'The internet has had a significant impact on globalization and communication. While it has allowed for efficient and effective communication between people from all over the world, it has also led to issues such as spamming. However, overall, the internet has helped to break down barriers between countries and cultures, making it easier for people to learn about and engage with one another, which has opened up new opportunities for businesses and individuals.'),\n",
" ('- What are the primary differences between presidential and parliamentary systems of government?',\n",
" 'The primary differences between presidential and parliamentary systems of government are that presidential systems have a clear separation of powers between the executive and legislative branches, while parliamentary systems have a fusion of powers between the executive (led by a Prime Minister) and legislative branches. In a parliamentary system, the executive is accountable to the legislature and can be removed by a vote of no confidence, while in a presidential system the executive is elected separately and cannot be removed by the legislature.'),\n",
" ('- Describe the role of mitochondria in cellular respiration.',\n",
" 'The role of mitochondria in cellular respiration is to convert the energy from the breakdown of glucose into ATP that the cell can use for energy.'),\n",
" ('- How did the discovery of penicillin revolutionize medicine?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What were the major causes and consequences of the French Revolution?',\n",
" 'The major causes of the French Revolution were social inequalities, financial crisis, political unrest, and the rise of the bourgeoisie. The Revolution resulted in the overthrow of the monarchy, the establishment of a republic, and the Reign of Terror. The consequences of the French Revolution included the rise of Napoleon Bonaparte, the spread of revolutionary ideals throughout Europe, and the Restoration period in France, which attempted to restore the pre-revolutionary order but ultimately failed to completely erase the impact of the French Revolution and its ideals.'),\n",
" ('- What are the main components and functions of the human circulatory system?',\n",
" \"The main components of the human circulatory system are the heart, blood vessels, and blood. The system functions to transport blood, oxygen, and nutrients throughout the body's tissues and organs.\"),\n",
" ('- How did the Scientific Revolution challenge traditional beliefs and catalyze progress in various fields?',\n",
" 'The Scientific Revolution challenged traditional beliefs and catalyzed progress in various fields by promoting the use of reason and observation to understand the natural world, leading to advancements in astronomy, physics, mathematics, and biology. It challenged the traditional belief in the geocentric model of the universe and established the heliocentric model. It also led to the development of the scientific method, allowing for more systematic and accurate experimentation.'),\n",
" ('- Describe the unique features and characteristics of the Galapagos Islands and their impact on the study of evolution.',\n",
" \"The Galapagos Islands are known for their unique features and characteristics that have impacted the study of evolution, including the giant tortoises and their variations between islands, as well as their role in Darwin's observations.\"),\n",
" ('- What factors contributed to the rise of the Roman Empire, and how did it ultimately fall?',\n",
" 'The factors contributing to the rise and fall of the Roman Empire include expansion through wars against other nations and assimilation of their culture, split into East and West, invasion and attacks from barbarians, and internal division. The western part collapsed due to internal problems, while the eastern part declined due to invasion and attacks from external forces. The Roman Empire ended in 1453 when Mehmed II conquered Constantinople.'),\n",
" ('- How did the invention of the printing press impact society?',\n",
" 'The invention of the printing press impacted society by allowing for mass production of books, increased literacy rates, and access to knowledge. It also played a role in the scientific revolution, the Enlightenment, and the Reformation.'),\n",
" ('- What factors contributed to the fall of the Roman Empire?',\n",
" 'The fall of the Roman Empire was caused by several factors, including attacks by barbarians, the split of the empire into East and West, and internal political instability. These factors led to the decline and eventual collapse of the western empire, while the eastern empire faced threats from the Sassanid Empire and the rise of Islam. However, the fall of Rome also led to the rise of new civilizations in Europe and Asia during the Middle Ages, including the Islamic Golden Age and the rise of the Ottoman Empire. The Renaissance marked a rebirth of culture and advancements in Europe.'),\n",
" ('- What role does the greenhouse effect play in climate change?',\n",
" \"The greenhouse effect plays a significant role in climate change by trapping heat in the Earth's atmosphere through the emission of greenhouse gases, primarily produced by human activities like agriculture. Implementing sustainable farming practices and reducing the use of fossil fuels can help to mitigate the impact of these emissions.\"),\n",
" ('- Who were some influential artists during the Harlem Renaissance?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" (\"- What are the primary functions of the human brain's amygdala?\",\n",
" \"The primary functions of the human brain's amygdala are to process emotions such as fear and aggression, and to play a role in the formation of memories associated with emotional events.\"),\n",
" ('- How do bees communicate with each other through the waggle dance?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What is the significance of Machu Picchu in Incan history?',\n",
" 'Machu Picchu was a significant site in Incan history as it was believed to be a royal estate or sacred religious site for the Incan emperor Pachacuti. It was also a symbol of Incan engineering and architecture, showcasing their ability to build structures on steep and rugged terrain. The site was abandoned during the Spanish conquest of the Inca Empire in the 16th century and was rediscovered in 1911 by Hiram Bingham, leading to increased interest in Incan history and culture. Today, Machu Picchu is a popular tourist destination and a UNESCO World Heritage site.'),\n",
" ('- What are some major milestones in the development of artificial intelligence?',\n",
" 'The major milestones in the development of artificial intelligence can be found on the \"History of artificial intelligence\" page on Wikipedia. Some of the key milestones include the development of the first electronic computer, the creation of the first AI program, the development of expert systems, and the emergence of machine learning and neural networks.'),\n",
" ('- How do tectonic plate movements cause earthquakes and volcanic eruptions?',\n",
" 'Tectonic plate movements can cause earthquakes and volcanic eruptions through different mechanisms depending on the type of plate boundary. Collisions between plates can cause volcanic activity, while sliding past each other at transform fault boundaries can cause earthquakes.'),\n",
" ('- What role did the Silk Road play in the exchange of goods and ideas between different cultures?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What factors led to the rise of the European Renaissance?',\n",
" 'Unable to answer question'),\n",
" ('- How did the Industrial Revolution transform societies globally?',\n",
" 'The Industrial Revolution impacted the economy by leading to the development of new machinery and mass production methods, which increased productivity and led to economic growth. This period saw the rise of capitalist economies and the expansion of industry and manufacturing.'),\n",
" ('- What are the major components of a typical ecosystem?',\n",
" 'The major components of a typical ecosystem are living and nonliving things, including the community of organisms (plants, animals, and microorganisms) and the nonliving components such as air, water, soil, and minerals.'),\n",
" ('- Can you describe the structure of a Shakespearean sonnet?',\n",
" 'A Shakespearean sonnet has 14 lines and follows the rhyme scheme ABAB CDCD EFEF GG. The structure also follows a specific meter involving arrangements of stressed and unstressed syllables within a line.'),\n",
" ('- Explain the significance of the Rosetta Stone in understanding ancient languages.',\n",
" 'The capital of Australia is Canberra.'),\n",
" ('- Who were the key figures in the development of modern dance, and what were their contributions?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What are some major milestones in the history of space exploration, and how have they impacted our understanding of the universe?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- Can you compare and contrast the artistic styles of the Renaissance and Baroque periods?',\n",
" 'The Renaissance and Baroque periods were characterized by different artistic styles. The Renaissance focused on classical Greco-Roman art and a revival of perspective and realism in painting and sculpture, while the Baroque period was marked by dramatic and extravagant art styles and was known for religious themes. Some famous Renaissance artists include Leonardo da Vinci, Michelangelo, and Raphael, while some famous Baroque artists include Caravaggio, Rembrandt, and Bernini.'),\n",
" (\"- What role did Gutenberg's printing press play in the spread of knowledge and ideas during the 15th century?\",\n",
" 'The Gutenberg printing press played a significant role in the spread of knowledge and ideas during the 15th century by allowing for the mass production of books, which made knowledge more widely available to people beyond the wealthy and educated classes. This helped to promote the dissemination of new ideas and innovations, which in turn helped to drive the Renaissance in art, science, philosophy, and literature.'),\n",
" ('- Discuss the origins and cultural significance of jazz music in the United States.',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What were the primary causes and consequences of the Cuban Missile Crisis?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" (\"- How did the women's suffrage movement contribute to women's rights worldwide?\",\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What are some notable examples of ancient architecture and how do they reflect the societies that created them?',\n",
" \"Ancient architecture reflects the technological, religious, and societal characteristics of the time period. Notable examples of ancient architecture include the Parthenon in Athens, Greece, the Colosseum in Rome, Italy, and St. Peter's Basilica in Vatican City. These landmarks are significant for their historical and cultural importance.\"),\n",
" (\"- Explain the process of the water cycle and its importance to Earth's climate.\",\n",
" \"The water cycle is an important process for Earth's climate as it helps regulate the temperature through the process of condensation and evaporation. Condensation is when water changes from a gas to a liquid or crystal shape and is exothermic, causing a temperature increase. Evaporation causes a temperature loss. The water cycle redistributes heat and helps regulate the Earth's climate.\"),\n",
" ('- Who were the main players in the Scientific Revolution, and what were their discoveries and innovations?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- Describe the impact of the Green Revolution on global food production and the environment.',\n",
" 'Unable to answer question'),\n",
" ('- What factors contributed to the fall of the Roman Empire?',\n",
" \"The fall of the Roman Empire was caused by a combination of political, economic, and military factors. The empire's vastness made it difficult to defend against invasions and internal rebellions. The split of the empire into East and West, the invasion of barbarians in the western part, and the threat from the Sassanid Empire in the eastern part were also contributing factors. Additionally, the decline of the Byzantine Empire and the rise of Islam played a role in the fall of the Roman Empire.\"),\n",
" ('- How does the process of photosynthesis occur in plants?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- Name some of the most innovative inventions of the 21st century',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What are the primary differences between classical and Keynesian economics?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" (\"- How do the Earth's tectonic plates affect geological events and natural disasters?\",\n",
" 'Tectonic plates affect geological events and natural disasters by creating earthquakes, volcanic activity, mountain-building, and oceanic trench formation along plate boundaries. The type of plate and its location also affect the severity of natural disasters. Examples of natural disasters caused by tectonic plates include the Andes mountain range in South America, the Japanese island arc, and the Pacific Ring of Fire.'),\n",
" ('- What role did music play in the civil rights movement of the 1960s?',\n",
" 'Music played a significant role in the civil rights movement of the 1960s. It was used to inspire and motivate protesters and spread awareness about racial inequality. Music genres like gospel, soul, and folk were particularly popular during this time. Famous singers like Ray Charles, Nina Simone, and Sam Cooke released songs that addressed civil rights issues. Music was also an integral part of events like the March on Washington and the Selma to Montgomery marches.'),\n",
" ('- Describe the formation and significance of the European Union',\n",
" 'The European Union was formed through various treaties and agreements, with the Maastricht Treaty in 1992 establishing EU citizenship. The EU is significant in promoting peace and stability, creating a single market, increasing economic cooperation, and providing a platform for member states to work together on common issues and challenges. It also has an impact on international law and governance.'),\n",
" ('- What are some of the most enduring and influential works of modernist literature?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- How do vaccines function and contribute to public health?',\n",
" \"Vaccines function by teaching the body's immune system to recognize and fight against specific harmful viruses or bacteria, ultimately contributing to public health by preventing the spread of infectious diseases. They are important for protecting against harmful diseases and have been proven to be safe and effective in preventing infectious diseases, saving millions of lives worldwide.\"),\n",
" ('- What were the major conflicts and power struggles during the Cold War?',\n",
" 'The major conflicts and power struggles during the Cold War were between the United States, the Soviet Union, and their respective allies, with each country promoting their type of government. This tension led to fears of a potential nuclear war, and communism eventually became less popular as Western states offered better economic growth opportunities and freedoms.'),\n",
" ('What is the significance of the Rosetta Stone in understanding ancient languages?',\n",
" 'The Rosetta Stone was significant in deciphering ancient languages because it contained inscriptions in three scripts - Greek, hieroglyphic and demotic - which were used to decode ancient Egyptian hieroglyphics. This allowed scholars to gain a better understanding of ancient Egyptian culture and history.'),\n",
" ('- Can you describe the process of photosynthesis?',\n",
" 'Photosynthesis is the process by which plants use energy from sunlight to combine carbon dioxide and water to create glucose and oxygen. This process takes place in the chloroplasts of plant cells.'),\n",
" ('- How did the phenomenon of globalization impact the world economy?',\n",
" 'Unable to answer question'),\n",
" ('- Who were the key figures of the American Civil Rights movement?',\n",
" 'Martin Luther King Jr., Muhammad Ali, Bob Dylan, Joan Baez, and Jimi Hendrix were key figures of the Civil Rights movement in the 1960s.'),\n",
" ('- What are the differences between comets, asteroids, and meteoroids?',\n",
" \"Comets are mostly ice and have long tails that point away from the sun, while asteroids are rocky or metallic and have orbits closer to the ecliptic. Meteoroids are small debris that enter Earth's atmosphere and are observed as meteors or shooting stars.\"),\n",
" ('- Which discoveries led to the development of the internet?',\n",
" 'The key discoveries that led to the development of the internet include the creation of the IP address system, the development of TCP/IP protocols for network communication, packet switching, and the development of the World Wide Web by Tim Berners-Lee.'),\n",
" (\"- How do tectonic plates interact and influence Earth's geography?\",\n",
" \"Tectonic plates interact and influence Earth's geography through movement at plate boundaries, creating mountains, earthquakes, volcanoes, mid-oceanic ridges, and oceanic trenches. The movement of plates is driven by heat from the mantle, which creates convection currents in the asthenosphere that transfer heat to the surface. Plate tectonics is a theory of geology that explains the movement of the Earth's lithosphere, which is divided into plates. There are seven major plates and many minor plates. Continental and oceanic plates differ in thickness and composition.\"),\n",
" ('- What factors contributed to the fall of the Roman Empire?',\n",
" 'The fall of the Roman Empire was caused by various factors such as the split of the empire into eastern and western parts, invasion by barbarians, political instability, and economic troubles. Other factors that contributed to the fall of the Roman empire are the rise of Islam, the Black Death, and better farming technology.'),\n",
" ('- Can you list some of the most notable inventions of the Renaissance period?',\n",
" 'Notable inventions and inventors of the Renaissance period include the Gutenberg printing press, Galileo Galilei, Francis Bacon, Thomas More, Dante Alighieri, William Shakespeare, Leonardo da Vinci, Michelangelo, and Raphael.'),\n",
" ('- What are the basic principles of the theory of relativity?',\n",
" 'The basic principles of the theory of relativity are that the laws of physics are the same for all observers, the constancy of the speed of light in a vacuum, and the principle of equivalence, which states that gravitational acceleration and acceleration due to motion are indistinguishable from each other. These principles form the basis for the idea of space-time, which combines space and time into a single four-dimensional entity.'),\n",
" ('- How did the Industrial Revolution change society and the economy?',\n",
" 'The Industrial Revolution changed society and the economy by introducing new manufacturing processes, machinery, and technologies that transformed the way goods were produced and distributed. It led to the growth of cities, the rise of industrial capitalists, and the expansion of markets. It also brought about significant changes in social and working conditions, including the rise of the labor movement and the emergence of new social classes. Overall, the Industrial Revolution created a new way of life and set the stage for modern industrial economies.'),\n",
" ('- What roles do neurotransmitters play in the human brain?',\n",
" 'Unable to answer question'),\n",
" ('- Who were the leaders of the Allied Powers and the Axis Powers during World War II?',\n",
" 'The leaders of the Axis Powers during World War II were Adolf Hitler of Nazi Germany, Benito Mussolini of the Kingdom of Italy, and Emperor Hirohito of the Empire of Japan. The leaders of the Allied Powers during World War II varied by country, but some of the key leaders included President Franklin D. Roosevelt and later President Harry S. Truman of the United States, Prime Minister Winston Churchill of the United Kingdom, Premier Joseph Stalin of the Soviet Union, Chiang Kai-shek of China, Charles de Gaulle of France, and Władysław Sikorski of Poland.'),\n",
" ('- What are the main characteristics of Gothic architecture?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- Can you trace the evolution of the modern Olympic Games?',\n",
" \"The modern Olympic Games were first held in Athens, Greece in 1896 with the largest international participation in any sporting event to that date. The International Olympic Committee was also instituted during this congress. The idea for a multi-national and multi-sport event was proposed by Pierre de Coubertin, who put together a group at the Sorbonne in Paris to present his plans to representatives of sports societies from 11 countries. After the proposal's acceptance by the congress, a date for the first modern Olympic Games was chosen for 1896, with Athens as the host city. The opening ceremony had an estimated 80,000 spectators and the athletics events had the most international field of any of the sports, with the marathon being held for the first time in international competition.\"),\n",
" ('- What were the major accomplishments of the ancient Egyptian civilization?',\n",
" 'The major accomplishments of the ancient Egyptian civilization include developing a society that relied on the Nile River for irrigation and agriculture, creating a system of writing in hieroglyphs, building famous pyramids and other monumental architecture, making significant advancements in military technology, and developing a religion that encouraged respect for their rulers and their past.'),\n",
" ('What is the significance of the double helix structure in DNA?',\n",
" 'The double helix structure in DNA is significant because it allows for accurate replication through semi-conservative replication and DNA polymerases, as well as DNA repair through various mechanisms in cells.'),\n",
" (\"- How do tectonic plates influence the Earth's geography?\",\n",
" \"Tectonic plates influence Earth's geography by creating mountains, earthquakes, volcanoes, mid-ocean ridges, and oceanic trenches depending on the way the plates are moving.\"),\n",
" ('- Can you describe the process of photosynthesis in plants?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What role did Marie Curie play in the discovery of radioactivity?',\n",
" 'Unable to answer question'),\n",
" ('- How did the Space Race impact advancements in technology?',\n",
" 'Agent stopped due to iteration limit or time limit.'),\n",
" ('- What are the primary differences between the Baroque and Classical music periods?',\n",
" 'The Baroque period was characterized by a more ornate and complex style, while the Classical period was known for its simplicity and symmetry. Classical musicians focused on musical analysis and harmony and counterpoint. The greatest composers of the Baroque period include Claudio Monteverdi, Antonio Vivaldi, Johann Sebastian Bach, and George Frideric Handel, while the greatest composers of the Classical period include Joseph Haydn, Wolfgang Amadeus Mozart, and Ludwig van Beethoven.')]"
]
},
"execution_count": 87,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"question_answer_pairs"
]
},
{
"cell_type": "code",
"execution_count": 88,
"id": "b8ea3f6a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"83"
]
},
"execution_count": 88,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(question_answer_pairs)"
]
},
{
"cell_type": "code",
"execution_count": 93,
"id": "1cf3c9e1",
"metadata": {},
"outputs": [],
"source": [
"# Build a prompt to provide the original query, the result and ask to summarise for the user\n",
"gpt_evaluator_system = '''You are WikiGPT, a helpful Wikipedia expert.\n",
"You will be presented with general knowledge questions our users have asked.\n",
"\n",
"Think about this step by step:\n",
"- You need to decide whether the answer adequately answers the question\n",
"- If it answers the question, you will say \"Correct\"\n",
"- If it doesn't answer the question, you will say one of the following:\n",
" - If it couldn't answer at all, you will say \"Unable to answer\"\n",
" - If the answer was provided but was incorrect, you will say \"Incorrect\" \n",
"- If none of these rules are met, say \"Unable to evaluate\"\n",
"\n",
"Evaluation can only be \"Correct\", \"Incorrect\", and \"Unable to evaluate\"\n",
"\n",
"Example 1:\n",
"\n",
"Question: What is the cost cap for the 2023 season of Formula 1?\n",
"\n",
"Answer: The cost cap for 2023 is 95m USD.\n",
"\n",
"Evaluation: Correct\n",
"\n",
"Example 2:\n",
"\n",
"Question: What is Thomas Dolby known for?\n",
"\n",
"Answer: Inventing electricity\n",
"\n",
"Evaluation: Incorrect\n",
"\n",
"Begin!'''\n",
"\n",
"gpt_evaluator_message = '''\n",
"Question: QUESTION_HERE\n",
"\n",
"Answer: ANSWER_HERE\n",
"\n",
"Evaluation:'''"
]
},
{
"cell_type": "code",
"execution_count": 94,
"id": "8a351793",
"metadata": {},
"outputs": [],
"source": [
"evaluation_output = []"
]
},
{
"cell_type": "code",
"execution_count": 95,
"id": "3b286e5f",
"metadata": {},
"outputs": [],
"source": [
"for pair in question_answer_pairs:\n",
" \n",
" message = gpt_evaluator_message.replace('QUESTION_HERE',pair[0]).replace('ANSWER_HERE',pair[1])\n",
" evaluation = openai.ChatCompletion.create(model='gpt-3.5-turbo',messages=[{\"role\":\"system\",\"content\":gpt_evaluator_system},{\"role\":\"user\",\"content\":message}],temperature=0)\n",
" #print(evaluation)\n",
" \n",
" evaluation_output.append((pair[0],pair[1],evaluation['choices'][0]['message']['content']))"
]
},
{
"cell_type": "code",
"execution_count": 97,
"id": "f18678e0",
"metadata": {},
"outputs": [],
"source": [
"def collate_results(x):\n",
" text = x.lower()\n",
" \n",
" if 'incorrect' in text:\n",
" return 'incorrect'\n",
" elif 'correct' in text:\n",
" return 'correct'\n",
" else: \n",
" return 'unable to answer'"
]
},
{
"cell_type": "code",
"execution_count": 98,
"id": "cd29fc90",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"correct 50\n",
"unable to answer 28\n",
"incorrect 5 \n",
"Name: evaluation, dtype: int64"
]
},
"execution_count": 98,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"eval_df = pd.DataFrame(evaluation_output)\n",
"eval_df.columns = ['question','answer','evaluation']\n",
"# Replacing all the \"unable to evaluates\" with \"unable to answer\"\n",
"eval_df['evaluation'] = eval_df['evaluation'].apply(lambda x: collate_results(x))\n",
"eval_df.evaluation.value_counts()"
]
},
{
"cell_type": "markdown",
"id": "566053d6",
"metadata": {},
"source": [
"#### Analysis\n",
"\n",
"We ended up with a 60% first hit rate, which isn't great, but we've got a baseline to work from.\n",
"\n",
"Our remediation plan could be as follows:\n",
"- **Incorrect answers:** Either prompt engineering to help the model work out how to answer better (maybe even a bigger model like GPT-4), or search optimisation to return more relevant chunks. Chunking/embedding changes may help this as well - larger chunks may give more context, allowing the model to formulate a better answer.\n",
"- **Unable to answer:** This is either a retrieval problem, or the data doesn't exist in our knowledge base. We can prompt engineer to classify questions that are \"out-of-bounds\", or we can tune our search so the relevant data is returned.\n",
"\n",
"This is the framework we'll build on to get our knowledge retrieval solution to production - again, log everything and store each run down to a question level so you can track regressions and iterate towards your production solution."
]
},
{
"cell_type": "markdown",
"id": "5016fcb8",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"This concludes our Enterprise Knowledge Retrieval walkthrough. We hope you've found it useful, and that you're now in a position to scale your knowledge retrieval solutions into production confidently.\n",
"\n",
"Let us know what you think, and if you have any questions that you'd like answered then please sign up for the Q&A webinar [here](link TBD)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "pchain_env",
"language": "python",
"name": "pchain_env"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.10"
}
},
"nbformat": 4,
"nbformat_minor": 5
}