cleaned up notebook

This commit is contained in:
Ted Sanders 2023-06-16 17:02:42 -07:00
parent 3b37ded49c
commit 953c57af01

View File

@ -5,7 +5,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"# Search augmented by query generation and embeddings reranking\n", "# Question answering using a search API and re-ranking\n",
"\n", "\n",
"Searching for relevant information can sometimes feel like looking for a needle in a haystack, but dont despair, GPTs can actually do a lot of this work for us. In this guide we explore a way to augment existing search systems with various AI techniques, helping us sift through the noise.\n", "Searching for relevant information can sometimes feel like looking for a needle in a haystack, but dont despair, GPTs can actually do a lot of this work for us. In this guide we explore a way to augment existing search systems with various AI techniques, helping us sift through the noise.\n",
"\n", "\n",
@ -44,16 +44,17 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 1, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"%env NEWS_API_KEY = YOUR_API_KEY\n" "%%capture\n",
"%env NEWS_API_KEY = NEWS_API_KEY\n"
] ]
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 2, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@ -70,16 +71,17 @@
"# Load environment variables\n", "# Load environment variables\n",
"news_api_key = os.getenv(\"NEWS_API_KEY\")\n", "news_api_key = os.getenv(\"NEWS_API_KEY\")\n",
"\n", "\n",
"GPT_MODEL = \"gpt-3.5-turbo\"\n",
"\n", "\n",
"# Helper functions\n", "# Helper functions\n",
"def json_gpt(input: str):\n", "def json_gpt(input: str):\n",
" completion = openai.ChatCompletion.create(\n", " completion = openai.ChatCompletion.create(\n",
" model=\"gpt-4\",\n", " model=GPT_MODEL,\n",
" messages=[\n", " messages=[\n",
" {\"role\": \"system\", \"content\": \"Output only valid JSON\"},\n", " {\"role\": \"system\", \"content\": \"Output only valid JSON\"},\n",
" {\"role\": \"user\", \"content\": input},\n", " {\"role\": \"user\", \"content\": input},\n",
" ],\n", " ],\n",
" temperature=1,\n", " temperature=0.5,\n",
" )\n", " )\n",
"\n", "\n",
" text = completion.choices[0].message.content\n", " text = completion.choices[0].message.content\n",
@ -106,7 +108,7 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 3, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
@ -124,33 +126,12 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 4, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"data": {
"text/plain": [
"['NBA championship winner',\n",
" 'NBA finals MVP',\n",
" 'last NBA championship game',\n",
" 'recent NBA finals results',\n",
" 'NBA finals champions',\n",
" 'NBA finals MVP and winner',\n",
" 'latest NBA championship game details',\n",
" 'NBA championship winning team',\n",
" 'most recent NBA finals MVP',\n",
" 'last NBA finals game summary',\n",
" 'latest NBA finals champion and MVP',\n",
" 'Who won the NBA championship? And who was the MVP? Tell me a bit about the last game.']"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [ "source": [
"QUERIES_INPUT = f\"\"\"\n", "QUERIES_INPUT = f\"\"\"\n",
"You have access to a search API that returns recent news articles.\n",
"Generate an array of search queries that are relevant to this question.\n", "Generate an array of search queries that are relevant to this question.\n",
"Use a variation of related keywords for the queries, trying to be as general as possible.\n", "Use a variation of related keywords for the queries, trying to be as general as possible.\n",
"Include as many queries as you can think of, including and excluding terms.\n", "Include as many queries as you can think of, including and excluding terms.\n",
@ -180,17 +161,9 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 5, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 12/12 [00:04<00:00, 2.69it/s]\n"
]
}
],
"source": [ "source": [
"def search_news(\n", "def search_news(\n",
" query: str,\n", " query: str,\n",
@ -226,41 +199,9 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 6, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total number of articles: 356\n",
"Top 5 articles of query 1: \n",
"\n",
"Title: Nascar takes on Le Mans as LeBron James gets centenary race under way\n",
"Description: <ul><li>Nascar has presence at iconic race for first time since 1976</li><li>NBA superstar LeBron James waves flag as honorary starter</li></ul>The crowd chanted “U-S-A! U-S-A!” as Nascar driver lineup for the 24 Hours of Le Mans passed through the city cente…\n",
"Content: The crowd chanted U-S-A! U-S-A! as Nascar driver lineup for the 24 Hours of Le Mans passed through t...\n",
"\n",
"Title: Futura and Michelob ULTRA Toast to the NBA Finals With Abstract Artwork Crafted From the Brands 2023 Limited-Edition Championship Bottles\n",
"Description: The sun is out to play, and so is Michelob ULTRA. With the 2022-2023 NBA Finals underway, the beermaker is back with its celebratory NBA Champ Bottles. This year, the self-proclaimed MVP of joy is dropping a limited-edition bottle made in collaboration with a…\n",
"Content: The sun is out to play, and so is Michelob ULTRA. With the 2022-2023 NBA Finals underway, the beerma...\n",
"\n",
"Title: Signed and Delivered, Futura and Michelob ULTRA Will Gift Hand-Painted Bottles to This Years NBA Championship Team\n",
"Description: Michelob ULTRA, the MVP of joy and official beer sponsor of the NBA is back to celebrate with basketball lovers and sports fans around the globe as the NBA 2022-2023 season comes to a nail-biting close. In collaboration with artist Futura, Michelob ULTRA will…\n",
"Content: Michelob ULTRA, the MVP of joy and official beer sponsor of the NBA is back to celebrate with basket...\n",
"\n",
"Title: Alexis Ohanian and Serena Williams are building a mini-sports empire with a new golf team that's part of a league created by Tiger Woods and Rory McIlroy\n",
"Description: Ohanian and Williams are already co-owners of the National Women's Soccer League Los Angeles team, Angel City FC.\n",
"Content: Alexis Ohanian and Serena Williams attend The 2023 Met Gala.Cindy Ord/Getty Images\n",
"<ul>\n",
"<li>Alexis ...\n",
"\n",
"Title: Las Vegas wanted the NHL. And now the city has the Stanley Cup\n",
"Description: The Golden Knights won the championship on Tuesday. Its a testament to the decision to move hockey into uncharted territoryA month after Las Vegas was awarded an NHL team in the summer of 2016, a local paper held a series of, in the publications own words, …\n",
"Content: A month after Las Vegas was awarded an NHL team in the summer of 2016, a local paper held a series o...\n",
"\n"
]
}
],
"source": [ "source": [
"print(\"Total number of articles:\", len(articles))\n", "print(\"Total number of articles:\", len(articles))\n",
"print(\"Top 5 articles of query 1:\", \"\\n\")\n", "print(\"Top 5 articles of query 1:\", \"\\n\")\n",
@ -286,20 +227,9 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 7, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"data": {
"text/plain": [
"'Team NAME won the NBA championship, and PLAYER_NAME was awarded the MVP title. In the last game, NAME displayed an outstanding performance with X points, Y rebounds, and Z assists, leading their team to a thrilling victory at PLACE.'"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [ "source": [
"HA_INPUT = f\"\"\"\n", "HA_INPUT = f\"\"\"\n",
"Generate a hypothetical answer to the user's question. This answer which will be used to rank search results. \n", "Generate a hypothetical answer to the user's question. This answer which will be used to rank search results. \n",
@ -326,29 +256,9 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 8, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"data": {
"text/plain": [
"[0.779471308147333,\n",
" 0.7800588417397301,\n",
" 0.7900892607044301,\n",
" 0.7704583513281005,\n",
" 0.7841046909560899,\n",
" 0.8246545759453099,\n",
" 0.8127694680991286,\n",
" 0.8235724294003601,\n",
" 0.7978980332478777,\n",
" 0.8273641985639677]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [ "source": [
"hypothetical_answer_embedding = embeddings(hypothetical_answer)[0]\n", "hypothetical_answer_embedding = embeddings(hypothetical_answer)[0]\n",
"article_embeddings = embeddings(\n", "article_embeddings = embeddings(\n",
@ -376,44 +286,9 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 9, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"name": "stdout",
"output_type": "stream",
"text": [
"Top 5 articles: \n",
"\n",
"Title: Barack Obama and Magic Johnson lead Nuggets tributes after first-ever NBA title win\n",
"Description: Straight after the Nuggets' Game 5 win, several standout names and brands within the world of sports and elsewhere paid tribute to Nikola Jokic and his teammates for their historic season.\n",
"Content: The Denver Nuggetsclinched their first ever NBA championship on Monday night, following it's Game 5 ...\n",
"Score: 0.8440720976260312\n",
"\n",
"Title: Barack Obama and Magic Johnson lead Nuggets tributes after first-ever NBA title win\n",
"Description: Straight after the Nuggets' Game 5 win, several standout names and brands within the world of sports and elsewhere paid tribute to Nikola Jokic and his teammates for their historic season.\n",
"Content: The Denver Nuggetsclinched their first ever NBA championship on Monday night, following it's Game 5 ...\n",
"Score: 0.8440720976260312\n",
"\n",
"Title: Nikola Jokic wins NBA Finals MVP, leading Denver Nuggets to first championship\n",
"Description: In the 2023 NBA Finals, Nikola Jokic became the first player ever with a 30-point, 20-rebound triple-double in a Finals game.\n",
"Content: Nikola Jokic did something nice.\n",
"After leading the Denver Nuggets to their first championship in fr...\n",
"Score: 0.8358902345553455\n",
"\n",
"Title: Nikola Jokic named NBA Finals MVP after leading Denver Nuggets to first championship\n",
"Description: Already a two-time league MVP, Nikola Jokic added an NBA championship and Finals MVP award to his impressive resume.\n",
"Content: DENVER Nikola Jokic claimed he didnt care about winning a third consecutive regular-season MVP in 20...\n",
"Score: 0.8358544359823221\n",
"\n",
"Title: Nikola Jokic named NBA Finals MVP after leading Denver Nuggets to first championship\n",
"Description: Already a two-time league MVP, Nikola Jokic added an NBA championship and Finals MVP award to his impressive resume.\n",
"Content: DENVER Nikola Jokic claimed he didnt care about winning a third consecutive regular-season MVP in 20...\n",
"Score: 0.8358361081673931\n",
"\n"
]
}
],
"source": [ "source": [
"scored_articles = zip(articles, cosine_similarities)\n", "scored_articles = zip(articles, cosine_similarities)\n",
"\n", "\n",
@ -443,26 +318,9 @@
}, },
{ {
"cell_type": "code", "cell_type": "code",
"execution_count": 11, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [ "outputs": [],
{
"data": {
"text/markdown": [
"The Denver Nuggets won their first-ever NBA championship, and Nikola Jokic was named the NBA Finals MVP. In the last game, Game 5 of the 2023 NBA Finals, Jokic became the first player ever with a 30-point, 20-rebound triple-double in a Finals game [^1^] [^3^]. The Nuggets' historic season was celebrated and recognized by standout names and brands within the world of sports and elsewhere, including tributes from Barack Obama and Magic Johnson [^1^] [^2^].\n",
"\n",
"[^1^]: [Denver Post](https://www.denverpost.com/2023/06/12/nikola-jokic-nba-finals-mvp-denver-nuggets/)\n",
"[^2^]: [Daily Mail](https://www.dailymail.co.uk/sport/nba/article-12189843/Barack-Obama-Magic-Johnson-lead-Nuggets-tributes-NBA-title-win.html)\n",
"[^3^]: [USA Today](https://www.usatoday.com/story/sports/nba/2023/06/12/nikola-jokic-is-nba-finals-mvp-after-nuggets-win-first-championship/70314309007/)"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [ "source": [
"formatted_top_results = [\n", "formatted_top_results = [\n",
" {\n", " {\n",
@ -482,9 +340,9 @@
"\"\"\"\n", "\"\"\"\n",
"\n", "\n",
"completion = openai.ChatCompletion.create(\n", "completion = openai.ChatCompletion.create(\n",
" model=\"gpt-4\",\n", " model=GPT_MODEL,\n",
" messages=[{\"role\": \"user\", \"content\": ANSWER_INPUT}],\n", " messages=[{\"role\": \"user\", \"content\": ANSWER_INPUT}],\n",
" temperature=1,\n", " temperature=0.5,\n",
" stream=True,\n", " stream=True,\n",
")\n", ")\n",
"\n", "\n",
@ -512,7 +370,7 @@
"name": "python", "name": "python",
"nbconvert_exporter": "python", "nbconvert_exporter": "python",
"pygments_lexer": "ipython3", "pygments_lexer": "ipython3",
"version": "3.11.0" "version": "3.9.9"
}, },
"orig_nbformat": 4 "orig_nbformat": 4
}, },