Merge pull request #77 from Pankaj-Baranwal/patch-1

Replaced non-existent model with the new version.
This commit is contained in:
Ted Sanders 2023-01-19 14:38:35 -08:00 committed by GitHub
commit c9df23f7d0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -12,7 +12,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"# 2. Creating a synthetic Q&A dataset\n", "# 2. Creating a synthetic Q&A dataset\n",
"We use [`davinci-instruct-beta-v2`](https://beta.openai.com/docs/engines/instruct-series-beta), a model specialized in following instructions, to create questions based on the given context. Then we also use [`davinci-instruct-beta-v2`](https://beta.openai.com/docs/engines/instruct-series-beta) to answer those questions, given the same context. \n", "We use [`davinci-instruct-beta-v3`](https://beta.openai.com/docs/engines/instruct-series-beta), a model specialized in following instructions, to create questions based on the given context. Then we also use [`davinci-instruct-beta-v3`](https://beta.openai.com/docs/engines/instruct-series-beta) to answer those questions, given the same context. \n",
"\n", "\n",
"This is expensive, and will also take a long time, as we call the davinci engine for each section. You can simply download the final dataset instead.\n", "This is expensive, and will also take a long time, as we call the davinci engine for each section. You can simply download the final dataset instead.\n",
"\n", "\n",
@ -175,7 +175,7 @@
"def get_questions(context):\n", "def get_questions(context):\n",
" try:\n", " try:\n",
" response = openai.Completion.create(\n", " response = openai.Completion.create(\n",
" engine=\"davinci-instruct-beta-v2\",\n", " engine=\"davinci-instruct-beta-v3\",\n",
" prompt=f\"Write questions based on the text below\\n\\nText: {context}\\n\\nQuestions:\\n1.\",\n", " prompt=f\"Write questions based on the text below\\n\\nText: {context}\\n\\nQuestions:\\n1.\",\n",
" temperature=0,\n", " temperature=0,\n",
" max_tokens=257,\n", " max_tokens=257,\n",
@ -255,7 +255,7 @@
"def get_answers(row):\n", "def get_answers(row):\n",
" try:\n", " try:\n",
" response = openai.Completion.create(\n", " response = openai.Completion.create(\n",
" engine=\"davinci-instruct-beta-v2\",\n", " engine=\"davinci-instruct-beta-v3\",\n",
" prompt=f\"Write questions based on the text below\\n\\nText: {row.context}\\n\\nQuestions:\\n{row.questions}\\n\\nAnswers:\\n1.\",\n", " prompt=f\"Write questions based on the text below\\n\\nText: {row.context}\\n\\nQuestions:\\n{row.questions}\\n\\nAnswers:\\n1.\",\n",
" temperature=0,\n", " temperature=0,\n",
" max_tokens=257,\n", " max_tokens=257,\n",
@ -385,7 +385,7 @@
} }
], ],
"source": [ "source": [
"answer_question(olympics_search_fileid, \"davinci-instruct-beta-v2\", \n", "answer_question(olympics_search_fileid, \"davinci-instruct-beta-v3\", \n",
" \"Where did women's 4 x 100 metres relay event take place during the 2020 Summer Olympics?\")" " \"Where did women's 4 x 100 metres relay event take place during the 2020 Summer Olympics?\")"
] ]
}, },
@ -393,7 +393,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"After we fine-tune the model for Q&A we'll be able to use it instead of [`davinci-instruct-beta-v2`](https://beta.openai.com/docs/engines/instruct-series-beta), to obtain better answers when the question can't be answered based on the context. We see a downside of [`davinci-instruct-beta-v2`](https://beta.openai.com/docs/engines/instruct-series-beta), which always attempts to answer the question, regardless of the relevant context being present or not. (Note the second question is asking about a future event, set in 2024.)" "After we fine-tune the model for Q&A we'll be able to use it instead of [`davinci-instruct-beta-v3`](https://beta.openai.com/docs/engines/instruct-series-beta), to obtain better answers when the question can't be answered based on the context. We see a downside of [`davinci-instruct-beta-v3`](https://beta.openai.com/docs/engines/instruct-series-beta), which always attempts to answer the question, regardless of the relevant context being present or not. (Note the second question is asking about a future event, set in 2024.)"
] ]
}, },
{ {
@ -413,7 +413,7 @@
} }
], ],
"source": [ "source": [
"answer_question(olympics_search_fileid, \"davinci-instruct-beta-v2\", \n", "answer_question(olympics_search_fileid, \"davinci-instruct-beta-v3\", \n",
" \"Where did women's 4 x 100 metres relay event take place during the 2048 Summer Olympics?\", max_len=1000)" " \"Where did women's 4 x 100 metres relay event take place during the 2048 Summer Olympics?\", max_len=1000)"
] ]
}, },