"In this example we'll try to go over all operations that can be done using the Azure endpoints and their differences with the openAi endpoints (if any).<br>\n",
"This example focuses on finetuning but touches on the majority of operations that are also available using the API. This example is meant to be a quick way of showing simple operations and is not meant as a finetune model adaptation tutorial.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"from openai import cli"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"In the following section the endpoint and key need to be set up of the next sections to work.<br> Please go to https://portal.azure.com, find your resource and then under \"Resource Management\" -> \"Keys and Endpoints\" look for the \"Endpoint\" value and one of the Keys. They will act as api_base and api_key in the code below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"openai.api_key = '' # Please add your api key here\n",
"openai.api_base = '' # Please add your endpoint here\n",
"\n",
"openai.api_type = 'azure'\n",
"openai.api_version = '2022-03-01-preview' # this may change in the future"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Microsoft Active Directory Authentication\n",
"Instead of key based authentication, you can use Active Directory to authenticate using credential tokens. Uncomment the next code section to use credential based authentication:"
"openai.api_version = '2022-03-01-preview' # this may change in the future\n",
"\n",
"\n",
"openai.api_base = '' # Please add your endpoint here\n",
"\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Files\n",
"In the next section we will focus on the files operations: importing, listing, retrieving, deleting. For this we need to create 2 temporary files with some sample data. For the sake of simplicity, we will use the same data for training and validation."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import shutil\n",
"import json\n",
"\n",
"training_file_name = 'training.jsonl'\n",
"validation_file_name = 'validation.jsonl'\n",
"\n",
"sample_data = [{\"prompt\": \"When I go to the store, I want an\", \"completion\": \"apple\"},\n",
" {\"prompt\": \"When I go to work, I want a\", \"completion\": \"coffe\"},\n",
" {\"prompt\": \"When I go home, I want a\", \"completion\": \"soda\"}]\n",
"\n",
"print(f'Generating the training file: {training_file_name}')\n",
"with open(training_file_name, 'w') as training_file:\n",
" for entry in sample_data:\n",
" json.dump(entry, training_file)\n",
" training_file.write('\\n')\n",
"\n",
"print(f'Copying the training file to the validation file')\n",
"List all of the uploaded files and check for the ones that are named \"training.jsonl\" or \"validation.jsonl\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print('Checking for existing uploaded files.')\n",
"results = []\n",
"files = openai.File.list().data\n",
"print(f'Found {len(files)} total uploaded files in the subscription.')\n",
"for item in files:\n",
" if item[\"filename\"] in [training_file_name, validation_file_name]:\n",
" results.append(item[\"id\"])\n",
"print(f'Found {len(results)} already uploaded files that match our names.')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Files: Deleting\n",
"Let's now delete those found files (if any) since we're going to be re-uploading them next."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(f'Deleting already uploaded files.')\n",
"for id in results:\n",
" openai.File.delete(sid = id)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Files: Importing & Retrieving\n",
"Now, let's import our two files ('training.jsonl' and 'validation.jsonl') and keep those IDs since we're going to use them later for finetuning.<br>\n",
"For this operation we are going to use the cli wrapper which does a bit more checks before uploading and also gives us progress. In addition, after uploading we're going to check the status our import until it has succeeded (or failed if something goes wrong)"
"if status not in [\"succeeded\", \"failed\"]:\n",
" print(f'Job not in terminal status: {status}. Waiting.')\n",
" while status not in [\"succeeded\", \"failed\"]:\n",
" time.sleep(2)\n",
" status = openai.FineTune.retrieve(id=job_id)[\"status\"]\n",
" print(f'Status: {status}')\n",
"else:\n",
" print(f'Finetune job {job_id} finished with status: {status}')\n",
"\n",
"print('Checking other finetune jobs in the subscription.')\n",
"result = openai.FineTune.list()\n",
"print(f'Found {len(result)} finetune jobs.')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Finetune: Deleting\n",
"Finally we can delete our finetune job.<br>\n",
"WARNING: Please skip this step if you want to continue with the next section as the finetune model is needed. (The delete code is commented out by default)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# openai.FineTune.delete(sid=job_id)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deployments\n",
"In this section we are going to create a deployment using the finetune model that we just adapted and then used the deployment to create a simple completion operation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Deployments: Create\n",
"Let's create a deployment using the fine-tune model."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#Fist let's get the model of the previous job:\n",
"result = openai.FineTune.retrieve(id=job_id)\n",
"if result[\"status\"] == 'succeeded':\n",
" model = result[\"fine_tuned_model\"]\n",
"\n",
"# Now let's create the deployment\n",
"print(f'Creating a new deployment with model: {model}')\n",