{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Azure embeddings example\n", "\n", "> Note: There is a newer version of the openai library available. See https://github.com/openai/openai-python/discussions/742\n", "\n", "This example will cover embeddings using the Azure OpenAI service." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we install the necessary dependencies." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "! pip install \"openai>=0.28.1,<1.0.0\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the following sections to work properly we first have to setup some things. Let's start with the `api_base` and `api_version`. To find your `api_base` go to https://portal.azure.com, find your resource and then under \"Resource Management\" -> \"Keys and Endpoints\" look for the \"Endpoint\" value." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import openai" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "openai.api_version = '2023-05-15'\n", "openai.api_base = '' # Please add your endpoint here" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "We next have to setup the `api_type` and `api_key`. We can either get the key from the portal or we can get it through Microsoft Active Directory Authentication. Depending on this the `api_type` is either `azure` or `azure_ad`." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Setup: Portal\n", "Let's first look at getting the key from the portal. Go to https://portal.azure.com, find your resource and then under \"Resource Management\" -> \"Keys and Endpoints\" look for one of the \"Keys\" values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "openai.api_type = 'azure'\n", "openai.api_key = os.environ[\"OPENAI_API_KEY\"]" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "> Note: In this example, we configured the library to use the Azure API by setting the variables in code. For development, consider setting the environment variables instead:\n", "\n", "```\n", "OPENAI_API_BASE\n", "OPENAI_API_KEY\n", "OPENAI_API_TYPE\n", "OPENAI_API_VERSION\n", "```" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### (Optional) Setup: Microsoft Active Directory Authentication\n", "Let's now see how we can get a key via Microsoft Active Directory Authentication. Uncomment the following code if you want to use Active Directory Authentication instead of keys from the portal." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# from azure.identity import DefaultAzureCredential\n", "\n", "# default_credential = DefaultAzureCredential()\n", "# token = default_credential.get_token(\"https://cognitiveservices.azure.com/.default\")\n", "\n", "# openai.api_type = 'azure_ad'\n", "# openai.api_key = token.token" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "A token is valid for a period of time, after which it will expire. To ensure a valid token is sent with every request, you can refresh an expiring token by hooking into requests.auth:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import typing\n", "import time\n", "import requests\n", "if typing.TYPE_CHECKING:\n", " from azure.core.credentials import TokenCredential\n", "\n", "class TokenRefresh(requests.auth.AuthBase):\n", "\n", " def __init__(self, credential: \"TokenCredential\", scopes: typing.List[str]) -> None:\n", " self.credential = credential\n", " self.scopes = scopes\n", " self.cached_token: typing.Optional[str] = None\n", "\n", " def __call__(self, req):\n", " if not self.cached_token or self.cached_token.expires_on - time.time() < 300:\n", " self.cached_token = self.credential.get_token(*self.scopes)\n", " req.headers[\"Authorization\"] = f\"Bearer {self.cached_token.token}\"\n", " return req\n", "\n", "session = requests.Session()\n", "session.auth = TokenRefresh(default_credential, [\"https://cognitiveservices.azure.com/.default\"])\n", "\n", "openai.requestssession = session" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Deployments\n", "In this section we are going to create a deployment that we can use to create embeddings." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Deployments: Create manually\n", "Let's create a deployment using the `text-similarity-curie-001` model. Create a new deployment by going to your Resource in your portal under \"Resource Management\" -> \"Model deployments\"." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "deployment_id = '' # Fill in the deployment id from the portal here" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Deployments: Listing\n", "Now because creating a new deployment takes a long time, let's look in the subscription for an already finished deployment that succeeded." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('While deployment running, selecting a completed one that supports embeddings.')\n", "deployment_id = None\n", "result = openai.Deployment.list()\n", "for deployment in result.data:\n", " if deployment[\"status\"] != \"succeeded\":\n", " continue\n", " \n", " model = openai.Model.retrieve(deployment[\"model\"])\n", " if model[\"capabilities\"][\"embeddings\"] != True:\n", " continue\n", " \n", " deployment_id = deployment[\"id\"]\n", " break\n", "\n", "if not deployment_id:\n", " print('No deployment with status: succeeded found.')\n", "else:\n", " print(f'Found a succeeded deployment that supports embeddings with id: {deployment_id}.')" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Embeddings\n", "Now let's send a sample embedding to the deployment." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "embeddings = openai.Embedding.create(deployment_id=deployment_id,\n", " input=\"The food was delicious and the waiter...\")\n", " \n", "print(embeddings)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" }, "vscode": { "interpreter": { "hash": "3a5103089ab7e7c666b279eeded403fcec76de49a40685dbdfe9f9c78ad97c17" } } }, "nbformat": 4, "nbformat_minor": 2 }