openai-cookbook/examples/chatgpt/gpt_actions_library/gpt_action_bigquery.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# GPT Action Library: BigQuery"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This page provides an instruction & guide for developers building a GPT Action for a specific application. Before you proceed, make sure to first familiarize yourself with the following information: \n",
    "- [Introduction to GPT Actions](https://platform.openai.com/docs/actions)\n",
    "- [Introduction to GPT Actions Library](https://platform.openai.com/docs/actions-library)\n",
    "- [Example of Buliding a GPT Action from Scratch](https://platform.openai.com/docs/getting-started)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This particular GPT Action provides an overview of how to connect to **Google BigQuery**, Google Cloud's Analytical Data Warehouse. This Action takes a user’s question, scans the relevant tables to gather the data schema, then writes a SQL query to answer the user’s question. \n",
    "\n",
    "Note: these instructions return back a functioning SQL statement, rather than the result itself. Currently middleware is required to return back a CSV file – we’ll be posting instructions on an example of that soon"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Application Information"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Application Key Links"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Check out these links from the application before you get started:\n",
    "- Application Website: https://cloud.google.com/bigquery \n",
    "- Application API Documentation: https://cloud.google.com/bigquery/docs/reference/rest "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Application Prerequisites"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before you get started, make sure you go through the following steps in your application environment:\n",
    "- Set up a GCP project \n",
    "- Set up a BQ dataset in that GCP project\n",
    "- Ensure that the user authenticating into BigQuery via ChatGPT has access to that BQ dataset "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ChatGPT Steps"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Custom GPT Instructions "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once you've created a Custom GPT, copy the text below in the Instructions panel. Have questions? Check out [Getting Started Example](https://platform.openai.com/docs/getting-started) to see how this step works in more detail."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "plaintext"
    }
   },
   "outputs": [],
   "source": [
    "**Context**: You are an expert at writing BigQuery SQL queries. A user is going to ask you a question. \n",
    "\n",
    "**Instructions**:\n",
    "1. No matter the user's question, start by running `runQuery` operation using this query: \"SELECT column_name, table_name, data_type, description FROM `{project}.{dataset}.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`\" \n",
    "-- Assume project = \"<insert your default project here>\", dataset = \"<insert your default dataset here>\", unless the user provides different values \n",
    "-- Remember to include useLegacySql:false in the json output\n",
    "2. Convert the user's question into a SQL statement that leverages the step above and run the `runQuery` operation on that SQL statement to confirm the query works. Add a limit of 100 rows\n",
    "3. Now remove the limit of 100 rows and return back the query for the user to see\n",
    "\n",
    "**Additional Notes**: If the user says \"Let's get started\", explain that the user can provide a project or dataset, along with a question they want answered. If the user has no ideas, suggest that we have a sample flights dataset they can query - ask if they want you to query that"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### OpenAPI Schema "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once you've created a Custom GPT, copy the text below in the Actions panel. Have questions? Check out [Getting Started Example](https://platform.openai.com/docs/getting-started) to see how this step works in more detail."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "vscode": {
     "languageId": "yaml"
    }
   },
   "outputs": [],
   "source": [
    "openapi: 3.1.0\n",
    "info:\n",
    "  title: BigQuery API\n",
    "  description: API for querying a BigQuery table.\n",
    "  version: 1.0.0\n",
    "servers:\n",
    "  - url: https://bigquery.googleapis.com/bigquery/v2\n",
    "    description: Google BigQuery API server\n",
    "paths:\n",
    "  /projects/{projectId}/queries:\n",
    "    post:\n",
    "      operationId: runQuery\n",
    "      summary: Executes a query on a specified BigQuery table.\n",
    "      description: Submits a query to BigQuery and returns the results.\n",
    "      parameters:\n",
    "        - name: projectId\n",
    "          in: path\n",
    "          required: true\n",
    "          description: The ID of the Google Cloud project.\n",
    "          schema:\n",
    "            type: string\n",
    "      requestBody:\n",
    "        required: true\n",
    "        content:\n",
    "          application/json:\n",
    "            schema:\n",
    "              type: object\n",
    "              properties:\n",
    "                query:\n",
    "                  type: string\n",
    "                  description: The SQL query string.\n",
    "                useLegacySql:\n",
    "                  type: boolean\n",
    "                  description: Whether to use legacy SQL.\n",
    "                  default: false\n",
    "      responses:\n",
    "        '200':\n",
    "          description: Successful query execution.\n",
    "          content:\n",
    "            application/json:\n",
    "              schema:\n",
    "                type: object\n",
    "                properties:\n",
    "                  kind:\n",
    "                    type: string\n",
    "                    example: \"bigquery#queryResponse\"\n",
    "                  schema:\n",
    "                    type: object\n",
    "                    description: The schema of the results.\n",
    "                  jobReference:\n",
    "                    type: object\n",
    "                    properties:\n",
    "                      projectId:\n",
    "                        type: string\n",
    "                      jobId:\n",
    "                        type: string\n",
    "                  rows:\n",
    "                    type: array\n",
    "                    items:\n",
    "                      type: object\n",
    "                      properties:\n",
    "                        f:\n",
    "                          type: array\n",
    "                          items:\n",
    "                            type: object\n",
    "                            properties:\n",
    "                              v:\n",
    "                                type: string\n",
    "                  totalRows:\n",
    "                    type: string\n",
    "                    description: Total number of rows in the query result.\n",
    "                  pageToken:\n",
    "                    type: string\n",
    "                    description: Token for pagination of query results.\n",
    "        '400':\n",
    "          description: Bad request. The request was invalid.\n",
    "        '401':\n",
    "          description: Unauthorized. Authentication is required.\n",
    "        '403':\n",
    "          description: Forbidden. The request is not allowed.\n",
    "        '404':\n",
    "          description: Not found. The specified resource was not found.\n",
    "        '500':\n",
    "          description: Internal server error. An error occurred while processing the request."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Authentication Instructions"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below are instructions on setting up authentication with this 3rd party application. Have questions? Check out [Getting Started Example](https://platform.openai.com/docs/getting-started) to see how this step works in more detail."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Pre-Action Steps"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before you set up authentication in ChatGPT, please take the following steps in the application."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Go to the Google Cloud Console\n",
    "- Navigate to API & Services > Credentials\n",
    "- Create new OAuth credentials (or use an existing one)\n",
    "- Locate your OAuth Client ID & Client Secret and store both values securely (see screenshot below)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![gptactions_BigQuery_auth.png](../../images/gptactions_BigQuery_auth.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### In ChatGPT"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In ChatGPT, click on \"Authentication\" and choose **\"OAuth\"**. Enter in the information below. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- **Client ID**: use Client ID from steps above \n",
    "- **Client Secret**: use Client Secret from steps above\n",
    "- **Authorization URL**: https://accounts.google.com/o/oauth2/auth\n",
    "- **Token URL**: https://oauth2.googleapis.com/token \n",
    "- **Scope**: https://www.googleapis.com/auth/bigquery \n",
    "- **Token**: Default (POST)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Post-Action Steps"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once you've set up authentication in ChatGPT, follow the steps below in the application to finalize the Action. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Copy the callback URL from the GPT Action\n",
    "- In the “Authorized redirect URIs” (see screenshot above), add your callback URL \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### FAQ & Troubleshooting"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- *Callback URL Error:* If you get a callback URL error in ChatGPT, pay close attention to the screenshot above. You need to add the callback URL directly into GCP for the action to authenticate correctly\n",
    "- *Schema calls the wrong project or dataset:* If ChatGPT calls the wrong project or dataset, consider updating your instructions to make it more explicit either (a) which project / dataset should be called or (b) to require the user provide those exact details before it runs the query"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Are there integrations that you’d like us to prioritize? Are there errors in our integrations? File a PR or issue in our github, and we’ll take a look.*\n"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}