{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### Multi-Tool Orchestration with RAG approach using OpenAI's Responses API" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "This cookbook guides you through building dynamic, multi-tool workflows using OpenAI's Responses API. It demonstrates how to implement a Retrieval-Augmented Generation (RAG) approach that intelligently routes user queries to the appropriate in-built or external tools. Whether your query calls for general knowledge or requires accessing specific internal context from a vector database (like Pinecone), this guide shows you how to integrate function calls, web searches in-built tool, and leverage document retrieval to generate accurate, context-aware responses." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.0.1\u001b[0m\n", "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n", "Note: you may need to restart the kernel to use updated packages.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/Users/shikhar/openai_projects/github_repos/success-git/success_new/success/oneoffs/shikhar/responses_rag_cookbook/env/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n", " from .autonotebook import tqdm as notebook_tqdm\n" ] } ], "source": [ "#%pip install datasets tqdm pandas pinecone openai --quiet\n", "\n", "import os\n", "import time\n", "from tqdm.auto import tqdm\n", "from pandas import DataFrame\n", "from datasets import load_dataset\n", "import random\n", "import string\n", "\n", "\n", "# Import OpenAI client and initialize with your API key.\n", "from openai import OpenAI\n", "\n", "client = OpenAI(api_key=os.getenv(\"OPENAI_API_KEY\"))\n", "\n", "# Import Pinecone client and related specifications.\n", "from pinecone import Pinecone\n", "from pinecone import ServerlessSpec" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example we use a sample medical reasoning dataset from Hugging Face. We convert the dataset into a Pandas DataFrame and merge the “Question” and “Response” columns into a single string. This merged text is used for embedding and later stored as metadata." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Example merged text: Question: A 61-year-old woman with a long history of involuntary urine loss during activities like coughing or sneezing but no leakage at night undergoes a gynecological exam and Q-tip test. Based on these findings, what would cystometry most likely reveal about her residual volume and detrusor contractions? Answer: Cystometry in this case of stress urinary incontinence would most likely reveal a normal post-void residual volume, as stress incontinence typically does not involve issues with bladder emptying. Additionally, since stress urinary incontinence is primarily related to physical exertion and not an overactive bladder, you would not expect to see any involuntary detrusor contractions during the test.\n" ] } ], "source": [ "# Load the dataset (ensure you're logged in with huggingface-cli if needed)\n", "ds = load_dataset(\"FreedomIntelligence/medical-o1-reasoning-SFT\", \"en\", split='train[:100]', trust_remote_code=True)\n", "ds_dataframe = DataFrame(ds)\n", "\n", "# Merge the Question and Response columns into a single string.\n", "ds_dataframe['merged'] = ds_dataframe.apply(\n", " lambda row: f\"Question: {row['Question']} Answer: {row['Response']}\", axis=1\n", ")\n", "print(\"Example merged text:\", ds_dataframe['merged'].iloc[0])" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Question | \n", "Complex_CoT | \n", "Response | \n", "merged | \n", "
---|---|---|---|---|
0 | \n", "A 61-year-old woman with a long history of inv... | \n", "Okay, let's think about this step by step. The... | \n", "Cystometry in this case of stress urinary inco... | \n", "Question: A 61-year-old woman with a long hist... | \n", "
1 | \n", "A 45-year-old man with a history of alcohol us... | \n", "Alright, let’s break this down. We have a 45-y... | \n", "Considering the clinical presentation of sudde... | \n", "Question: A 45-year-old man with a history of ... | \n", "
2 | \n", "A 45-year-old man presents with symptoms inclu... | \n", "Okay, so here's a 45-year-old guy who's experi... | \n", "Based on the clinical findings presented—wide-... | \n", "Question: A 45-year-old man presents with symp... | \n", "
3 | \n", "A patient with psoriasis was treated with syst... | \n", "I'm thinking about this patient with psoriasis... | \n", "The development of generalized pustules in a p... | \n", "Question: A patient with psoriasis was treated... | \n", "
4 | \n", "What is the most likely diagnosis for a 2-year... | \n", "Okay, so we're dealing with a 2-year-old child... | \n", "Based on the described symptoms and the unusua... | \n", "Question: What is the most likely diagnosis fo... | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
95 | \n", "An electrical current flows along a flat plate... | \n", "Alright, to find out the temperature at the ce... | \n", "The correct answer is F. 1549°F. | \n", "Question: An electrical current flows along a ... | \n", "
96 | \n", "A herpetologist bitten by a poisonous snake is... | \n", "Alright, so we're dealing with a case where a ... | \n", "The snake venom is most likely affecting the a... | \n", "Question: A herpetologist bitten by a poisonou... | \n", "
97 | \n", "A 34 years old person has rapidly developing c... | \n", "Alright, let's break down what's happening wit... | \n", "The symptoms described in the question fit mos... | \n", "Question: A 34 years old person has rapidly de... | \n", "
98 | \n", "What is the term used to describe the type of ... | \n", "Okay, so I need to figure out what kind of inj... | \n", "The term used to describe the type of injury c... | \n", "Question: What is the term used to describe th... | \n", "
99 | \n", "During the process of chlorination of water, t... | \n", "Alright, let's think this through starting fro... | \n", "The effective disinfecting action during the c... | \n", "Question: During the process of chlorination o... | \n", "
100 rows × 4 columns
\n", "\n", " | Type | \n", "Call ID | \n", "Output | \n", "Name | \n", "
---|---|---|---|---|
0 | \n", "web_search_call | \n", "ws_67e6e83241ac81918f93ffc96491ec390fdddafaeef... | \n", "N/A | \n", "N/A | \n", "
1 | \n", "message | \n", "msg_67e6e833a2cc8191a9df22f324a876b00fdddafaee... | \n", "N/A | \n", "N/A | \n", "
2 | \n", "function_call | \n", "call_6YWhEw3QSI7wGZBlNs5Pz4zI | \n", "N/A | \n", "PineconeSearchDocuments | \n", "