Update stream chat completions API cookbook (#1172)

2025-05-09 19:32:38 +00:00 · 2024-05-06 12:15:49 -07:00 · 2024-05-06 12:15:49 -07:00 · dc0e64aedf
commit dc0e64aedf
parent 02525b5f3c
2 changed files with 61 additions and 4 deletions
--- a/examples/How_to_stream_completions.ipynb
+++ b/examples/How_to_stream_completions.ipynb
@ -19,14 +19,13 @@
    "\n",
    "Note that using `stream=True` in a production application makes it more difficult to moderate the content of the completions, as partial completions may be more difficult to evaluate. This may have implications for [approved usage](https://beta.openai.com/docs/usage-guidelines).\n",
    "\n",
-    "Another small drawback of streaming responses is that the response no longer includes the `usage` field to tell you how many tokens were consumed. After receiving and combining all of the responses, you can calculate this yourself using [`tiktoken`](How_to_count_tokens_with_tiktoken.ipynb).\n",
-    "\n",
    "## Example code\n",
    "\n",
    "Below, this notebook shows:\n",
    "1. What a typical chat completion response looks like\n",
    "2. What a streaming chat completion response looks like\n",
-    "3. How much time is saved by streaming a chat completion"
+    "3. How much time is saved by streaming a chat completion\n",
+    "4. How to get token usage data for streamed chat completion response"
   ]
  },
  {
@ -572,6 +571,65 @@
  {
   "cell_type": "markdown",
   "metadata": {},
+   "source": [
+    "### 4. How to get token usage data for streamed chat completion response\n",
+    "\n",
+    "You can get token usage statistics for your streamed response by setting `stream_options={\"include_usage\": True}`. When you do so, an extra chunk will be streamed as the final chunk. You can access the usage data for the entire request via the `usage` field on this chunk. A few important notes when you set `stream_options={\"include_usage\": True}`:\n",
+    "* The value for the `usage` field on all chunks except for the last one will be null.\n",
+    "* The `usage` field on the last chunk contains token usage statistics for the entire request.\n",
+    "* The `choices` field on the last chunk will always be an empty array `[]`.\n",
+    "\n",
+    "Let's see how it works using the example in 2."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "choices: [Choice(delta=ChoiceDelta(content='', function_call=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)]\n",
+      "usage: None\n",
+      "****************\n",
+      "choices: [Choice(delta=ChoiceDelta(content='2', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)]\n",
+      "usage: None\n",
+      "****************\n",
+      "choices: [Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)]\n",
+      "usage: None\n",
+      "****************\n",
+      "choices: []\n",
+      "usage: CompletionUsage(completion_tokens=1, prompt_tokens=19, total_tokens=20)\n",
+      "****************\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Example of an OpenAI ChatCompletion request with stream=True and stream_options={\"include_usage\": True}\n",
+    "\n",
+    "# a ChatCompletion request\n",
+    "response = client.chat.completions.create(\n",
+    "    model='gpt-3.5-turbo',\n",
+    "    messages=[\n",
+    "        {'role': 'user', 'content': \"What's 1+1? Answer in one word.\"}\n",
+    "    ],\n",
+    "    temperature=0,\n",
+    "    stream=True,\n",
+    "    stream_options={\"include_usage\": True}, # retrieving token usage for stream response\n",
+    ")\n",
+    "\n",
+    "for chunk in response:\n",
+    "    print(f\"choices: {chunk.choices}\\nusage: {chunk.usage}\")\n",
+    "    print(\"****************\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
   "source": []
  }
 ],
--- a/registry.yaml
+++ b/registry.yaml
@ -207,7 +207,6 @@
    - ted-at-openai
  tags:
    - completions
-    - tiktoken

 - title: Multiclass Classification for Transactions
  path: examples/Multiclass_classification_for_transactions.ipynb