diff --git a/examples/GPT_with_vision_for_video_understanding.ipynb b/examples/GPT_with_vision_for_video_understanding.ipynb
index 8e32ee0..6dc2bdf 100644
--- a/examples/GPT_with_vision_for_video_understanding.ipynb
+++ b/examples/GPT_with_vision_for_video_understanding.ipynb
@@ -39,7 +39,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "First we use OpenCV to extract frames from a nature [video](https://www.youtube.com/watch?v=kQ_7GtE529M) containing bisons and wolves:\n"
+    "First, we use OpenCV to extract frames from a nature [video](https://www.youtube.com/watch?v=kQ_7GtE529M) containing bisons and wolves:\n"
    ]
   },
   {
@@ -104,7 +104,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Once we have the video frames we craft our prompt and send a request to GPT (Note that we don't need to send every frame for GPT to understand what's going on):\n"
+    "Once we have the video frames, we craft our prompt and send a request to GPT (Note that we don't need to send every frame for GPT to understand what's going on):\n"
    ]
   },
   {
@@ -206,7 +206,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now we can pass the script to the TTS API where it will generate a mp3 of the voiceover:\n"
+    "Now we can pass the script to the TTS API where it will generate an mp3 of the voiceover:\n"
    ]
   },
   {