openai-cookbook/examples/Parse_PDF_docs_for_RAG.ipynb

19317 lines
3.8 MiB
Plaintext
Raw Normal View History

Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
{
"cells": [
{
"cell_type": "markdown",
"id": "ec5f67a9",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"# Parsing PDF documents for RAG applications\n",
"\n",
"This notebook shows how to leverage GPT-4o to turn rich PDF documents such as slide decks or exports from web pages into usable content for your RAG application.\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"This technique can be used if you have a lot of unstructured data containing valuable information that you want to be able to retrieve as part of your RAG pipeline.\n",
"\n",
"For example, you could build a Knowledge Assistant that could answer user queries about your company or product based on information contained in PDF documents. \n",
"\n",
"The example documents used in this notebook are located at [data/example_pdfs](data/example_pdfs). They are related to OpenAI's APIs and various techniques that can be used as part of LLM projects."
]
},
{
"cell_type": "markdown",
"id": "3e8f6820",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"## Data preparation\n",
"\n",
"In this section, we will process our input data to prepare it for retrieval.\n",
"\n",
"We will do this in 2 ways:\n",
"\n",
"1. Extracting text with pdfminer\n",
"2. Converting the PDF pages to images to analyze them with GPT-4o\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"You can skip the 1st method if you want to only use the content inferred from the image analysis."
]
},
{
"cell_type": "markdown",
"id": "74eb2df8",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"### Setup\n",
"\n",
"We need to install a few libraries to convert the PDF to images and extract the text (optional).\n",
"\n",
"**Note: You need to install `poppler` on your machine for the `pdf2image` library to work. You can follow the instructions to install it [here](https://pypi.org/project/pdf2image/).**"
]
},
{
"cell_type": "code",
"execution_count": 79,
"id": "6784d73e",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"source": [
"%pip install pdf2image -q\n",
"%pip install pdfminer -q\n",
"%pip install pdfminer.six -q\n",
"%pip install openai -q\n",
"%pip install scikit-learn -q\n",
"%pip install rich -q\n",
"%pip install tqdm -q\n",
"%pip install pandas -q"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
{
"cell_type": "code",
"execution_count": 80,
"id": "f1f08b70",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Imports\n",
"from pdf2image import convert_from_path\n",
"from pdf2image.exceptions import (\n",
" PDFInfoNotInstalledError,\n",
" PDFPageCountError,\n",
" PDFSyntaxError\n",
")\n",
"from pdfminer.high_level import extract_text\n",
"import base64\n",
"import io\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"import os\n",
"import concurrent.futures\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"from tqdm import tqdm\n",
"from openai import OpenAI\n",
"import re\n",
"import pandas as pd \n",
"from sklearn.metrics.pairwise import cosine_similarity\n",
"import json\n",
"import numpy as np\n",
"from rich import print\n",
"from ast import literal_eval"
]
},
{
"cell_type": "markdown",
"id": "6f713d87",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"### File processing"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "d6696a30",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"def convert_doc_to_images(path):\n",
" images = convert_from_path(path)\n",
" return images\n",
"\n",
"def extract_text_from_doc(path):\n",
" text = extract_text(path)\n",
" return text"
]
},
{
"cell_type": "markdown",
"id": "41992d04",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"#### Testing with an example"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "88ab392d",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"file_path = \"data/example_pdfs/fine-tuning-deck.pdf\"\n",
"\n",
"images = convert_doc_to_images(file_path)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "2575a284",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"text = extract_text_from_doc(file_path)"
]
},
{
"cell_type": "code",
"execution_count": 77,
"id": "8b2c432a",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCARlB9ADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD5/ooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKK
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB9AAAARlCAIAAAButct1AADve0lEQVR4Aezdd5glVdkv7DMwpCEMOecoMGQBkRwkSYYXBVFRRBAkqqC8IAIGEAliQhRQxAQiCBhIknPOkgTJOTNk/X5axzr11a69e3ec7uHuP8ZVq1a8dzVe17NXP/V//o8fAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECBAgAABAgQIECB
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"text/plain": [
"<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2000x1125>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCARlB9ADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD5/ooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKK
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB9AAAARlCAIAAAButct1AAEAAElEQVR4AezddaAV1fr/8auComITgmIhYoGigmDitRHswro2KnZh97W7u8VEQcXGVspCBQNQBCwEBQUs1N/nun7f5667JvbsPHuf8z5/6Jo1a9asec3szTnPrHnmH//gBwEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBB
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"text/plain": [
"<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2000x1125>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAHCAyADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD5/ooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoopVUu4VepOBzigBKK1tf8ADGs+F7qK21qwks5pU8xFdlO5c4zkEjqKi0XQtT8Raiun6TaPdXbKWEaEA4AyTkkCgDOoq5qmlX2ianPp2o27W95A22SJiCVOM9uOhFW9F8L614ihvJtJsHuY7JBJcOGVRGuCcnJH90/lQBkUVqv4b1ePw5H4gaycaVJL5KXO5cF+RjGc9j2rKoAKKKKACiipPIlMJm8p/KBxv2nbn60AR0UVb0zTLzWNSg0/T4DPdztsijBALH054oAqUVa1LTbzSNRn0+/gMF1bvsljJBKn044qrQAUUUUAFFdPovw88V+ItNTUdJ0aW6tHZlWRXQAkHB6kGtD/AIVB4+/6Fu4/7+R//FUAcRRXQXngjxJYa/aaFdaXJFqd2oaC3LrlwSR1zj+E9+1ZOpabeaRqM+n38Bgurd9ksZIJU+nHFAFWiiigAooooAKKKKACipEglkR3SJ2VBlmVSQv19KjoAKKKKACiiigAorR0XQtS8RakmnaTatdXbqWWJWAJAGT1IFU7i3ltLmW2nQpNE5jdT/CwOCPzoAiooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACgUUUAez/FBv+Ej+FPgzxQPmlRPss7D+8Vwc/8AAo2/OqPwhceHtA8W+M3RC1hZi3t9/RpHOcfmEH41Z8IN/wAJH8BPFOiE7p9LkF5CB1C8P/7JJ+dUteP/AAjnwF0HSxhbjXLt76UdzGv3f/aZoAT45WcUviHSfEVsAbbWdPjmDDuygA/+OlK0fDjf8Iz+zxruqcLcazcG1iPQlfufy82qmpn/AISX9njTbzO+58P3xt5D3EbcD/0KP8ql+Lbf2F4J8GeEVwHgtPtdwo/vkYH6mSgC/Y6HqHiL9nnRtL0q2a4u59XYIg4wA0mST0AHcmsJ/gT4ka0lez1HRr25iGXtbe6JkHtyMZ+pFbkGr3mj/sxwPZTPDJc3z2zSIcMEZ2LAH3C4+hNcP8Jbma3+KGheTIyeZP5T7TjcrKcg+ooA5FbK6e+FitvIboyeUIQp3l84249c8Yr0m3+BniIwRtqOp6Lps0gytvdXX7z6HAI/ImoNeutQ0f486hc6HYR3eoJqLm3t2jLh3YegI9Sf1q/rfw58RavrNxrHi3XtB0e5um8147u9G5AeihRngDgDNAHI+J/h7r3hLVrOw1OKEfbGC29xHJuic5A64yMZGQR3r2aH4a63H8Frrwo1xp/2+S/E4cXB8rblTy2OvHTFYHxVt7e1+D3hGC21IanFDOY0vACBKAjDIz2GMD2ArNtSf+GYr3/sLD/0JKAPLtZ0qfQ9Zu9Lumjae1lMTmJtykj0PcVf8Gw6tceL9Mi0K4jt9UabFtLJjarYPJyD79jWFXY/Cr/kqHh//r6H/oJoAqa9o/iLUPH95pV4P7Q16W5McnkYxJJjkjgDGPYCuuT4D+IcLFNq+hQXrDK2j3Z3/Thf5Val8T2fhH9ofVNV1BGa0F1LFIyruKBlxuA74/lmrmsfC+z8X6zd6x4S8Z6bfSXczTi3uJisqMxzjIyeO2QDQB5Z4k8Nar4U1d9M1e2MFwoDDB3K6noykdRWRXefE6Txkt7p1l4wt41mtYTHbzxqCJk4yd4+8cjnuCfeuDoA9y03ULzS/wBmRruwuprW5S+IWWFyjAGYA8ivLv8AhO/Fv/Qy6t/4GP8A417B4V1ey0L9ncahqGkQatbJeurWk5ARyZQATkHp16Vyv/C0/CH/AES7Rv8Avtf/AI3QByPh/UPE/iHxvpTWurSya15gjtbm8kL7DyRywPHJ7d6lv/DviDXfiXcaFfXNvNrs9yySzFtsbSBck5C9MD0rT8C39tqnxr0q+tLCOwt57/fHaxEbYgQflGAP5V0dp/ydE3/YTk/9FmgDA074M+I7xrhru503TbeGd7dZ7y42LMynBKcZIyOvGaxfGfw91zwO8B1JYZba4yIrm3ffGxHOOQCDjnkfSrvxa1e81T4j6ulzMzRWk7W0EZPyxovGAPc8n3NdVdSPd/sx2zXDGVrbVNkRc5KKGOAPb5jQBwfhHwHrvjSaUaXAgt4P9ddTvsij9ie59hmug1n4MeI9M0mbUrO407VreAEyjT597IB14IGce3PtXo174Rku/hB4Z0TT9f03Rre4hF1dm7l8v7UzKGxnuAW5HsvpVDwD4Jn8E+KLfU08c+H3tfu3VvHdj98mOmDxkHBH0oA8Cor0u98E6PrnxK8R2Fv4i0zSrCCTzoJZpFMbhyDtU7gON36Uur/CvStM0a9v4vHuiXcltC0q28TLvlIGdo+fqaAO8+Hfw/1jT/h54mtZZ7Evrlin2UpOSF3Rtjfx8v3x6968ob4a62vjiHwkJrFtRmi81WWYmLG0t97HXAPaus+Fp/4th8Rf+vEf+gSVi/BH/kq+k/7s3/opqAOF1Gxl0zU7vT5ypmtZnhkKHI3KSDg+mRWxc+DdTtfBVp4rka3/ALOupzBGokPmbgWHK46fIe9Q+MP+R217/sI3H/oxq9C1j/k2fw9/2FH/APQpqAOU074YeJNW0jSNSsIYJ4
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAyAAAAHCCAIAAACYATqfAADXPUlEQVR4Ae3dd4AsRdU28M/wmsWcxRwQEyBgAPQSFRQFMUs0AAZURJBXQfECBkRFiQbEQFIQURQkCBIUBTMI5oAoiDmH1/D97j1YtjO7e2fT7Ozu03/MVFefOnXqqe6qp09VV/2//5cjCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCPwXAte5znX+67zvhECTaYE+qWsjSqBHzGlPzHjJRyp+vtg8X+wcqcod05hlt+mKHocxEyYyCASBIBAE5hCBFfCYObRsQWatp7zuda+raP/617/++c9/LsgyztNCtaph/z/+8Y95WoqYHQSCQBAIAiOCwKgQLLQD4VhzzTXf9KY3/fWvf73e9a73ghe84Pvf/37Fd8GqmLXXXvuNb3zjH//4x5VWWulDH/rQYYcdJsl4/eKb3/zmhz3sYRKedNJJ73jHO2hbunTpox/9aPKvetWrvvCFL/Tn0s1xzsNVtOc///lbb701Znb44Ycr8gTlnY7BpfblL3/5k570pD/84Q+QmVgbeyT5/e9/v/POO//iF78gPBw7J7ZqAVytithggw1e85rXKM7pp5/+hje8YcRv1AUAe4oQBIJAEJgpBK4/U4qmqUc/TcNVV1211lpr3fzmNxd+1KMehWD1D45UzOMf//j1118fFbvhDW/4P//zPzhHv0OIJLW3utWttt1229vf/vZ0vu997/PrWGONNR7zmMcI3O52t/Pbn8syoRk9ypg73/nOsha+8sorv/KVr1TkCvMp8+53v/shhYQ/9alP+a3IFaadrECpfdCDHiSv//u//4PtIBoQrBvf+MYlORw7B7FqQJmqBUzdLae8v/nNby644IK6IQfUMBtiVRF3utOd6ka9+uqrZyOX6AwCQSAIBIFZQmCECJYe5Sc/+Ym+zVv79a9//Yc//OFHH310dTPdwiNSItdZZ52///3vXCx/+9vf7nvf+97tbnf70Y9+1PN+75SP6qEPfai+83e/+91f/vKXc845p1T96U9/ktxVv13lsxcuY3SWxx57rFyOP/74Zz7zmRU5YKbNP9cCAyacgtif//xnuaBN0sJtAg2IiMpCShqSzbwWmCD5KFxyOynFPe95z5NPPhlfv/TSS90zYip+bi1srw0N3rm1J7kHgSAQBILAgAiMCsFirjERvchFF1206aabok08PfiHmG4/J0aXs/LKKz/4wQ8mc9Ob3pQT6za3uc0jHvEIBItkt9h1ajDxRje6EeVf/epXf/zjH5cGv2II9yTpJp+NsOJwC8ma2YPr19kTNgDHpSf861//2mlFDq5kUpKFD1ciSrrjjjtCaeLsVMo111yjXEjVMO2cVKEmFlYEI86KgHxPLDmcqwW4F4Pvfe97cowHaziwJ5cgEASCwEwhMEIEq3qU888/XyeNPBlpuutd73rFFVd0e/fGmQz5/fa3v7388su5r0QazzItqQeUevs37kOhoZ/PfvazYgQqvkfYKT2OimdM2dMv1hPTTeXSxAkJ4y6OllGPtjFPyxV00EEHHXLIIQRQNL8V2SM/KWN60vafshPbwFz7L00QM4idLXnX4Imha0kEppaqq2HM8PKaWfYz5tWZjewWgeb+e7Lq97TTTvv0pz9NoE77xbpWdXXOOZhdwxIOAkEgCCxCBIbRlwwIa3UeX/va1wwU6uTMnVp99dWl7e/walYKb9D+++/vFZ+M8UTMqcs5dDYU3vKWtzSdCCMhjLqNZwm/hVz0SZLUISySkvGSuFQC3VTSOqXKpW5CwiIdTWGL6UZ2k/SHFcFonaNbzBKbwJgqWr+2AWOWW72sOBWY4LcpHM/OVuSSLJ1d9FaIuYRSOfpTiWwGTCrQrBJoCVsxx4xsYv2BlrDnUsWXNr89RXDbVGRPKqcuVaUXq24C5Ls6hXt0FpjiW5L+gCQE+sEsO0v/xBr6dSYmCASBIBAECoHR8mBpzQ0wmQSzySabaP1NeP/Yxz7WrSrcwowfA4IijfedeeaZBlDMnrnPfe7D4/WNb3yDBn2SqwKEH/CABxhP1OUbYbn44ovF19WmU1/i4DATc8c73pFjTL7cNjxn5iGJdFUP1OQrULkU0bnt8sMopEsS/uxnP+Na60lIQ3WQbWSQSSzpMaYnlwFPy8IyRhEMmN7gBjeQHTNMpa/4BsuAOrtijBwThK7MIGEgOEqyakfYJG7fGQyCOeGWCm/2uYD5UirOhxG/+tWv6uoU8GSSupAcj/HroKRi6rT9DqJ8PJkW3yrLbXyPe9zDIKxL7Hczq6nBcV6O5X/ALP2q3g0AFmUBSw0lj1f1Dcxb3/rWlcrNyVtpnFSRx0vV0EggCASBIBAE5hMCeh3m/u///q/+A0XAn5zqdaoMGn2BBz7wgXoOAu9+97ud7rfffsI6GKsDOC0NLWC5AVf1l6eccopIGkrJiSeeKIl4873EP/GJT/z4xz/+gx/8wLRuvArJw9UOPPDAMb8xLHt0Y3L08fx3vvOdX/7yl6iV4+c///m3vvUtyjfccMPKDnUQeNaznvW5z33OcM+Xv/xlHRjJH/7wh2eccYbk/GpLliwpYb9jHqXEOgiUGOg0O55YRTZwfClJmyKYb64IHHv6bCtQvP71rzfSSr4KPqb+bmQB6KvMwq0IbsulK9kfHs/OylpNsd8nkHe/+92lfexjH6tS4DAg5mUDbv2BD3zgm9/8pntAMf3C/7jjjvPRA50D2lmWl1Wm6KkCVgEWaGrHZDI3nsoS89znPreE2SzGFxhHHXVUQdTNq8J47Qc/+MHzzjvv1FNPLcwLkEqr7BYWoc3EQXc4uo+LuxN8qPHTn/703HPPrWptaiutG4kZ8t1rr72kZXOZvcMOO1Aor7p5VlttNbB8+9vfrtr3+93vfvf973//KqusIlXTWWVxWjG+Jvnwhz8MwAITz/NuY50UzmOSyiILauvloZKUhvwGgSAQBILAPEOgOhUjgF6m9T3cSHe4wx2UoTqV6tjQmuJGWItGf6ONNiKMDegJmmQLmJjFMeDqHnvsIZKGUoUDicTh9DF77723cPegX+ctBh9igFwqVVOr3/r85z/fTdLCfCqSO91ll13I63T97rPPPk1AL+4g1mIUhEyVXaD/qIIjfJVE90xGZPV5nEColUuVb1MrgEH6xbRMRJOklaI/ixZTeXUJVhXf73hHT9p+OytfdIoxKotPcdd
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"text/plain": [
"<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=800x450>"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCARlB9ADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD5/ooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKK
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB9AAAARlCAIAAAButct1AAEAAElEQVR4AezdB2AkZd3H8We2pCeX670DV+m9N+lFUEABpUlRmiDYUBApIgqIoLwoIipKkaLSpAtK7+0OjnqV6zV927y/2b3b7O3Mbjab3WSTfMcYZp555plnPjObm/3PM89jDBMCCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIAAAggggAACCCCAAAIIIIA
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"text/plain": [
"<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2000x1125>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCARlB9ADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD5/ooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKK
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB9AAAARlCAIAAAButct1AAEAAElEQVR4AezdB5jU1P648Z+C2EUUxQqCICoqIiJgAxGwYcHeC4qKir3ea++994q9KzYUsSCKBRURuyCiCIq99/J/r7n3/GOSmZ2dnd2dnX338eFmTk5OTj7JZO9+z8k3//d//iiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgoooIACCiiggAIKKKCAAgo
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"text/plain": [
"<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2000x1125>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCARlB9ADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD5/ooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAorofDXgbxJ4ul26Npc08QOGuGGyJfXLnAz7dfavWNI/ZrunQPrPiGKJu8VnCX/8eYj/ANBoA8Gor6eT9nDwiEUPqetlsckTRAE/Ty6X/hnHwf8A9BLXP+/8P/xqgD5gor6f/wCGcfB//QS1z/v/AA//ABqj/hnHwf8A9BLXP+/8P/xqgD5gor6f/wCGcfB//QS1z/v/AA//ABqj/hnHwf8A9BLXP+/8P/xqgD5gor6f/wCGcfB//QS1z/v/AA//ABqj/hnHwf8A9BLXP+/8P/xqgD5gor6f/wCGcfB//QS1z/v/AA//ABqj/h
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB9AAAARlCAIAAAButct1AAEAAElEQVR4Aey9d6BtNdlv/YrYQFBRmgKiolIEEaUdUEBRBEWagocmIF1A6b13ASnSBAEphyK9K6AgUkR6kSIgVaWooIAV9RuX3C83b2ZZc9W99tpj/3FOVmaSmYyZWfLLkyf/8z/+SUACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCUhAAhKQgAQkIAEJSEACEpCABCQgAQlIQAISkIAEJCABCXRG4HWdZTOXBCQgAQlIQAISkIAEJCABCUhAAhIYdwTmnHPOcVdnKywBCUhgYhJ45ZVXnn/++XHXdgX3cXfJrLAEJCABCUhAAsNIYOqpp55nnnnmn3/+ueaaa+655548efIw1tI6SUACEpDABCBwzTXXIE9cdtllTzzxxK9//evxKFX05CpNO+20Cy644Mc+9rFJkybNN998CyywQE+KtRAJSEACEhg8gfBq4412yy23XH311UP+alNwH3wP8YwSkIAEJCABCYwOART2pZZaaqWVVlp22WVHp1W2RAISkIAERojAM88888PX/hApXn311RFqWXlT0Nl5NW+99da+mssBGSsBCUhg/BPg1XbcccddeOGF99577xC2RsF9CC+KVZKABCQgAQlIYNgJzDjjjGusscauu+46yyyzDHtdrZ8EJCABCUjg/ydw1FFHob3feOON/3/ESP3P2/nII490kdlIXVQbIwEJSKAVgbXWWotX21DNKCu4t7poHpeABCQgAQlIQAIJARy/HnDAAQ7mEyQGJSABCUhgnBHAMHDbbbcdNnmiG4hYtdOivffeu5tCzCsBCUhAAuOUwLC91xTcx2lHstoSkIAEJCABCQyawBJLLHHsscfqAXbQ3D2fBCQgAQn0h8CwyROdtZI9VFZfffUpU6Z0lt1cEpCABCQwMgR4r335y18ehlVcCu4j06lsiAQkIAEJSEAC/SKA3dyJJ56oVXu/+FquBCQ
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"text/plain": [
"<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2000x1125>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCARlB9ADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD5/ooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKK
"image/png": "iVBORw0KGgoAAAANSUhEUgAAB9AAAARlCAIAAAButct1AAEAAElEQVR4AezdBbwc1dnHcZLcuLu7u7sRSAgJCa7BrdDipRQvBQq05S0Up0CLE9wSiBB3d3d3d8/7zz3JYZiZ3bt67+7e320+6eyZM2fOfEcueebsc846ix8EEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQQQAABBBBAAAEEEEAAAQQ
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"text/plain": [
"<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=2000x1125>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for img in images:\n",
" display(img)"
]
},
{
"cell_type": "markdown",
"id": "be69a779",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"### Image analysis with GPT-4o\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"After converting a PDF file to multiple images, we'll use GPT-4o to analyze the content based on the images."
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
{
"cell_type": "code",
"execution_count": 63,
"id": "9edcbd39",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Initializing OpenAI client - see https://platform.openai.com/docs/quickstart?context=python\n",
"client = OpenAI()"
]
},
{
"cell_type": "code",
"execution_count": 76,
"id": "eb376547",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Converting images to base64 encoded images in a data URI format to use with the ChatCompletions API\n",
"def get_img_uri(img):\n",
" png_buffer = io.BytesIO()\n",
" img.save(png_buffer, format=\"PNG\")\n",
" png_buffer.seek(0)\n",
"\n",
" base64_png = base64.b64encode(png_buffer.read()).decode('utf-8')\n",
"\n",
" data_uri = f\"data:image/png;base64,{base64_png}\"\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" return data_uri"
]
},
{
"cell_type": "code",
"execution_count": 70,
"id": "62a606d7",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"system_prompt = '''\n",
"You will be provided with an image of a PDF page or a slide. Your goal is to deliver a detailed and engaging presentation about the content you see, using clear and accessible language suitable for a 101-level audience.\n",
"\n",
"If there is an identifiable title, start by stating the title to provide context for your audience.\n",
"\n",
"Describe visual elements in detail:\n",
"\n",
"- **Diagrams**: Explain each component and how they interact. For example, \"The process begins with X, which then leads to Y and results in Z.\"\n",
" \n",
"- **Tables**: Break down the information logically. For instance, \"Product A costs X dollars, while Product B is priced at Y dollars.\"\n",
"\n",
"Focus on the content itself rather than the format:\n",
"\n",
"- **DO NOT** include terms referring to the content format.\n",
" \n",
"- **DO NOT** mention the content type. Instead, directly discuss the information presented.\n",
"\n",
"Keep your explanation comprehensive yet concise:\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"- Be exhaustive in describing the content, as your audience cannot see the image.\n",
" \n",
"- Exclude irrelevant details such as page numbers or the position of elements on the image.\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"Use clear and accessible language:\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"- Explain technical terms or concepts in simple language appropriate for a 101-level audience.\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"Engage with the content:\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"- Interpret and analyze the information where appropriate, offering insights to help the audience understand its significance.\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"------\n",
"\n",
"If there is an identifiable title, present the output in the following format:\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"{TITLE}\n",
"\n",
"{Content description}\n",
"\n",
"If there is no clear title, simply provide the content description.\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"'''\n",
"\n",
"def analyze_image(data_uri):\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" response = client.chat.completions.create(\n",
" model=\"gpt-4o\",\n",
" messages=[\n",
" {\"role\": \"system\", \"content\": system_prompt},\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" {\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" \"type\": \"image_url\",\n",
" \"image_url\": {\n",
" \"url\": f\"{data_uri}\"\n",
" }\n",
" }\n",
" ]\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" },\n",
" ],\n",
" max_tokens=500,\n",
" temperature=0,\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" top_p=0.1\n",
" )\n",
" return response.choices[0].message.content"
]
},
{
"cell_type": "markdown",
"id": "ff47a90b",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"#### Testing with an example"
]
},
{
"cell_type": "code",
"execution_count": 81,
"id": "dea9dc0f",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAHCAyADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD5/ooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAoopVUu4VepOBzigBKK1tf8ADGs+F7qK21qwks5pU8xFdlO5c4zkEjqKi0XQtT8Raiun6TaPdXbKWEaEA4AyTkkCgDOoq5qmlX2ianPp2o27W95A22SJiCVOM9uOhFW9F8L614ihvJtJsHuY7JBJcOGVRGuCcnJH90/lQBkUVqv4b1ePw5H4gaycaVJL5KXO5cF+RjGc9j2rKoAKKKKACiipPIlMJm8p/KBxv2nbn60AR0UVb0zTLzWNSg0/T4DPdztsijBALH054oAqUVa1LTbzSNRn0+/gMF1bvsljJBKn044qrQAUUUUAFFdPovw88V+ItNTUdJ0aW6tHZlWRXQAkHB6kGtD/AIVB4+/6Fu4/7+R//FUAcRRXQXngjxJYa/aaFdaXJFqd2oaC3LrlwSR1zj+E9+1ZOpabeaRqM+n38Bgurd9ksZIJU+nHFAFWiiigAooooAKKKKACipEglkR3SJ2VBlmVSQv19KjoAKKKKACiiigAorR0XQtS8RakmnaTatdXbqWWJWAJAGT1IFU7i3ltLmW2nQpNE5jdT/CwOCPzoAiooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigAooooAKKKKACgUUUAez/FBv+Ej+FPgzxQPmlRPss7D+8Vwc/8AAo2/OqPwhceHtA8W+M3RC1hZi3t9/RpHOcfmEH41Z8IN/wAJH8BPFOiE7p9LkF5CB1C8P/7JJ+dUteP/AAjnwF0HSxhbjXLt76UdzGv3f/aZoAT45WcUviHSfEVsAbbWdPjmDDuygA/+OlK0fDjf8Iz+zxruqcLcazcG1iPQlfufy82qmpn/AISX9njTbzO+58P3xt5D3EbcD/0KP8ql+Lbf2F4J8GeEVwHgtPtdwo/vkYH6mSgC/Y6HqHiL9nnRtL0q2a4u59XYIg4wA0mST0AHcmsJ/gT4ka0lez1HRr25iGXtbe6JkHtyMZ+pFbkGr3mj/sxwPZTPDJc3z2zSIcMEZ2LAH3C4+hNcP8Jbma3+KGheTIyeZP5T7TjcrKcg+ooA5FbK6e+FitvIboyeUIQp3l84249c8Yr0m3+BniIwRtqOp6Lps0gytvdXX7z6HAI/ImoNeutQ0f486hc6HYR3eoJqLm3t2jLh3YegI9Sf1q/rfw58RavrNxrHi3XtB0e5um8147u9G5AeihRngDgDNAHI+J/h7r3hLVrOw1OKEfbGC29xHJuic5A64yMZGQR3r2aH4a63H8Frrwo1xp/2+S/E4cXB8rblTy2OvHTFYHxVt7e1+D3hGC21IanFDOY0vACBKAjDIz2GMD2ArNtSf+GYr3/sLD/0JKAPLtZ0qfQ9Zu9Lumjae1lMTmJtykj0PcVf8Gw6tceL9Mi0K4jt9UabFtLJjarYPJyD79jWFXY/Cr/kqHh//r6H/oJoAqa9o/iLUPH95pV4P7Q16W5McnkYxJJjkjgDGPYCuuT4D+IcLFNq+hQXrDK2j3Z3/Thf5Val8T2fhH9ofVNV1BGa0F1LFIyruKBlxuA74/lmrmsfC+z8X6zd6x4S8Z6bfSXczTi3uJisqMxzjIyeO2QDQB5Z4k8Nar4U1d9M1e2MFwoDDB3K6noykdRWRXefE6Txkt7p1l4wt41mtYTHbzxqCJk4yd4+8cjnuCfeuDoA9y03ULzS/wBmRruwuprW5S+IWWFyjAGYA8ivLv8AhO/Fv/Qy6t/4GP8A417B4V1ey0L9ncahqGkQatbJeurWk5ARyZQATkHp16Vyv/C0/CH/AES7Rv8Avtf/AI3QByPh/UPE/iHxvpTWurSya15gjtbm8kL7DyRywPHJ7d6lv/DviDXfiXcaFfXNvNrs9yySzFtsbSBck5C9MD0rT8C39tqnxr0q+tLCOwt57/fHaxEbYgQflGAP5V0dp/ydE3/YTk/9FmgDA074M+I7xrhru503TbeGd7dZ7y42LMynBKcZIyOvGaxfGfw91zwO8B1JYZba4yIrm3ffGxHOOQCDjnkfSrvxa1e81T4j6ulzMzRWk7W0EZPyxovGAPc8n3NdVdSPd/sx2zXDGVrbVNkRc5KKGOAPb5jQBwfhHwHrvjSaUaXAgt4P9ddTvsij9ie59hmug1n4MeI9M0mbUrO407VreAEyjT597IB14IGce3PtXo174Rku/hB4Z0TT9f03Rre4hF1dm7l8v7UzKGxnuAW5HsvpVDwD4Jn8E+KLfU08c+H3tfu3VvHdj98mOmDxkHBH0oA8Cor0u98E6PrnxK8R2Fv4i0zSrCCTzoJZpFMbhyDtU7gON36Uur/CvStM0a9v4vHuiXcltC0q28TLvlIGdo+fqaAO8+Hfw/1jT/h54mtZZ7Evrlin2UpOSF3Rtjfx8v3x6968ob4a62vjiHwkJrFtRmi81WWYmLG0t97HXAPaus+Fp/4th8Rf+vEf+gSVi/BH/kq+k/7s3/opqAOF1Gxl0zU7vT5ypmtZnhkKHI3KSDg+mRWxc+DdTtfBVp4rka3/ALOupzBGokPmbgWHK46fIe9Q+MP+R217/sI3H/oxq9C1j/k2fw9/2FH/APQpqAOU074YeJNW0jSNSsIYJ4
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAyAAAAHCCAIAAACYATqfAADXPUlEQVR4Ae3dd4AsRdU28M/wmsWcxRwQEyBgAPQSFRQFMUs0AAZURJBXQfECBkRFiQbEQFIQURQkCBIUBTMI5oAoiDmH1/D97j1YtjO7e2fT7Ozu03/MVFefOnXqqe6qp09VV/2//5cjCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCASBIBAEgkAQCAJBIAgEgSAQBIJAEAgCQSAIBIEgEASCQBAIAkEgCPwXAte5znX+67zvhECTaYE+qWsjSqBHzGlPzHjJRyp+vtg8X+wcqcod05hlt+mKHocxEyYyCASBIBAE5hCBFfCYObRsQWatp7zuda+raP/617/++c9/LsgyztNCtaph/z/+8Y95WoqYHQSCQBAIAiOCwKgQLLQD4VhzzTXf9KY3/fWvf73e9a73ghe84Pvf/37Fd8GqmLXXXvuNb3zjH//4x5VWWulDH/rQYYcdJsl4/eKb3/zmhz3sYRKedNJJ73jHO2hbunTpox/9aPKvetWrvvCFL/Tn0s1xzsNVtOc///lbb701Znb44Ycr8gTlnY7BpfblL3/5k570pD/84Q+QmVgbeyT5/e9/v/POO//iF78gPBw7J7ZqAVytithggw1e85rXKM7pp5/+hje8YcRv1AUAe4oQBIJAEJgpBK4/U4qmqUc/TcNVV1211lpr3fzmNxd+1KMehWD1D45UzOMf//j1118fFbvhDW/4P//zPzhHv0OIJLW3utWttt1229vf/vZ0vu997/PrWGONNR7zmMcI3O52t/Pbn8syoRk9ypg73/nOsha+8sorv/KVr1TkCvMp8+53v/shhYQ/9alP+a3IFaadrECpfdCDHiSv//u//4PtIBoQrBvf+MYlORw7B7FqQJmqBUzdLae8v/nNby644IK6IQfUMBtiVRF3utOd6ka9+uqrZyOX6AwCQSAIBIFZQmCECJYe5Sc/+Ym+zVv79a9//Yc//OFHH310dTPdwiNSItdZZ52///3vXCx/+9vf7nvf+97tbnf70Y9+1PN+75SP6qEPfai+83e/+91f/vKXc845p1T96U9/ktxVv13lsxcuY3SWxx57rFyOP/74Zz7zmRU5YKbNP9cCAyacgtif//xnuaBN0sJtAg2IiMpCShqSzbwWmCD5KFxyOynFPe95z5NPPhlfv/TSS90zYip+bi1srw0N3rm1J7kHgSAQBILAgAiMCsFirjERvchFF1206aabok08PfiHmG4/J0aXs/LKKz/4wQ8mc9Ob3pQT6za3uc0jHvEIBItkt9h1ajDxRje6EeVf/epXf/zjH5cGv2II9yTpJp+NsOJwC8ma2YPr19kTNgDHpSf861//2mlFDq5kUpKFD1ciSrrjjjtCaeLsVMo111yjXEjVMO2cVKEmFlYEI86KgHxPLDmcqwW4F4Pvfe97cowHaziwJ5cgEASCwEwhMEIEq3qU888/XyeNPBlpuutd73rFFVd0e/fGmQz5/fa3v7388su5r0QazzItqQeUevs37kOhoZ/PfvazYgQqvkfYKT2OimdM2dMv1hPTTeXSxAkJ4y6OllGPtjFPyxV00EEHHXLIIQRQNL8V2SM/KWN60vafshPbwFz7L00QM4idLXnX4Imha0kEppaqq2HM8PKaWfYz5tWZjewWgeb+e7Lq97TTTvv0pz9NoE77xbpWdXXOOZhdwxIOAkEgCCxCBIbRlwwIa3UeX/va1wwU6uTMnVp99dWl7e/walYKb9D+++/vFZ+M8UTMqcs5dDYU3vKWtzSdCCMhjLqNZwm/hVz0SZLUISySkvGSuFQC3VTSOqXKpW5CwiIdTWGL6UZ2k/SHFcFonaNbzBKbwJgqWr+2AWOWW72sOBWY4LcpHM/OVuSSLJ1d9FaIuYRSOfpTiWwGTCrQrBJoCVsxx4xsYv2BlrDnUsWXNr89RXDbVGRPKqcuVaUXq24C5Ls6hXt0FpjiW5L+gCQE+sEsO0v/xBr6dSYmCASBIBAECoHR8mBpzQ0wmQSzySabaP1NeP/Yxz7WrSrcwowfA4IijfedeeaZBlDMnrnPfe7D4/WNb3yDBn2SqwKEH/CABxhP1OUbYbn44ovF19WmU1/i4DATc8c73pFjTL7cNjxn5iGJdFUP1OQrULkU0bnt8sMopEsS/uxnP+Na60lIQ3WQbWSQSSzpMaYnlwFPy8IyRhEMmN7gBjeQHTNMpa/4BsuAOrtijBwThK7MIGEgOEqyakfYJG7fGQyCOeGWCm/2uYD5UirOhxG/+tWv6uoU8GSSupAcj/HroKRi6rT9DqJ8PJkW3yrLbXyPe9zDIKxL7Hczq6nBcV6O5X/ALP2q3g0AFmUBSw0lj1f1Dcxb3/rWlcrNyVtpnFSRx0vV0EggCASBIBAE5hMCeh3m/u///q/+A0XAn5zqdaoMGn2BBz7wgXoOAu9+97ud7rfffsI6GKsDOC0NLWC5AVf1l6eccopIGkrJiSeeKIl4873EP/GJT/z4xz/+gx/8wLRuvArJw9UOPPDAMb8xLHt0Y3L08fx3vvOdX/7yl6iV4+c///m3vvUtyjfccMPKDnUQeNaznvW5z33OcM+Xv/xlHRjJH/7wh2eccYbk/GpLliwpYb9jHqXEOgiUGOg0O55YRTZwfClJmyKYb64IHHv6bCtQvP71rzfSSr4KPqb+bmQB6KvMwq0IbsulK9kfHs/OylpNsd8nkHe/+92lfexjH6tS4DAg5mUDbv2BD3zgm9/8pntAMf3C/7jjjvPRA50D2lmWl1Wm6KkCVgEWaGrHZDI3nsoS89znPreE2SzGFxhHHXVUQdTNq8J47Qc/+MHzzjvv1FNPLcwLkEqr7BYWoc3EQXc4uo+LuxN8qPHTn/703HPPrWptaiutG4kZ8t1rr72kZXOZvcMOO1Aor7p5VlttNbB8+9vfrtr3+93vfvf973//KqusIlXTWWVxWjG+Jvnwhz8MwAITz/NuY50UzmOSyiILauvloZKUhvwGgSAQBILAPEOgOhUjgF6m9T3cSHe4wx2UoTqV6tjQmuJGWItGf6ONNiKMDegJmmQLmJjFMeDqHnvsIZKGUoUDicTh9DF77723cPegX+ctBh9igFwqVVOr3/r85z/fTdLCfCqSO91ll13I63T97rPPPk1AL+4g1mIUhEyVXaD/qIIjfJVE90xGZPV5nEColUuVb1MrgEH6xbRMRJOklaI/ixZTeXUJVhXf73hHT9p+OytfdIoxKotPcdd
"text/plain": [
"<PIL.PpmImagePlugin.PpmImageFile image mode=RGB size=800x450>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"source": [
"img = images[2]\n",
"display(img)\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"data_uri = get_img_uri(img)"
]
},
{
"cell_type": "code",
"execution_count": 82,
"id": "a272d1e9",
"metadata": {},
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"outputs": [
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What is Fine-tuning\n",
"\n",
"Fine-tuning is a process where a pre-existing model, known as a public model, is trained using specific training \n",
"data. This involves providing the model with a set of input/output examples to learn from. The goal is to adjust \n",
"the model so that it can respond accurately to similar inputs in the future.\n",
"\n",
"The diagram illustrates this process: starting with a public model, training data is used in a training phase to \n",
"produce a fine-tuned model. This refined model is better equipped to handle specific tasks or datasets.\n",
"\n",
"For effective fine-tuning, it is recommended to use between <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples, although the minimum requirement is\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> examples. This ensures the model has enough data to learn from and improve its performance.\n",
"</pre>\n"
],
"text/plain": [
"What is Fine-tuning\n",
"\n",
"Fine-tuning is a process where a pre-existing model, known as a public model, is trained using specific training \n",
"data. This involves providing the model with a set of input/output examples to learn from. The goal is to adjust \n",
"the model so that it can respond accurately to similar inputs in the future.\n",
"\n",
"The diagram illustrates this process: starting with a public model, training data is used in a training phase to \n",
"produce a fine-tuned model. This refined model is better equipped to handle specific tasks or datasets.\n",
"\n",
"For effective fine-tuning, it is recommended to use between \u001b[1;36m50\u001b[0m to \u001b[1;36m100\u001b[0m examples, although the minimum requirement is\n",
"\u001b[1;36m10\u001b[0m examples. This ensures the model has enough data to learn from and improve its performance.\n"
]
},
"metadata": {},
"output_type": "display_data"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
}
],
"source": [
"res = analyze_image(data_uri)\n",
"print(res)"
]
},
{
"cell_type": "markdown",
"id": "35b8efaf",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"#### Processing all documents"
]
},
{
"cell_type": "code",
"execution_count": 83,
"id": "f8d3f8c5",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"files_path = \"data/example_pdfs\"\n",
"\n",
"all_items = os.listdir(files_path)\n",
"files = [item for item in all_items if os.path.isfile(os.path.join(files_path, item))]"
]
},
{
"cell_type": "code",
"execution_count": 84,
"id": "26d1021a",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"def analyze_doc_image(img):\n",
" img_uri = get_img_uri(img)\n",
" data = analyze_image(img_uri)\n",
" return data"
]
},
{
"cell_type": "markdown",
"id": "4170b84f",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"We will list all files in the example folder and process them by \n",
"1. Extracting the text\n",
"2. Converting the docs to images\n",
"3. Analyzing pages with GPT-4o\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"Note: This takes about ~2 mins to run. Feel free to skip and load directly the result file (see below)."
]
},
{
"cell_type": "code",
"execution_count": 85,
"id": "dace007b",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Analyzing pages for doc rag-deck.pdf\n",
"</pre>\n"
],
"text/plain": [
"Analyzing pages for doc rag-deck.pdf\n"
]
},
"metadata": {},
"output_type": "display_data"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 19/19 [00:20<00:00, 1.07s/it]\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Analyzing pages for doc models-page.pdf\n",
"</pre>\n"
],
"text/plain": [
"Analyzing pages for doc models-page.pdf\n"
]
},
"metadata": {},
"output_type": "display_data"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 9/9 [00:15<00:00, 1.76s/it]\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Analyzing pages for doc evals-decks.pdf\n",
"</pre>\n"
],
"text/plain": [
"Analyzing pages for doc evals-decks.pdf\n"
]
},
"metadata": {},
"output_type": "display_data"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 12/12 [00:12<00:00, 1.08s/it]\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Analyzing pages for doc fine-tuning-deck.pdf\n",
"</pre>\n"
],
"text/plain": [
"Analyzing pages for doc fine-tuning-deck.pdf\n"
]
},
"metadata": {},
"output_type": "display_data"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"100%|██████████| 6/6 [00:07<00:00, 1.31s/it]\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
}
],
"source": [
"docs = []\n",
"\n",
"for f in files:\n",
" \n",
" path = f\"{files_path}/{f}\"\n",
" doc = {\n",
" \"filename\": f\n",
" }\n",
" text = extract_text_from_doc(path)\n",
" doc['text'] = text\n",
" imgs = convert_doc_to_images(path)\n",
" pages_description = []\n",
" \n",
" print(f\"Analyzing pages for doc {f}\")\n",
" \n",
" # Concurrent execution\n",
" with concurrent.futures.ThreadPoolExecutor(max_workers=8) as executor:\n",
" \n",
" # Removing 1st slide as it's usually just an intro\n",
" futures = [\n",
" executor.submit(analyze_doc_image, img)\n",
" for img in imgs[1:]\n",
" ]\n",
" \n",
" with tqdm(total=len(imgs)-1) as pbar:\n",
" for _ in concurrent.futures.as_completed(futures):\n",
" pbar.update(1)\n",
" \n",
" for f in futures:\n",
" res = f.result()\n",
" pages_description.append(res)\n",
" \n",
" doc['pages_description'] = pages_description\n",
" docs.append(doc)"
]
},
{
"cell_type": "code",
"execution_count": 86,
"id": "b733ba0a",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Saving result to file for later\n",
"json_path = \"data/parsed_pdf_docs.json\"\n",
"\n",
"with open(json_path, 'w') as f:\n",
" json.dump(docs, f)"
]
},
{
"cell_type": "code",
"execution_count": 87,
"id": "535770e3",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Optional: load content from the saved file\n",
"with open(json_path, 'r') as f:\n",
" docs = json.load(f)"
]
},
{
"cell_type": "markdown",
"id": "e507ee4e",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"### Embedding content\n",
"Before embedding the content, we will chunk it logically by page.\n",
"For real-world scenarios, you could explore more advanced ways to chunk the content:\n",
"- Cutting it into smaller pieces\n",
"- Adding data - such as the slide title, deck title and/or the doc description - at the beginning of each piece of content. That way, each independent chunk can be in context\n",
"\n",
"For the sake of brevity, we will use a very simple chunking strategy and rely on separators to split the text by page."
]
},
{
"cell_type": "code",
"execution_count": 88,
"id": "05ffb8f8",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Chunking content by page and merging together slides text & description if applicable\n",
"content = []\n",
"for doc in docs:\n",
" # Removing first slide as well\n",
" text = doc['text'].split('\\f')[1:]\n",
" description = doc['pages_description']\n",
" description_indexes = []\n",
" for i in range(len(text)):\n",
" slide_content = text[i] + '\\n'\n",
" # Trying to find matching slide description\n",
" slide_title = text[i].split('\\n')[0]\n",
" for j in range(len(description)):\n",
" description_title = description[j].split('\\n')[0]\n",
" if slide_title.lower() == description_title.lower():\n",
" slide_content += description[j].replace(description_title, '')\n",
" # Keeping track of the descriptions added\n",
" description_indexes.append(j)\n",
" # Adding the slide content + matching slide description to the content pieces\n",
" content.append(slide_content) \n",
" # Adding the slides descriptions that weren't used\n",
" for j in range(len(description)):\n",
" if j not in description_indexes:\n",
" content.append(description[j])"
]
},
{
"cell_type": "code",
"execution_count": 90,
"id": "9f972358",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Overview\n",
"\n",
"Retrieval-Augmented Generation \n",
"enhances the capabilities of language \n",
"models by combining them with a \n",
"retrieval system. This allows the model \n",
"to leverage external knowledge sources \n",
"to generate more accurate and \n",
"contextually relevant responses.\n",
"\n",
"Example use cases\n",
"\n",
"- Provide answers with up-to-date \n",
"\n",
"information\n",
"\n",
"- Generate contextual responses\n",
"\n",
"What well cover\n",
"\n",
"● Technical patterns\n",
"\n",
"● Best practices\n",
"\n",
"● Common pitfalls\n",
"\n",
"● Resources\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Overview\n",
"\n",
"Retrieval-Augmented Generation \n",
"enhances the capabilities of language \n",
"models by combining them with a \n",
"retrieval system. This allows the model \n",
"to leverage external knowledge sources \n",
"to generate more accurate and \n",
"contextually relevant responses.\n",
"\n",
"Example use cases\n",
"\n",
"- Provide answers with up-to-date \n",
"\n",
"information\n",
"\n",
"- Generate contextual responses\n",
"\n",
"What well cover\n",
"\n",
"● Technical patterns\n",
"\n",
"● Best practices\n",
"\n",
"● Common pitfalls\n",
"\n",
"● Resources\n",
"\n",
"\u001b[1;36m3\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What is RAG\n",
"\n",
"Retrieve information to Augment the models knowledge and Generate the output\n",
"\n",
"“What is your \n",
"return policy?”\n",
"\n",
"ask\n",
"\n",
"result\n",
"\n",
"search\n",
"\n",
"LLM\n",
"\n",
"return information\n",
"\n",
"Total refunds: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>% of value vouchers: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"$<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> discount on next order: &gt; <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"\n",
"“You can get a full refund up \n",
"to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days after the \n",
"purchase, then up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days \n",
"you would get a voucher for \n",
"half the value of your order”\n",
"\n",
"Knowledge \n",
"Base <span style=\"color: #800080; text-decoration-color: #800080\">/</span> External \n",
"sources\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"\n",
"\n",
"\n",
"RAG stands for <span style=\"color: #008000; text-decoration-color: #008000\">\"Retrieve information to Augment the models knowledge and Generate the output.\"</span> This process \n",
"involves using a language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> to enhance its responses by accessing external information sources.\n",
"\n",
"Here's how it works:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **User Query**: A user asks a question, such as <span style=\"color: #008000; text-decoration-color: #008000\">\"What is your return policy?\"</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **LLM Processing**: The language model receives the question and initiates a search for relevant information.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Information Retrieval**: The LLM accesses a knowledge base or external sources to find the necessary details. \n",
"In this example, the information retrieved includes:\n",
" - Total refunds available from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days.\n",
" - <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>% value vouchers for returns between <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days.\n",
" - A $<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> discount on the next order for returns after <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Response Generation**: The LLM uses the retrieved information to generate a coherent response for the user. \n",
"For instance, it might say, <span style=\"color: #008000; text-decoration-color: #008000\">\"You can get a full refund up to 14 days after the purchase, then up to 30 days you </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">would get a voucher for half the value of your order.\"</span>\n",
"\n",
"This method allows the model to provide accurate and up-to-date answers by leveraging external data sources.\n",
"</pre>\n"
],
"text/plain": [
"What is RAG\n",
"\n",
"Retrieve information to Augment the models knowledge and Generate the output\n",
"\n",
"“What is your \n",
"return policy?”\n",
"\n",
"ask\n",
"\n",
"result\n",
"\n",
"search\n",
"\n",
"LLM\n",
"\n",
"return information\n",
"\n",
"Total refunds: \u001b[1;36m0\u001b[0m-\u001b[1;36m14\u001b[0m days\n",
"\u001b[1;36m50\u001b[0m% of value vouchers: \u001b[1;36m14\u001b[0m-\u001b[1;36m30\u001b[0m days\n",
"$\u001b[1;36m5\u001b[0m discount on next order: > \u001b[1;36m30\u001b[0m days\n",
"\n",
"“You can get a full refund up \n",
"to \u001b[1;36m14\u001b[0m days after the \n",
"purchase, then up to \u001b[1;36m30\u001b[0m days \n",
"you would get a voucher for \n",
"half the value of your order”\n",
"\n",
"Knowledge \n",
"Base \u001b[35m/\u001b[0m External \n",
"sources\n",
"\n",
"\u001b[1;36m4\u001b[0m\n",
"\n",
"\n",
"\n",
"\n",
"RAG stands for \u001b[32m\"Retrieve information to Augment the models knowledge and Generate the output.\"\u001b[0m This process \n",
"involves using a language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m to enhance its responses by accessing external information sources.\n",
"\n",
"Here's how it works:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **User Query**: A user asks a question, such as \u001b[32m\"What is your return policy?\"\u001b[0m\n",
"\n",
"\u001b[1;36m2\u001b[0m. **LLM Processing**: The language model receives the question and initiates a search for relevant information.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Information Retrieval**: The LLM accesses a knowledge base or external sources to find the necessary details. \n",
"In this example, the information retrieved includes:\n",
" - Total refunds available from \u001b[1;36m0\u001b[0m to \u001b[1;36m14\u001b[0m days.\n",
" - \u001b[1;36m50\u001b[0m% value vouchers for returns between \u001b[1;36m14\u001b[0m to \u001b[1;36m30\u001b[0m days.\n",
" - A $\u001b[1;36m5\u001b[0m discount on the next order for returns after \u001b[1;36m30\u001b[0m days.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Response Generation**: The LLM uses the retrieved information to generate a coherent response for the user. \n",
"For instance, it might say, \u001b[32m\"You can get a full refund up to 14 days after the purchase, then up to 30 days you \u001b[0m\n",
"\u001b[32mwould get a voucher for half the value of your order.\"\u001b[0m\n",
"\n",
"This method allows the model to provide accurate and up-to-date answers by leveraging external data sources.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">When to use RAG\n",
"\n",
"Good for ✅\n",
"\n",
"Not good for ❌\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Introducing new information to the model \n",
"\n",
"●\n",
"\n",
"Teaching the model a specific format, style, \n",
"\n",
"to update its knowledge\n",
"\n",
"Reducing hallucinations by controlling \n",
"\n",
"content\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span>!\\ Hallucinations can still happen with RAG\n",
"\n",
"or language\n",
"➔ Use fine-tuning or custom models instead\n",
"\n",
"●\n",
"\n",
"Reducing token usage\n",
"➔ Consider fine-tuning depending on the use \n",
"\n",
"case\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>\n",
"\n",
"\n",
"\n",
"\n",
"**Good for:**\n",
"\n",
"- **Introducing new information to the model:** RAG <span style=\"font-weight: bold\">(</span>Retrieval-Augmented Generation<span style=\"font-weight: bold\">)</span> is effective for updating a \n",
"model's knowledge by incorporating new data.\n",
"\n",
"- **Reducing hallucinations by controlling content:** While RAG can help minimize hallucinations, it's important to\n",
"note that they can still occur.\n",
"\n",
"**Not good for:**\n",
"\n",
"- **Teaching the model a specific format, style, or language:** For these tasks, it's better to use fine-tuning or \n",
"custom models.\n",
"\n",
"- **Reducing token usage:** If token usage is a concern, consider fine-tuning based on the specific use case.\n",
"</pre>\n"
],
"text/plain": [
"When to use RAG\n",
"\n",
"Good for ✅\n",
"\n",
"Not good for ❌\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Introducing new information to the model \n",
"\n",
"●\n",
"\n",
"Teaching the model a specific format, style, \n",
"\n",
"to update its knowledge\n",
"\n",
"Reducing hallucinations by controlling \n",
"\n",
"content\n",
"\n",
"\u001b[35m/\u001b[0m!\\ Hallucinations can still happen with RAG\n",
"\n",
"or language\n",
"➔ Use fine-tuning or custom models instead\n",
"\n",
"●\n",
"\n",
"Reducing token usage\n",
"➔ Consider fine-tuning depending on the use \n",
"\n",
"case\n",
"\n",
"\u001b[1;36m5\u001b[0m\n",
"\n",
"\n",
"\n",
"\n",
"**Good for:**\n",
"\n",
"- **Introducing new information to the model:** RAG \u001b[1m(\u001b[0mRetrieval-Augmented Generation\u001b[1m)\u001b[0m is effective for updating a \n",
"model's knowledge by incorporating new data.\n",
"\n",
"- **Reducing hallucinations by controlling content:** While RAG can help minimize hallucinations, it's important to\n",
"note that they can still occur.\n",
"\n",
"**Not good for:**\n",
"\n",
"- **Teaching the model a specific format, style, or language:** For these tasks, it's better to use fine-tuning or \n",
"custom models.\n",
"\n",
"- **Reducing token usage:** If token usage is a concern, consider fine-tuning based on the specific use case.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"\n",
"Data preparation\n",
"\n",
"Input processing\n",
"\n",
"Retrieval\n",
"\n",
"Answer Generation\n",
"\n",
"● Chunking\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Embeddings\n",
"\n",
"Augmenting \n",
"content\n",
"\n",
"●\n",
"\n",
"Input \n",
"augmentation\n",
"\n",
"● NER\n",
"\n",
"●\n",
"\n",
"Search\n",
"\n",
"● Context window\n",
"\n",
"● Multi-step \n",
"retrieval\n",
"\n",
"● Optimisation\n",
"\n",
"●\n",
"\n",
"Safety checks\n",
"\n",
"●\n",
"\n",
"Embeddings\n",
"\n",
"● Re-ranking\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"\n",
"Data preparation\n",
"\n",
"Input processing\n",
"\n",
"Retrieval\n",
"\n",
"Answer Generation\n",
"\n",
"● Chunking\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Embeddings\n",
"\n",
"Augmenting \n",
"content\n",
"\n",
"●\n",
"\n",
"Input \n",
"augmentation\n",
"\n",
"● NER\n",
"\n",
"●\n",
"\n",
"Search\n",
"\n",
"● Context window\n",
"\n",
"● Multi-step \n",
"retrieval\n",
"\n",
"● Optimisation\n",
"\n",
"●\n",
"\n",
"Safety checks\n",
"\n",
"●\n",
"\n",
"Embeddings\n",
"\n",
"● Re-ranking\n",
"\n",
"\u001b[1;36m6\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation\n",
"\n",
"chunk documents into multiple \n",
"pieces for easier consumption\n",
"\n",
"content\n",
"\n",
"embeddings\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span>…\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.876</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.145</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.179</span>…\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span>…\n",
"\n",
"Augment content \n",
"using LLMs\n",
"\n",
"Ex: parse text only, ask gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> to rephrase &amp; \n",
"summarize each part, generate bullet points…\n",
"\n",
"BEST PRACTICES\n",
"\n",
"Pre-process content for LLM \n",
"consumption: \n",
"Add summary, headers for each \n",
"part, etc.\n",
"+ curate relevant data sources\n",
"\n",
"Knowledge \n",
"Base\n",
"\n",
"COMMON PITFALLS\n",
"\n",
"➔ Having too much low-quality \n",
"\n",
"content\n",
"\n",
"➔ Having too large documents\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation\n",
"\n",
"chunk documents into multiple \n",
"pieces for easier consumption\n",
"\n",
"content\n",
"\n",
"embeddings\n",
"\n",
"\u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m…\n",
"\n",
"\u001b[1;36m0.876\u001b[0m, \u001b[1;36m0.145\u001b[0m, \u001b[1;36m0.179\u001b[0m…\n",
"\n",
"\u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m…\n",
"\n",
"Augment content \n",
"using LLMs\n",
"\n",
"Ex: parse text only, ask gpt-\u001b[1;36m4\u001b[0m to rephrase & \n",
"summarize each part, generate bullet points…\n",
"\n",
"BEST PRACTICES\n",
"\n",
"Pre-process content for LLM \n",
"consumption: \n",
"Add summary, headers for each \n",
"part, etc.\n",
"+ curate relevant data sources\n",
"\n",
"Knowledge \n",
"Base\n",
"\n",
"COMMON PITFALLS\n",
"\n",
"➔ Having too much low-quality \n",
"\n",
"content\n",
"\n",
"➔ Having too large documents\n",
"\n",
"\u001b[1;36m7\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation: chunking\n",
"\n",
"Why chunking?\n",
"\n",
"If your system doesnt require \n",
"entire documents to provide \n",
"relevant answers, you can \n",
"chunk them into multiple pieces \n",
"for easier consumption <span style=\"font-weight: bold\">(</span>reduced \n",
"cost &amp; latency<span style=\"font-weight: bold\">)</span>.\n",
"\n",
"Other approaches: graphs or \n",
"map-reduce\n",
"\n",
"Things to consider\n",
"\n",
"●\n",
"\n",
"Overlap:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Should chunks be independent or overlap one \n",
"another?\n",
"If they overlap, by how much?\n",
"\n",
"●\n",
"\n",
"Size of chunks: \n",
"\n",
"○ What is the optimal chunk size for my use case?\n",
"○\n",
"\n",
"Do I want to include a lot in the context window or \n",
"just the minimum?\n",
"\n",
"● Where to chunk:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Should I chunk every N tokens or use specific \n",
"separators? \n",
"Is there a logical way to split the context that would \n",
"help the retrieval process?\n",
"\n",
"● What to return:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Should I return chunks across multiple documents \n",
"or top chunks within the same doc?\n",
"Should chunks be linked together with metadata to \n",
"indicate common properties?\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation: chunking\n",
"\n",
"Why chunking?\n",
"\n",
"If your system doesnt require \n",
"entire documents to provide \n",
"relevant answers, you can \n",
"chunk them into multiple pieces \n",
"for easier consumption \u001b[1m(\u001b[0mreduced \n",
"cost & latency\u001b[1m)\u001b[0m.\n",
"\n",
"Other approaches: graphs or \n",
"map-reduce\n",
"\n",
"Things to consider\n",
"\n",
"●\n",
"\n",
"Overlap:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Should chunks be independent or overlap one \n",
"another?\n",
"If they overlap, by how much?\n",
"\n",
"●\n",
"\n",
"Size of chunks: \n",
"\n",
"○ What is the optimal chunk size for my use case?\n",
"○\n",
"\n",
"Do I want to include a lot in the context window or \n",
"just the minimum?\n",
"\n",
"● Where to chunk:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Should I chunk every N tokens or use specific \n",
"separators? \n",
"Is there a logical way to split the context that would \n",
"help the retrieval process?\n",
"\n",
"● What to return:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Should I return chunks across multiple documents \n",
"or top chunks within the same doc?\n",
"Should chunks be linked together with metadata to \n",
"indicate common properties?\n",
"\n",
"\u001b[1;36m8\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation: embeddings\n",
"\n",
"What to embed?\n",
"\n",
"Depending on your use case \n",
"you might not want just to \n",
"embed the text in the \n",
"documents but metadata as well \n",
"- anything that will make it easier \n",
"to surface this specific chunk or \n",
"document when performing a \n",
"search\n",
"\n",
"Examples\n",
"\n",
"Embedding Q&amp;A posts in a forum\n",
"You might want to embed the title of the posts, \n",
"the text of the original question and the content of \n",
"the top answers.\n",
"Additionally, if the posts are tagged by topic or \n",
"with keywords, you can embed those too.\n",
"\n",
"Embedding product specs\n",
"In additional to embedding the text contained in \n",
"documents describing the products, you might \n",
"want to add metadata that you have on the \n",
"product such as the color, size, etc. in your \n",
"embeddings.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">9</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation: embeddings\n",
"\n",
"What to embed?\n",
"\n",
"Depending on your use case \n",
"you might not want just to \n",
"embed the text in the \n",
"documents but metadata as well \n",
"- anything that will make it easier \n",
"to surface this specific chunk or \n",
"document when performing a \n",
"search\n",
"\n",
"Examples\n",
"\n",
"Embedding Q&A posts in a forum\n",
"You might want to embed the title of the posts, \n",
"the text of the original question and the content of \n",
"the top answers.\n",
"Additionally, if the posts are tagged by topic or \n",
"with keywords, you can embed those too.\n",
"\n",
"Embedding product specs\n",
"In additional to embedding the text contained in \n",
"documents describing the products, you might \n",
"want to add metadata that you have on the \n",
"product such as the color, size, etc. in your \n",
"embeddings.\n",
"\n",
"\u001b[1;36m9\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation: augmenting content\n",
"\n",
"What does “Augmenting \n",
"content” mean?\n",
"\n",
"Augmenting content refers to \n",
"modifications of the original content \n",
"to make it more digestible for a \n",
"system relying on RAG. The \n",
"modifications could be a change in \n",
"format, wording, or adding \n",
"descriptive content such as \n",
"summaries or keywords.\n",
"\n",
"Example approaches\n",
"\n",
"Make it a guide*\n",
"Reformat the content to look more like \n",
"a step-by-step guide with clear \n",
"headings and bullet-points, as this \n",
"format is more easily understandable \n",
"by an LLM.\n",
"\n",
"Add descriptive metadata*\n",
"Consider adding keywords or text that \n",
"users might search for when thinking \n",
"of a specific product or service.\n",
"\n",
"Multimodality\n",
"Leverage models \n",
"such as Whisper or \n",
"GPT-4V to \n",
"transform audio or \n",
"visual content into \n",
"text.\n",
"For example, you \n",
"can use GPT-4V to \n",
"generate tags for \n",
"images or to \n",
"describe slides.\n",
"\n",
"* GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can do this for you with the right prompt\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation: augmenting content\n",
"\n",
"What does “Augmenting \n",
"content” mean?\n",
"\n",
"Augmenting content refers to \n",
"modifications of the original content \n",
"to make it more digestible for a \n",
"system relying on RAG. The \n",
"modifications could be a change in \n",
"format, wording, or adding \n",
"descriptive content such as \n",
"summaries or keywords.\n",
"\n",
"Example approaches\n",
"\n",
"Make it a guide*\n",
"Reformat the content to look more like \n",
"a step-by-step guide with clear \n",
"headings and bullet-points, as this \n",
"format is more easily understandable \n",
"by an LLM.\n",
"\n",
"Add descriptive metadata*\n",
"Consider adding keywords or text that \n",
"users might search for when thinking \n",
"of a specific product or service.\n",
"\n",
"Multimodality\n",
"Leverage models \n",
"such as Whisper or \n",
"GPT-4V to \n",
"transform audio or \n",
"visual content into \n",
"text.\n",
"For example, you \n",
"can use GPT-4V to \n",
"generate tags for \n",
"images or to \n",
"describe slides.\n",
"\n",
"* GPT-\u001b[1;36m4\u001b[0m can do this for you with the right prompt\n",
"\n",
"\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Input processing\n",
"\n",
"Process input according to task\n",
"\n",
"Q&amp;A\n",
"HyDE: Ask LLM to hypothetically answer the \n",
"question &amp; use the answer to search the KB\n",
"\n",
"embeddings\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span>…\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.876</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.145</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.179</span>…\n",
"\n",
"Content search\n",
"Prompt LLM to rephrase input &amp; optionally add \n",
"more context\n",
"\n",
"query\n",
"\n",
"SELECT * from items…\n",
"\n",
"DB search\n",
"NER: Find relevant entities to be used for a \n",
"keyword search or to construct a search query\n",
"\n",
"keywords\n",
"\n",
"red\n",
"\n",
"summer\n",
"\n",
"BEST PRACTICES\n",
"\n",
"Consider how to transform the \n",
"input to match content in the \n",
"database\n",
"Consider using metadata to \n",
"augment the user input\n",
"\n",
"COMMON PITFALLS\n",
"\n",
"➔ Comparing directly the input \n",
"to the database without \n",
"considering the task \n",
"specificities \n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">11</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Input processing\n",
"\n",
"Process input according to task\n",
"\n",
"Q&A\n",
"HyDE: Ask LLM to hypothetically answer the \n",
"question & use the answer to search the KB\n",
"\n",
"embeddings\n",
"\n",
"\u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m…\n",
"\n",
"\u001b[1;36m0.876\u001b[0m, \u001b[1;36m0.145\u001b[0m, \u001b[1;36m0.179\u001b[0m…\n",
"\n",
"Content search\n",
"Prompt LLM to rephrase input & optionally add \n",
"more context\n",
"\n",
"query\n",
"\n",
"SELECT * from items…\n",
"\n",
"DB search\n",
"NER: Find relevant entities to be used for a \n",
"keyword search or to construct a search query\n",
"\n",
"keywords\n",
"\n",
"red\n",
"\n",
"summer\n",
"\n",
"BEST PRACTICES\n",
"\n",
"Consider how to transform the \n",
"input to match content in the \n",
"database\n",
"Consider using metadata to \n",
"augment the user input\n",
"\n",
"COMMON PITFALLS\n",
"\n",
"➔ Comparing directly the input \n",
"to the database without \n",
"considering the task \n",
"specificities \n",
"\n",
"\u001b[1;36m11\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Input processing: input augmentation\n",
"\n",
"What is input augmentation?\n",
"\n",
"Example approaches\n",
"\n",
"Augmenting the input means turning \n",
"it into something different, either \n",
"rephrasing it, splitting it in several \n",
"inputs or expanding it.\n",
"This helps boost performance as \n",
"the LLM might understand better \n",
"the user intent.\n",
"\n",
"Query \n",
"expansion*\n",
"Rephrase the \n",
"query to be \n",
"more \n",
"descriptive\n",
"\n",
"HyDE*\n",
"Hypothetically \n",
"answer the \n",
"question &amp; use \n",
"the answer to \n",
"search the KB\n",
"\n",
"Splitting a query in N*\n",
"When there is more than <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> question or \n",
"intent in a user query, consider \n",
"splitting it in several queries\n",
"\n",
"Fallback\n",
"Consider \n",
"implementing a \n",
"flow where the LLM \n",
"can ask for \n",
"clarification when \n",
"there is not enough \n",
"information in the \n",
"original user query \n",
"to get a result\n",
"<span style=\"font-weight: bold\">(</span>Especially relevant \n",
"with tool usage<span style=\"font-weight: bold\">)</span>\n",
"\n",
"* GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can do this for you with the right prompt\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">12</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Input processing: input augmentation\n",
"\n",
"What is input augmentation?\n",
"\n",
"Example approaches\n",
"\n",
"Augmenting the input means turning \n",
"it into something different, either \n",
"rephrasing it, splitting it in several \n",
"inputs or expanding it.\n",
"This helps boost performance as \n",
"the LLM might understand better \n",
"the user intent.\n",
"\n",
"Query \n",
"expansion*\n",
"Rephrase the \n",
"query to be \n",
"more \n",
"descriptive\n",
"\n",
"HyDE*\n",
"Hypothetically \n",
"answer the \n",
"question & use \n",
"the answer to \n",
"search the KB\n",
"\n",
"Splitting a query in N*\n",
"When there is more than \u001b[1;36m1\u001b[0m question or \n",
"intent in a user query, consider \n",
"splitting it in several queries\n",
"\n",
"Fallback\n",
"Consider \n",
"implementing a \n",
"flow where the LLM \n",
"can ask for \n",
"clarification when \n",
"there is not enough \n",
"information in the \n",
"original user query \n",
"to get a result\n",
"\u001b[1m(\u001b[0mEspecially relevant \n",
"with tool usage\u001b[1m)\u001b[0m\n",
"\n",
"* GPT-\u001b[1;36m4\u001b[0m can do this for you with the right prompt\n",
"\n",
"\u001b[1;36m12\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Input processing: NER\n",
"\n",
"Why use NER?\n",
"\n",
"Using NER <span style=\"font-weight: bold\">(</span>Named Entity \n",
"Recognition<span style=\"font-weight: bold\">)</span> allows to extract \n",
"relevant entities from the input, that \n",
"can then be used for more \n",
"deterministic search queries. \n",
"This can be useful when the scope \n",
"is very constrained.\n",
"\n",
"Example\n",
"\n",
"Searching for movies\n",
"If you have a structured database containing \n",
"metadata on movies, you can extract genre, \n",
"actors or directors names, etc. from the user \n",
"query and use this to search the database\n",
"\n",
"Note: You can use exact values or embeddings after \n",
"having extracted the relevant entities\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Input processing: NER\n",
"\n",
"Why use NER?\n",
"\n",
"Using NER \u001b[1m(\u001b[0mNamed Entity \n",
"Recognition\u001b[1m)\u001b[0m allows to extract \n",
"relevant entities from the input, that \n",
"can then be used for more \n",
"deterministic search queries. \n",
"This can be useful when the scope \n",
"is very constrained.\n",
"\n",
"Example\n",
"\n",
"Searching for movies\n",
"If you have a structured database containing \n",
"metadata on movies, you can extract genre, \n",
"actors or directors names, etc. from the user \n",
"query and use this to search the database\n",
"\n",
"Note: You can use exact values or embeddings after \n",
"having extracted the relevant entities\n",
"\n",
"\u001b[1;36m13\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Retrieval\n",
"\n",
"re-ranking\n",
"\n",
"INPUT\n",
"\n",
"embeddings\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span>…\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.876</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.145</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.179</span>…\n",
"\n",
"query\n",
"\n",
"SELECT * from items…\n",
"\n",
"keywords\n",
"\n",
"red\n",
"\n",
"summer\n",
"\n",
"Semantic \n",
"search\n",
"\n",
"RESULTS\n",
"\n",
"RESULTS\n",
"\n",
"vector DB\n",
"\n",
"relational <span style=\"color: #800080; text-decoration-color: #800080\">/</span> \n",
"nosql db\n",
"\n",
"FINAL RESULT\n",
"\n",
"Used to \n",
"generate output\n",
"\n",
"BEST PRACTICES\n",
"\n",
"Use a combination of semantic \n",
"search and deterministic queries \n",
"where possible\n",
"\n",
"+ Cache output where possible\n",
"\n",
"COMMON PITFALLS\n",
"\n",
"➔ The wrong elements could be \n",
"compared when looking at \n",
"text similarity, that is why \n",
"re-ranking is important\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Retrieval\n",
"\n",
"re-ranking\n",
"\n",
"INPUT\n",
"\n",
"embeddings\n",
"\n",
"\u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m…\n",
"\n",
"\u001b[1;36m0.876\u001b[0m, \u001b[1;36m0.145\u001b[0m, \u001b[1;36m0.179\u001b[0m…\n",
"\n",
"query\n",
"\n",
"SELECT * from items…\n",
"\n",
"keywords\n",
"\n",
"red\n",
"\n",
"summer\n",
"\n",
"Semantic \n",
"search\n",
"\n",
"RESULTS\n",
"\n",
"RESULTS\n",
"\n",
"vector DB\n",
"\n",
"relational \u001b[35m/\u001b[0m \n",
"nosql db\n",
"\n",
"FINAL RESULT\n",
"\n",
"Used to \n",
"generate output\n",
"\n",
"BEST PRACTICES\n",
"\n",
"Use a combination of semantic \n",
"search and deterministic queries \n",
"where possible\n",
"\n",
"+ Cache output where possible\n",
"\n",
"COMMON PITFALLS\n",
"\n",
"➔ The wrong elements could be \n",
"compared when looking at \n",
"text similarity, that is why \n",
"re-ranking is important\n",
"\n",
"\u001b[1;36m14\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Retrieval: search\n",
"\n",
"How to search?\n",
"\n",
"Semantic search\n",
"\n",
"Keyword search\n",
"\n",
"Search query\n",
"\n",
"There are many different \n",
"approaches to search depending on \n",
"the use case and the existing \n",
"system.\n",
"\n",
"Using embeddings, you \n",
"can perform semantic \n",
"searches. You can \n",
"compare embeddings \n",
"with what is in your \n",
"database and find the \n",
"most similar.\n",
"\n",
"If you have extracted \n",
"specific entities or \n",
"keywords to search for, \n",
"you can search for these \n",
"in your database.\n",
"\n",
"Based on the extracted \n",
"entities you have or the \n",
"user input as is, you can \n",
"construct search queries \n",
"<span style=\"font-weight: bold\">(</span>SQL, cypher…<span style=\"font-weight: bold\">)</span> and use \n",
"these queries to search \n",
"your database.\n",
"\n",
"You can use a hybrid approach and combine several of these.\n",
"You can perform multiple searches in parallel or in sequence, or \n",
"search for keywords with their embeddings for example.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Retrieval: search\n",
"\n",
"How to search?\n",
"\n",
"Semantic search\n",
"\n",
"Keyword search\n",
"\n",
"Search query\n",
"\n",
"There are many different \n",
"approaches to search depending on \n",
"the use case and the existing \n",
"system.\n",
"\n",
"Using embeddings, you \n",
"can perform semantic \n",
"searches. You can \n",
"compare embeddings \n",
"with what is in your \n",
"database and find the \n",
"most similar.\n",
"\n",
"If you have extracted \n",
"specific entities or \n",
"keywords to search for, \n",
"you can search for these \n",
"in your database.\n",
"\n",
"Based on the extracted \n",
"entities you have or the \n",
"user input as is, you can \n",
"construct search queries \n",
"\u001b[1m(\u001b[0mSQL, cypher…\u001b[1m)\u001b[0m and use \n",
"these queries to search \n",
"your database.\n",
"\n",
"You can use a hybrid approach and combine several of these.\n",
"You can perform multiple searches in parallel or in sequence, or \n",
"search for keywords with their embeddings for example.\n",
"\n",
"\u001b[1;36m15\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Retrieval: multi-step retrieval\n",
"\n",
"What is multi-step retrieval?\n",
"\n",
"In some cases, there might be \n",
"several actions to be performed to \n",
"get the required information to \n",
"generate an answer.\n",
"\n",
"Things to consider\n",
"\n",
"●\n",
"\n",
"Framework to be used:\n",
"\n",
"○ When there are multiple steps to perform, \n",
"consider whether you want to handle this \n",
"yourself or use a framework to make it easier\n",
"\n",
"●\n",
"\n",
"Cost &amp; Latency:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Performing multiple steps at the retrieval \n",
"stage can increase latency and cost \n",
"significantly\n",
"Consider performing actions in parallel to \n",
"reduce latency\n",
"\n",
"●\n",
"\n",
"Chain of Thought:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Guide the assistant with the chain of thought \n",
"approach: break down instructions into \n",
"several steps, with clear guidelines on \n",
"whether to continue, stop or do something \n",
"else. \n",
"This is more appropriate when tasks need to \n",
"be performed sequentially - for example: “if \n",
"this didnt work, then do this”\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Retrieval: multi-step retrieval\n",
"\n",
"What is multi-step retrieval?\n",
"\n",
"In some cases, there might be \n",
"several actions to be performed to \n",
"get the required information to \n",
"generate an answer.\n",
"\n",
"Things to consider\n",
"\n",
"●\n",
"\n",
"Framework to be used:\n",
"\n",
"○ When there are multiple steps to perform, \n",
"consider whether you want to handle this \n",
"yourself or use a framework to make it easier\n",
"\n",
"●\n",
"\n",
"Cost & Latency:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Performing multiple steps at the retrieval \n",
"stage can increase latency and cost \n",
"significantly\n",
"Consider performing actions in parallel to \n",
"reduce latency\n",
"\n",
"●\n",
"\n",
"Chain of Thought:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"Guide the assistant with the chain of thought \n",
"approach: break down instructions into \n",
"several steps, with clear guidelines on \n",
"whether to continue, stop or do something \n",
"else. \n",
"This is more appropriate when tasks need to \n",
"be performed sequentially - for example: “if \n",
"this didnt work, then do this”\n",
"\n",
"\u001b[1;36m16\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Retrieval: re-ranking\n",
"\n",
"What is re-ranking?\n",
"\n",
"Example approaches\n",
"\n",
"Re-ranking means re-ordering the \n",
"results of the retrieval process to \n",
"surface more relevant results.\n",
"This is particularly important when \n",
"doing semantic searches.\n",
"\n",
"Rule-based re-ranking\n",
"You can use metadata to rank results by relevance. For \n",
"example, you can look at the recency of the documents, at \n",
"tags, specific keywords in the title, etc.\n",
"\n",
"Re-ranking algorithms\n",
"There are several existing algorithms/approaches you can use \n",
"based on your use case: BERT-based re-rankers, \n",
"cross-encoder re-ranking, TF-IDF algorithms…\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">17</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Retrieval: re-ranking\n",
"\n",
"What is re-ranking?\n",
"\n",
"Example approaches\n",
"\n",
"Re-ranking means re-ordering the \n",
"results of the retrieval process to \n",
"surface more relevant results.\n",
"This is particularly important when \n",
"doing semantic searches.\n",
"\n",
"Rule-based re-ranking\n",
"You can use metadata to rank results by relevance. For \n",
"example, you can look at the recency of the documents, at \n",
"tags, specific keywords in the title, etc.\n",
"\n",
"Re-ranking algorithms\n",
"There are several existing algorithms/approaches you can use \n",
"based on your use case: BERT-based re-rankers, \n",
"cross-encoder re-ranking, TF-IDF algorithms…\n",
"\n",
"\u001b[1;36m17\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Answer Generation\n",
"\n",
"FINAL RESULT\n",
"\n",
"Piece of content \n",
"retrieved\n",
"\n",
"LLM\n",
"\n",
"Prompt including \n",
"the content\n",
"\n",
"User sees the \n",
"final result\n",
"\n",
"BEST PRACTICES\n",
"\n",
"Evaluate performance after each \n",
"experimentation to assess if its \n",
"worth exploring other paths\n",
"+ Implement guardrails if applicable\n",
"\n",
"COMMON PITFALLS\n",
"\n",
"➔ Going for fine-tuning without \n",
"trying other approaches\n",
"➔ Not paying attention to the \n",
"way the model is prompted\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Answer Generation\n",
"\n",
"FINAL RESULT\n",
"\n",
"Piece of content \n",
"retrieved\n",
"\n",
"LLM\n",
"\n",
"Prompt including \n",
"the content\n",
"\n",
"User sees the \n",
"final result\n",
"\n",
"BEST PRACTICES\n",
"\n",
"Evaluate performance after each \n",
"experimentation to assess if its \n",
"worth exploring other paths\n",
"+ Implement guardrails if applicable\n",
"\n",
"COMMON PITFALLS\n",
"\n",
"➔ Going for fine-tuning without \n",
"trying other approaches\n",
"➔ Not paying attention to the \n",
"way the model is prompted\n",
"\n",
"\u001b[1;36m18\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Answer Generation: context window\n",
"\n",
"How to manage context?\n",
"\n",
"Depending on your use case, there are \n",
"several things to consider when \n",
"including retrieved content into the \n",
"context window to generate an answer. \n",
"\n",
"Things to consider\n",
"\n",
"●\n",
"\n",
"Context window max size:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"There is a maximum size, so putting too \n",
"much content is not ideal\n",
"In conversation use cases, the \n",
"conversation will be part of the context \n",
"as well and will add to that size\n",
"\n",
"●\n",
"\n",
"Cost &amp; Latency vs Accuracy:\n",
"\n",
"○ More context results in increased \n",
"\n",
"latency and additional costs since there \n",
"will be more input tokens\n",
"Less context might also result in \n",
"decreased accuracy\n",
"\n",
"○\n",
"\n",
"●\n",
"\n",
"“Lost in the middle” problem:\n",
"\n",
"○ When there is too much context, LLMs \n",
"tend to forget the text “in the middle” of \n",
"the content and might look over some \n",
"important information.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">19</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Answer Generation: context window\n",
"\n",
"How to manage context?\n",
"\n",
"Depending on your use case, there are \n",
"several things to consider when \n",
"including retrieved content into the \n",
"context window to generate an answer. \n",
"\n",
"Things to consider\n",
"\n",
"●\n",
"\n",
"Context window max size:\n",
"\n",
"○\n",
"\n",
"○\n",
"\n",
"There is a maximum size, so putting too \n",
"much content is not ideal\n",
"In conversation use cases, the \n",
"conversation will be part of the context \n",
"as well and will add to that size\n",
"\n",
"●\n",
"\n",
"Cost & Latency vs Accuracy:\n",
"\n",
"○ More context results in increased \n",
"\n",
"latency and additional costs since there \n",
"will be more input tokens\n",
"Less context might also result in \n",
"decreased accuracy\n",
"\n",
"○\n",
"\n",
"●\n",
"\n",
"“Lost in the middle” problem:\n",
"\n",
"○ When there is too much context, LLMs \n",
"tend to forget the text “in the middle” of \n",
"the content and might look over some \n",
"important information.\n",
"\n",
"\u001b[1;36m19\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Answer Generation: optimisation\n",
"\n",
"How to optimise?\n",
"\n",
"There are a few different \n",
"methods to consider when \n",
"optimising a RAG application.\n",
"Try them from left to right, and \n",
"iterate with several of these \n",
"approaches if needed.\n",
"\n",
"Prompt Engineering\n",
"\n",
"Few-shot examples\n",
"\n",
"Fine-tuning\n",
"\n",
"At each point of the \n",
"process, experiment with \n",
"different prompts to get \n",
"the expected input format \n",
"or generate a relevant \n",
"output.\n",
"Try guiding the model if \n",
"the process to get to the \n",
"final outcome contains \n",
"several steps.\n",
"\n",
"If the model doesnt \n",
"behave as expected, \n",
"provide examples of what \n",
"you want e.g. provide \n",
"example user inputs and \n",
"the expected processing \n",
"format.\n",
"\n",
"If giving a few examples \n",
"isnt enough, consider \n",
"fine-tuning a model with \n",
"more examples for each \n",
"step of the process: you \n",
"can fine-tune to get a \n",
"specific input processing \n",
"or output format.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">20</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Answer Generation: optimisation\n",
"\n",
"How to optimise?\n",
"\n",
"There are a few different \n",
"methods to consider when \n",
"optimising a RAG application.\n",
"Try them from left to right, and \n",
"iterate with several of these \n",
"approaches if needed.\n",
"\n",
"Prompt Engineering\n",
"\n",
"Few-shot examples\n",
"\n",
"Fine-tuning\n",
"\n",
"At each point of the \n",
"process, experiment with \n",
"different prompts to get \n",
"the expected input format \n",
"or generate a relevant \n",
"output.\n",
"Try guiding the model if \n",
"the process to get to the \n",
"final outcome contains \n",
"several steps.\n",
"\n",
"If the model doesnt \n",
"behave as expected, \n",
"provide examples of what \n",
"you want e.g. provide \n",
"example user inputs and \n",
"the expected processing \n",
"format.\n",
"\n",
"If giving a few examples \n",
"isnt enough, consider \n",
"fine-tuning a model with \n",
"more examples for each \n",
"step of the process: you \n",
"can fine-tune to get a \n",
"specific input processing \n",
"or output format.\n",
"\n",
"\u001b[1;36m20\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Answer Generation: safety checks\n",
"\n",
"Why include safety checks?\n",
"\n",
"Just because you provide the model \n",
"with <span style=\"font-weight: bold\">(</span>supposedly<span style=\"font-weight: bold\">)</span> relevant context \n",
"doesnt mean the answer will \n",
"systematically be truthful or on-point.\n",
"Depending on the use case, you \n",
"might want to double-check. \n",
"\n",
"Example evaluation framework: RAGAS\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Answer Generation: safety checks\n",
"\n",
"Why include safety checks?\n",
"\n",
"Just because you provide the model \n",
"with \u001b[1m(\u001b[0msupposedly\u001b[1m)\u001b[0m relevant context \n",
"doesnt mean the answer will \n",
"systematically be truthful or on-point.\n",
"Depending on the use case, you \n",
"might want to double-check. \n",
"\n",
"Example evaluation framework: RAGAS\n",
"\n",
"\u001b[1;36m21\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Overview**\n",
"\n",
"Retrieval-Augmented Generation <span style=\"font-weight: bold\">(</span>RAG<span style=\"font-weight: bold\">)</span> enhances language models by integrating them with a retrieval system. This \n",
"combination allows the model to access external knowledge sources, resulting in more accurate and contextually \n",
"relevant responses. \n",
"\n",
"**Example Use Cases:**\n",
"- Providing answers with up-to-date information\n",
"- Generating contextual responses\n",
"\n",
"**What Well Cover:**\n",
"- Technical patterns\n",
"- Best practices\n",
"- Common pitfalls\n",
"- Resources\n",
"</pre>\n"
],
"text/plain": [
"**Overview**\n",
"\n",
"Retrieval-Augmented Generation \u001b[1m(\u001b[0mRAG\u001b[1m)\u001b[0m enhances language models by integrating them with a retrieval system. This \n",
"combination allows the model to access external knowledge sources, resulting in more accurate and contextually \n",
"relevant responses. \n",
"\n",
"**Example Use Cases:**\n",
"- Providing answers with up-to-date information\n",
"- Generating contextual responses\n",
"\n",
"**What Well Cover:**\n",
"- Technical patterns\n",
"- Best practices\n",
"- Common pitfalls\n",
"- Resources\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns**\n",
"\n",
"This image outlines four key technical patterns involved in data processing and answer generation:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Data Preparation**\n",
" - **Chunking**: Breaking down data into smaller, manageable pieces.\n",
" - **Embeddings**: Converting data into numerical formats that can be easily processed by machine learning \n",
"models.\n",
" - **Augmenting Content**: Enhancing data with additional information to improve its quality or usefulness.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Input Processing**\n",
" - **Input Augmentation**: Adding extra data or features to the input to improve model performance.\n",
" - **NER <span style=\"font-weight: bold\">(</span>Named Entity Recognition<span style=\"font-weight: bold\">)</span>**: Identifying and classifying key entities in the text, such as names, \n",
"dates, and locations.\n",
" - **Embeddings**: Similar to data preparation, embeddings are used here to represent input data in a format \n",
"suitable for processing.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Retrieval**\n",
" - **Search**: Locating relevant information from a dataset.\n",
" - **Multi-step Retrieval**: Using multiple steps or methods to refine the search process and improve accuracy.\n",
" - **Re-ranking**: Adjusting the order of retrieved results based on relevance or other criteria.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Answer Generation**\n",
" - **Context Window**: Using a specific portion of data to generate relevant answers.\n",
" - **Optimisation**: Improving the efficiency and accuracy of the answer generation process.\n",
" - **Safety Checks**: Ensuring that the generated answers are safe and appropriate for use.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns**\n",
"\n",
"This image outlines four key technical patterns involved in data processing and answer generation:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Data Preparation**\n",
" - **Chunking**: Breaking down data into smaller, manageable pieces.\n",
" - **Embeddings**: Converting data into numerical formats that can be easily processed by machine learning \n",
"models.\n",
" - **Augmenting Content**: Enhancing data with additional information to improve its quality or usefulness.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Input Processing**\n",
" - **Input Augmentation**: Adding extra data or features to the input to improve model performance.\n",
" - **NER \u001b[1m(\u001b[0mNamed Entity Recognition\u001b[1m)\u001b[0m**: Identifying and classifying key entities in the text, such as names, \n",
"dates, and locations.\n",
" - **Embeddings**: Similar to data preparation, embeddings are used here to represent input data in a format \n",
"suitable for processing.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Retrieval**\n",
" - **Search**: Locating relevant information from a dataset.\n",
" - **Multi-step Retrieval**: Using multiple steps or methods to refine the search process and improve accuracy.\n",
" - **Re-ranking**: Adjusting the order of retrieved results based on relevance or other criteria.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Answer Generation**\n",
" - **Context Window**: Using a specific portion of data to generate relevant answers.\n",
" - **Optimisation**: Improving the efficiency and accuracy of the answer generation process.\n",
" - **Safety Checks**: Ensuring that the generated answers are safe and appropriate for use.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Data Preparation**\n",
"\n",
"This presentation focuses on the process of preparing data for easier consumption by large language models <span style=\"font-weight: bold\">(</span>LLMs<span style=\"font-weight: bold\">)</span>. \n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Content Chunking**: \n",
" - Documents are divided into smaller, manageable pieces. This makes it easier for LLMs to process the \n",
"information.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Embeddings**:\n",
" - Each chunk of content is converted into embeddings, which are numerical representations <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span><span style=\"font-weight: bold\">)</span> that capture the semantic meaning of the text. These embeddings are then stored in a knowledge base.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Augmenting Content**:\n",
" - Content can be enhanced using LLMs. For example, GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can be used to rephrase, summarize, and generate bullet\n",
"points from the text.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Best Practices**:\n",
" - Pre-process content for LLM consumption by adding summaries and headers for each part.\n",
" - Curate relevant data sources to ensure quality and relevance.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>. **Common Pitfalls**:\n",
" - Avoid having too much low-quality content.\n",
" - Ensure documents are not too large, as this can hinder processing efficiency.\n",
"\n",
"This approach helps in organizing and optimizing data for better performance and understanding by LLMs.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Data Preparation**\n",
"\n",
"This presentation focuses on the process of preparing data for easier consumption by large language models \u001b[1m(\u001b[0mLLMs\u001b[1m)\u001b[0m. \n",
"\n",
"\u001b[1;36m1\u001b[0m. **Content Chunking**: \n",
" - Documents are divided into smaller, manageable pieces. This makes it easier for LLMs to process the \n",
"information.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Embeddings**:\n",
" - Each chunk of content is converted into embeddings, which are numerical representations \u001b[1m(\u001b[0me.g., \u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \n",
"\u001b[1;36m0.289\u001b[0m\u001b[1m)\u001b[0m that capture the semantic meaning of the text. These embeddings are then stored in a knowledge base.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Augmenting Content**:\n",
" - Content can be enhanced using LLMs. For example, GPT-\u001b[1;36m4\u001b[0m can be used to rephrase, summarize, and generate bullet\n",
"points from the text.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Best Practices**:\n",
" - Pre-process content for LLM consumption by adding summaries and headers for each part.\n",
" - Curate relevant data sources to ensure quality and relevance.\n",
"\n",
"\u001b[1;36m5\u001b[0m. **Common Pitfalls**:\n",
" - Avoid having too much low-quality content.\n",
" - Ensure documents are not too large, as this can hinder processing efficiency.\n",
"\n",
"This approach helps in organizing and optimizing data for better performance and understanding by LLMs.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Data Preparation - Chunking**\n",
"\n",
"**Why Chunking?**\n",
"\n",
"Chunking is a technique used when your system doesn't need entire documents to provide relevant answers. By \n",
"breaking documents into smaller pieces, you can make data easier to process, which reduces cost and latency. This \n",
"approach is beneficial for systems that need to handle large volumes of data efficiently. Other methods for data \n",
"preparation include using graphs or map-reduce.\n",
"\n",
"**Things to Consider**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Overlap:**\n",
" - Should chunks be independent or overlap with one another?\n",
" - If they overlap, by how much should they do so?\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Size of Chunks:**\n",
" - What is the optimal chunk size for your specific use case?\n",
" - Do you want to include a lot of information in the context window, or just the minimum necessary?\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Where to Chunk:**\n",
" - Should you chunk every N tokens or use specific separators?\n",
" - Is there a logical way to split the context that would aid the retrieval process?\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **What to Return:**\n",
" - Should you return chunks across multiple documents or focus on top chunks within the same document?\n",
" - Should chunks be linked together with metadata to indicate common properties?\n",
"\n",
"These considerations help in designing an efficient chunking strategy that aligns with your system's requirements \n",
"and goals.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Data Preparation - Chunking**\n",
"\n",
"**Why Chunking?**\n",
"\n",
"Chunking is a technique used when your system doesn't need entire documents to provide relevant answers. By \n",
"breaking documents into smaller pieces, you can make data easier to process, which reduces cost and latency. This \n",
"approach is beneficial for systems that need to handle large volumes of data efficiently. Other methods for data \n",
"preparation include using graphs or map-reduce.\n",
"\n",
"**Things to Consider**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Overlap:**\n",
" - Should chunks be independent or overlap with one another?\n",
" - If they overlap, by how much should they do so?\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Size of Chunks:**\n",
" - What is the optimal chunk size for your specific use case?\n",
" - Do you want to include a lot of information in the context window, or just the minimum necessary?\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Where to Chunk:**\n",
" - Should you chunk every N tokens or use specific separators?\n",
" - Is there a logical way to split the context that would aid the retrieval process?\n",
"\n",
"\u001b[1;36m4\u001b[0m. **What to Return:**\n",
" - Should you return chunks across multiple documents or focus on top chunks within the same document?\n",
" - Should chunks be linked together with metadata to indicate common properties?\n",
"\n",
"These considerations help in designing an efficient chunking strategy that aligns with your system's requirements \n",
"and goals.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"># Technical Patterns: Data Preparation - Embeddings\n",
"\n",
"## What to Embed?\n",
"\n",
"When preparing data for embedding, it's important to consider not just the text but also the metadata. This \n",
"approach can enhance the searchability and relevance of the data. Here are some examples:\n",
"\n",
"### Examples\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Embedding Q&amp;A Posts in a Forum**\n",
" - You might want to include the title of the posts, the original question, and the top answers.\n",
" - Additionally, if the posts are tagged by topic or keywords, these can be embedded as well.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Embedding Product Specs**\n",
" - Besides embedding the text from product descriptions, you can add metadata such as color, size, and other \n",
"specifications to your embeddings.\n",
"\n",
"By embedding both text and metadata, you can improve the ability to surface specific chunks or documents during a \n",
"search.\n",
"</pre>\n"
],
"text/plain": [
"# Technical Patterns: Data Preparation - Embeddings\n",
"\n",
"## What to Embed?\n",
"\n",
"When preparing data for embedding, it's important to consider not just the text but also the metadata. This \n",
"approach can enhance the searchability and relevance of the data. Here are some examples:\n",
"\n",
"### Examples\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Embedding Q&A Posts in a Forum**\n",
" - You might want to include the title of the posts, the original question, and the top answers.\n",
" - Additionally, if the posts are tagged by topic or keywords, these can be embedded as well.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Embedding Product Specs**\n",
" - Besides embedding the text from product descriptions, you can add metadata such as color, size, and other \n",
"specifications to your embeddings.\n",
"\n",
"By embedding both text and metadata, you can improve the ability to surface specific chunks or documents during a \n",
"search.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Data Preparation - Augmenting Content**\n",
"\n",
"**What does “Augmenting content” mean?**\n",
"\n",
"Augmenting content involves modifying the original material to make it more accessible and understandable for \n",
"systems that rely on Retrieval-Augmented Generation <span style=\"font-weight: bold\">(</span>RAG<span style=\"font-weight: bold\">)</span>. These modifications can include changes in format, \n",
"wording, or the addition of descriptive elements like summaries or keywords.\n",
"\n",
"**Example Approaches:**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Make it a Guide:**\n",
" - Reformat the content into a step-by-step guide with clear headings and bullet points. This structure is more \n",
"easily understood by a Language Learning Model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span>. GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can assist with this transformation using the right \n",
"prompts.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Add Descriptive Metadata:**\n",
" - Incorporate keywords or text that users might search for when considering a specific product or service. This \n",
"helps in making the content more searchable and relevant.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Multimodality:**\n",
" - Utilize models like Whisper or GPT-4V to convert audio or visual content into text. For instance, GPT-4V can \n",
"generate tags for images or describe slides, enhancing the content's accessibility and utility.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Data Preparation - Augmenting Content**\n",
"\n",
"**What does “Augmenting content” mean?**\n",
"\n",
"Augmenting content involves modifying the original material to make it more accessible and understandable for \n",
"systems that rely on Retrieval-Augmented Generation \u001b[1m(\u001b[0mRAG\u001b[1m)\u001b[0m. These modifications can include changes in format, \n",
"wording, or the addition of descriptive elements like summaries or keywords.\n",
"\n",
"**Example Approaches:**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Make it a Guide:**\n",
" - Reformat the content into a step-by-step guide with clear headings and bullet points. This structure is more \n",
"easily understood by a Language Learning Model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m. GPT-\u001b[1;36m4\u001b[0m can assist with this transformation using the right \n",
"prompts.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Add Descriptive Metadata:**\n",
" - Incorporate keywords or text that users might search for when considering a specific product or service. This \n",
"helps in making the content more searchable and relevant.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Multimodality:**\n",
" - Utilize models like Whisper or GPT-4V to convert audio or visual content into text. For instance, GPT-4V can \n",
"generate tags for images or describe slides, enhancing the content's accessibility and utility.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Input Processing**\n",
"\n",
"This slide discusses methods for processing input data according to specific tasks, focusing on three main areas: \n",
"Q&amp;A, content search, and database <span style=\"font-weight: bold\">(</span>DB<span style=\"font-weight: bold\">)</span> search.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Q&amp;A**: \n",
" - Uses a technique called HyDE, where a large language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> is asked to hypothetically answer a question.\n",
"This answer is then used to search the knowledge base <span style=\"font-weight: bold\">(</span>KB<span style=\"font-weight: bold\">)</span>.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Content Search**:\n",
" - Involves prompting the LLM to rephrase the input and optionally add more context to improve search results.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **DB Search**:\n",
" - Utilizes Named Entity Recognition <span style=\"font-weight: bold\">(</span>NER<span style=\"font-weight: bold\">)</span> to find relevant entities. These entities are then used for keyword \n",
"searches or to construct a search query.\n",
"\n",
"The slide also highlights different output formats:\n",
"- **Embeddings**: Numerical representations of data, such as vectors <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span><span style=\"font-weight: bold\">)</span>.\n",
"- **Query**: SQL-like statements for database searches <span style=\"font-weight: bold\">(</span>e.g., SELECT * from items<span style=\"font-weight: bold\">)</span>.\n",
"- **Keywords**: Specific terms extracted from the input <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008000; text-decoration-color: #008000\">\"red,\"</span> <span style=\"color: #008000; text-decoration-color: #008000\">\"summer\"</span><span style=\"font-weight: bold\">)</span>.\n",
"\n",
"**Best Practices**:\n",
"- Transform the input to match the content in the database.\n",
"- Use metadata to enhance user input.\n",
"\n",
"**Common Pitfalls**:\n",
"- Avoid directly comparing input to the database without considering the specific requirements of the task.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Input Processing**\n",
"\n",
"This slide discusses methods for processing input data according to specific tasks, focusing on three main areas: \n",
"Q&A, content search, and database \u001b[1m(\u001b[0mDB\u001b[1m)\u001b[0m search.\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Q&A**: \n",
" - Uses a technique called HyDE, where a large language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m is asked to hypothetically answer a question.\n",
"This answer is then used to search the knowledge base \u001b[1m(\u001b[0mKB\u001b[1m)\u001b[0m.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Content Search**:\n",
" - Involves prompting the LLM to rephrase the input and optionally add more context to improve search results.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **DB Search**:\n",
" - Utilizes Named Entity Recognition \u001b[1m(\u001b[0mNER\u001b[1m)\u001b[0m to find relevant entities. These entities are then used for keyword \n",
"searches or to construct a search query.\n",
"\n",
"The slide also highlights different output formats:\n",
"- **Embeddings**: Numerical representations of data, such as vectors \u001b[1m(\u001b[0me.g., \u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m\u001b[1m)\u001b[0m.\n",
"- **Query**: SQL-like statements for database searches \u001b[1m(\u001b[0me.g., SELECT * from items\u001b[1m)\u001b[0m.\n",
"- **Keywords**: Specific terms extracted from the input \u001b[1m(\u001b[0me.g., \u001b[32m\"red,\"\u001b[0m \u001b[32m\"summer\"\u001b[0m\u001b[1m)\u001b[0m.\n",
"\n",
"**Best Practices**:\n",
"- Transform the input to match the content in the database.\n",
"- Use metadata to enhance user input.\n",
"\n",
"**Common Pitfalls**:\n",
"- Avoid directly comparing input to the database without considering the specific requirements of the task.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Input Processing - Input Augmentation**\n",
"\n",
"**What is input augmentation?**\n",
"\n",
"Input augmentation involves transforming the input into something different, such as rephrasing it, splitting it \n",
"into several inputs, or expanding it. This process enhances performance by helping the language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> better \n",
"understand the user's intent.\n",
"\n",
"**Example Approaches:**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Query Expansion**\n",
" - Rephrase the query to make it more descriptive. This helps the LLM grasp the context and details more \n",
"effectively.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **HyDE**\n",
" - Hypothetically answer the question and use that answer to search the knowledge base <span style=\"font-weight: bold\">(</span>KB<span style=\"font-weight: bold\">)</span>. This approach can \n",
"provide more relevant results by anticipating possible answers.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Splitting a Query in N**\n",
" - When a user query contains multiple questions or intents, consider dividing it into several queries. This \n",
"ensures each part is addressed thoroughly.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Fallback**\n",
" - Implement a flow where the LLM can ask for clarification if the original query lacks sufficient information. \n",
"This is particularly useful when using tools that require precise input.\n",
"\n",
"*Note: GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can perform these tasks with the appropriate prompt.*\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Input Processing - Input Augmentation**\n",
"\n",
"**What is input augmentation?**\n",
"\n",
"Input augmentation involves transforming the input into something different, such as rephrasing it, splitting it \n",
"into several inputs, or expanding it. This process enhances performance by helping the language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m better \n",
"understand the user's intent.\n",
"\n",
"**Example Approaches:**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Query Expansion**\n",
" - Rephrase the query to make it more descriptive. This helps the LLM grasp the context and details more \n",
"effectively.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **HyDE**\n",
" - Hypothetically answer the question and use that answer to search the knowledge base \u001b[1m(\u001b[0mKB\u001b[1m)\u001b[0m. This approach can \n",
"provide more relevant results by anticipating possible answers.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Splitting a Query in N**\n",
" - When a user query contains multiple questions or intents, consider dividing it into several queries. This \n",
"ensures each part is addressed thoroughly.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Fallback**\n",
" - Implement a flow where the LLM can ask for clarification if the original query lacks sufficient information. \n",
"This is particularly useful when using tools that require precise input.\n",
"\n",
"*Note: GPT-\u001b[1;36m4\u001b[0m can perform these tasks with the appropriate prompt.*\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Input Processing - NER\n",
"\n",
"**Why use NER?**\n",
"\n",
"Named Entity Recognition <span style=\"font-weight: bold\">(</span>NER<span style=\"font-weight: bold\">)</span> is a technique used to extract relevant entities from input data. This process is \n",
"beneficial for creating more deterministic search queries, especially when the scope is very constrained. By \n",
"identifying specific entities, such as names, dates, or locations, NER helps in refining and improving the accuracy\n",
"of searches.\n",
"\n",
"**Example: Searching for Movies**\n",
"\n",
"Consider a structured database containing metadata on movies. By using NER, you can extract specific entities like \n",
"genre, actors, or directors' names from a user's query. This information can then be used to search the database \n",
"more effectively. \n",
"\n",
"**Note:** After extracting the relevant entities, you can use exact values or embeddings to enhance the search \n",
"process.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Input Processing - NER\n",
"\n",
"**Why use NER?**\n",
"\n",
"Named Entity Recognition \u001b[1m(\u001b[0mNER\u001b[1m)\u001b[0m is a technique used to extract relevant entities from input data. This process is \n",
"beneficial for creating more deterministic search queries, especially when the scope is very constrained. By \n",
"identifying specific entities, such as names, dates, or locations, NER helps in refining and improving the accuracy\n",
"of searches.\n",
"\n",
"**Example: Searching for Movies**\n",
"\n",
"Consider a structured database containing metadata on movies. By using NER, you can extract specific entities like \n",
"genre, actors, or directors' names from a user's query. This information can then be used to search the database \n",
"more effectively. \n",
"\n",
"**Note:** After extracting the relevant entities, you can use exact values or embeddings to enhance the search \n",
"process.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Retrieval\n",
"\n",
"This diagram illustrates a retrieval process using technical patterns. The process begins with three types of \n",
"input: embeddings, queries, and keywords.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Embeddings**: These are numerical representations <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span><span style=\"font-weight: bold\">)</span> used for semantic search. They \n",
"are processed through a vector database <span style=\"font-weight: bold\">(</span>vector DB<span style=\"font-weight: bold\">)</span>.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Query**: This involves structured queries <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008000; text-decoration-color: #008000\">\"SELECT * from items...\"</span><span style=\"font-weight: bold\">)</span> that interact with a relational or \n",
"NoSQL database.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Keywords**: Simple search terms like <span style=\"color: #008000; text-decoration-color: #008000\">\"red\"</span> and <span style=\"color: #008000; text-decoration-color: #008000\">\"summer\"</span> are also used with the relational or NoSQL database.\n",
"\n",
"The results from both the vector and relational/NoSQL databases are combined. The initial results undergo a \n",
"re-ranking process to ensure accuracy and relevance, leading to the final result, which is then used to generate \n",
"output.\n",
"\n",
"**Best Practices**:\n",
"- Combine semantic search with deterministic queries for more effective retrieval.\n",
"- Cache outputs where possible to improve efficiency.\n",
"\n",
"**Common Pitfalls**:\n",
"- Incorrect element comparison during text similarity checks can occur, highlighting the importance of re-ranking \n",
"to ensure accurate results.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Retrieval\n",
"\n",
"This diagram illustrates a retrieval process using technical patterns. The process begins with three types of \n",
"input: embeddings, queries, and keywords.\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Embeddings**: These are numerical representations \u001b[1m(\u001b[0me.g., \u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m\u001b[1m)\u001b[0m used for semantic search. They \n",
"are processed through a vector database \u001b[1m(\u001b[0mvector DB\u001b[1m)\u001b[0m.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Query**: This involves structured queries \u001b[1m(\u001b[0me.g., \u001b[32m\"SELECT * from items...\"\u001b[0m\u001b[1m)\u001b[0m that interact with a relational or \n",
"NoSQL database.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Keywords**: Simple search terms like \u001b[32m\"red\"\u001b[0m and \u001b[32m\"summer\"\u001b[0m are also used with the relational or NoSQL database.\n",
"\n",
"The results from both the vector and relational/NoSQL databases are combined. The initial results undergo a \n",
"re-ranking process to ensure accuracy and relevance, leading to the final result, which is then used to generate \n",
"output.\n",
"\n",
"**Best Practices**:\n",
"- Combine semantic search with deterministic queries for more effective retrieval.\n",
"- Cache outputs where possible to improve efficiency.\n",
"\n",
"**Common Pitfalls**:\n",
"- Incorrect element comparison during text similarity checks can occur, highlighting the importance of re-ranking \n",
"to ensure accurate results.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Retrieval - Search\n",
"\n",
"**How to search?**\n",
"\n",
"There are various approaches to searching, which depend on the use case and the existing system. Here are three \n",
"main methods:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Semantic Search**:\n",
" - This method uses embeddings to perform searches. \n",
" - By comparing embeddings with the data in your database, you can find the most similar matches.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Keyword Search**:\n",
" - If you have specific entities or keywords extracted, you can search for these directly in your database.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Search Query**:\n",
" - Based on extracted entities or direct user input, you can construct search queries <span style=\"font-weight: bold\">(</span>such as SQL or Cypher<span style=\"font-weight: bold\">)</span> to \n",
"search your database.\n",
"\n",
"Additionally, you can use a hybrid approach by combining several methods. This can involve performing multiple \n",
"searches in parallel or in sequence, or searching for keywords along with their embeddings.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Retrieval - Search\n",
"\n",
"**How to search?**\n",
"\n",
"There are various approaches to searching, which depend on the use case and the existing system. Here are three \n",
"main methods:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Semantic Search**:\n",
" - This method uses embeddings to perform searches. \n",
" - By comparing embeddings with the data in your database, you can find the most similar matches.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Keyword Search**:\n",
" - If you have specific entities or keywords extracted, you can search for these directly in your database.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Search Query**:\n",
" - Based on extracted entities or direct user input, you can construct search queries \u001b[1m(\u001b[0msuch as SQL or Cypher\u001b[1m)\u001b[0m to \n",
"search your database.\n",
"\n",
"Additionally, you can use a hybrid approach by combining several methods. This can involve performing multiple \n",
"searches in parallel or in sequence, or searching for keywords along with their embeddings.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Retrieval - Multi-step Retrieval**\n",
"\n",
"**What is multi-step retrieval?**\n",
"\n",
"Multi-step retrieval involves performing several actions to obtain the necessary information to generate an answer.\n",
"This approach is useful when a single step is insufficient to gather all required data.\n",
"\n",
"**Things to Consider**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Framework to be Used:**\n",
" - When multiple steps are needed, decide whether to manage this process yourself or use a framework to simplify \n",
"the task.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Cost &amp; Latency:**\n",
" - Performing multiple steps can significantly increase both latency and cost.\n",
" - To mitigate latency, consider executing actions in parallel.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Chain of Thought:**\n",
" - Use a chain of thought approach to guide the process. Break down instructions into clear steps, providing \n",
"guidelines on whether to continue, stop, or take alternative actions.\n",
" - This method is particularly useful for tasks that must be performed sequentially, such as <span style=\"color: #008000; text-decoration-color: #008000\">\"if this didnt </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">work, then do this.\"</span>\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Retrieval - Multi-step Retrieval**\n",
"\n",
"**What is multi-step retrieval?**\n",
"\n",
"Multi-step retrieval involves performing several actions to obtain the necessary information to generate an answer.\n",
"This approach is useful when a single step is insufficient to gather all required data.\n",
"\n",
"**Things to Consider**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Framework to be Used:**\n",
" - When multiple steps are needed, decide whether to manage this process yourself or use a framework to simplify \n",
"the task.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Cost & Latency:**\n",
" - Performing multiple steps can significantly increase both latency and cost.\n",
" - To mitigate latency, consider executing actions in parallel.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Chain of Thought:**\n",
" - Use a chain of thought approach to guide the process. Break down instructions into clear steps, providing \n",
"guidelines on whether to continue, stop, or take alternative actions.\n",
" - This method is particularly useful for tasks that must be performed sequentially, such as \u001b[32m\"if this didnt \u001b[0m\n",
"\u001b[32mwork, then do this.\"\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Retrieval - Re-ranking**\n",
"\n",
"**What is re-ranking?**\n",
"\n",
"Re-ranking involves re-ordering the results of a retrieval process to highlight more relevant outcomes. This is \n",
"especially crucial in semantic searches, where understanding the context and meaning of queries is important.\n",
"\n",
"**Example Approaches**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Rule-based Re-ranking**\n",
" - This approach uses metadata to rank results by relevance. For instance, you might consider the recency of \n",
"documents, tags, or specific keywords in the title to determine their importance.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Re-ranking Algorithms**\n",
" - There are various algorithms available for re-ranking based on specific use cases. Examples include BERT-based\n",
"re-rankers, cross-encoder re-ranking, and TF-IDF algorithms. These methods apply different techniques to assess and\n",
"order the relevance of search results.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Retrieval - Re-ranking**\n",
"\n",
"**What is re-ranking?**\n",
"\n",
"Re-ranking involves re-ordering the results of a retrieval process to highlight more relevant outcomes. This is \n",
"especially crucial in semantic searches, where understanding the context and meaning of queries is important.\n",
"\n",
"**Example Approaches**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Rule-based Re-ranking**\n",
" - This approach uses metadata to rank results by relevance. For instance, you might consider the recency of \n",
"documents, tags, or specific keywords in the title to determine their importance.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Re-ranking Algorithms**\n",
" - There are various algorithms available for re-ranking based on specific use cases. Examples include BERT-based\n",
"re-rankers, cross-encoder re-ranking, and TF-IDF algorithms. These methods apply different techniques to assess and\n",
"order the relevance of search results.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Answer Generation**\n",
"\n",
"This diagram illustrates the process of generating answers using a language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span>. Here's a breakdown of the \n",
"components and concepts:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Process Flow:**\n",
" - A piece of content is retrieved and used to create a prompt.\n",
" - This prompt is fed into the LLM, which processes it to generate a final result.\n",
" - The user then sees this final result.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Best Practices:**\n",
" - It's important to evaluate performance after each experiment. This helps determine if exploring other methods \n",
"is beneficial.\n",
" - Implementing guardrails can be useful to ensure the model's outputs are safe and reliable.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Common Pitfalls:**\n",
" - Avoid jumping straight to fine-tuning the model without considering other approaches that might be more \n",
"effective or efficient.\n",
" - Pay close attention to how the model is prompted, as this can significantly impact the quality of the output.\n",
"\n",
"By following these guidelines, you can optimize the use of LLMs for generating accurate and useful answers.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Answer Generation**\n",
"\n",
"This diagram illustrates the process of generating answers using a language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m. Here's a breakdown of the \n",
"components and concepts:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Process Flow:**\n",
" - A piece of content is retrieved and used to create a prompt.\n",
" - This prompt is fed into the LLM, which processes it to generate a final result.\n",
" - The user then sees this final result.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Best Practices:**\n",
" - It's important to evaluate performance after each experiment. This helps determine if exploring other methods \n",
"is beneficial.\n",
" - Implementing guardrails can be useful to ensure the model's outputs are safe and reliable.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Common Pitfalls:**\n",
" - Avoid jumping straight to fine-tuning the model without considering other approaches that might be more \n",
"effective or efficient.\n",
" - Pay close attention to how the model is prompted, as this can significantly impact the quality of the output.\n",
"\n",
"By following these guidelines, you can optimize the use of LLMs for generating accurate and useful answers.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"># Technical Patterns: Answer Generation - Context Window\n",
"\n",
"## How to Manage Context?\n",
"\n",
"When generating answers using a context window, it's important to consider several factors based on your specific \n",
"use case. Here are key points to keep in mind:\n",
"\n",
"### Things to Consider\n",
"\n",
"- **Context Window Max Size:**\n",
" - The context window has a maximum size, so overloading it with too much content is not ideal.\n",
" - In conversational scenarios, the conversation itself becomes part of the context, contributing to the overall \n",
"size.\n",
"\n",
"- **Cost &amp; Latency vs. Accuracy:**\n",
" - Including more context can lead to increased latency and higher costs due to the additional input tokens \n",
"required.\n",
" - Conversely, using less context might reduce accuracy.\n",
"\n",
"- **<span style=\"color: #008000; text-decoration-color: #008000\">\"Lost in the Middle\"</span> Problem:**\n",
" - When the context is too extensive, language models may overlook or forget information that is <span style=\"color: #008000; text-decoration-color: #008000\">\"in the middle\"</span> \n",
"of the content, potentially missing important details.\n",
"</pre>\n"
],
"text/plain": [
"# Technical Patterns: Answer Generation - Context Window\n",
"\n",
"## How to Manage Context?\n",
"\n",
"When generating answers using a context window, it's important to consider several factors based on your specific \n",
"use case. Here are key points to keep in mind:\n",
"\n",
"### Things to Consider\n",
"\n",
"- **Context Window Max Size:**\n",
" - The context window has a maximum size, so overloading it with too much content is not ideal.\n",
" - In conversational scenarios, the conversation itself becomes part of the context, contributing to the overall \n",
"size.\n",
"\n",
"- **Cost & Latency vs. Accuracy:**\n",
" - Including more context can lead to increased latency and higher costs due to the additional input tokens \n",
"required.\n",
" - Conversely, using less context might reduce accuracy.\n",
"\n",
"- **\u001b[32m\"Lost in the Middle\"\u001b[0m Problem:**\n",
" - When the context is too extensive, language models may overlook or forget information that is \u001b[32m\"in the middle\"\u001b[0m \n",
"of the content, potentially missing important details.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Answer Generation Optimisation**\n",
"\n",
"**How to optimise?**\n",
"\n",
"When optimising a Retrieval-Augmented Generation <span style=\"font-weight: bold\">(</span>RAG<span style=\"font-weight: bold\">)</span> application, there are several methods to consider. These \n",
"methods should be tried sequentially from left to right, and multiple approaches can be iterated if necessary.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Prompt Engineering**\n",
" - Experiment with different prompts at each stage of the process to achieve the desired input format or generate\n",
"relevant output.\n",
" - Guide the model through multiple steps to reach the final outcome.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Few-shot Examples**\n",
" - If the model's behavior is not as expected, provide examples of the desired outcome.\n",
" - Include sample user inputs and the expected processing format to guide the model.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Fine-tuning**\n",
" - If a few examples are insufficient, consider fine-tuning the model with more examples for each process step.\n",
" - Fine-tuning can help achieve a specific input processing or output format.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Answer Generation Optimisation**\n",
"\n",
"**How to optimise?**\n",
"\n",
"When optimising a Retrieval-Augmented Generation \u001b[1m(\u001b[0mRAG\u001b[1m)\u001b[0m application, there are several methods to consider. These \n",
"methods should be tried sequentially from left to right, and multiple approaches can be iterated if necessary.\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Prompt Engineering**\n",
" - Experiment with different prompts at each stage of the process to achieve the desired input format or generate\n",
"relevant output.\n",
" - Guide the model through multiple steps to reach the final outcome.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Few-shot Examples**\n",
" - If the model's behavior is not as expected, provide examples of the desired outcome.\n",
" - Include sample user inputs and the expected processing format to guide the model.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Fine-tuning**\n",
" - If a few examples are insufficient, consider fine-tuning the model with more examples for each process step.\n",
" - Fine-tuning can help achieve a specific input processing or output format.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Answer Generation - Safety Checks\n",
"\n",
"**Why include safety checks?**\n",
"\n",
"Safety checks are crucial because providing a model with supposedly relevant context does not guarantee that the \n",
"generated answer will be truthful or accurate. Depending on the use case, it is important to double-check the \n",
"information to ensure reliability.\n",
"\n",
"**RAGAS Score Evaluation Framework**\n",
"\n",
"The RAGAS score is an evaluation framework that assesses both the generation and retrieval aspects of answer \n",
"generation:\n",
"\n",
"- **Generation:**\n",
" - **Faithfulness:** This measures how factually accurate the generated answer is.\n",
" - **Answer Relevancy:** This evaluates how relevant the generated answer is to the question.\n",
"\n",
"- **Retrieval:**\n",
" - **Context Precision:** This assesses the signal-to-noise ratio of the retrieved context, ensuring that the \n",
"information is precise.\n",
" - **Context Recall:** This checks if all relevant information required to answer the question is retrieved.\n",
"\n",
"By using this framework, one can systematically evaluate and improve the quality of generated answers.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Answer Generation - Safety Checks\n",
"\n",
"**Why include safety checks?**\n",
"\n",
"Safety checks are crucial because providing a model with supposedly relevant context does not guarantee that the \n",
"generated answer will be truthful or accurate. Depending on the use case, it is important to double-check the \n",
"information to ensure reliability.\n",
"\n",
"**RAGAS Score Evaluation Framework**\n",
"\n",
"The RAGAS score is an evaluation framework that assesses both the generation and retrieval aspects of answer \n",
"generation:\n",
"\n",
"- **Generation:**\n",
" - **Faithfulness:** This measures how factually accurate the generated answer is.\n",
" - **Answer Relevancy:** This evaluates how relevant the generated answer is to the question.\n",
"\n",
"- **Retrieval:**\n",
" - **Context Precision:** This assesses the signal-to-noise ratio of the retrieved context, ensuring that the \n",
"information is precise.\n",
" - **Context Recall:** This checks if all relevant information required to answer the question is retrieved.\n",
"\n",
"By using this framework, one can systematically evaluate and improve the quality of generated answers.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"Models - OpenAI API\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo , gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> , and gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview point to the latest model\n",
"version. You can verify this by looking at the response object after sending a request.\n",
"The response will include the specific model version used <span style=\"font-weight: bold\">(</span>e.g. gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span> <span style=\"font-weight: bold\">)</span>.\n",
"\n",
"We also offer static model versions that developers can continue using for at least\n",
"three months after an updated model has been introduced. With the new cadence of\n",
"model updates, we are also giving people the ability to contribute evals to help us\n",
"\n",
"improve the model for different use cases. If you are interested, check out the OpenAI\n",
"Evals repository.\n",
"\n",
"Learn more about model deprecation on our deprecation page.\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> and GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is a large multimodal model <span style=\"font-weight: bold\">(</span>accepting text or image inputs and outputting text<span style=\"font-weight: bold\">)</span>\n",
"that can solve difficult problems with greater accuracy than any of our previous\n",
"\n",
"models, thanks to its broader general knowledge and advanced reasoning capabilities.\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is available in the OpenAI API to paying customers. Like gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo , GPT-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is optimized for chat but works well for traditional completions tasks using the Chat\n",
"Completions API. Learn how to use GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> in our text generation guide.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"CONTEXT\n",
"WIND OW\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>-preview\n",
"\n",
"New GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"\n",
"Up to\n",
"\n",
"Dec\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"The latest GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> model\n",
"\n",
"tokens\n",
"\n",
"intended to reduce cases of\n",
"\n",
"“laziness” where the model\n",
"doesnt complete a task.\n",
"Returns a maximum of\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
"Learn more.\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview\n",
"\n",
"Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>-preview.\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-preview\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo model\n",
"featuring improved\n",
"instruction following, JSON\n",
"\n",
"mode, reproducible outputs,\n",
"parallel function calling, and\n",
"more. Returns a maximum\n",
"of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens. This\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"tokens\n",
"\n",
"Up to\n",
"Dec\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"tokens\n",
"\n",
"Up to\n",
"Apr <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"Models - OpenAI API\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo , gpt-\u001b[1;36m4\u001b[0m , and gpt-\u001b[1;36m4\u001b[0m-turbo-preview point to the latest model\n",
"version. You can verify this by looking at the response object after sending a request.\n",
"The response will include the specific model version used \u001b[1m(\u001b[0me.g. gpt-\u001b[1;36m3.5\u001b[0m-turbo-\n",
"\u001b[1;36m0613\u001b[0m \u001b[1m)\u001b[0m.\n",
"\n",
"We also offer static model versions that developers can continue using for at least\n",
"three months after an updated model has been introduced. With the new cadence of\n",
"model updates, we are also giving people the ability to contribute evals to help us\n",
"\n",
"improve the model for different use cases. If you are interested, check out the OpenAI\n",
"Evals repository.\n",
"\n",
"Learn more about model deprecation on our deprecation page.\n",
"\n",
"GPT-\u001b[1;36m4\u001b[0m and GPT-\u001b[1;36m4\u001b[0m Turbo\n",
"\n",
"GPT-\u001b[1;36m4\u001b[0m is a large multimodal model \u001b[1m(\u001b[0maccepting text or image inputs and outputting text\u001b[1m)\u001b[0m\n",
"that can solve difficult problems with greater accuracy than any of our previous\n",
"\n",
"models, thanks to its broader general knowledge and advanced reasoning capabilities.\n",
"\n",
"GPT-\u001b[1;36m4\u001b[0m is available in the OpenAI API to paying customers. Like gpt-\u001b[1;36m3.5\u001b[0m-turbo , GPT-\n",
"\n",
"\u001b[1;36m4\u001b[0m is optimized for chat but works well for traditional completions tasks using the Chat\n",
"Completions API. Learn how to use GPT-\u001b[1;36m4\u001b[0m in our text generation guide.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"CONTEXT\n",
"WIND OW\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0125\u001b[0m-preview\n",
"\n",
"New GPT-\u001b[1;36m4\u001b[0m Turbo\n",
"\n",
"\u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"\n",
"Up to\n",
"\n",
"Dec\n",
"\n",
"\u001b[1;36m2023\u001b[0m\n",
"\n",
"The latest GPT-\u001b[1;36m4\u001b[0m model\n",
"\n",
"tokens\n",
"\n",
"intended to reduce cases of\n",
"\n",
"“laziness” where the model\n",
"doesnt complete a task.\n",
"Returns a maximum of\n",
"\n",
"\u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
"Learn more.\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m-turbo-preview\n",
"\n",
"Currently points to gpt-\u001b[1;36m4\u001b[0m-\n",
"\n",
"\u001b[1;36m0125\u001b[0m-preview.\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-preview\n",
"\n",
"GPT-\u001b[1;36m4\u001b[0m Turbo model\n",
"featuring improved\n",
"instruction following, JSON\n",
"\n",
"mode, reproducible outputs,\n",
"parallel function calling, and\n",
"more. Returns a maximum\n",
"of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens. This\n",
"\n",
"\u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"tokens\n",
"\n",
"Up to\n",
"Dec\n",
"\u001b[1;36m2023\u001b[0m\n",
"\n",
"\u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"tokens\n",
"\n",
"Up to\n",
"Apr \u001b[1;36m2023\u001b[0m\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m2\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"Models - OpenAI API\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"is a preview model.\n",
"Learn more.\n",
"\n",
"CONTEXT\n",
"WIND OW\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> with the ability to\n",
"understand images, in\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"tokens\n",
"\n",
"Up to\n",
"Apr <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"addition to all other GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"Turbo capabilities. Currently\n",
"points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-\n",
"\n",
"vision-preview.\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-vision-preview GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> with the ability to\n",
"\n",
"understand images, in\n",
"addition to all other GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"Turbo capabilities. Returns a\n",
"maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output\n",
"\n",
"tokens. This is a preview\n",
"\n",
"model version. Learn more.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"tokens\n",
"\n",
"Up to\n",
"Apr <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>\n",
"\n",
"Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">192</span>\n",
"\n",
"Up to\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>. See\n",
"\n",
"tokens\n",
"\n",
"Sep <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"continuous model upgrades.\n",
"\n",
"Snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> from\n",
"\n",
"June 13th <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> with\n",
"\n",
"improved function calling\n",
"\n",
"support.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">192</span>\n",
"tokens\n",
"\n",
"Up to\n",
"Sep <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k\n",
"\n",
"Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>\n",
"\n",
"32k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>. See\n",
"\n",
"continuous model upgrades.\n",
"This model was never rolled\n",
"out widely in favor of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"Turbo.\n",
"\n",
"Snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k\n",
"\n",
"from June 13th <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> with\n",
"improved function calling\n",
"support. This model was\n",
"never rolled out widely in\n",
"\n",
"favor of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"\n",
"tokens\n",
"\n",
"Up to\n",
"\n",
"Sep <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"\n",
"tokens\n",
"\n",
"Up to\n",
"\n",
"Sep <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"For many basic tasks, the difference between GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> and GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> models is not\n",
"significant. However, in more complex reasoning situations, GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is much more\n",
"capable than any of our previous models.\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"Models - OpenAI API\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"is a preview model.\n",
"Learn more.\n",
"\n",
"CONTEXT\n",
"WIND OW\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m-vision-preview\n",
"\n",
"GPT-\u001b[1;36m4\u001b[0m with the ability to\n",
"understand images, in\n",
"\n",
"\u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"tokens\n",
"\n",
"Up to\n",
"Apr \u001b[1;36m2023\u001b[0m\n",
"\n",
"addition to all other GPT-\u001b[1;36m4\u001b[0m\n",
"Turbo capabilities. Currently\n",
"points to gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-\n",
"\n",
"vision-preview.\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-vision-preview GPT-\u001b[1;36m4\u001b[0m with the ability to\n",
"\n",
"understand images, in\n",
"addition to all other GPT-\u001b[1;36m4\u001b[0m\n",
"\n",
"Turbo capabilities. Returns a\n",
"maximum of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output\n",
"\n",
"tokens. This is a preview\n",
"\n",
"model version. Learn more.\n",
"\n",
"\u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"tokens\n",
"\n",
"Up to\n",
"Apr \u001b[1;36m2023\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0613\u001b[0m\n",
"\n",
"Currently points to gpt-\u001b[1;36m4\u001b[0m-\n",
"\n",
"\u001b[1;36m8\u001b[0m,\u001b[1;36m192\u001b[0m\n",
"\n",
"Up to\n",
"\n",
"\u001b[1;36m0613\u001b[0m. See\n",
"\n",
"tokens\n",
"\n",
"Sep \u001b[1;36m2021\u001b[0m\n",
"\n",
"continuous model upgrades.\n",
"\n",
"Snapshot of gpt-\u001b[1;36m4\u001b[0m from\n",
"\n",
"June 13th \u001b[1;36m2023\u001b[0m with\n",
"\n",
"improved function calling\n",
"\n",
"support.\n",
"\n",
"\u001b[1;36m8\u001b[0m,\u001b[1;36m192\u001b[0m\n",
"tokens\n",
"\n",
"Up to\n",
"Sep \u001b[1;36m2021\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m-32k\n",
"\n",
"Currently points to gpt-\u001b[1;36m4\u001b[0m-\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m-32k-\u001b[1;36m0613\u001b[0m\n",
"\n",
"32k-\u001b[1;36m0613\u001b[0m. See\n",
"\n",
"continuous model upgrades.\n",
"This model was never rolled\n",
"out widely in favor of GPT-\u001b[1;36m4\u001b[0m\n",
"\n",
"Turbo.\n",
"\n",
"Snapshot of gpt-\u001b[1;36m4\u001b[0m-32k\n",
"\n",
"from June 13th \u001b[1;36m2023\u001b[0m with\n",
"improved function calling\n",
"support. This model was\n",
"never rolled out widely in\n",
"\n",
"favor of GPT-\u001b[1;36m4\u001b[0m Turbo.\n",
"\n",
"\u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m\n",
"\n",
"tokens\n",
"\n",
"Up to\n",
"\n",
"Sep \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m\n",
"\n",
"tokens\n",
"\n",
"Up to\n",
"\n",
"Sep \u001b[1;36m2021\u001b[0m\n",
"\n",
"For many basic tasks, the difference between GPT-\u001b[1;36m4\u001b[0m and GPT-\u001b[1;36m3.5\u001b[0m models is not\n",
"significant. However, in more complex reasoning situations, GPT-\u001b[1;36m4\u001b[0m is much more\n",
"capable than any of our previous models.\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m3\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"Models - OpenAI API\n",
"\n",
"Multilingual capabilities\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> outperforms both previous large language models and as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, most state-\n",
"of-the-art systems <span style=\"font-weight: bold\">(</span>which often have benchmark-specific training or hand-\n",
"engineering<span style=\"font-weight: bold\">)</span>. On the MMLU benchmark, an English-language suite of multiple-choice\n",
"questions covering <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">57</span> subjects, GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> not only outperforms existing models by a\n",
"considerable margin in English, but also demonstrates strong performance in other\n",
"languages.\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo models can understand and generate natural language or code and\n",
"have been optimized for chat using the Chat Completions API but work well for non-\n",
"chat tasks as well.\n",
"\n",
"CONTEXT\n",
"WIND OW\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span>\n",
"\n",
"tokens\n",
"\n",
"Up to Sep\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>\n",
"\n",
"New Updated GPT <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo\n",
"\n",
"The latest GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo\n",
"model with higher accuracy at\n",
"\n",
"responding in requested\n",
"\n",
"formats and a fix for a bug\n",
"\n",
"which caused a text encoding\n",
"issue for non-English\n",
"\n",
"language function calls.\n",
"\n",
"Returns a maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"\n",
"output tokens. Learn more.\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
"\n",
"Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"\n",
"Up to Sep\n",
"\n",
"turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>. The gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-\n",
"\n",
"tokens\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"turbo model alias will be\n",
"\n",
"automatically upgraded from\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span> to\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span> on\n",
"\n",
"February 16th.\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo model with\n",
"improved instruction\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span>\n",
"tokens\n",
"\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"following, JSON mode,\n",
"reproducible outputs, parallel\n",
"function calling, and more.\n",
"Returns a maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"\n",
"output tokens. Learn more.\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"Models - OpenAI API\n",
"\n",
"Multilingual capabilities\n",
"\n",
"GPT-\u001b[1;36m4\u001b[0m outperforms both previous large language models and as of \u001b[1;36m2023\u001b[0m, most state-\n",
"of-the-art systems \u001b[1m(\u001b[0mwhich often have benchmark-specific training or hand-\n",
"engineering\u001b[1m)\u001b[0m. On the MMLU benchmark, an English-language suite of multiple-choice\n",
"questions covering \u001b[1;36m57\u001b[0m subjects, GPT-\u001b[1;36m4\u001b[0m not only outperforms existing models by a\n",
"considerable margin in English, but also demonstrates strong performance in other\n",
"languages.\n",
"\n",
"GPT-\u001b[1;36m3.5\u001b[0m Turbo\n",
"\n",
"GPT-\u001b[1;36m3.5\u001b[0m Turbo models can understand and generate natural language or code and\n",
"have been optimized for chat using the Chat Completions API but work well for non-\n",
"chat tasks as well.\n",
"\n",
"CONTEXT\n",
"WIND OW\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"\u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m\n",
"\n",
"tokens\n",
"\n",
"Up to Sep\n",
"\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0125\u001b[0m\n",
"\n",
"New Updated GPT \u001b[1;36m3.5\u001b[0m Turbo\n",
"\n",
"The latest GPT-\u001b[1;36m3.5\u001b[0m Turbo\n",
"model with higher accuracy at\n",
"\n",
"responding in requested\n",
"\n",
"formats and a fix for a bug\n",
"\n",
"which caused a text encoding\n",
"issue for non-English\n",
"\n",
"language function calls.\n",
"\n",
"Returns a maximum of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m\n",
"\n",
"output tokens. Learn more.\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
"\n",
"Currently points to gpt-\u001b[1;36m3.5\u001b[0m-\n",
"\n",
"\u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m\n",
"\n",
"Up to Sep\n",
"\n",
"turbo-\u001b[1;36m0613\u001b[0m. The gpt-\u001b[1;36m3.5\u001b[0m-\n",
"\n",
"tokens\n",
"\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"turbo model alias will be\n",
"\n",
"automatically upgraded from\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0613\u001b[0m to\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0125\u001b[0m on\n",
"\n",
"February 16th.\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m1106\u001b[0m\n",
"\n",
"GPT-\u001b[1;36m3.5\u001b[0m Turbo model with\n",
"improved instruction\n",
"\n",
"\u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m\n",
"tokens\n",
"\n",
"Up to Sep\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"following, JSON mode,\n",
"reproducible outputs, parallel\n",
"function calling, and more.\n",
"Returns a maximum of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m\n",
"\n",
"output tokens. Learn more.\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m4\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"Models - OpenAI API\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-instruct Similar capabilities as GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"era models. Compatible with\n",
"legacy Completions endpoint\n",
"and not Chat Completions.\n",
"\n",
"CONTEXT\n",
"WIND OW\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"tokens\n",
"\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k\n",
"\n",
"Legacy Currently points to\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span>\n",
"tokens\n",
"\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>\n",
"\n",
"Legacy Snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-\n",
"\n",
"turbo from June 13th <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"\n",
"Will be deprecated on June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>,\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"tokens\n",
"\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>\n",
"\n",
"Legacy Snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span>\n",
"\n",
"Up to Sep\n",
"\n",
"16k-turbo from June 13th\n",
"\n",
"tokens\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. Will be deprecated on\n",
"\n",
"June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>.\n",
"\n",
"DALL·E\n",
"\n",
"DALL·E is a AI system that can create realistic images and art from a description in\n",
"\n",
"natural language. DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> currently supports the ability, given a prompt, to create a\n",
"\n",
"new image with a specific size. DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> also support the ability to edit an existing\n",
"\n",
"image, or create variations of a user provided image.\n",
"\n",
"DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> is available through our Images API along with DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. You can try DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"\n",
"through ChatGPT Plus.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"dall-e-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"\n",
"New DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"\n",
"The latest DALL·E model released in Nov <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. Learn more.\n",
"\n",
"dall-e-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> The previous DALL·E model released in Nov <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2022</span>. The 2nd iteration of\n",
"DALL·E with more realistic, accurate, and 4x greater resolution images\n",
"than the original model.\n",
"\n",
"TTS\n",
"\n",
"TTS is an AI model that converts text to natural sounding spoken text. We offer two\n",
"different model variates, tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> is optimized for real time text to speech use cases\n",
"and tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd is optimized for quality. These models can be used with the Speech\n",
"\n",
"endpoint in the Audio API.\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"Models - OpenAI API\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-instruct Similar capabilities as GPT-\u001b[1;36m3\u001b[0m\n",
"era models. Compatible with\n",
"legacy Completions endpoint\n",
"and not Chat Completions.\n",
"\n",
"CONTEXT\n",
"WIND OW\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"\u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m\n",
"tokens\n",
"\n",
"Up to Sep\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k\n",
"\n",
"Legacy Currently points to\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-\u001b[1;36m0613\u001b[0m.\n",
"\n",
"\u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m\n",
"tokens\n",
"\n",
"Up to Sep\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0613\u001b[0m\n",
"\n",
"Legacy Snapshot of gpt-\u001b[1;36m3.5\u001b[0m-\n",
"\n",
"turbo from June 13th \u001b[1;36m2023\u001b[0m.\n",
"\n",
"Will be deprecated on June \u001b[1;36m13\u001b[0m,\n",
"\u001b[1;36m2024\u001b[0m.\n",
"\n",
"\u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m\n",
"tokens\n",
"\n",
"Up to Sep\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-\u001b[1;36m0613\u001b[0m\n",
"\n",
"Legacy Snapshot of gpt-\u001b[1;36m3.5\u001b[0m-\n",
"\n",
"\u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m\n",
"\n",
"Up to Sep\n",
"\n",
"16k-turbo from June 13th\n",
"\n",
"tokens\n",
"\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m2023\u001b[0m. Will be deprecated on\n",
"\n",
"June \u001b[1;36m13\u001b[0m, \u001b[1;36m2024\u001b[0m.\n",
"\n",
"DALL·E\n",
"\n",
"DALL·E is a AI system that can create realistic images and art from a description in\n",
"\n",
"natural language. DALL·E \u001b[1;36m3\u001b[0m currently supports the ability, given a prompt, to create a\n",
"\n",
"new image with a specific size. DALL·E \u001b[1;36m2\u001b[0m also support the ability to edit an existing\n",
"\n",
"image, or create variations of a user provided image.\n",
"\n",
"DALL·E \u001b[1;36m3\u001b[0m is available through our Images API along with DALL·E \u001b[1;36m2\u001b[0m. You can try DALL·E \u001b[1;36m3\u001b[0m\n",
"\n",
"through ChatGPT Plus.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"dall-e-\u001b[1;36m3\u001b[0m\n",
"\n",
"New DALL·E \u001b[1;36m3\u001b[0m\n",
"\n",
"The latest DALL·E model released in Nov \u001b[1;36m2023\u001b[0m. Learn more.\n",
"\n",
"dall-e-\u001b[1;36m2\u001b[0m The previous DALL·E model released in Nov \u001b[1;36m2022\u001b[0m. The 2nd iteration of\n",
"DALL·E with more realistic, accurate, and 4x greater resolution images\n",
"than the original model.\n",
"\n",
"TTS\n",
"\n",
"TTS is an AI model that converts text to natural sounding spoken text. We offer two\n",
"different model variates, tts-\u001b[1;36m1\u001b[0m is optimized for real time text to speech use cases\n",
"and tts-\u001b[1;36m1\u001b[0m-hd is optimized for quality. These models can be used with the Speech\n",
"\n",
"endpoint in the Audio API.\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m5\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"Models - OpenAI API\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"\n",
"New Text-to-speech <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"The latest text to speech model, optimized for speed.\n",
"\n",
"tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd\n",
"\n",
"New Text-to-speech <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> HD\n",
"The latest text to speech model, optimized for quality.\n",
"\n",
"Whisper\n",
"\n",
"Whisper is a general-purpose speech recognition model. It is trained on a large dataset\n",
"of diverse audio and is also a multi-task model that can perform multilingual speech\n",
"recognition as well as speech translation and language identification. The Whisper v2-\n",
"\n",
"large model is currently available through our API with the whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> model name.\n",
"\n",
"Currently, there is no difference between the open source version of Whisper and the\n",
"\n",
"version available through our API. However, through our API, we offer an optimized\n",
"inference process which makes running Whisper through our API much faster than\n",
"\n",
"doing it through other means. For more technical details on Whisper, you can read the\n",
"\n",
"paper.\n",
"\n",
"Embeddings\n",
"\n",
"Embeddings are a numerical representation of text that can be used to measure the\n",
"\n",
"relatedness between two pieces of text. Embeddings are useful for search, clustering,\n",
"\n",
"recommendations, anomaly detection, and classification tasks. You can read more\n",
"about our latest embedding models in the announcement blog post.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"text-embedding-\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-large\n",
"\n",
"New Embedding V3 large\n",
"Most capable embedding model for both\n",
"\n",
"english and non-english tasks\n",
"\n",
"text-embedding-\n",
"\n",
"New Embedding V3 small\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-small\n",
"\n",
"Increased performance over 2nd generation ada\n",
"embedding model\n",
"\n",
"text-embedding-\n",
"ada-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>\n",
"\n",
"Most capable 2nd generation embedding\n",
"model, replacing <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span> first generation models\n",
"\n",
"OUTP UT\n",
"DIMENSION\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">072</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536</span>\n",
"\n",
"Moderation\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"Models - OpenAI API\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"tts-\u001b[1;36m1\u001b[0m\n",
"\n",
"New Text-to-speech \u001b[1;36m1\u001b[0m\n",
"The latest text to speech model, optimized for speed.\n",
"\n",
"tts-\u001b[1;36m1\u001b[0m-hd\n",
"\n",
"New Text-to-speech \u001b[1;36m1\u001b[0m HD\n",
"The latest text to speech model, optimized for quality.\n",
"\n",
"Whisper\n",
"\n",
"Whisper is a general-purpose speech recognition model. It is trained on a large dataset\n",
"of diverse audio and is also a multi-task model that can perform multilingual speech\n",
"recognition as well as speech translation and language identification. The Whisper v2-\n",
"\n",
"large model is currently available through our API with the whisper-\u001b[1;36m1\u001b[0m model name.\n",
"\n",
"Currently, there is no difference between the open source version of Whisper and the\n",
"\n",
"version available through our API. However, through our API, we offer an optimized\n",
"inference process which makes running Whisper through our API much faster than\n",
"\n",
"doing it through other means. For more technical details on Whisper, you can read the\n",
"\n",
"paper.\n",
"\n",
"Embeddings\n",
"\n",
"Embeddings are a numerical representation of text that can be used to measure the\n",
"\n",
"relatedness between two pieces of text. Embeddings are useful for search, clustering,\n",
"\n",
"recommendations, anomaly detection, and classification tasks. You can read more\n",
"about our latest embedding models in the announcement blog post.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"text-embedding-\n",
"\u001b[1;36m3\u001b[0m-large\n",
"\n",
"New Embedding V3 large\n",
"Most capable embedding model for both\n",
"\n",
"english and non-english tasks\n",
"\n",
"text-embedding-\n",
"\n",
"New Embedding V3 small\n",
"\n",
"\u001b[1;36m3\u001b[0m-small\n",
"\n",
"Increased performance over 2nd generation ada\n",
"embedding model\n",
"\n",
"text-embedding-\n",
"ada-\u001b[1;36m002\u001b[0m\n",
"\n",
"Most capable 2nd generation embedding\n",
"model, replacing \u001b[1;36m16\u001b[0m first generation models\n",
"\n",
"OUTP UT\n",
"DIMENSION\n",
"\n",
"\u001b[1;36m3\u001b[0m,\u001b[1;36m072\u001b[0m\n",
"\n",
"\u001b[1;36m1\u001b[0m,\u001b[1;36m536\u001b[0m\n",
"\n",
"\u001b[1;36m1\u001b[0m,\u001b[1;36m536\u001b[0m\n",
"\n",
"Moderation\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m6\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"Models - OpenAI API\n",
"\n",
"The Moderation models are designed to check whether content complies with\n",
"OpenAI's usage policies. The models provide classification capabilities that look for\n",
"content in the following categories: hate, hate/threatening, self-harm, sexual,\n",
"sexual/minors, violence, and violence/graphic. You can find out more in our moderation\n",
"\n",
"guide.\n",
"\n",
"Moderation models take in an arbitrary sized input that is automatically broken up into\n",
"chunks of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens. In cases where the input is more than <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens,\n",
"\n",
"truncation is used which in a rare condition may omit a small number of tokens from\n",
"the moderation check.\n",
"\n",
"The final results from each request to the moderation endpoint shows the maximum\n",
"\n",
"value on a per category basis. For example, if one chunk of 4K tokens had a category\n",
"score of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.9901</span> and the other had a score of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.1901</span>, the results would show <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.9901</span> in the\n",
"API response since it is higher.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"MAX\n",
"TOKENS\n",
"\n",
"text-moderation-latest Currently points to text-moderation-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span>.\n",
"\n",
"text-moderation-stable Currently points to text-moderation-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span>.\n",
"\n",
"text-moderation-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span>\n",
"\n",
"Most capable moderation model across\n",
"all categories.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"\n",
"GPT base\n",
"\n",
"GPT base models can understand and generate natural language or code but are not\n",
"trained with instruction following. These models are made to be replacements for our\n",
"\n",
"original GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> base models and use the legacy Completions API. Most customers\n",
"\n",
"should use GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> or GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span> Replacement for the GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> ada and\n",
"\n",
"babbage base models.\n",
"\n",
"davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span> Replacement for the GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> curie and\n",
"\n",
"davinci base models.\n",
"\n",
"MAX\n",
"TOKENS\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span>\n",
"tokens\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span>\n",
"tokens\n",
"\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"How we use your data\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"Models - OpenAI API\n",
"\n",
"The Moderation models are designed to check whether content complies with\n",
"OpenAI's usage policies. The models provide classification capabilities that look for\n",
"content in the following categories: hate, hate/threatening, self-harm, sexual,\n",
"sexual/minors, violence, and violence/graphic. You can find out more in our moderation\n",
"\n",
"guide.\n",
"\n",
"Moderation models take in an arbitrary sized input that is automatically broken up into\n",
"chunks of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens. In cases where the input is more than \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens,\n",
"\n",
"truncation is used which in a rare condition may omit a small number of tokens from\n",
"the moderation check.\n",
"\n",
"The final results from each request to the moderation endpoint shows the maximum\n",
"\n",
"value on a per category basis. For example, if one chunk of 4K tokens had a category\n",
"score of \u001b[1;36m0.9901\u001b[0m and the other had a score of \u001b[1;36m0.1901\u001b[0m, the results would show \u001b[1;36m0.9901\u001b[0m in the\n",
"API response since it is higher.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"MAX\n",
"TOKENS\n",
"\n",
"text-moderation-latest Currently points to text-moderation-\n",
"\n",
"\u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m\n",
"\n",
"\u001b[1;36m007\u001b[0m.\n",
"\n",
"text-moderation-stable Currently points to text-moderation-\n",
"\n",
"\u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m\n",
"\n",
"\u001b[1;36m007\u001b[0m.\n",
"\n",
"text-moderation-\u001b[1;36m007\u001b[0m\n",
"\n",
"Most capable moderation model across\n",
"all categories.\n",
"\n",
"\u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m\n",
"\n",
"GPT base\n",
"\n",
"GPT base models can understand and generate natural language or code but are not\n",
"trained with instruction following. These models are made to be replacements for our\n",
"\n",
"original GPT-\u001b[1;36m3\u001b[0m base models and use the legacy Completions API. Most customers\n",
"\n",
"should use GPT-\u001b[1;36m3.5\u001b[0m or GPT-\u001b[1;36m4\u001b[0m.\n",
"\n",
"MODEL\n",
"\n",
"DE S CRIPTION\n",
"\n",
"babbage-\u001b[1;36m002\u001b[0m Replacement for the GPT-\u001b[1;36m3\u001b[0m ada and\n",
"\n",
"babbage base models.\n",
"\n",
"davinci-\u001b[1;36m002\u001b[0m Replacement for the GPT-\u001b[1;36m3\u001b[0m curie and\n",
"\n",
"davinci base models.\n",
"\n",
"MAX\n",
"TOKENS\n",
"\n",
"TRAINING\n",
"DATA\n",
"\n",
"\u001b[1;36m16\u001b[0m,\u001b[1;36m384\u001b[0m\n",
"tokens\n",
"\n",
"\u001b[1;36m16\u001b[0m,\u001b[1;36m384\u001b[0m\n",
"tokens\n",
"\n",
"Up to Sep\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"Up to Sep\n",
"\u001b[1;36m2021\u001b[0m\n",
"\n",
"How we use your data\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m7\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"Models - OpenAI API\n",
"\n",
"Your data is your data.\n",
"\n",
"As of March <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, data sent to the OpenAI API will not be used to train or improve\n",
"\n",
"OpenAI models <span style=\"font-weight: bold\">(</span>unless you explicitly opt in<span style=\"font-weight: bold\">)</span>. One advantage to opting in is that the\n",
"models may get better at your use case over time.\n",
"\n",
"To help identify abuse, API data may be retained for up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, after which it will be\n",
"\n",
"deleted <span style=\"font-weight: bold\">(</span>unless otherwise required by law<span style=\"font-weight: bold\">)</span>. For trusted customers with sensitive\n",
"applications, zero data retention may be available. With zero data retention, request\n",
"and response bodies are not persisted to any logging mechanism and exist only in\n",
"memory in order to serve the request.\n",
"\n",
"Note that this data policy does not apply to OpenAI's non-API consumer services like\n",
"ChatGPT or DALL·E Labs.\n",
"\n",
"Default usage policies by endpoint\n",
"\n",
"ENDP OINT\n",
"\n",
"DATA USED\n",
"FOR TRAINING\n",
"\n",
"DEFAULT\n",
"RETENTION\n",
"\n",
"ELIGIBLE FOR\n",
"ZERO RETENTION\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/chat/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>*\n",
"\n",
"No\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"\n",
"Yes, except\n",
"\n",
"image inputs*\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">files</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">assistants</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">threads</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">messages</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">runs</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/runs/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">steps</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">generations</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">edits</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">variations</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">embeddings</span>\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">transcriptions</span> No\n",
"\n",
"Until deleted by\n",
"\n",
"No\n",
"\n",
"customer\n",
"\n",
"Until deleted by\n",
"\n",
"No\n",
"\n",
"customer\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days *\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days *\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days *\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days *\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"\n",
"Zero data\n",
"retention\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"Yes\n",
"\n",
"-\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"Models - OpenAI API\n",
"\n",
"Your data is your data.\n",
"\n",
"As of March \u001b[1;36m1\u001b[0m, \u001b[1;36m2023\u001b[0m, data sent to the OpenAI API will not be used to train or improve\n",
"\n",
"OpenAI models \u001b[1m(\u001b[0munless you explicitly opt in\u001b[1m)\u001b[0m. One advantage to opting in is that the\n",
"models may get better at your use case over time.\n",
"\n",
"To help identify abuse, API data may be retained for up to \u001b[1;36m30\u001b[0m days, after which it will be\n",
"\n",
"deleted \u001b[1m(\u001b[0munless otherwise required by law\u001b[1m)\u001b[0m. For trusted customers with sensitive\n",
"applications, zero data retention may be available. With zero data retention, request\n",
"and response bodies are not persisted to any logging mechanism and exist only in\n",
"memory in order to serve the request.\n",
"\n",
"Note that this data policy does not apply to OpenAI's non-API consumer services like\n",
"ChatGPT or DALL·E Labs.\n",
"\n",
"Default usage policies by endpoint\n",
"\n",
"ENDP OINT\n",
"\n",
"DATA USED\n",
"FOR TRAINING\n",
"\n",
"DEFAULT\n",
"RETENTION\n",
"\n",
"ELIGIBLE FOR\n",
"ZERO RETENTION\n",
"\n",
"\u001b[35m/v1/chat/\u001b[0m\u001b[95mcompletions\u001b[0m*\n",
"\n",
"No\n",
"\n",
"\u001b[1;36m30\u001b[0m days\n",
"\n",
"Yes, except\n",
"\n",
"image inputs*\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mfiles\u001b[0m\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95massistants\u001b[0m\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mthreads\u001b[0m\n",
"\n",
"\u001b[35m/v1/threads/\u001b[0m\u001b[95mmessages\u001b[0m\n",
"\n",
"\u001b[35m/v1/threads/\u001b[0m\u001b[95mruns\u001b[0m\n",
"\n",
"\u001b[35m/v1/threads/runs/\u001b[0m\u001b[95msteps\u001b[0m\n",
"\n",
"\u001b[35m/v1/images/\u001b[0m\u001b[95mgenerations\u001b[0m\n",
"\n",
"\u001b[35m/v1/images/\u001b[0m\u001b[95medits\u001b[0m\n",
"\n",
"\u001b[35m/v1/images/\u001b[0m\u001b[95mvariations\u001b[0m\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95membeddings\u001b[0m\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranscriptions\u001b[0m No\n",
"\n",
"Until deleted by\n",
"\n",
"No\n",
"\n",
"customer\n",
"\n",
"Until deleted by\n",
"\n",
"No\n",
"\n",
"customer\n",
"\n",
"\u001b[1;36m60\u001b[0m days *\n",
"\n",
"\u001b[1;36m60\u001b[0m days *\n",
"\n",
"\u001b[1;36m60\u001b[0m days *\n",
"\n",
"\u001b[1;36m60\u001b[0m days *\n",
"\n",
"\u001b[1;36m30\u001b[0m days\n",
"\n",
"\u001b[1;36m30\u001b[0m days\n",
"\n",
"\u001b[1;36m30\u001b[0m days\n",
"\n",
"\u001b[1;36m30\u001b[0m days\n",
"\n",
"Zero data\n",
"retention\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"Yes\n",
"\n",
"-\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m8\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"Models - OpenAI API\n",
"\n",
"ENDP OINT\n",
"\n",
"DATA USED\n",
"FOR TRAINING\n",
"\n",
"DEFAULT\n",
"RETENTION\n",
"\n",
"ELIGIBLE FOR\n",
"ZERO RETENTION\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">translations</span>\n",
"\n",
"No\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">speech</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/fine_tuning/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">jobs</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">moderations</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"Zero data\n",
"retention\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"\n",
"Until deleted by\n",
"customer\n",
"\n",
"Zero data\n",
"retention\n",
"\n",
"-\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"\n",
"Yes\n",
"\n",
"* Image inputs via the gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview model are not eligible for zero\n",
"retention.\n",
"\n",
"* For the Assistants API, we are still evaluating the default retention period during the\n",
"\n",
"Beta. We expect that the default retention period will be stable after the end of the\n",
"\n",
"Beta.\n",
"\n",
"For details, see our API data usage policies. To learn more about zero retention, get in\n",
"\n",
"touch with our sales team.\n",
"\n",
"Model endpoint compatibility\n",
"\n",
"ENDP OINT\n",
"\n",
"L ATE ST MODEL S\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">assistants</span>\n",
"\n",
"All models except gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0301</span>\n",
"\n",
"supported. The retrieval tool requires gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"\n",
"turbo-preview <span style=\"font-weight: bold\">(</span>and subsequent dated model\n",
"\n",
"releases<span style=\"font-weight: bold\">)</span> or gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span> <span style=\"font-weight: bold\">(</span>and\n",
"\n",
"subsequent versions<span style=\"font-weight: bold\">)</span>.\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">transcriptions</span> whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">translations</span>\n",
"\n",
"whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">speech</span>\n",
"\n",
"tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/chat/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> and dated model releases, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-\n",
"\n",
"preview and dated model releases, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"\n",
"vision-preview, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k and dated model\n",
"\n",
"releases, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo and dated model\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">9</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"Models - OpenAI API\n",
"\n",
"ENDP OINT\n",
"\n",
"DATA USED\n",
"FOR TRAINING\n",
"\n",
"DEFAULT\n",
"RETENTION\n",
"\n",
"ELIGIBLE FOR\n",
"ZERO RETENTION\n",
"\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranslations\u001b[0m\n",
"\n",
"No\n",
"\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mspeech\u001b[0m\n",
"\n",
"\u001b[35m/v1/fine_tuning/\u001b[0m\u001b[95mjobs\u001b[0m\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mmoderations\u001b[0m\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mcompletions\u001b[0m\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"Zero data\n",
"retention\n",
"\n",
"\u001b[1;36m30\u001b[0m days\n",
"\n",
"Until deleted by\n",
"customer\n",
"\n",
"Zero data\n",
"retention\n",
"\n",
"-\n",
"\n",
"No\n",
"\n",
"No\n",
"\n",
"-\n",
"\n",
"\u001b[1;36m30\u001b[0m days\n",
"\n",
"Yes\n",
"\n",
"* Image inputs via the gpt-\u001b[1;36m4\u001b[0m-vision-preview model are not eligible for zero\n",
"retention.\n",
"\n",
"* For the Assistants API, we are still evaluating the default retention period during the\n",
"\n",
"Beta. We expect that the default retention period will be stable after the end of the\n",
"\n",
"Beta.\n",
"\n",
"For details, see our API data usage policies. To learn more about zero retention, get in\n",
"\n",
"touch with our sales team.\n",
"\n",
"Model endpoint compatibility\n",
"\n",
"ENDP OINT\n",
"\n",
"L ATE ST MODEL S\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95massistants\u001b[0m\n",
"\n",
"All models except gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0301\u001b[0m\n",
"\n",
"supported. The retrieval tool requires gpt-\u001b[1;36m4\u001b[0m-\n",
"\n",
"turbo-preview \u001b[1m(\u001b[0mand subsequent dated model\n",
"\n",
"releases\u001b[1m)\u001b[0m or gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m1106\u001b[0m \u001b[1m(\u001b[0mand\n",
"\n",
"subsequent versions\u001b[1m)\u001b[0m.\n",
"\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranscriptions\u001b[0m whisper-\u001b[1;36m1\u001b[0m\n",
"\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranslations\u001b[0m\n",
"\n",
"whisper-\u001b[1;36m1\u001b[0m\n",
"\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mspeech\u001b[0m\n",
"\n",
"tts-\u001b[1;36m1\u001b[0m, tts-\u001b[1;36m1\u001b[0m-hd\n",
"\n",
"\u001b[35m/v1/chat/\u001b[0m\u001b[95mcompletions\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m and dated model releases, gpt-\u001b[1;36m4\u001b[0m-turbo-\n",
"\n",
"preview and dated model releases, gpt-\u001b[1;36m4\u001b[0m-\n",
"\n",
"vision-preview, gpt-\u001b[1;36m4\u001b[0m-32k and dated model\n",
"\n",
"releases, gpt-\u001b[1;36m3.5\u001b[0m-turbo and dated model\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m9\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"\n",
"ENDP OINT\n",
"\n",
"Models - OpenAI API\n",
"\n",
"L ATE ST MODEL S\n",
"\n",
"releases, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k and dated model\n",
"\n",
"releases, fine-tuned versions of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span> <span style=\"font-weight: bold\">(</span>Legacy<span style=\"font-weight: bold\">)</span> gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-instruct, babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>,\n",
"\n",
"davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">embeddings</span>\n",
"\n",
"text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-small, text-embedding-\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-large, text-embedding-ada-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/fine_tuning/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">jobs</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo, babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>, davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>\n",
"\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">moderations</span>\n",
"\n",
"text-moderation-stable, text-\n",
"\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"\n",
"ENDP OINT\n",
"\n",
"Models - OpenAI API\n",
"\n",
"L ATE ST MODEL S\n",
"\n",
"releases, gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k and dated model\n",
"\n",
"releases, fine-tuned versions of gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mcompletions\u001b[0m \u001b[1m(\u001b[0mLegacy\u001b[1m)\u001b[0m gpt-\u001b[1;36m3.5\u001b[0m-turbo-instruct, babbage-\u001b[1;36m002\u001b[0m,\n",
"\n",
"davinci-\u001b[1;36m002\u001b[0m\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95membeddings\u001b[0m\n",
"\n",
"text-embedding-\u001b[1;36m3\u001b[0m-small, text-embedding-\n",
"\n",
"\u001b[1;36m3\u001b[0m-large, text-embedding-ada-\u001b[1;36m002\u001b[0m\n",
"\n",
"\u001b[35m/v1/fine_tuning/\u001b[0m\u001b[95mjobs\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo, babbage-\u001b[1;36m002\u001b[0m, davinci-\u001b[1;36m002\u001b[0m\n",
"\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mmoderations\u001b[0m\n",
"\n",
"text-moderation-stable, text-\n",
"\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\n",
"\u001b[1;36m10\u001b[0m/\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> and GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo**\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is a sophisticated multimodal model capable of processing both text and image inputs to produce text outputs.\n",
"It is designed to tackle complex problems with higher accuracy than previous models, leveraging its extensive \n",
"general knowledge and advanced reasoning skills. GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is accessible through the OpenAI API for paying customers \n",
"and is optimized for chat applications, although it can also handle traditional completion tasks using the Chat \n",
"Completions API.\n",
"\n",
"**Model Versions:**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>-preview**\n",
" - **Description:** This is the latest GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo model, designed to minimize instances where the model fails to\n",
"complete a task, known as <span style=\"color: #008000; text-decoration-color: #008000\">\"laziness.\"</span> It can return up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data:** Up to December <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview**\n",
" - **Description:** This version currently points to the gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>-preview model.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data:** Up to December <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-preview**\n",
" - **Description:** This version of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo includes enhancements such as improved instruction following, \n",
"JSON mode, reproducible outputs, and parallel function calling. It also supports up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data:** Up to April <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"These models are part of OpenAI's ongoing efforts to provide developers with robust tools for various applications,\n",
"ensuring flexibility and improved performance across different use cases.\n",
"</pre>\n"
],
"text/plain": [
"**GPT-\u001b[1;36m4\u001b[0m and GPT-\u001b[1;36m4\u001b[0m Turbo**\n",
"\n",
"GPT-\u001b[1;36m4\u001b[0m is a sophisticated multimodal model capable of processing both text and image inputs to produce text outputs.\n",
"It is designed to tackle complex problems with higher accuracy than previous models, leveraging its extensive \n",
"general knowledge and advanced reasoning skills. GPT-\u001b[1;36m4\u001b[0m is accessible through the OpenAI API for paying customers \n",
"and is optimized for chat applications, although it can also handle traditional completion tasks using the Chat \n",
"Completions API.\n",
"\n",
"**Model Versions:**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0125\u001b[0m-preview**\n",
" - **Description:** This is the latest GPT-\u001b[1;36m4\u001b[0m Turbo model, designed to minimize instances where the model fails to\n",
"complete a task, known as \u001b[32m\"laziness.\"\u001b[0m It can return up to \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
" - **Context Window:** \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data:** Up to December \u001b[1;36m2023\u001b[0m\n",
"\n",
"\u001b[1;36m2\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m-turbo-preview**\n",
" - **Description:** This version currently points to the gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0125\u001b[0m-preview model.\n",
" - **Context Window:** \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data:** Up to December \u001b[1;36m2023\u001b[0m\n",
"\n",
"\u001b[1;36m3\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-preview**\n",
" - **Description:** This version of GPT-\u001b[1;36m4\u001b[0m Turbo includes enhancements such as improved instruction following, \n",
"JSON mode, reproducible outputs, and parallel function calling. It also supports up to \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
" - **Context Window:** \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data:** Up to April \u001b[1;36m2023\u001b[0m\n",
"\n",
"These models are part of OpenAI's ongoing efforts to provide developers with robust tools for various applications,\n",
"ensuring flexibility and improved performance across different use cases.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Models - OpenAI API Overview**\n",
"\n",
"This document provides an overview of various GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> models, highlighting their capabilities, context windows, and \n",
"training data timelines.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview**\n",
" - **Description**: This model has the ability to understand images, in addition to all other GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo \n",
"capabilities. It currently points to the gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-vision-preview model.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data**: Up to April <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-vision-preview**\n",
" - **Description**: Similar to the gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview, this model can understand images and includes all GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> \n",
"Turbo capabilities. It returns a maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens and is a preview model version.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data**: Up to April <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>**\n",
" - **Description**: This model currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span> and includes continuous model upgrades.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">192</span> tokens\n",
" - **Training Data**: Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>**\n",
" - **Description**: A snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> from June 13th, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, with improved function calling support.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">192</span> tokens\n",
" - **Training Data**: Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k**\n",
" - **Description**: This model points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span> and includes continuous model upgrades. It was not widely\n",
"rolled out in favor of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens\n",
" - **Training Data**: Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>**\n",
" - **Description**: A snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k from June 13th, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, with improved function calling support. Like \n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k, it was not widely rolled out in favor of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens\n",
" - **Training Data**: Up to September \n",
"</pre>\n"
],
"text/plain": [
"**Models - OpenAI API Overview**\n",
"\n",
"This document provides an overview of various GPT-\u001b[1;36m4\u001b[0m models, highlighting their capabilities, context windows, and \n",
"training data timelines.\n",
"\n",
"\u001b[1;36m1\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m-vision-preview**\n",
" - **Description**: This model has the ability to understand images, in addition to all other GPT-\u001b[1;36m4\u001b[0m Turbo \n",
"capabilities. It currently points to the gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-vision-preview model.\n",
" - **Context Window**: \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data**: Up to April \u001b[1;36m2023\u001b[0m\n",
"\n",
"\u001b[1;36m2\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-vision-preview**\n",
" - **Description**: Similar to the gpt-\u001b[1;36m4\u001b[0m-vision-preview, this model can understand images and includes all GPT-\u001b[1;36m4\u001b[0m \n",
"Turbo capabilities. It returns a maximum of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens and is a preview model version.\n",
" - **Context Window**: \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data**: Up to April \u001b[1;36m2023\u001b[0m\n",
"\n",
"\u001b[1;36m3\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m**\n",
" - **Description**: This model currently points to gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0613\u001b[0m and includes continuous model upgrades.\n",
" - **Context Window**: \u001b[1;36m8\u001b[0m,\u001b[1;36m192\u001b[0m tokens\n",
" - **Training Data**: Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m4\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0613\u001b[0m**\n",
" - **Description**: A snapshot of gpt-\u001b[1;36m4\u001b[0m from June 13th, \u001b[1;36m2023\u001b[0m, with improved function calling support.\n",
" - **Context Window**: \u001b[1;36m8\u001b[0m,\u001b[1;36m192\u001b[0m tokens\n",
" - **Training Data**: Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m5\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m-32k**\n",
" - **Description**: This model points to gpt-\u001b[1;36m4\u001b[0m-32k-\u001b[1;36m0613\u001b[0m and includes continuous model upgrades. It was not widely\n",
"rolled out in favor of GPT-\u001b[1;36m4\u001b[0m Turbo.\n",
" - **Context Window**: \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens\n",
" - **Training Data**: Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m6\u001b[0m. **gpt-\u001b[1;36m4\u001b[0m-32k-\u001b[1;36m0613\u001b[0m**\n",
" - **Description**: A snapshot of gpt-\u001b[1;36m4\u001b[0m-32k from June 13th, \u001b[1;36m2023\u001b[0m, with improved function calling support. Like \n",
"gpt-\u001b[1;36m4\u001b[0m-32k, it was not widely rolled out in favor of GPT-\u001b[1;36m4\u001b[0m Turbo.\n",
" - **Context Window**: \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens\n",
" - **Training Data**: Up to September \n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Multilingual Capabilities and GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo**\n",
"\n",
"**Multilingual Capabilities**\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> surpasses previous large language models and, as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, most state-of-the-art systems. It excels in the \n",
"MMLU benchmark, which involves English-language multiple-choice questions across <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">57</span> subjects. GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> not only \n",
"outperforms existing models in English but also shows strong performance in other languages.\n",
"\n",
"**GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo**\n",
"\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo models are designed to understand and generate natural language or code. They are optimized for chat \n",
"using the Chat Completions API but are also effective for non-chat tasks.\n",
"\n",
"**Model Descriptions:**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>**\n",
" - **Description:** Updated GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo with improved accuracy and a fix for a text encoding bug in non-English\n",
"language function calls. It returns up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo**\n",
" - **Description:** Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>. The alias will automatically upgrade to \n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span> on February 16th.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>**\n",
" - **Description:** Features improved instruction following, JSON mode, reproducible outputs, and parallel \n",
"function calling. It returns up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"</pre>\n"
],
"text/plain": [
"**Multilingual Capabilities and GPT-\u001b[1;36m3.5\u001b[0m Turbo**\n",
"\n",
"**Multilingual Capabilities**\n",
"\n",
"GPT-\u001b[1;36m4\u001b[0m surpasses previous large language models and, as of \u001b[1;36m2023\u001b[0m, most state-of-the-art systems. It excels in the \n",
"MMLU benchmark, which involves English-language multiple-choice questions across \u001b[1;36m57\u001b[0m subjects. GPT-\u001b[1;36m4\u001b[0m not only \n",
"outperforms existing models in English but also shows strong performance in other languages.\n",
"\n",
"**GPT-\u001b[1;36m3.5\u001b[0m Turbo**\n",
"\n",
"GPT-\u001b[1;36m3.5\u001b[0m Turbo models are designed to understand and generate natural language or code. They are optimized for chat \n",
"using the Chat Completions API but are also effective for non-chat tasks.\n",
"\n",
"**Model Descriptions:**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0125\u001b[0m**\n",
" - **Description:** Updated GPT-\u001b[1;36m3.5\u001b[0m Turbo with improved accuracy and a fix for a text encoding bug in non-English\n",
"language function calls. It returns up to \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
" - **Context Window:** \u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m2\u001b[0m. **gpt-\u001b[1;36m3.5\u001b[0m-turbo**\n",
" - **Description:** Currently points to gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0613\u001b[0m. The alias will automatically upgrade to \n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0125\u001b[0m on February 16th.\n",
" - **Context Window:** \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m3\u001b[0m. **gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m1106\u001b[0m**\n",
" - **Description:** Features improved instruction following, JSON mode, reproducible outputs, and parallel \n",
"function calling. It returns up to \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
" - **Context Window:** \u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Models - OpenAI API**\n",
"\n",
"**GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Models:**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-instruct**\n",
" - **Description:** Similar capabilities to GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> era models. Compatible with legacy Completions endpoint, not \n",
"Chat Completions.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k**\n",
" - **Description:** Legacy model pointing to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>**\n",
" - **Description:** Legacy snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo from June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. Will be deprecated on June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>**\n",
" - **Description:** Legacy snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-turbo from June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. Will be deprecated on June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>,\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"\n",
"**DALL-E:**\n",
"\n",
"- DALL-E is an AI system that creates realistic images and art from natural language descriptions. DALL-E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> \n",
"supports creating new images with specific sizes and editing existing images or creating variations. Available \n",
"through the Images API and ChatGPT Plus.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **dall-e-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>**\n",
" - **Description:** The latest DALL-E model released in November <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **dall-e-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>**\n",
" - **Description:** Released in November <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2022</span>, this model offers more realistic, accurate, and higher resolution \n",
"images than the original.\n",
"\n",
"**TTS <span style=\"font-weight: bold\">(</span>Text-to-Speech<span style=\"font-weight: bold\">)</span>:**\n",
"\n",
"- TTS converts text to natural-sounding spoken text. Two model variants are offered:\n",
" - **tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>:** Optimized for real-time text-to-speech use cases.\n",
" - **tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd:** Optimized for quality.\n",
"- These models can be used with the Speech endpoint in\n",
"</pre>\n"
],
"text/plain": [
"**Models - OpenAI API**\n",
"\n",
"**GPT-\u001b[1;36m3.5\u001b[0m Models:**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **gpt-\u001b[1;36m3.5\u001b[0m-turbo-instruct**\n",
" - **Description:** Similar capabilities to GPT-\u001b[1;36m3\u001b[0m era models. Compatible with legacy Completions endpoint, not \n",
"Chat Completions.\n",
" - **Context Window:** \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m2\u001b[0m. **gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k**\n",
" - **Description:** Legacy model pointing to gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-\u001b[1;36m0613\u001b[0m.\n",
" - **Context Window:** \u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m3\u001b[0m. **gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0613\u001b[0m**\n",
" - **Description:** Legacy snapshot of gpt-\u001b[1;36m3.5\u001b[0m-turbo from June \u001b[1;36m13\u001b[0m, \u001b[1;36m2023\u001b[0m. Will be deprecated on June \u001b[1;36m13\u001b[0m, \u001b[1;36m2024\u001b[0m.\n",
" - **Context Window:** \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"\u001b[1;36m4\u001b[0m. **gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-\u001b[1;36m0613\u001b[0m**\n",
" - **Description:** Legacy snapshot of gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-turbo from June \u001b[1;36m13\u001b[0m, \u001b[1;36m2023\u001b[0m. Will be deprecated on June \u001b[1;36m13\u001b[0m,\n",
"\u001b[1;36m2024\u001b[0m.\n",
" - **Context Window:** \u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
"\n",
"**DALL-E:**\n",
"\n",
"- DALL-E is an AI system that creates realistic images and art from natural language descriptions. DALL-E \u001b[1;36m3\u001b[0m \n",
"supports creating new images with specific sizes and editing existing images or creating variations. Available \n",
"through the Images API and ChatGPT Plus.\n",
"\n",
"\u001b[1;36m1\u001b[0m. **dall-e-\u001b[1;36m3\u001b[0m**\n",
" - **Description:** The latest DALL-E model released in November \u001b[1;36m2023\u001b[0m.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **dall-e-\u001b[1;36m2\u001b[0m**\n",
" - **Description:** Released in November \u001b[1;36m2022\u001b[0m, this model offers more realistic, accurate, and higher resolution \n",
"images than the original.\n",
"\n",
"**TTS \u001b[1m(\u001b[0mText-to-Speech\u001b[1m)\u001b[0m:**\n",
"\n",
"- TTS converts text to natural-sounding spoken text. Two model variants are offered:\n",
" - **tts-\u001b[1;36m1\u001b[0m:** Optimized for real-time text-to-speech use cases.\n",
" - **tts-\u001b[1;36m1\u001b[0m-hd:** Optimized for quality.\n",
"- These models can be used with the Speech endpoint in\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Models - OpenAI API**\n",
"\n",
"**Text-to-Speech Models:**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>**: This is a new text-to-speech model optimized for speed, providing efficient conversion of text into \n",
"spoken words.\n",
" \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd**: This model is optimized for quality, offering high-definition text-to-speech conversion.\n",
"\n",
"**Whisper:**\n",
"\n",
"Whisper is a versatile speech recognition model capable of handling diverse audio inputs. It supports multilingual \n",
"speech recognition, speech translation, and language identification. The Whisper v2-large model is accessible via \n",
"the API under the name <span style=\"color: #008000; text-decoration-color: #008000\">\"whisper-1.\"</span> While the open-source version and the API version are similar, the API offers \n",
"an optimized inference process for faster performance. More technical details can be found in the associated paper.\n",
"\n",
"**Embeddings:**\n",
"\n",
"Embeddings are numerical representations of text, useful for measuring the relatedness between text pieces. They \n",
"are applied in search, clustering, recommendations, anomaly detection, and classification tasks.\n",
"\n",
"- **text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-large**: The most capable embedding model for both English and non-English tasks, with an \n",
"output dimension of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">072</span>.\n",
" \n",
"- **text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-small**: Offers improved performance over the second-generation ada embedding model, with an \n",
"output dimension of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536</span>.\n",
" \n",
"- **text-embedding-ada-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>**: A second-generation embedding model replacing <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span> first-generation models, also with \n",
"an output dimension of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536</span>.\n",
"\n",
"**Moderation:**\n",
"\n",
"The document mentions a section on moderation, likely related to content moderation capabilities, though specific \n",
"details are not provided in the visible content.\n",
"</pre>\n"
],
"text/plain": [
"**Models - OpenAI API**\n",
"\n",
"**Text-to-Speech Models:**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **tts-\u001b[1;36m1\u001b[0m**: This is a new text-to-speech model optimized for speed, providing efficient conversion of text into \n",
"spoken words.\n",
" \n",
"\u001b[1;36m2\u001b[0m. **tts-\u001b[1;36m1\u001b[0m-hd**: This model is optimized for quality, offering high-definition text-to-speech conversion.\n",
"\n",
"**Whisper:**\n",
"\n",
"Whisper is a versatile speech recognition model capable of handling diverse audio inputs. It supports multilingual \n",
"speech recognition, speech translation, and language identification. The Whisper v2-large model is accessible via \n",
"the API under the name \u001b[32m\"whisper-1.\"\u001b[0m While the open-source version and the API version are similar, the API offers \n",
"an optimized inference process for faster performance. More technical details can be found in the associated paper.\n",
"\n",
"**Embeddings:**\n",
"\n",
"Embeddings are numerical representations of text, useful for measuring the relatedness between text pieces. They \n",
"are applied in search, clustering, recommendations, anomaly detection, and classification tasks.\n",
"\n",
"- **text-embedding-\u001b[1;36m3\u001b[0m-large**: The most capable embedding model for both English and non-English tasks, with an \n",
"output dimension of \u001b[1;36m3\u001b[0m,\u001b[1;36m072\u001b[0m.\n",
" \n",
"- **text-embedding-\u001b[1;36m3\u001b[0m-small**: Offers improved performance over the second-generation ada embedding model, with an \n",
"output dimension of \u001b[1;36m1\u001b[0m,\u001b[1;36m536\u001b[0m.\n",
" \n",
"- **text-embedding-ada-\u001b[1;36m002\u001b[0m**: A second-generation embedding model replacing \u001b[1;36m16\u001b[0m first-generation models, also with \n",
"an output dimension of \u001b[1;36m1\u001b[0m,\u001b[1;36m536\u001b[0m.\n",
"\n",
"**Moderation:**\n",
"\n",
"The document mentions a section on moderation, likely related to content moderation capabilities, though specific \n",
"details are not provided in the visible content.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Moderation Models and GPT Base**\n",
"\n",
"**Moderation Models**\n",
"\n",
"The moderation models are designed to ensure content compliance with OpenAI's usage policies. They classify content\n",
"into categories such as hate, hate/threatening, self-harm, sexual, sexual/minors, violence, and violence/graphic. \n",
"These models process inputs by breaking them into chunks of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens. If the input exceeds <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens, some \n",
"tokens may be truncated, potentially omitting a few from the moderation check.\n",
"\n",
"The moderation endpoint provides the maximum score per category from each request. For instance, if one chunk \n",
"scores <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.9901</span> and another scores <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.1901</span> in a category, the API response will show <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.9901</span>.\n",
"\n",
"- **text-moderation-latest**: Points to text-moderation-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span> with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens.\n",
"- **text-moderation-stable**: Also points to text-moderation-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span> with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens.\n",
"- **text-moderation-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span>**: The most capable model across all categories with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens.\n",
"\n",
"**GPT Base**\n",
"\n",
"GPT base models are capable of understanding and generating natural language or code but are not trained for \n",
"instruction following. They serve as replacements for the original GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> base models and utilize the legacy \n",
"Completions API. Most users are advised to use GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> or GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>.\n",
"\n",
"- **babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>**: Replaces the GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> ada and babbage models, with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> tokens and training data up to \n",
"September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>.\n",
"- **davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>**: Replaces the GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> curie and davinci models, with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> tokens and training data up to\n",
"September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>.\n",
"</pre>\n"
],
"text/plain": [
"**Moderation Models and GPT Base**\n",
"\n",
"**Moderation Models**\n",
"\n",
"The moderation models are designed to ensure content compliance with OpenAI's usage policies. They classify content\n",
"into categories such as hate, hate/threatening, self-harm, sexual, sexual/minors, violence, and violence/graphic. \n",
"These models process inputs by breaking them into chunks of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens. If the input exceeds \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens, some \n",
"tokens may be truncated, potentially omitting a few from the moderation check.\n",
"\n",
"The moderation endpoint provides the maximum score per category from each request. For instance, if one chunk \n",
"scores \u001b[1;36m0.9901\u001b[0m and another scores \u001b[1;36m0.1901\u001b[0m in a category, the API response will show \u001b[1;36m0.9901\u001b[0m.\n",
"\n",
"- **text-moderation-latest**: Points to text-moderation-\u001b[1;36m007\u001b[0m with a max of \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens.\n",
"- **text-moderation-stable**: Also points to text-moderation-\u001b[1;36m007\u001b[0m with a max of \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens.\n",
"- **text-moderation-\u001b[1;36m007\u001b[0m**: The most capable model across all categories with a max of \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens.\n",
"\n",
"**GPT Base**\n",
"\n",
"GPT base models are capable of understanding and generating natural language or code but are not trained for \n",
"instruction following. They serve as replacements for the original GPT-\u001b[1;36m3\u001b[0m base models and utilize the legacy \n",
"Completions API. Most users are advised to use GPT-\u001b[1;36m3.5\u001b[0m or GPT-\u001b[1;36m4\u001b[0m.\n",
"\n",
"- **babbage-\u001b[1;36m002\u001b[0m**: Replaces the GPT-\u001b[1;36m3\u001b[0m ada and babbage models, with a max of \u001b[1;36m16\u001b[0m,\u001b[1;36m384\u001b[0m tokens and training data up to \n",
"September \u001b[1;36m2021\u001b[0m.\n",
"- **davinci-\u001b[1;36m002\u001b[0m**: Replaces the GPT-\u001b[1;36m3\u001b[0m curie and davinci models, with a max of \u001b[1;36m16\u001b[0m,\u001b[1;36m384\u001b[0m tokens and training data up to\n",
"September \u001b[1;36m2021\u001b[0m.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Your Data is Your Data\n",
"\n",
"As of March <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, data sent to the OpenAI API is not used to train or improve OpenAI models unless you \n",
"explicitly opt in. Opting in can help models improve for your specific use case over time.\n",
"\n",
"To prevent abuse, API data may be retained for up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days before deletion, unless legally required otherwise. \n",
"Trusted customers with sensitive applications may have zero data retention, meaning request and response bodies are\n",
"not logged and exist only in memory to serve the request.\n",
"\n",
"This data policy does not apply to OpenAI's non-API consumer services like ChatGPT or DALL-E Labs.\n",
"\n",
"**Default Usage Policies by Endpoint**\n",
"\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/chat/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>**: Data is not used for training. Default retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, and it is eligible for \n",
"zero retention except for image inputs.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">files</span>**: Data is not used for training. Retention is until deleted by the customer, with no zero retention \n",
"option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">assistants</span>**: Data is not used for training. Retention is until deleted by the customer, with no zero \n",
"retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">threads</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">messages</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">runs</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/runs/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">steps</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">generations</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">edits</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">variations</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">embeddings</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, and it is eligible for zero retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">transcriptions</span>**: Data is not used for training\n",
"</pre>\n"
],
"text/plain": [
"Your Data is Your Data\n",
"\n",
"As of March \u001b[1;36m1\u001b[0m, \u001b[1;36m2023\u001b[0m, data sent to the OpenAI API is not used to train or improve OpenAI models unless you \n",
"explicitly opt in. Opting in can help models improve for your specific use case over time.\n",
"\n",
"To prevent abuse, API data may be retained for up to \u001b[1;36m30\u001b[0m days before deletion, unless legally required otherwise. \n",
"Trusted customers with sensitive applications may have zero data retention, meaning request and response bodies are\n",
"not logged and exist only in memory to serve the request.\n",
"\n",
"This data policy does not apply to OpenAI's non-API consumer services like ChatGPT or DALL-E Labs.\n",
"\n",
"**Default Usage Policies by Endpoint**\n",
"\n",
"- **\u001b[35m/v1/chat/\u001b[0m\u001b[95mcompletions\u001b[0m**: Data is not used for training. Default retention is \u001b[1;36m30\u001b[0m days, and it is eligible for \n",
"zero retention except for image inputs.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95mfiles\u001b[0m**: Data is not used for training. Retention is until deleted by the customer, with no zero retention \n",
"option.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95massistants\u001b[0m**: Data is not used for training. Retention is until deleted by the customer, with no zero \n",
"retention option.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95mthreads\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m60\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/threads/\u001b[0m\u001b[95mmessages\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m60\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/threads/\u001b[0m\u001b[95mruns\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m60\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/threads/runs/\u001b[0m\u001b[95msteps\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m60\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/images/\u001b[0m\u001b[95mgenerations\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m30\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/images/\u001b[0m\u001b[95medits\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m30\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/images/\u001b[0m\u001b[95mvariations\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m30\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95membeddings\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m30\u001b[0m days, and it is eligible for zero retention.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranscriptions\u001b[0m**: Data is not used for training\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">### Model Endpoint Compatibility and Data Retention\n",
"\n",
"#### Data Retention Details\n",
"\n",
"The table outlines the data retention policies for various API endpoints:\n",
"\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">translations</span>**: No data is used for training, and there is zero data retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">speech</span>**: No data is used for training, with a default retention period of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days. It is not \n",
"eligible for zero retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/fine_tuning/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">jobs</span>**: No data is used for training, and data is retained until deleted by the customer. It is\n",
"not eligible for zero retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">moderations</span>**: No data is used for training, and there is zero data retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>**: No data is used for training, with a default retention period of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days. It is eligible for\n",
"zero retention.\n",
"\n",
"Additional notes:\n",
"- Image inputs via the `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview` model are not eligible for zero retention.\n",
"- The default retention period for the Assistants API is still being evaluated during the Beta phase.\n",
"\n",
"#### Model Endpoint Compatibility\n",
"\n",
"The table provides information on the compatibility of endpoints with the latest models:\n",
"\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">assistants</span>**: Supports all models except `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0301</span>`. The `retrieval` tool requires \n",
"`gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview` or `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>`.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">transcriptions</span>**: Compatible with `whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>`.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">translations</span>**: Compatible with `whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>`.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">speech</span>**: Compatible with `tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>` and `tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd`.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/chat/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>**: Compatible with `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>`, `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview`, `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview`, `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k`, \n",
"and `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo`.\n",
"\n",
"For more details, users are encouraged to refer to the API data usage policies or contact the sales team for \n",
"information on zero retention.\n",
"</pre>\n"
],
"text/plain": [
"### Model Endpoint Compatibility and Data Retention\n",
"\n",
"#### Data Retention Details\n",
"\n",
"The table outlines the data retention policies for various API endpoints:\n",
"\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranslations\u001b[0m**: No data is used for training, and there is zero data retention.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mspeech\u001b[0m**: No data is used for training, with a default retention period of \u001b[1;36m30\u001b[0m days. It is not \n",
"eligible for zero retention.\n",
"- **\u001b[35m/v1/fine_tuning/\u001b[0m\u001b[95mjobs\u001b[0m**: No data is used for training, and data is retained until deleted by the customer. It is\n",
"not eligible for zero retention.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95mmoderations\u001b[0m**: No data is used for training, and there is zero data retention.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95mcompletions\u001b[0m**: No data is used for training, with a default retention period of \u001b[1;36m30\u001b[0m days. It is eligible for\n",
"zero retention.\n",
"\n",
"Additional notes:\n",
"- Image inputs via the `gpt-\u001b[1;36m4\u001b[0m-vision-preview` model are not eligible for zero retention.\n",
"- The default retention period for the Assistants API is still being evaluated during the Beta phase.\n",
"\n",
"#### Model Endpoint Compatibility\n",
"\n",
"The table provides information on the compatibility of endpoints with the latest models:\n",
"\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95massistants\u001b[0m**: Supports all models except `gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0301\u001b[0m`. The `retrieval` tool requires \n",
"`gpt-\u001b[1;36m4\u001b[0m-turbo-preview` or `gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m1106\u001b[0m`.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranscriptions\u001b[0m**: Compatible with `whisper-\u001b[1;36m1\u001b[0m`.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranslations\u001b[0m**: Compatible with `whisper-\u001b[1;36m1\u001b[0m`.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mspeech\u001b[0m**: Compatible with `tts-\u001b[1;36m1\u001b[0m` and `tts-\u001b[1;36m1\u001b[0m-hd`.\n",
"- **\u001b[35m/v1/chat/\u001b[0m\u001b[95mcompletions\u001b[0m**: Compatible with `gpt-\u001b[1;36m4\u001b[0m`, `gpt-\u001b[1;36m4\u001b[0m-turbo-preview`, `gpt-\u001b[1;36m4\u001b[0m-vision-preview`, `gpt-\u001b[1;36m4\u001b[0m-32k`, \n",
"and `gpt-\u001b[1;36m3.5\u001b[0m-turbo`.\n",
"\n",
"For more details, users are encouraged to refer to the API data usage policies or contact the sales team for \n",
"information on zero retention.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">LATEST MODELS\n",
"\n",
"This document outlines the latest models available for different endpoints in the OpenAI API:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span> <span style=\"font-weight: bold\">(</span>Legacy<span style=\"font-weight: bold\">)</span>**:\n",
" - Models: `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-instruct`, `babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`, `davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`\n",
" - These models are used for generating text completions based on input prompts.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">embeddings</span>**:\n",
" - Models: `text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-small`, `text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-large`, `text-embedding-ada-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`\n",
" - These models are designed to convert text into numerical vectors, which can be used for various tasks like \n",
"similarity comparison and clustering.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/fine_tuning/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">jobs</span>**:\n",
" - Models: `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo`, `babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`, `davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`\n",
" - These models support fine-tuning, allowing users to customize the models for specific tasks by training them \n",
"on additional data.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">moderations</span>**:\n",
" - Models: `text-moderation-stable`\n",
" - This model is used for content moderation, helping to identify and filter out inappropriate or harmful \n",
"content.\n",
"\n",
"Additionally, the document mentions the availability of `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k` and other fine-tuned versions of \n",
"`gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo`, indicating enhancements in model capabilities and performance.\n",
"</pre>\n"
],
"text/plain": [
"LATEST MODELS\n",
"\n",
"This document outlines the latest models available for different endpoints in the OpenAI API:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **\u001b[35m/v1/\u001b[0m\u001b[95mcompletions\u001b[0m \u001b[1m(\u001b[0mLegacy\u001b[1m)\u001b[0m**:\n",
" - Models: `gpt-\u001b[1;36m3.5\u001b[0m-turbo-instruct`, `babbage-\u001b[1;36m002\u001b[0m`, `davinci-\u001b[1;36m002\u001b[0m`\n",
" - These models are used for generating text completions based on input prompts.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **\u001b[35m/v1/\u001b[0m\u001b[95membeddings\u001b[0m**:\n",
" - Models: `text-embedding-\u001b[1;36m3\u001b[0m-small`, `text-embedding-\u001b[1;36m3\u001b[0m-large`, `text-embedding-ada-\u001b[1;36m002\u001b[0m`\n",
" - These models are designed to convert text into numerical vectors, which can be used for various tasks like \n",
"similarity comparison and clustering.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **\u001b[35m/v1/fine_tuning/\u001b[0m\u001b[95mjobs\u001b[0m**:\n",
" - Models: `gpt-\u001b[1;36m3.5\u001b[0m-turbo`, `babbage-\u001b[1;36m002\u001b[0m`, `davinci-\u001b[1;36m002\u001b[0m`\n",
" - These models support fine-tuning, allowing users to customize the models for specific tasks by training them \n",
"on additional data.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **\u001b[35m/v1/\u001b[0m\u001b[95mmoderations\u001b[0m**:\n",
" - Models: `text-moderation-stable`\n",
" - This model is used for content moderation, helping to identify and filter out inappropriate or harmful \n",
"content.\n",
"\n",
"Additionally, the document mentions the availability of `gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k` and other fine-tuned versions of \n",
"`gpt-\u001b[1;36m3.5\u001b[0m-turbo`, indicating enhancements in model capabilities and performance.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Overview\n",
"\n",
"Evaluation is the process of validating \n",
"and testing the outputs that your LLM \n",
"applications are producing. Having \n",
"strong evaluations <span style=\"font-weight: bold\">(</span>“evals”<span style=\"font-weight: bold\">)</span> will mean a \n",
"more stable, reliable application which is \n",
"resilient to code and model changes.\n",
"\n",
"Example use cases\n",
"\n",
"- Quantify a solutions reliability\n",
"- Monitor application performance in \n",
"\n",
"production\n",
"Test for regressions \n",
"\n",
"-\n",
"\n",
"What well cover\n",
"\n",
"● What are evals\n",
"\n",
"● Technical patterns\n",
"\n",
"● Example framework\n",
"\n",
"● Best practices\n",
"\n",
"● Resources\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Overview\n",
"\n",
"Evaluation is the process of validating \n",
"and testing the outputs that your LLM \n",
"applications are producing. Having \n",
"strong evaluations \u001b[1m(\u001b[0m“evals”\u001b[1m)\u001b[0m will mean a \n",
"more stable, reliable application which is \n",
"resilient to code and model changes.\n",
"\n",
"Example use cases\n",
"\n",
"- Quantify a solutions reliability\n",
"- Monitor application performance in \n",
"\n",
"production\n",
"Test for regressions \n",
"\n",
"-\n",
"\n",
"What well cover\n",
"\n",
"● What are evals\n",
"\n",
"● Technical patterns\n",
"\n",
"● Example framework\n",
"\n",
"● Best practices\n",
"\n",
"● Resources\n",
"\n",
"\u001b[1;36m3\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What are evals\n",
"Example\n",
"\n",
"An evaluation contains a question and a correct answer. We call this the ground truth.\n",
"\n",
"Question\n",
"\n",
"What is the population \n",
"of Canada?\n",
"\n",
"Thought: I dont know. I \n",
"should use a tool\n",
"Action: Search\n",
"Action Input: What is the \n",
"population of Canada?\n",
"\n",
"LLM\n",
"\n",
"Search\n",
"\n",
"There are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people \n",
"in Canada as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"\n",
"The current population of \n",
"Canada is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> as of \n",
"Tuesday, May <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>….\n",
"\n",
"Actual result\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"\n",
"\n",
"\n",
"An evaluation, or <span style=\"color: #008000; text-decoration-color: #008000\">\"eval,\"</span> involves a question and a correct answer, known as the ground truth. In this example, the\n",
"question posed is, <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span> \n",
"\n",
"The process begins with a person asking this question. The language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> initially does not know the answer \n",
"and decides to use a tool to find it. The LLM takes the action of searching, with the input being the question \n",
"about Canada's population.\n",
"\n",
"The search tool then provides the answer: <span style=\"color: #008000; text-decoration-color: #008000\">\"The current population of Canada is 39,566,248 as of Tuesday, May 23, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2023.\"</span> This result matches the actual result expected, which is that there are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people in Canada as of \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. \n",
"\n",
"This example illustrates how evaluations are used to verify the accuracy of information provided by a language \n",
"model.\n",
"\n",
"This slide provides an example of an evaluation process, often referred to as <span style=\"color: #008000; text-decoration-color: #008000\">\"evals.\"</span> The purpose of evals is to \n",
"compare a predicted answer to a known correct answer, called the <span style=\"color: #008000; text-decoration-color: #008000\">\"ground truth,\"</span> to determine if they match.\n",
"\n",
"In this example, the question posed is: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span> The ground truth states that the \n",
"population of Canada in <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people. The predicted answer is: <span style=\"color: #008000; text-decoration-color: #008000\">\"There are 39,566,248 people in Canada </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">as of 2023.\"</span>\n",
"\n",
"Since the predicted answer matches the ground truth, the evaluation is successful, as indicated by a checkmark. \n",
"This process is crucial for verifying the accuracy of predictions in various applications.\n",
"</pre>\n"
],
"text/plain": [
"What are evals\n",
"Example\n",
"\n",
"An evaluation contains a question and a correct answer. We call this the ground truth.\n",
"\n",
"Question\n",
"\n",
"What is the population \n",
"of Canada?\n",
"\n",
"Thought: I dont know. I \n",
"should use a tool\n",
"Action: Search\n",
"Action Input: What is the \n",
"population of Canada?\n",
"\n",
"LLM\n",
"\n",
"Search\n",
"\n",
"There are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people \n",
"in Canada as of \u001b[1;36m2023\u001b[0m.\n",
"\n",
"The current population of \n",
"Canada is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m as of \n",
"Tuesday, May \u001b[1;36m23\u001b[0m, \u001b[1;36m2023\u001b[0m….\n",
"\n",
"Actual result\n",
"\n",
"\u001b[1;36m4\u001b[0m\n",
"\n",
"\n",
"\n",
"\n",
"An evaluation, or \u001b[32m\"eval,\"\u001b[0m involves a question and a correct answer, known as the ground truth. In this example, the\n",
"question posed is, \u001b[32m\"What is the population of Canada?\"\u001b[0m \n",
"\n",
"The process begins with a person asking this question. The language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m initially does not know the answer \n",
"and decides to use a tool to find it. The LLM takes the action of searching, with the input being the question \n",
"about Canada's population.\n",
"\n",
"The search tool then provides the answer: \u001b[32m\"The current population of Canada is 39,566,248 as of Tuesday, May 23, \u001b[0m\n",
"\u001b[32m2023.\"\u001b[0m This result matches the actual result expected, which is that there are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people in Canada as of \n",
"\u001b[1;36m2023\u001b[0m. \n",
"\n",
"This example illustrates how evaluations are used to verify the accuracy of information provided by a language \n",
"model.\n",
"\n",
"This slide provides an example of an evaluation process, often referred to as \u001b[32m\"evals.\"\u001b[0m The purpose of evals is to \n",
"compare a predicted answer to a known correct answer, called the \u001b[32m\"ground truth,\"\u001b[0m to determine if they match.\n",
"\n",
"In this example, the question posed is: \u001b[32m\"What is the population of Canada?\"\u001b[0m The ground truth states that the \n",
"population of Canada in \u001b[1;36m2023\u001b[0m is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people. The predicted answer is: \u001b[32m\"There are 39,566,248 people in Canada \u001b[0m\n",
"\u001b[32mas of 2023.\"\u001b[0m\n",
"\n",
"Since the predicted answer matches the ground truth, the evaluation is successful, as indicated by a checkmark. \n",
"This process is crucial for verifying the accuracy of predictions in various applications.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What are evals\n",
"Example\n",
"\n",
"Our ground truth matches the predicted answer, so the evaluation passes!\n",
"\n",
"Evaluation\n",
"\n",
"Question\n",
"\n",
"Ground Truth\n",
"\n",
"Predicted Answer\n",
"\n",
"What is the population \n",
"of Canada?\n",
"\n",
"The population of Canada in \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people.\n",
"\n",
"There are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people \n",
"in Canada as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>\n",
"\n",
"\n",
"\n",
"\n",
"An evaluation, or <span style=\"color: #008000; text-decoration-color: #008000\">\"eval,\"</span> involves a question and a correct answer, known as the ground truth. In this example, the\n",
"question posed is, <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span> \n",
"\n",
"The process begins with a person asking this question. The language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> initially does not know the answer \n",
"and decides to use a tool to find it. The LLM takes the action of searching, with the input being the question \n",
"about Canada's population.\n",
"\n",
"The search tool then provides the answer: <span style=\"color: #008000; text-decoration-color: #008000\">\"The current population of Canada is 39,566,248 as of Tuesday, May 23, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2023.\"</span> This result matches the actual result expected, which is that there are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people in Canada as of \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. \n",
"\n",
"This example illustrates how evaluations are used to verify the accuracy of information provided by a language \n",
"model.\n",
"\n",
"This slide provides an example of an evaluation process, often referred to as <span style=\"color: #008000; text-decoration-color: #008000\">\"evals.\"</span> The purpose of evals is to \n",
"compare a predicted answer to a known correct answer, called the <span style=\"color: #008000; text-decoration-color: #008000\">\"ground truth,\"</span> to determine if they match.\n",
"\n",
"In this example, the question posed is: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span> The ground truth states that the \n",
"population of Canada in <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people. The predicted answer is: <span style=\"color: #008000; text-decoration-color: #008000\">\"There are 39,566,248 people in Canada </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">as of 2023.\"</span>\n",
"\n",
"Since the predicted answer matches the ground truth, the evaluation is successful, as indicated by a checkmark. \n",
"This process is crucial for verifying the accuracy of predictions in various applications.\n",
"</pre>\n"
],
"text/plain": [
"What are evals\n",
"Example\n",
"\n",
"Our ground truth matches the predicted answer, so the evaluation passes!\n",
"\n",
"Evaluation\n",
"\n",
"Question\n",
"\n",
"Ground Truth\n",
"\n",
"Predicted Answer\n",
"\n",
"What is the population \n",
"of Canada?\n",
"\n",
"The population of Canada in \n",
"\u001b[1;36m2023\u001b[0m is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people.\n",
"\n",
"There are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people \n",
"in Canada as of \u001b[1;36m2023\u001b[0m.\n",
"\n",
"\u001b[1;36m5\u001b[0m\n",
"\n",
"\n",
"\n",
"\n",
"An evaluation, or \u001b[32m\"eval,\"\u001b[0m involves a question and a correct answer, known as the ground truth. In this example, the\n",
"question posed is, \u001b[32m\"What is the population of Canada?\"\u001b[0m \n",
"\n",
"The process begins with a person asking this question. The language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m initially does not know the answer \n",
"and decides to use a tool to find it. The LLM takes the action of searching, with the input being the question \n",
"about Canada's population.\n",
"\n",
"The search tool then provides the answer: \u001b[32m\"The current population of Canada is 39,566,248 as of Tuesday, May 23, \u001b[0m\n",
"\u001b[32m2023.\"\u001b[0m This result matches the actual result expected, which is that there are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people in Canada as of \n",
"\u001b[1;36m2023\u001b[0m. \n",
"\n",
"This example illustrates how evaluations are used to verify the accuracy of information provided by a language \n",
"model.\n",
"\n",
"This slide provides an example of an evaluation process, often referred to as \u001b[32m\"evals.\"\u001b[0m The purpose of evals is to \n",
"compare a predicted answer to a known correct answer, called the \u001b[32m\"ground truth,\"\u001b[0m to determine if they match.\n",
"\n",
"In this example, the question posed is: \u001b[32m\"What is the population of Canada?\"\u001b[0m The ground truth states that the \n",
"population of Canada in \u001b[1;36m2023\u001b[0m is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people. The predicted answer is: \u001b[32m\"There are 39,566,248 people in Canada \u001b[0m\n",
"\u001b[32mas of 2023.\"\u001b[0m\n",
"\n",
"Since the predicted answer matches the ground truth, the evaluation is successful, as indicated by a checkmark. \n",
"This process is crucial for verifying the accuracy of predictions in various applications.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"\n",
"Metric-based evaluations\n",
"\n",
"Component evaluations\n",
"\n",
"Subjective evaluations\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Comparison metrics like \n",
"BLEU, ROUGE\n",
"\n",
"Gives a score to filter and \n",
"rank results\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Compares ground \n",
"truth to prediction\n",
"\n",
"Gives Pass/Fail\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Uses a scorecard to \n",
"evaluate subjectively\n",
"\n",
"Scorecard may also \n",
"have a Pass/Fail\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"\n",
"Metric-based evaluations\n",
"\n",
"Component evaluations\n",
"\n",
"Subjective evaluations\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Comparison metrics like \n",
"BLEU, ROUGE\n",
"\n",
"Gives a score to filter and \n",
"rank results\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Compares ground \n",
"truth to prediction\n",
"\n",
"Gives Pass/Fail\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Uses a scorecard to \n",
"evaluate subjectively\n",
"\n",
"Scorecard may also \n",
"have a Pass/Fail\n",
"\n",
"\u001b[1;36m6\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Metric-based evaluations\n",
"\n",
"ROUGE is a common metric for evaluating machine summarizations of text\n",
"\n",
"ROUGE\n",
"\n",
"Metric for evaluating \n",
"summarization tasks\n",
"\n",
"Original\n",
"\n",
"OpenAI's mission is to ensure that \n",
"artificial general intelligence <span style=\"font-weight: bold\">(</span>AGI<span style=\"font-weight: bold\">)</span> \n",
"benefits all of humanity. OpenAI \n",
"will build safe and beneficial AGI \n",
"directly, but will also consider its \n",
"mission fulfilled if its work aids \n",
"others to achieve this outcome. \n",
"OpenAI follows several key \n",
"principles for this purpose. First, \n",
"broadly distributed benefits - any \n",
"influence over AGI's deployment \n",
"will be used for the benefit of all, \n",
"and to avoid harmful uses or undue \n",
"concentration of power…\n",
"\n",
"Machine \n",
"Summary\n",
"\n",
"OpenAI aims to ensure AGI is \n",
"for everyone's use, totally \n",
"avoiding harmful stuff or big \n",
"power concentration. \n",
"Committed to researching \n",
"AGI's safe side, promoting \n",
"these studies in AI folks. \n",
"OpenAI wants to be top in AI \n",
"things and works with \n",
"worldwide research, policy \n",
"groups to figure AGI's stuff.\n",
"\n",
"ROUGE \n",
"Score\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.51162</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Metric-based evaluations\n",
"\n",
"ROUGE is a common metric for evaluating machine summarizations of text\n",
"\n",
"ROUGE\n",
"\n",
"Metric for evaluating \n",
"summarization tasks\n",
"\n",
"Original\n",
"\n",
"OpenAI's mission is to ensure that \n",
"artificial general intelligence \u001b[1m(\u001b[0mAGI\u001b[1m)\u001b[0m \n",
"benefits all of humanity. OpenAI \n",
"will build safe and beneficial AGI \n",
"directly, but will also consider its \n",
"mission fulfilled if its work aids \n",
"others to achieve this outcome. \n",
"OpenAI follows several key \n",
"principles for this purpose. First, \n",
"broadly distributed benefits - any \n",
"influence over AGI's deployment \n",
"will be used for the benefit of all, \n",
"and to avoid harmful uses or undue \n",
"concentration of power…\n",
"\n",
"Machine \n",
"Summary\n",
"\n",
"OpenAI aims to ensure AGI is \n",
"for everyone's use, totally \n",
"avoiding harmful stuff or big \n",
"power concentration. \n",
"Committed to researching \n",
"AGI's safe side, promoting \n",
"these studies in AI folks. \n",
"OpenAI wants to be top in AI \n",
"things and works with \n",
"worldwide research, policy \n",
"groups to figure AGI's stuff.\n",
"\n",
"ROUGE \n",
"Score\n",
"\n",
"\u001b[1;36m0.51162\u001b[0m\n",
"\n",
"\u001b[1;36m7\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Metric-based evaluations\n",
"\n",
"BLEU score is another standard metric, this time focusing on machine translation tasks\n",
"\n",
"BLEU\n",
"\n",
"Original text\n",
"\n",
"Reference\n",
"Translation\n",
"\n",
"Predicted \n",
"Translation\n",
"\n",
"Metric for \n",
"evaluating \n",
"translation tasks\n",
"\n",
"Y gwir oedd \n",
"doedden nhw \n",
"ddim yn dweud \n",
"celwyddau wedi'r \n",
"cwbl.\n",
"\n",
"The truth was \n",
"they were not \n",
"telling lies after \n",
"all.\n",
"\n",
"The truth was \n",
"they weren't \n",
"telling lies after \n",
"all.\n",
"\n",
"BLEU \n",
"Score\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.39938</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Metric-based evaluations\n",
"\n",
"BLEU score is another standard metric, this time focusing on machine translation tasks\n",
"\n",
"BLEU\n",
"\n",
"Original text\n",
"\n",
"Reference\n",
"Translation\n",
"\n",
"Predicted \n",
"Translation\n",
"\n",
"Metric for \n",
"evaluating \n",
"translation tasks\n",
"\n",
"Y gwir oedd \n",
"doedden nhw \n",
"ddim yn dweud \n",
"celwyddau wedi'r \n",
"cwbl.\n",
"\n",
"The truth was \n",
"they were not \n",
"telling lies after \n",
"all.\n",
"\n",
"The truth was \n",
"they weren't \n",
"telling lies after \n",
"all.\n",
"\n",
"BLEU \n",
"Score\n",
"\n",
"\u001b[1;36m0.39938\u001b[0m\n",
"\n",
"\u001b[1;36m8\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Metric-based evaluations\n",
"\n",
"What theyre good for\n",
"\n",
"What to be aware of\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"A good starting point for evaluating a \n",
"\n",
"● Not tuned to your specific context\n",
"\n",
"fresh solution\n",
"\n",
"Useful yardstick for automated testing \n",
"\n",
"of whether a change has triggered a \n",
"\n",
"major performance shift\n",
"\n",
"● Most customers require more \n",
"\n",
"sophisticated evaluations to go to \n",
"\n",
"production\n",
"\n",
"● Cheap and fast\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">9</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Metric-based evaluations\n",
"\n",
"What theyre good for\n",
"\n",
"What to be aware of\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"A good starting point for evaluating a \n",
"\n",
"● Not tuned to your specific context\n",
"\n",
"fresh solution\n",
"\n",
"Useful yardstick for automated testing \n",
"\n",
"of whether a change has triggered a \n",
"\n",
"major performance shift\n",
"\n",
"● Most customers require more \n",
"\n",
"sophisticated evaluations to go to \n",
"\n",
"production\n",
"\n",
"● Cheap and fast\n",
"\n",
"\u001b[1;36m9\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Component evaluations\n",
"\n",
"Component evaluations <span style=\"font-weight: bold\">(</span>or “unit tests”<span style=\"font-weight: bold\">)</span> cover a single input/output of the application. They check \n",
"whether each component works in isolation, comparing the input to a ground truth ideal result\n",
"\n",
"Is this the \n",
"correct action?\n",
"\n",
"Exact match \n",
"comparison\n",
"\n",
"Does this answer \n",
"use the context?\n",
"\n",
"Extract numbers \n",
"from each and \n",
"compare\n",
"\n",
"What is the population \n",
"of Canada?\n",
"\n",
"Thought: I dont know. I \n",
"should use a tool\n",
"Action: Search\n",
"Action Input: What is the \n",
"population of Canada?\n",
"\n",
"Agent\n",
"\n",
"Search\n",
"\n",
"There are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people \n",
"in Canada as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"\n",
"The current population of \n",
"Canada is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> as of \n",
"Tuesday, May <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>….\n",
"\n",
"Is this the right \n",
"search result?\n",
"\n",
"Tag the right \n",
"answer and do \n",
"an exact match \n",
"comparison with \n",
"the retrieval.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Component evaluations\n",
"\n",
"Component evaluations \u001b[1m(\u001b[0mor “unit tests”\u001b[1m)\u001b[0m cover a single input/output of the application. They check \n",
"whether each component works in isolation, comparing the input to a ground truth ideal result\n",
"\n",
"Is this the \n",
"correct action?\n",
"\n",
"Exact match \n",
"comparison\n",
"\n",
"Does this answer \n",
"use the context?\n",
"\n",
"Extract numbers \n",
"from each and \n",
"compare\n",
"\n",
"What is the population \n",
"of Canada?\n",
"\n",
"Thought: I dont know. I \n",
"should use a tool\n",
"Action: Search\n",
"Action Input: What is the \n",
"population of Canada?\n",
"\n",
"Agent\n",
"\n",
"Search\n",
"\n",
"There are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people \n",
"in Canada as of \u001b[1;36m2023\u001b[0m.\n",
"\n",
"The current population of \n",
"Canada is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m as of \n",
"Tuesday, May \u001b[1;36m23\u001b[0m, \u001b[1;36m2023\u001b[0m….\n",
"\n",
"Is this the right \n",
"search result?\n",
"\n",
"Tag the right \n",
"answer and do \n",
"an exact match \n",
"comparison with \n",
"the retrieval.\n",
"\n",
"\u001b[1;36m10\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Subjective evaluations\n",
"\n",
"Building up a good scorecard for automated testing benefits from a few rounds of detailed human \n",
"review so we can learn what is valuable. \n",
"\n",
"A policy of “show rather than tell” is also advised for GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, so include examples of what a <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> and \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> out of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> look like so the model can appreciate the spread.\n",
"\n",
"Example \n",
"scorecard\n",
"\n",
"You are a helpful evaluation assistant who grades how well the Assistant has answered the customers query.\n",
"\n",
"You will assess each submission against these metrics, please think through these step by step:\n",
"\n",
"-\n",
"\n",
"relevance: Grade how relevant the search content is to the question from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> <span style=\"color: #800080; text-decoration-color: #800080\">//</span> <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> being highly relevant and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> \n",
"being \n",
"not relevant at all.\n",
"\n",
"- credibility: Grade how credible the sources provided are from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> <span style=\"color: #800080; text-decoration-color: #800080\">//</span> <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> being an established newspaper, \n",
"\n",
"-\n",
"\n",
"government agency or large company and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> being unreferenced.\n",
"result: Assess whether the question is correct given only the content returned from the search and the users \n",
"question <span style=\"color: #800080; text-decoration-color: #800080\">//</span> acceptable values are “correct” or “incorrect”\n",
"\n",
"You will output this as a JSON document: <span style=\"font-weight: bold\">{</span>relevance: integer, credibility: integer, result: string<span style=\"font-weight: bold\">}</span>\n",
"\n",
"User: What is the population of Canada?\n",
"Assistant: Canada's population was estimated at <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">858</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">480</span> on April <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> by Statistics Canada.\n",
"Evaluation: <span style=\"font-weight: bold\">{</span>relevance: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, credibility: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, result: correct<span style=\"font-weight: bold\">}</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">11</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Subjective evaluations\n",
"\n",
"Building up a good scorecard for automated testing benefits from a few rounds of detailed human \n",
"review so we can learn what is valuable. \n",
"\n",
"A policy of “show rather than tell” is also advised for GPT-\u001b[1;36m4\u001b[0m, so include examples of what a \u001b[1;36m1\u001b[0m, \u001b[1;36m3\u001b[0m and \n",
"\u001b[1;36m8\u001b[0m out of \u001b[1;36m10\u001b[0m look like so the model can appreciate the spread.\n",
"\n",
"Example \n",
"scorecard\n",
"\n",
"You are a helpful evaluation assistant who grades how well the Assistant has answered the customers query.\n",
"\n",
"You will assess each submission against these metrics, please think through these step by step:\n",
"\n",
"-\n",
"\n",
"relevance: Grade how relevant the search content is to the question from \u001b[1;36m1\u001b[0m to \u001b[1;36m5\u001b[0m \u001b[35m/\u001b[0m\u001b[35m/\u001b[0m \u001b[1;36m5\u001b[0m being highly relevant and \u001b[1;36m1\u001b[0m \n",
"being \n",
"not relevant at all.\n",
"\n",
"- credibility: Grade how credible the sources provided are from \u001b[1;36m1\u001b[0m to \u001b[1;36m5\u001b[0m \u001b[35m/\u001b[0m\u001b[35m/\u001b[0m \u001b[1;36m5\u001b[0m being an established newspaper, \n",
"\n",
"-\n",
"\n",
"government agency or large company and \u001b[1;36m1\u001b[0m being unreferenced.\n",
"result: Assess whether the question is correct given only the content returned from the search and the users \n",
"question \u001b[35m/\u001b[0m\u001b[35m/\u001b[0m acceptable values are “correct” or “incorrect”\n",
"\n",
"You will output this as a JSON document: \u001b[1m{\u001b[0mrelevance: integer, credibility: integer, result: string\u001b[1m}\u001b[0m\n",
"\n",
"User: What is the population of Canada?\n",
"Assistant: Canada's population was estimated at \u001b[1;36m39\u001b[0m,\u001b[1;36m858\u001b[0m,\u001b[1;36m480\u001b[0m on April \u001b[1;36m1\u001b[0m, \u001b[1;36m2023\u001b[0m by Statistics Canada.\n",
"Evaluation: \u001b[1m{\u001b[0mrelevance: \u001b[1;36m5\u001b[0m, credibility: \u001b[1;36m5\u001b[0m, result: correct\u001b[1m}\u001b[0m\n",
"\n",
"\u001b[1;36m11\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Example framework\n",
"\n",
"Your evaluations can be grouped up into test suites called runs and executed in a batch to test \n",
"the effectiveness of your system.\n",
"\n",
"Each run should have its contents logged and stored at the most granular level possible \n",
"<span style=\"font-weight: bold\">(</span>“tracing”<span style=\"font-weight: bold\">)</span> so you can investigate failure reasons, make tweaks and then rerun your evals.\n",
"\n",
"Run ID Model\n",
"\n",
"Score\n",
"\n",
"Annotation feedback\n",
"\n",
"Changes since last run\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">28</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">36</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">34</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18</span> incorrect with correct search results\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
"\n",
"N/A\n",
"\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> incorrect with correct search results\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
"\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">12</span> incorrect with correct search results\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
"\n",
"Model updated to GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"Added few-shot examples\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">42</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> incorrect with correct search results\n",
"\n",
"Added metadata to search\n",
"Prompt engineering for Answer step\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">48</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> incorrect with correct search results\n",
"\n",
"Prompt engineering to Answer step\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">12</span>\n",
"\n",
"\n",
"\n",
"\n",
"This diagram illustrates a framework for processing a return request using a language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> system. Here's a \n",
"breakdown of the process:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **User Input**: The user wants to return a T-shirt purchased on Amazon on March 3rd.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Router**: The initial input is processed by a router LLM, which determines the nature of the request. The \n",
"expected and predicted outcomes are both <span style=\"color: #008000; text-decoration-color: #008000\">\"return,\"</span> and the process passes this evaluation.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Return Assistant**: The request is then handled by a return assistant LLM. It interacts with a knowledge base \n",
"to verify the return policy.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Knowledge Base**: The system checks the return policy, confirming that the item is eligible for return within \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days of purchase. The expected and predicted outcomes are <span style=\"color: #008000; text-decoration-color: #008000\">\"return_policy,\"</span> and this step also passes.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>. **Response to User**: The system responds to the user, confirming that the return can be processed because it is\n",
"within the <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span>-day window.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span>. **Evaluation**: The response is evaluated for adherence to guidelines, scoring <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> for politeness, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for \n",
"coherence, and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for relevancy, resulting in a pass.\n",
"\n",
"The framework uses both component evaluations <span style=\"font-weight: bold\">(</span>red dashed lines<span style=\"font-weight: bold\">)</span> and subjective evaluations <span style=\"font-weight: bold\">(</span>orange dashed lines<span style=\"font-weight: bold\">)</span> \n",
"to ensure the process is accurate and user-friendly.\n",
"</pre>\n"
],
"text/plain": [
"Example framework\n",
"\n",
"Your evaluations can be grouped up into test suites called runs and executed in a batch to test \n",
"the effectiveness of your system.\n",
"\n",
"Each run should have its contents logged and stored at the most granular level possible \n",
"\u001b[1m(\u001b[0m“tracing”\u001b[1m)\u001b[0m so you can investigate failure reasons, make tweaks and then rerun your evals.\n",
"\n",
"Run ID Model\n",
"\n",
"Score\n",
"\n",
"Annotation feedback\n",
"\n",
"Changes since last run\n",
"\n",
"\u001b[1;36m1\u001b[0m\n",
"\n",
"\u001b[1;36m2\u001b[0m\n",
"\n",
"\u001b[1;36m3\u001b[0m\n",
"\n",
"\u001b[1;36m4\u001b[0m\n",
"\n",
"\u001b[1;36m5\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo \u001b[1;36m28\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m4\u001b[0m\n",
"\n",
"\u001b[1;36m36\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo \u001b[1;36m34\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"\n",
"● \u001b[1;36m18\u001b[0m incorrect with correct search results\n",
"● \u001b[1;36m4\u001b[0m incorrect searches\n",
"\n",
"N/A\n",
"\n",
"● \u001b[1;36m10\u001b[0m incorrect with correct search results\n",
"● \u001b[1;36m4\u001b[0m incorrect searches\n",
"\n",
"● \u001b[1;36m12\u001b[0m incorrect with correct search results\n",
"● \u001b[1;36m4\u001b[0m incorrect searches\n",
"\n",
"Model updated to GPT-\u001b[1;36m4\u001b[0m\n",
"\n",
"Added few-shot examples\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo \u001b[1;36m42\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"\n",
"● \u001b[1;36m8\u001b[0m incorrect with correct search results\n",
"\n",
"Added metadata to search\n",
"Prompt engineering for Answer step\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo \u001b[1;36m48\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"\n",
"● \u001b[1;36m2\u001b[0m incorrect with correct search results\n",
"\n",
"Prompt engineering to Answer step\n",
"\n",
"\u001b[1;36m12\u001b[0m\n",
"\n",
"\n",
"\n",
"\n",
"This diagram illustrates a framework for processing a return request using a language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m system. Here's a \n",
"breakdown of the process:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **User Input**: The user wants to return a T-shirt purchased on Amazon on March 3rd.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Router**: The initial input is processed by a router LLM, which determines the nature of the request. The \n",
"expected and predicted outcomes are both \u001b[32m\"return,\"\u001b[0m and the process passes this evaluation.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Return Assistant**: The request is then handled by a return assistant LLM. It interacts with a knowledge base \n",
"to verify the return policy.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Knowledge Base**: The system checks the return policy, confirming that the item is eligible for return within \n",
"\u001b[1;36m14\u001b[0m days of purchase. The expected and predicted outcomes are \u001b[32m\"return_policy,\"\u001b[0m and this step also passes.\n",
"\n",
"\u001b[1;36m5\u001b[0m. **Response to User**: The system responds to the user, confirming that the return can be processed because it is\n",
"within the \u001b[1;36m14\u001b[0m-day window.\n",
"\n",
"\u001b[1;36m6\u001b[0m. **Evaluation**: The response is evaluated for adherence to guidelines, scoring \u001b[1;36m5\u001b[0m for politeness, \u001b[1;36m4\u001b[0m for \n",
"coherence, and \u001b[1;36m4\u001b[0m for relevancy, resulting in a pass.\n",
"\n",
"The framework uses both component evaluations \u001b[1m(\u001b[0mred dashed lines\u001b[1m)\u001b[0m and subjective evaluations \u001b[1m(\u001b[0morange dashed lines\u001b[1m)\u001b[0m \n",
"to ensure the process is accurate and user-friendly.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Example framework\n",
"\n",
"I want to return a \n",
"T-shirt I bought on \n",
"Amazon on March 3rd.\n",
"\n",
"User\n",
"\n",
"Router\n",
"\n",
"LLM\n",
"\n",
"Expected: return\n",
"Predicted: return\n",
"PASS\n",
"\n",
"Return\n",
"Assistant\n",
"\n",
"LLM\n",
"\n",
"Component evals\n",
"\n",
"Subjective evals\n",
"\n",
"Expected: return_policy\n",
"Predicted: return_policy\n",
"PASS\n",
"\n",
"Knowledge \n",
"base\n",
"\n",
"Question: Does this response adhere to \n",
"our guidelines\n",
"Score: \n",
"Politeness: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, Coherence: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, Relevancy: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"PASS\n",
"\n",
"Sure - because were \n",
"within <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days of the \n",
"purchase, I can \n",
"process the return\n",
"\n",
"Question: I want to return a T-shirt I \n",
"bought on Amazon on March 3rd.\n",
"Ground truth: Eligible for return\n",
"PASS\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>\n",
"\n",
"\n",
"\n",
"\n",
"This diagram illustrates a framework for processing a return request using a language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> system. Here's a \n",
"breakdown of the process:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **User Input**: The user wants to return a T-shirt purchased on Amazon on March 3rd.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Router**: The initial input is processed by a router LLM, which determines the nature of the request. The \n",
"expected and predicted outcomes are both <span style=\"color: #008000; text-decoration-color: #008000\">\"return,\"</span> and the process passes this evaluation.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Return Assistant**: The request is then handled by a return assistant LLM. It interacts with a knowledge base \n",
"to verify the return policy.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Knowledge Base**: The system checks the return policy, confirming that the item is eligible for return within \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days of purchase. The expected and predicted outcomes are <span style=\"color: #008000; text-decoration-color: #008000\">\"return_policy,\"</span> and this step also passes.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>. **Response to User**: The system responds to the user, confirming that the return can be processed because it is\n",
"within the <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span>-day window.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span>. **Evaluation**: The response is evaluated for adherence to guidelines, scoring <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> for politeness, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for \n",
"coherence, and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for relevancy, resulting in a pass.\n",
"\n",
"The framework uses both component evaluations <span style=\"font-weight: bold\">(</span>red dashed lines<span style=\"font-weight: bold\">)</span> and subjective evaluations <span style=\"font-weight: bold\">(</span>orange dashed lines<span style=\"font-weight: bold\">)</span> \n",
"to ensure the process is accurate and user-friendly.\n",
"</pre>\n"
],
"text/plain": [
"Example framework\n",
"\n",
"I want to return a \n",
"T-shirt I bought on \n",
"Amazon on March 3rd.\n",
"\n",
"User\n",
"\n",
"Router\n",
"\n",
"LLM\n",
"\n",
"Expected: return\n",
"Predicted: return\n",
"PASS\n",
"\n",
"Return\n",
"Assistant\n",
"\n",
"LLM\n",
"\n",
"Component evals\n",
"\n",
"Subjective evals\n",
"\n",
"Expected: return_policy\n",
"Predicted: return_policy\n",
"PASS\n",
"\n",
"Knowledge \n",
"base\n",
"\n",
"Question: Does this response adhere to \n",
"our guidelines\n",
"Score: \n",
"Politeness: \u001b[1;36m5\u001b[0m, Coherence: \u001b[1;36m4\u001b[0m, Relevancy: \u001b[1;36m4\u001b[0m\n",
"PASS\n",
"\n",
"Sure - because were \n",
"within \u001b[1;36m14\u001b[0m days of the \n",
"purchase, I can \n",
"process the return\n",
"\n",
"Question: I want to return a T-shirt I \n",
"bought on Amazon on March 3rd.\n",
"Ground truth: Eligible for return\n",
"PASS\n",
"\n",
"\u001b[1;36m13\u001b[0m\n",
"\n",
"\n",
"\n",
"\n",
"This diagram illustrates a framework for processing a return request using a language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m system. Here's a \n",
"breakdown of the process:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **User Input**: The user wants to return a T-shirt purchased on Amazon on March 3rd.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Router**: The initial input is processed by a router LLM, which determines the nature of the request. The \n",
"expected and predicted outcomes are both \u001b[32m\"return,\"\u001b[0m and the process passes this evaluation.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Return Assistant**: The request is then handled by a return assistant LLM. It interacts with a knowledge base \n",
"to verify the return policy.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Knowledge Base**: The system checks the return policy, confirming that the item is eligible for return within \n",
"\u001b[1;36m14\u001b[0m days of purchase. The expected and predicted outcomes are \u001b[32m\"return_policy,\"\u001b[0m and this step also passes.\n",
"\n",
"\u001b[1;36m5\u001b[0m. **Response to User**: The system responds to the user, confirming that the return can be processed because it is\n",
"within the \u001b[1;36m14\u001b[0m-day window.\n",
"\n",
"\u001b[1;36m6\u001b[0m. **Evaluation**: The response is evaluated for adherence to guidelines, scoring \u001b[1;36m5\u001b[0m for politeness, \u001b[1;36m4\u001b[0m for \n",
"coherence, and \u001b[1;36m4\u001b[0m for relevancy, resulting in a pass.\n",
"\n",
"The framework uses both component evaluations \u001b[1m(\u001b[0mred dashed lines\u001b[1m)\u001b[0m and subjective evaluations \u001b[1m(\u001b[0morange dashed lines\u001b[1m)\u001b[0m \n",
"to ensure the process is accurate and user-friendly.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Best practices\n",
"\n",
"Log everything\n",
"\n",
"●\n",
"\n",
"Evals need test cases - log everything as you develop so you can mine your logs for good eval cases\n",
"\n",
"Create a feedback loop\n",
"\n",
"●\n",
"●\n",
"\n",
"Build evals into your application so you can quickly run them, iterate and rerun to see the impact\n",
"Evals also provide a useful structure for few-shot or fine-tuning examples when optimizing\n",
"\n",
"Employ expert labellers who know the process\n",
"\n",
"● Use experts to help create your eval cases - these need to be as lifelike as possible\n",
"\n",
"Evaluate early and often\n",
"\n",
"●\n",
"\n",
"Evals are something you should build as soon as you have your first functioning prompt - you wont be \n",
"able to optimize without this baseline, so build it early\n",
"\n",
"● Making evals early also forces you to engage with what a good response looks like\n",
"\n",
"\n",
"\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Log Everything**\n",
" - It's important to log all test cases during development. This allows you to mine your logs for effective \n",
"evaluation cases.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Create a Feedback Loop**\n",
" - Integrate evaluations into your application to quickly run, iterate, and rerun them to observe impacts.\n",
" - Evaluations provide a useful structure for few-shot or fine-tuning examples during optimization.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Employ Expert Labelers Who Know the Process**\n",
" - Use experts to help create evaluation cases, ensuring they are as realistic as possible.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Evaluate Early and Often**\n",
" - Build evaluations as soon as you have a functioning prompt. This baseline is crucial for optimization.\n",
" - Early evaluations help you understand what a good response looks like, facilitating better engagement.\n",
"</pre>\n"
],
"text/plain": [
"Best practices\n",
"\n",
"Log everything\n",
"\n",
"●\n",
"\n",
"Evals need test cases - log everything as you develop so you can mine your logs for good eval cases\n",
"\n",
"Create a feedback loop\n",
"\n",
"●\n",
"●\n",
"\n",
"Build evals into your application so you can quickly run them, iterate and rerun to see the impact\n",
"Evals also provide a useful structure for few-shot or fine-tuning examples when optimizing\n",
"\n",
"Employ expert labellers who know the process\n",
"\n",
"● Use experts to help create your eval cases - these need to be as lifelike as possible\n",
"\n",
"Evaluate early and often\n",
"\n",
"●\n",
"\n",
"Evals are something you should build as soon as you have your first functioning prompt - you wont be \n",
"able to optimize without this baseline, so build it early\n",
"\n",
"● Making evals early also forces you to engage with what a good response looks like\n",
"\n",
"\n",
"\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Log Everything**\n",
" - It's important to log all test cases during development. This allows you to mine your logs for effective \n",
"evaluation cases.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Create a Feedback Loop**\n",
" - Integrate evaluations into your application to quickly run, iterate, and rerun them to observe impacts.\n",
" - Evaluations provide a useful structure for few-shot or fine-tuning examples during optimization.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Employ Expert Labelers Who Know the Process**\n",
" - Use experts to help create evaluation cases, ensuring they are as realistic as possible.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Evaluate Early and Often**\n",
" - Build evaluations as soon as you have a functioning prompt. This baseline is crucial for optimization.\n",
" - Early evaluations help you understand what a good response looks like, facilitating better engagement.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">## Overview\n",
"\n",
"Evaluation is the process of validating and testing the outputs that your Large Language Model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> applications \n",
"are producing. Strong evaluations, referred to as <span style=\"color: #008000; text-decoration-color: #008000\">\"evals,\"</span> contribute to creating a more stable and reliable \n",
"application that can withstand changes in code and model updates.\n",
"\n",
"### Example Use Cases\n",
"- **Quantify a solutions reliability**: Measure how dependable your application is.\n",
"- **Monitor application performance in production**: Keep track of how well your application performs in real-world\n",
"scenarios.\n",
"- **Test for regressions**: Ensure that new updates do not negatively impact existing functionality.\n",
"\n",
"### What Well Cover\n",
"- **What are evals**: Understanding the concept and importance of evaluations.\n",
"- **Technical patterns**: Exploring common methods and strategies used in evaluations.\n",
"- **Example framework**: Providing a structured approach to implementing evaluations.\n",
"- **Best practices**: Sharing tips and guidelines for effective evaluations.\n",
"- **Resources**: Offering additional materials for further learning and exploration.\n",
"</pre>\n"
],
"text/plain": [
"## Overview\n",
"\n",
"Evaluation is the process of validating and testing the outputs that your Large Language Model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m applications \n",
"are producing. Strong evaluations, referred to as \u001b[32m\"evals,\"\u001b[0m contribute to creating a more stable and reliable \n",
"application that can withstand changes in code and model updates.\n",
"\n",
"### Example Use Cases\n",
"- **Quantify a solutions reliability**: Measure how dependable your application is.\n",
"- **Monitor application performance in production**: Keep track of how well your application performs in real-world\n",
"scenarios.\n",
"- **Test for regressions**: Ensure that new updates do not negatively impact existing functionality.\n",
"\n",
"### What Well Cover\n",
"- **What are evals**: Understanding the concept and importance of evaluations.\n",
"- **Technical patterns**: Exploring common methods and strategies used in evaluations.\n",
"- **Example framework**: Providing a structured approach to implementing evaluations.\n",
"- **Best practices**: Sharing tips and guidelines for effective evaluations.\n",
"- **Resources**: Offering additional materials for further learning and exploration.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns**\n",
"\n",
"This slide outlines three types of evaluation methods used in technical assessments:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Metric-based Evaluations**:\n",
" - These evaluations use comparison metrics such as BLEU and ROUGE. \n",
" - They provide a score that helps in filtering and ranking results, making it easier to assess the quality of \n",
"outputs quantitatively.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Component Evaluations**:\n",
" - This method involves comparing the ground truth to predictions.\n",
" - It results in a simple Pass/Fail outcome, which is useful for determining whether specific components meet the\n",
"required standards.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Subjective Evaluations**:\n",
" - These evaluations rely on a scorecard to assess outputs subjectively.\n",
" - The scorecard can also include a Pass/Fail option, allowing for a more nuanced evaluation that considers \n",
"qualitative aspects.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns**\n",
"\n",
"This slide outlines three types of evaluation methods used in technical assessments:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Metric-based Evaluations**:\n",
" - These evaluations use comparison metrics such as BLEU and ROUGE. \n",
" - They provide a score that helps in filtering and ranking results, making it easier to assess the quality of \n",
"outputs quantitatively.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Component Evaluations**:\n",
" - This method involves comparing the ground truth to predictions.\n",
" - It results in a simple Pass/Fail outcome, which is useful for determining whether specific components meet the\n",
"required standards.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Subjective Evaluations**:\n",
" - These evaluations rely on a scorecard to assess outputs subjectively.\n",
" - The scorecard can also include a Pass/Fail option, allowing for a more nuanced evaluation that considers \n",
"qualitative aspects.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Metric-based Evaluations\n",
"\n",
"ROUGE is a common metric for evaluating machine summarizations of text. It is specifically used to assess the \n",
"quality of summaries by comparing them to reference summaries. The slide provides an example of how ROUGE is \n",
"applied:\n",
"\n",
"- **Original Text**: This is a detailed description of OpenAI's mission, emphasizing the development of artificial \n",
"general intelligence <span style=\"font-weight: bold\">(</span>AGI<span style=\"font-weight: bold\">)</span> that benefits humanity. It highlights the importance of safety, broad distribution of \n",
"benefits, and avoiding harmful uses or power concentration.\n",
"\n",
"- **Machine Summary**: This is a condensed version of the original text. It focuses on ensuring AGI is safe and \n",
"accessible, avoiding harm and power concentration, and promoting research and collaboration in AI.\n",
"\n",
"- **ROUGE Score**: The score given is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.51162</span>, which quantifies the similarity between the machine-generated \n",
"summary and the original text. A higher score indicates a closer match to the reference summary.\n",
"\n",
"Overall, ROUGE helps in evaluating how well a machine-generated summary captures the essence of the original text.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Metric-based Evaluations\n",
"\n",
"ROUGE is a common metric for evaluating machine summarizations of text. It is specifically used to assess the \n",
"quality of summaries by comparing them to reference summaries. The slide provides an example of how ROUGE is \n",
"applied:\n",
"\n",
"- **Original Text**: This is a detailed description of OpenAI's mission, emphasizing the development of artificial \n",
"general intelligence \u001b[1m(\u001b[0mAGI\u001b[1m)\u001b[0m that benefits humanity. It highlights the importance of safety, broad distribution of \n",
"benefits, and avoiding harmful uses or power concentration.\n",
"\n",
"- **Machine Summary**: This is a condensed version of the original text. It focuses on ensuring AGI is safe and \n",
"accessible, avoiding harm and power concentration, and promoting research and collaboration in AI.\n",
"\n",
"- **ROUGE Score**: The score given is \u001b[1;36m0.51162\u001b[0m, which quantifies the similarity between the machine-generated \n",
"summary and the original text. A higher score indicates a closer match to the reference summary.\n",
"\n",
"Overall, ROUGE helps in evaluating how well a machine-generated summary captures the essence of the original text.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"># Technical Patterns: Metric-based Evaluations\n",
"\n",
"The slide discusses the BLEU score, a standard metric used to evaluate machine translation tasks. BLEU stands for \n",
"Bilingual Evaluation Understudy and is a method for assessing the quality of text that has been machine-translated \n",
"from one language to another.\n",
"\n",
"### Key Elements:\n",
"\n",
"- **BLEU**: This is a metric specifically designed for evaluating translation tasks. It compares the \n",
"machine-generated translation to one or more reference translations.\n",
"\n",
"- **Original Text**: The example given is in Welsh: <span style=\"color: #008000; text-decoration-color: #008000\">\"Y gwir oedd doedden nhw ddim yn dweud celwyddau wedi'r cwbl.\"</span>\n",
"\n",
"- **Reference Translation**: This is the human-generated translation used as a standard for comparison: <span style=\"color: #008000; text-decoration-color: #008000\">\"The truth </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">was they were not telling lies after all.\"</span>\n",
"\n",
"- **Predicted Translation**: This is the translation produced by the machine: <span style=\"color: #008000; text-decoration-color: #008000\">\"The truth was they weren't telling </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">lies after all.\"</span>\n",
"\n",
"- **BLEU Score**: The score for this translation is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.39938</span>. This score indicates how closely the machine \n",
"translation matches the reference translation, with a higher score representing a closer match.\n",
"\n",
"The BLEU score is widely used in the field of natural language processing to provide a quantitative measure of \n",
"translation quality.\n",
"</pre>\n"
],
"text/plain": [
"# Technical Patterns: Metric-based Evaluations\n",
"\n",
"The slide discusses the BLEU score, a standard metric used to evaluate machine translation tasks. BLEU stands for \n",
"Bilingual Evaluation Understudy and is a method for assessing the quality of text that has been machine-translated \n",
"from one language to another.\n",
"\n",
"### Key Elements:\n",
"\n",
"- **BLEU**: This is a metric specifically designed for evaluating translation tasks. It compares the \n",
"machine-generated translation to one or more reference translations.\n",
"\n",
"- **Original Text**: The example given is in Welsh: \u001b[32m\"Y gwir oedd doedden nhw ddim yn dweud celwyddau wedi'r cwbl.\"\u001b[0m\n",
"\n",
"- **Reference Translation**: This is the human-generated translation used as a standard for comparison: \u001b[32m\"The truth \u001b[0m\n",
"\u001b[32mwas they were not telling lies after all.\"\u001b[0m\n",
"\n",
"- **Predicted Translation**: This is the translation produced by the machine: \u001b[32m\"The truth was they weren't telling \u001b[0m\n",
"\u001b[32mlies after all.\"\u001b[0m\n",
"\n",
"- **BLEU Score**: The score for this translation is \u001b[1;36m0.39938\u001b[0m. This score indicates how closely the machine \n",
"translation matches the reference translation, with a higher score representing a closer match.\n",
"\n",
"The BLEU score is widely used in the field of natural language processing to provide a quantitative measure of \n",
"translation quality.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Metric-based Evaluations\n",
"\n",
"**What theyre good for:**\n",
"\n",
"- **Starting Point**: They provide a good starting point for evaluating a new solution, helping to establish \n",
"initial benchmarks.\n",
"- **Automated Testing**: These evaluations serve as a useful yardstick for automated testing, particularly in \n",
"determining if a change has caused a significant performance shift.\n",
"- **Cost-Effective**: They are cheap and fast, making them accessible for quick assessments.\n",
"\n",
"**What to be aware of:**\n",
"\n",
"- **Context Specificity**: These evaluations are not tailored to specific contexts, which can limit their \n",
"effectiveness in certain situations.\n",
"- **Sophistication Needs**: Most customers require more sophisticated evaluations before moving to production, \n",
"indicating that metric-based evaluations might not be sufficient on their own for final decision-making.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Metric-based Evaluations\n",
"\n",
"**What theyre good for:**\n",
"\n",
"- **Starting Point**: They provide a good starting point for evaluating a new solution, helping to establish \n",
"initial benchmarks.\n",
"- **Automated Testing**: These evaluations serve as a useful yardstick for automated testing, particularly in \n",
"determining if a change has caused a significant performance shift.\n",
"- **Cost-Effective**: They are cheap and fast, making them accessible for quick assessments.\n",
"\n",
"**What to be aware of:**\n",
"\n",
"- **Context Specificity**: These evaluations are not tailored to specific contexts, which can limit their \n",
"effectiveness in certain situations.\n",
"- **Sophistication Needs**: Most customers require more sophisticated evaluations before moving to production, \n",
"indicating that metric-based evaluations might not be sufficient on their own for final decision-making.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Component Evaluations**\n",
"\n",
"Component evaluations, also known as <span style=\"color: #008000; text-decoration-color: #008000\">\"unit tests,\"</span> focus on assessing a single input/output of an application. The \n",
"goal is to verify that each component functions correctly in isolation by comparing the input to a predefined ideal\n",
"result, known as the ground truth.\n",
"\n",
"**Process Overview:**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Input Question:** \n",
" - The process begins with a question: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Agent's Role:**\n",
" - The agent receives the question and processes it. The agent's thought process is: <span style=\"color: #008000; text-decoration-color: #008000\">\"I dont know. I should use </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">a tool.\"</span>\n",
" - The agent decides on an action: <span style=\"color: #008000; text-decoration-color: #008000\">\"Search.\"</span>\n",
" - The action input is the original question: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Search Component:**\n",
" - The search component is tasked with finding the answer. It retrieves the information: <span style=\"color: #008000; text-decoration-color: #008000\">\"The current population </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of Canada is 39,566,248 as of Tuesday, May 23, 2023….\"</span>\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Evaluation Steps:**\n",
" - **Correct Action Check:** Is the agent's decision to search the correct action?\n",
" - **Exact Match Comparison:** Does the retrieved answer match the expected result exactly?\n",
" - **Contextual Relevance:** Does the answer use the context provided in the question?\n",
" - **Number Extraction and Comparison:** Extract numbers from both the expected and retrieved answers and compare\n",
"them for accuracy.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>. **Final Output:**\n",
" - The final output is the verified answer: <span style=\"color: #008000; text-decoration-color: #008000\">\"There are 39,566,248 people in Canada as of 2023.\"</span>\n",
"\n",
"This process ensures that each component of the application is functioning correctly and producing accurate results\n",
"by systematically evaluating each step against the ground truth.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Component Evaluations**\n",
"\n",
"Component evaluations, also known as \u001b[32m\"unit tests,\"\u001b[0m focus on assessing a single input/output of an application. The \n",
"goal is to verify that each component functions correctly in isolation by comparing the input to a predefined ideal\n",
"result, known as the ground truth.\n",
"\n",
"**Process Overview:**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Input Question:** \n",
" - The process begins with a question: \u001b[32m\"What is the population of Canada?\"\u001b[0m\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Agent's Role:**\n",
" - The agent receives the question and processes it. The agent's thought process is: \u001b[32m\"I dont know. I should use \u001b[0m\n",
"\u001b[32ma tool.\"\u001b[0m\n",
" - The agent decides on an action: \u001b[32m\"Search.\"\u001b[0m\n",
" - The action input is the original question: \u001b[32m\"What is the population of Canada?\"\u001b[0m\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Search Component:**\n",
" - The search component is tasked with finding the answer. It retrieves the information: \u001b[32m\"The current population \u001b[0m\n",
"\u001b[32mof Canada is 39,566,248 as of Tuesday, May 23, 2023….\"\u001b[0m\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Evaluation Steps:**\n",
" - **Correct Action Check:** Is the agent's decision to search the correct action?\n",
" - **Exact Match Comparison:** Does the retrieved answer match the expected result exactly?\n",
" - **Contextual Relevance:** Does the answer use the context provided in the question?\n",
" - **Number Extraction and Comparison:** Extract numbers from both the expected and retrieved answers and compare\n",
"them for accuracy.\n",
"\n",
"\u001b[1;36m5\u001b[0m. **Final Output:**\n",
" - The final output is the verified answer: \u001b[32m\"There are 39,566,248 people in Canada as of 2023.\"\u001b[0m\n",
"\n",
"This process ensures that each component of the application is functioning correctly and producing accurate results\n",
"by systematically evaluating each step against the ground truth.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Subjective Evaluations**\n",
"\n",
"Building an effective scorecard for automated testing is enhanced by incorporating detailed human reviews. This \n",
"process helps identify what is truly valuable. The approach of <span style=\"color: #008000; text-decoration-color: #008000\">\"show rather than tell\"</span> is recommended for GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, \n",
"meaning that examples of scores like <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>, and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> out of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> should be provided to help the model understand the \n",
"range.\n",
"\n",
"**Example Scorecard:**\n",
"\n",
"- **Role**: You are an evaluation assistant assessing how well the Assistant has answered a customer's query.\n",
" \n",
"- **Metrics for Assessment**:\n",
" - **Relevance**: Rate the relevance of the search content to the question on a scale from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, where <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> is \n",
"highly relevant and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> is not relevant at all.\n",
" - **Credibility**: Rate the credibility of the sources from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, where <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> is an established newspaper, \n",
"government agency, or large company, and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> is unreferenced.\n",
" - **Result**: Determine if the question is answered correctly based on the search content and the user's \n",
"question. Acceptable values are <span style=\"color: #008000; text-decoration-color: #008000\">\"correct\"</span> or <span style=\"color: #008000; text-decoration-color: #008000\">\"incorrect.\"</span>\n",
"\n",
"- **Output Format**: Provide the evaluation as a JSON document with fields for relevance, credibility, and result.\n",
"\n",
"**Example Evaluation**:\n",
"- **User Query**: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span>\n",
"- **Assistant's Response**: <span style=\"color: #008000; text-decoration-color: #008000\">\"Canada's population was estimated at 39,858,480 on April 1, 2023, by Statistics </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Canada.\"</span>\n",
"- **Evaluation**: `<span style=\"font-weight: bold\">{</span>relevance: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, credibility: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, result: correct<span style=\"font-weight: bold\">}</span>`\n",
"\n",
"This structured approach ensures clarity and consistency in evaluating the performance of automated systems.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Subjective Evaluations**\n",
"\n",
"Building an effective scorecard for automated testing is enhanced by incorporating detailed human reviews. This \n",
"process helps identify what is truly valuable. The approach of \u001b[32m\"show rather than tell\"\u001b[0m is recommended for GPT-\u001b[1;36m4\u001b[0m, \n",
"meaning that examples of scores like \u001b[1;36m1\u001b[0m, \u001b[1;36m3\u001b[0m, and \u001b[1;36m8\u001b[0m out of \u001b[1;36m10\u001b[0m should be provided to help the model understand the \n",
"range.\n",
"\n",
"**Example Scorecard:**\n",
"\n",
"- **Role**: You are an evaluation assistant assessing how well the Assistant has answered a customer's query.\n",
" \n",
"- **Metrics for Assessment**:\n",
" - **Relevance**: Rate the relevance of the search content to the question on a scale from \u001b[1;36m1\u001b[0m to \u001b[1;36m5\u001b[0m, where \u001b[1;36m5\u001b[0m is \n",
"highly relevant and \u001b[1;36m1\u001b[0m is not relevant at all.\n",
" - **Credibility**: Rate the credibility of the sources from \u001b[1;36m1\u001b[0m to \u001b[1;36m5\u001b[0m, where \u001b[1;36m5\u001b[0m is an established newspaper, \n",
"government agency, or large company, and \u001b[1;36m1\u001b[0m is unreferenced.\n",
" - **Result**: Determine if the question is answered correctly based on the search content and the user's \n",
"question. Acceptable values are \u001b[32m\"correct\"\u001b[0m or \u001b[32m\"incorrect.\"\u001b[0m\n",
"\n",
"- **Output Format**: Provide the evaluation as a JSON document with fields for relevance, credibility, and result.\n",
"\n",
"**Example Evaluation**:\n",
"- **User Query**: \u001b[32m\"What is the population of Canada?\"\u001b[0m\n",
"- **Assistant's Response**: \u001b[32m\"Canada's population was estimated at 39,858,480 on April 1, 2023, by Statistics \u001b[0m\n",
"\u001b[32mCanada.\"\u001b[0m\n",
"- **Evaluation**: `\u001b[1m{\u001b[0mrelevance: \u001b[1;36m5\u001b[0m, credibility: \u001b[1;36m5\u001b[0m, result: correct\u001b[1m}\u001b[0m`\n",
"\n",
"This structured approach ensures clarity and consistency in evaluating the performance of automated systems.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Example Framework**\n",
"\n",
"This framework outlines a method for evaluating the effectiveness of a system by grouping evaluations into test \n",
"suites called <span style=\"color: #008000; text-decoration-color: #008000\">\"runs.\"</span> These runs are executed in batches, and each run's contents are logged and stored at a \n",
"detailed level, known as <span style=\"color: #008000; text-decoration-color: #008000\">\"tracing.\"</span> This allows for investigation of failures, making adjustments, and rerunning \n",
"evaluations.\n",
"\n",
"The table provides a summary of different runs:\n",
"\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>**: \n",
" - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">28</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18</span> incorrect with correct search results, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
" - Changes: N/A\n",
"\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>**: \n",
" - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">36</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> incorrect with correct search results, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
" - Changes: Model updated to GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>**: \n",
" - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">34</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">12</span> incorrect with correct search results, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
" - Changes: Added few-shot examples\n",
"\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>**: \n",
" - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">42</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> incorrect with correct search results\n",
" - Changes: Added metadata to search, Prompt engineering for Answer step\n",
"\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>**: \n",
" - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">48</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> incorrect with correct search results\n",
" - Changes: Prompt engineering to Answer step\n",
"\n",
"This framework emphasizes the importance of detailed logging and iterative improvements to enhance system \n",
"performance.\n",
"</pre>\n"
],
"text/plain": [
"**Example Framework**\n",
"\n",
"This framework outlines a method for evaluating the effectiveness of a system by grouping evaluations into test \n",
"suites called \u001b[32m\"runs.\"\u001b[0m These runs are executed in batches, and each run's contents are logged and stored at a \n",
"detailed level, known as \u001b[32m\"tracing.\"\u001b[0m This allows for investigation of failures, making adjustments, and rerunning \n",
"evaluations.\n",
"\n",
"The table provides a summary of different runs:\n",
"\n",
"- **Run ID \u001b[1;36m1\u001b[0m**: \n",
" - Model: gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
" - Score: \u001b[1;36m28\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m18\u001b[0m incorrect with correct search results, \u001b[1;36m4\u001b[0m incorrect searches\n",
" - Changes: N/A\n",
"\n",
"- **Run ID \u001b[1;36m2\u001b[0m**: \n",
" - Model: gpt-\u001b[1;36m4\u001b[0m\n",
" - Score: \u001b[1;36m36\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m10\u001b[0m incorrect with correct search results, \u001b[1;36m4\u001b[0m incorrect searches\n",
" - Changes: Model updated to GPT-\u001b[1;36m4\u001b[0m\n",
"\n",
"- **Run ID \u001b[1;36m3\u001b[0m**: \n",
" - Model: gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
" - Score: \u001b[1;36m34\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m12\u001b[0m incorrect with correct search results, \u001b[1;36m4\u001b[0m incorrect searches\n",
" - Changes: Added few-shot examples\n",
"\n",
"- **Run ID \u001b[1;36m4\u001b[0m**: \n",
" - Model: gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
" - Score: \u001b[1;36m42\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m8\u001b[0m incorrect with correct search results\n",
" - Changes: Added metadata to search, Prompt engineering for Answer step\n",
"\n",
"- **Run ID \u001b[1;36m5\u001b[0m**: \n",
" - Model: gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
" - Score: \u001b[1;36m48\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m2\u001b[0m incorrect with correct search results\n",
" - Changes: Prompt engineering to Answer step\n",
"\n",
"This framework emphasizes the importance of detailed logging and iterative improvements to enhance system \n",
"performance.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Overview\n",
"\n",
"Fine-tuning involves adjusting the \n",
"parameters of pre-trained models on a \n",
"specific dataset or task. This process \n",
"enhances the model's ability to generate \n",
"more accurate and relevant responses for \n",
"the given context by adapting it to the \n",
"nuances and specific requirements of the \n",
"task at hand.\n",
"\n",
"Example use cases\n",
"\n",
"- Generate output in a consistent \n",
"\n",
"-\n",
"\n",
"format\n",
"Process input by following specific \n",
"instructions\n",
"\n",
"What well cover\n",
"\n",
"● When to fine-tune\n",
"\n",
"● Preparing the dataset\n",
"\n",
"● Best practices\n",
"\n",
"● Hyperparameters\n",
"\n",
"● Fine-tuning advances\n",
"\n",
"● Resources\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Overview\n",
"\n",
"Fine-tuning involves adjusting the \n",
"parameters of pre-trained models on a \n",
"specific dataset or task. This process \n",
"enhances the model's ability to generate \n",
"more accurate and relevant responses for \n",
"the given context by adapting it to the \n",
"nuances and specific requirements of the \n",
"task at hand.\n",
"\n",
"Example use cases\n",
"\n",
"- Generate output in a consistent \n",
"\n",
"-\n",
"\n",
"format\n",
"Process input by following specific \n",
"instructions\n",
"\n",
"What well cover\n",
"\n",
"● When to fine-tune\n",
"\n",
"● Preparing the dataset\n",
"\n",
"● Best practices\n",
"\n",
"● Hyperparameters\n",
"\n",
"● Fine-tuning advances\n",
"\n",
"● Resources\n",
"\n",
"\u001b[1;36m3\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What is Fine-tuning\n",
"\n",
"Public Model\n",
"\n",
"Training data\n",
"\n",
"Training\n",
"\n",
"Fine-tuned \n",
"model\n",
"\n",
"Fine-tuning a model consists of training the \n",
"model to follow a set of given input/output \n",
"examples.\n",
"\n",
"This will teach the model to behave in a \n",
"certain way when confronted with a similar \n",
"input in the future.\n",
"\n",
"We recommend using <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples \n",
"\n",
"even if the minimum is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"\n",
"\n",
"\n",
"\n",
"Fine-tuning is a process in machine learning where a pre-existing model, known as a public model, is further \n",
"trained using specific training data. This involves adjusting the model to follow a set of given input/output \n",
"examples. The goal is to teach the model to respond in a particular way when it encounters similar inputs in the \n",
"future.\n",
"\n",
"The diagram illustrates this process: starting with a public model, training data is used in a training phase to \n",
"produce a fine-tuned model. This refined model is better suited to specific tasks or datasets.\n",
"\n",
"It is recommended to use <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples for effective fine-tuning, although the minimum requirement is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> \n",
"examples. This ensures the model learns adequately from the examples provided.\n",
"</pre>\n"
],
"text/plain": [
"What is Fine-tuning\n",
"\n",
"Public Model\n",
"\n",
"Training data\n",
"\n",
"Training\n",
"\n",
"Fine-tuned \n",
"model\n",
"\n",
"Fine-tuning a model consists of training the \n",
"model to follow a set of given input/output \n",
"examples.\n",
"\n",
"This will teach the model to behave in a \n",
"certain way when confronted with a similar \n",
"input in the future.\n",
"\n",
"We recommend using \u001b[1;36m50\u001b[0m-\u001b[1;36m100\u001b[0m examples \n",
"\n",
"even if the minimum is \u001b[1;36m10\u001b[0m.\n",
"\n",
"\u001b[1;36m4\u001b[0m\n",
"\n",
"\n",
"\n",
"\n",
"Fine-tuning is a process in machine learning where a pre-existing model, known as a public model, is further \n",
"trained using specific training data. This involves adjusting the model to follow a set of given input/output \n",
"examples. The goal is to teach the model to respond in a particular way when it encounters similar inputs in the \n",
"future.\n",
"\n",
"The diagram illustrates this process: starting with a public model, training data is used in a training phase to \n",
"produce a fine-tuned model. This refined model is better suited to specific tasks or datasets.\n",
"\n",
"It is recommended to use \u001b[1;36m50\u001b[0m-\u001b[1;36m100\u001b[0m examples for effective fine-tuning, although the minimum requirement is \u001b[1;36m10\u001b[0m \n",
"examples. This ensures the model learns adequately from the examples provided.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">When to fine-tune\n",
"\n",
"Good for ✅\n",
"\n",
"Not good for ❌\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Following a given format or tone for the \n",
"\n",
"output\n",
"\n",
"Processing the input following specific, \n",
"\n",
"complex instructions\n",
"\n",
"Improving latency\n",
"\n",
"Reducing token usage\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Teaching the model new knowledge\n",
"➔ Use RAG or custom models instead\n",
"\n",
"Performing well at multiple, unrelated tasks\n",
"➔ Do prompt-engineering or create multiple \n",
"\n",
"FT models instead\n",
"\n",
"Include up-to-date content in responses\n",
"➔ Use RAG instead\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"When to fine-tune\n",
"\n",
"Good for ✅\n",
"\n",
"Not good for ❌\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Following a given format or tone for the \n",
"\n",
"output\n",
"\n",
"Processing the input following specific, \n",
"\n",
"complex instructions\n",
"\n",
"Improving latency\n",
"\n",
"Reducing token usage\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"●\n",
"\n",
"Teaching the model new knowledge\n",
"➔ Use RAG or custom models instead\n",
"\n",
"Performing well at multiple, unrelated tasks\n",
"➔ Do prompt-engineering or create multiple \n",
"\n",
"FT models instead\n",
"\n",
"Include up-to-date content in responses\n",
"➔ Use RAG instead\n",
"\n",
"\u001b[1;36m5\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Preparing the dataset\n",
"\n",
"Example format\n",
"\n",
"<span style=\"font-weight: bold\">{</span>\n",
"\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"messages\"</span>: <span style=\"font-weight: bold\">[</span>\n",
"\n",
"<span style=\"font-weight: bold\">{</span>\n",
"\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"role\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"system\"</span>,\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"content\"</span>: \"Marv is a factual chatbot \n",
"that is also sarcastic.\"\n",
"\n",
"<span style=\"font-weight: bold\">}</span>,\n",
"<span style=\"font-weight: bold\">{</span>\n",
"\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"role\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"user\"</span>,\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"content\"</span>: \"What's the capital of \n",
"France?\"\n",
"\n",
"<span style=\"font-weight: bold\">}</span>,\n",
"<span style=\"font-weight: bold\">{</span>\n",
"\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"role\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"assistant\"</span>,\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"content\"</span>: \"Paris, as if everyone \n",
"doesn't know that already.\"\n",
"\n",
"<span style=\"font-weight: bold\">}</span>\n",
"\n",
"<span style=\"font-weight: bold\">]</span>\n",
"\n",
"<span style=\"font-weight: bold\">}</span>\n",
"\n",
".jsonl\n",
"\n",
"➔ Take the set of instructions and prompts that you \n",
"\n",
"found worked best for the model prior to fine-tuning. \n",
"Include them in every training example\n",
"\n",
"➔ If you would like to shorten the instructions or \n",
"\n",
"prompts, it may take more training examples to arrive \n",
"at good results\n",
"\n",
"We recommend using <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples \n",
"\n",
"even if the minimum is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span>\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Preparing the dataset\n",
"\n",
"Example format\n",
"\n",
"\u001b[1m{\u001b[0m\n",
"\n",
"\u001b[32m\"messages\"\u001b[0m: \u001b[1m[\u001b[0m\n",
"\n",
"\u001b[1m{\u001b[0m\n",
"\n",
"\u001b[32m\"role\"\u001b[0m: \u001b[32m\"system\"\u001b[0m,\n",
"\u001b[32m\"content\"\u001b[0m: \"Marv is a factual chatbot \n",
"that is also sarcastic.\"\n",
"\n",
"\u001b[1m}\u001b[0m,\n",
"\u001b[1m{\u001b[0m\n",
"\n",
"\u001b[32m\"role\"\u001b[0m: \u001b[32m\"user\"\u001b[0m,\n",
"\u001b[32m\"content\"\u001b[0m: \"What's the capital of \n",
"France?\"\n",
"\n",
"\u001b[1m}\u001b[0m,\n",
"\u001b[1m{\u001b[0m\n",
"\n",
"\u001b[32m\"role\"\u001b[0m: \u001b[32m\"assistant\"\u001b[0m,\n",
"\u001b[32m\"content\"\u001b[0m: \"Paris, as if everyone \n",
"doesn't know that already.\"\n",
"\n",
"\u001b[1m}\u001b[0m\n",
"\n",
"\u001b[1m]\u001b[0m\n",
"\n",
"\u001b[1m}\u001b[0m\n",
"\n",
".jsonl\n",
"\n",
"➔ Take the set of instructions and prompts that you \n",
"\n",
"found worked best for the model prior to fine-tuning. \n",
"Include them in every training example\n",
"\n",
"➔ If you would like to shorten the instructions or \n",
"\n",
"prompts, it may take more training examples to arrive \n",
"at good results\n",
"\n",
"We recommend using \u001b[1;36m50\u001b[0m-\u001b[1;36m100\u001b[0m examples \n",
"\n",
"even if the minimum is \u001b[1;36m10\u001b[0m.\n",
"\n",
"\u001b[1;36m6\u001b[0m\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Best practices\n",
"\n",
"Curate examples carefully\n",
"\n",
"Datasets can be difficult to build, start \n",
"small and invest intentionally. \n",
"Optimize for fewer high-quality \n",
"training examples.\n",
"\n",
"● Consider “prompt baking”, or using a basic \n",
"prompt to generate your initial examples\n",
"● If your conversations are multi-turn, ensure \n",
"\n",
"your examples are representative\n",
"\n",
"● Collect examples to target issues detected \n",
"\n",
"in evaluation\n",
"\n",
"● Consider the balance &amp; diversity of data\n",
"● Make sure your examples contain all the \n",
"\n",
"information needed in the response\n",
"\n",
"Iterate on hyperparameters\n",
"\n",
"Establish a baseline\n",
"\n",
"Start with the defaults and adjust \n",
"based on performance.\n",
"\n",
"● If the model does not appear to converge, \n",
"\n",
"increase the learning rate multiplier\n",
"● If the model does not follow the training \n",
"data as much as expected increase the \n",
"number of epochs\n",
"\n",
"● If the model becomes less diverse than \n",
"\n",
"expected decrease the # of epochs by <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>\n",
"\n",
"Automate your feedback \n",
"pipeline\n",
"\n",
"Introduce automated evaluations to \n",
"highlight potential problem cases to \n",
"clean up and use as training data.\n",
"\n",
"Consider the G-Eval approach of \n",
"using GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> to perform automated \n",
"testing using a scorecard.\n",
"\n",
"Often users start with a \n",
"zero-shot or few-shot prompt to \n",
"build a baseline evaluation \n",
"before graduating to fine-tuning.\n",
"\n",
"Often users start with a \n",
"zero-shot or few-shot prompt to \n",
"build a baseline evaluation \n",
"Optimize for latency and \n",
"before graduating to fine-tuning.\n",
"token efficiency\n",
"\n",
"When using GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, once you \n",
"have a baseline evaluation and \n",
"training examples consider \n",
"fine-tuning <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> to get similar \n",
"performance for less cost and \n",
"latency.\n",
"\n",
"Experiment with reducing or \n",
"removing system instructions \n",
"with subsequent fine-tuned \n",
"model versions.\n",
"\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Best practices\n",
"\n",
"Curate examples carefully\n",
"\n",
"Datasets can be difficult to build, start \n",
"small and invest intentionally. \n",
"Optimize for fewer high-quality \n",
"training examples.\n",
"\n",
"● Consider “prompt baking”, or using a basic \n",
"prompt to generate your initial examples\n",
"● If your conversations are multi-turn, ensure \n",
"\n",
"your examples are representative\n",
"\n",
"● Collect examples to target issues detected \n",
"\n",
"in evaluation\n",
"\n",
"● Consider the balance & diversity of data\n",
"● Make sure your examples contain all the \n",
"\n",
"information needed in the response\n",
"\n",
"Iterate on hyperparameters\n",
"\n",
"Establish a baseline\n",
"\n",
"Start with the defaults and adjust \n",
"based on performance.\n",
"\n",
"● If the model does not appear to converge, \n",
"\n",
"increase the learning rate multiplier\n",
"● If the model does not follow the training \n",
"data as much as expected increase the \n",
"number of epochs\n",
"\n",
"● If the model becomes less diverse than \n",
"\n",
"expected decrease the # of epochs by \u001b[1;36m1\u001b[0m-\u001b[1;36m2\u001b[0m\n",
"\n",
"Automate your feedback \n",
"pipeline\n",
"\n",
"Introduce automated evaluations to \n",
"highlight potential problem cases to \n",
"clean up and use as training data.\n",
"\n",
"Consider the G-Eval approach of \n",
"using GPT-\u001b[1;36m4\u001b[0m to perform automated \n",
"testing using a scorecard.\n",
"\n",
"Often users start with a \n",
"zero-shot or few-shot prompt to \n",
"build a baseline evaluation \n",
"before graduating to fine-tuning.\n",
"\n",
"Often users start with a \n",
"zero-shot or few-shot prompt to \n",
"build a baseline evaluation \n",
"Optimize for latency and \n",
"before graduating to fine-tuning.\n",
"token efficiency\n",
"\n",
"When using GPT-\u001b[1;36m4\u001b[0m, once you \n",
"have a baseline evaluation and \n",
"training examples consider \n",
"fine-tuning \u001b[1;36m3.5\u001b[0m to get similar \n",
"performance for less cost and \n",
"latency.\n",
"\n",
"Experiment with reducing or \n",
"removing system instructions \n",
"with subsequent fine-tuned \n",
"model versions.\n",
"\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Hyperparameters\n",
"\n",
"Epochs\n",
"Refers to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> full cycle through the training dataset\n",
"If you have hundreds of thousands of examples, we would recommend \n",
"experimenting with two epochs <span style=\"font-weight: bold\">(</span>or one<span style=\"font-weight: bold\">)</span> to avoid overfitting.\n",
"\n",
"default: auto <span style=\"font-weight: bold\">(</span>standard is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span><span style=\"font-weight: bold\">)</span>\n",
"\n",
"Batch size\n",
"Number of training examples used to train a single \n",
"forward &amp; backward pass\n",
"In general, we've found that larger batch sizes tend to work better for larger datasets\n",
"\n",
"default: ~<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>% x N* <span style=\"font-weight: bold\">(</span>max <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">256</span><span style=\"font-weight: bold\">)</span>\n",
"\n",
"*N = number of training examples\n",
"\n",
"Learning rate multiplier\n",
"Scaling factor for the original learning rate\n",
"We recommend experimenting with values between <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.02</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>. We've found that \n",
"larger learning rates often perform better with larger batch sizes.\n",
"\n",
"default: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.05</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.1</span> or <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>*\n",
"\n",
"*depends on final batch size\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>\n",
"\n",
"\n",
"\n",
"\n",
"**Epochs**\n",
"- An epoch refers to one complete cycle through the training dataset.\n",
"- For datasets with hundreds of thousands of examples, it is recommended to use fewer epochs <span style=\"font-weight: bold\">(</span>one or two<span style=\"font-weight: bold\">)</span> to \n",
"prevent overfitting.\n",
"- Default setting is auto, with a standard of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> epochs.\n",
"\n",
"**Batch Size**\n",
"- This is the number of training examples used to train in a single forward and backward pass.\n",
"- Larger batch sizes are generally more effective for larger datasets.\n",
"- The default batch size is approximately <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>% of the total number of training examples <span style=\"font-weight: bold\">(</span>N<span style=\"font-weight: bold\">)</span>, with a maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">256</span>.\n",
"\n",
"**Learning Rate Multiplier**\n",
"- This is a scaling factor for the original learning rate.\n",
"- Experimentation with values between <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.02</span> and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span> is recommended.\n",
"- Larger learning rates often yield better results with larger batch sizes.\n",
"- Default values are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.05</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.1</span>, or <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>, depending on the final batch size.\n",
"</pre>\n"
],
"text/plain": [
"Hyperparameters\n",
"\n",
"Epochs\n",
"Refers to \u001b[1;36m1\u001b[0m full cycle through the training dataset\n",
"If you have hundreds of thousands of examples, we would recommend \n",
"experimenting with two epochs \u001b[1m(\u001b[0mor one\u001b[1m)\u001b[0m to avoid overfitting.\n",
"\n",
"default: auto \u001b[1m(\u001b[0mstandard is \u001b[1;36m4\u001b[0m\u001b[1m)\u001b[0m\n",
"\n",
"Batch size\n",
"Number of training examples used to train a single \n",
"forward & backward pass\n",
"In general, we've found that larger batch sizes tend to work better for larger datasets\n",
"\n",
"default: ~\u001b[1;36m0.2\u001b[0m% x N* \u001b[1m(\u001b[0mmax \u001b[1;36m256\u001b[0m\u001b[1m)\u001b[0m\n",
"\n",
"*N = number of training examples\n",
"\n",
"Learning rate multiplier\n",
"Scaling factor for the original learning rate\n",
"We recommend experimenting with values between \u001b[1;36m0.02\u001b[0m-\u001b[1;36m0.2\u001b[0m. We've found that \n",
"larger learning rates often perform better with larger batch sizes.\n",
"\n",
"default: \u001b[1;36m0.05\u001b[0m, \u001b[1;36m0.1\u001b[0m or \u001b[1;36m0.2\u001b[0m*\n",
"\n",
"*depends on final batch size\n",
"\n",
"\u001b[1;36m8\u001b[0m\n",
"\n",
"\n",
"\n",
"\n",
"**Epochs**\n",
"- An epoch refers to one complete cycle through the training dataset.\n",
"- For datasets with hundreds of thousands of examples, it is recommended to use fewer epochs \u001b[1m(\u001b[0mone or two\u001b[1m)\u001b[0m to \n",
"prevent overfitting.\n",
"- Default setting is auto, with a standard of \u001b[1;36m4\u001b[0m epochs.\n",
"\n",
"**Batch Size**\n",
"- This is the number of training examples used to train in a single forward and backward pass.\n",
"- Larger batch sizes are generally more effective for larger datasets.\n",
"- The default batch size is approximately \u001b[1;36m0.2\u001b[0m% of the total number of training examples \u001b[1m(\u001b[0mN\u001b[1m)\u001b[0m, with a maximum of \u001b[1;36m256\u001b[0m.\n",
"\n",
"**Learning Rate Multiplier**\n",
"- This is a scaling factor for the original learning rate.\n",
"- Experimentation with values between \u001b[1;36m0.02\u001b[0m and \u001b[1;36m0.2\u001b[0m is recommended.\n",
"- Larger learning rates often yield better results with larger batch sizes.\n",
"- Default values are \u001b[1;36m0.05\u001b[0m, \u001b[1;36m0.1\u001b[0m, or \u001b[1;36m0.2\u001b[0m, depending on the final batch size.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Overview**\n",
"\n",
"Fine-tuning involves adjusting the parameters of pre-trained models on a specific dataset or task. This process \n",
"enhances the model's ability to generate more accurate and relevant responses for the given context by adapting it \n",
"to the nuances and specific requirements of the task at hand.\n",
"\n",
"**Example Use Cases:**\n",
"- Generate output in a consistent format.\n",
"- Process input by following specific instructions.\n",
"\n",
"**What Well Cover:**\n",
"- When to fine-tune\n",
"- Preparing the dataset\n",
"- Best practices\n",
"- Hyperparameters\n",
"- Fine-tuning advances\n",
"- Resources\n",
"</pre>\n"
],
"text/plain": [
"**Overview**\n",
"\n",
"Fine-tuning involves adjusting the parameters of pre-trained models on a specific dataset or task. This process \n",
"enhances the model's ability to generate more accurate and relevant responses for the given context by adapting it \n",
"to the nuances and specific requirements of the task at hand.\n",
"\n",
"**Example Use Cases:**\n",
"- Generate output in a consistent format.\n",
"- Process input by following specific instructions.\n",
"\n",
"**What Well Cover:**\n",
"- When to fine-tune\n",
"- Preparing the dataset\n",
"- Best practices\n",
"- Hyperparameters\n",
"- Fine-tuning advances\n",
"- Resources\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">When to Fine-Tune\n",
"\n",
"**Good for:**\n",
"\n",
"- **Following a given format or tone for the output:** Fine-tuning is effective when you need the model to adhere \n",
"to a specific style or structure in its responses.\n",
" \n",
"- **Processing the input following specific, complex instructions:** It helps in handling detailed and intricate \n",
"instructions accurately.\n",
"\n",
"- **Improving latency:** Fine-tuning can enhance the speed of the model's responses.\n",
"\n",
"- **Reducing token usage:** It can optimize the model to use fewer tokens, making it more efficient.\n",
"\n",
"**Not good for:**\n",
"\n",
"- **Teaching the model new knowledge:** Fine-tuning is not suitable for adding new information to the model. \n",
"Instead, use Retrieval-Augmented Generation <span style=\"font-weight: bold\">(</span>RAG<span style=\"font-weight: bold\">)</span> or custom models.\n",
"\n",
"- **Performing well at multiple, unrelated tasks:** For diverse tasks, it's better to use prompt engineering or \n",
"create multiple fine-tuned models.\n",
"\n",
"- **Including up-to-date content in responses:** Fine-tuning is not ideal for ensuring the model has the latest \n",
"information. RAG is recommended for this purpose.\n",
"</pre>\n"
],
"text/plain": [
"When to Fine-Tune\n",
"\n",
"**Good for:**\n",
"\n",
"- **Following a given format or tone for the output:** Fine-tuning is effective when you need the model to adhere \n",
"to a specific style or structure in its responses.\n",
" \n",
"- **Processing the input following specific, complex instructions:** It helps in handling detailed and intricate \n",
"instructions accurately.\n",
"\n",
"- **Improving latency:** Fine-tuning can enhance the speed of the model's responses.\n",
"\n",
"- **Reducing token usage:** It can optimize the model to use fewer tokens, making it more efficient.\n",
"\n",
"**Not good for:**\n",
"\n",
"- **Teaching the model new knowledge:** Fine-tuning is not suitable for adding new information to the model. \n",
"Instead, use Retrieval-Augmented Generation \u001b[1m(\u001b[0mRAG\u001b[1m)\u001b[0m or custom models.\n",
"\n",
"- **Performing well at multiple, unrelated tasks:** For diverse tasks, it's better to use prompt engineering or \n",
"create multiple fine-tuned models.\n",
"\n",
"- **Including up-to-date content in responses:** Fine-tuning is not ideal for ensuring the model has the latest \n",
"information. RAG is recommended for this purpose.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Preparing the Dataset**\n",
"\n",
"This slide provides guidance on preparing a dataset for training a chatbot model. It includes an example format \n",
"using JSONL <span style=\"font-weight: bold\">(</span>JSON Lines<span style=\"font-weight: bold\">)</span> to structure the data. The example shows a conversation with three roles:\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **System**: Sets the context by describing the chatbot as <span style=\"color: #008000; text-decoration-color: #008000\">\"Marv is a factual chatbot that is also sarcastic.\"</span>\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **User**: Asks a question, <span style=\"color: #008000; text-decoration-color: #008000\">\"What's the capital of France?\"</span>\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Assistant**: Responds with a sarcastic answer, <span style=\"color: #008000; text-decoration-color: #008000\">\"Paris, as if everyone doesn't know that already.\"</span>\n",
"\n",
"Key recommendations for dataset preparation include:\n",
"\n",
"- Use a set of instructions and prompts that have proven effective for the model before fine-tuning. These should \n",
"be included in every training example.\n",
"- If you choose to shorten instructions or prompts, be aware that more training examples may be needed to achieve \n",
"good results.\n",
"- It is recommended to use <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples, even though the minimum required is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>.\n",
"</pre>\n"
],
"text/plain": [
"**Preparing the Dataset**\n",
"\n",
"This slide provides guidance on preparing a dataset for training a chatbot model. It includes an example format \n",
"using JSONL \u001b[1m(\u001b[0mJSON Lines\u001b[1m)\u001b[0m to structure the data. The example shows a conversation with three roles:\n",
"\n",
"\u001b[1;36m1\u001b[0m. **System**: Sets the context by describing the chatbot as \u001b[32m\"Marv is a factual chatbot that is also sarcastic.\"\u001b[0m\n",
"\u001b[1;36m2\u001b[0m. **User**: Asks a question, \u001b[32m\"What's the capital of France?\"\u001b[0m\n",
"\u001b[1;36m3\u001b[0m. **Assistant**: Responds with a sarcastic answer, \u001b[32m\"Paris, as if everyone doesn't know that already.\"\u001b[0m\n",
"\n",
"Key recommendations for dataset preparation include:\n",
"\n",
"- Use a set of instructions and prompts that have proven effective for the model before fine-tuning. These should \n",
"be included in every training example.\n",
"- If you choose to shorten instructions or prompts, be aware that more training examples may be needed to achieve \n",
"good results.\n",
"- It is recommended to use \u001b[1;36m50\u001b[0m-\u001b[1;36m100\u001b[0m examples, even though the minimum required is \u001b[1;36m10\u001b[0m.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Best Practices**\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>. **Curate Examples Carefully**\n",
" - Building datasets can be challenging, so start small and focus on high-quality examples.\n",
" - Use <span style=\"color: #008000; text-decoration-color: #008000\">\"prompt baking\"</span> to generate initial examples.\n",
" - Ensure multi-turn conversations are well-represented.\n",
" - Collect examples to address issues found during evaluation.\n",
" - Balance and diversify your data.\n",
" - Ensure examples contain all necessary information for responses.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **Iterate on Hyperparameters**\n",
" - Begin with default settings and adjust based on performance.\n",
" - Increase the learning rate multiplier if the model doesn't converge.\n",
" - Increase the number of epochs if the model doesn't follow training data closely.\n",
" - Decrease the number of epochs by <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> if the model becomes less diverse.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>. **Establish a Baseline**\n",
" - Start with zero-shot or few-shot prompts to create a baseline before fine-tuning.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>. **Automate Your Feedback Pipeline**\n",
" - Use automated evaluations to identify and clean up problem cases for training data.\n",
" - Consider using the G-Eval approach with GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for automated testing with a scorecard.\n",
"\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>. **Optimize for Latency and Token Efficiency**\n",
" - After establishing a baseline, consider fine-tuning with GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> for similar performance at lower cost and \n",
"latency.\n",
" - Experiment with reducing or removing system instructions in subsequent fine-tuned versions.\n",
"</pre>\n"
],
"text/plain": [
"**Best Practices**\n",
"\n",
"\u001b[1;36m1\u001b[0m. **Curate Examples Carefully**\n",
" - Building datasets can be challenging, so start small and focus on high-quality examples.\n",
" - Use \u001b[32m\"prompt baking\"\u001b[0m to generate initial examples.\n",
" - Ensure multi-turn conversations are well-represented.\n",
" - Collect examples to address issues found during evaluation.\n",
" - Balance and diversify your data.\n",
" - Ensure examples contain all necessary information for responses.\n",
"\n",
"\u001b[1;36m2\u001b[0m. **Iterate on Hyperparameters**\n",
" - Begin with default settings and adjust based on performance.\n",
" - Increase the learning rate multiplier if the model doesn't converge.\n",
" - Increase the number of epochs if the model doesn't follow training data closely.\n",
" - Decrease the number of epochs by \u001b[1;36m1\u001b[0m-\u001b[1;36m2\u001b[0m if the model becomes less diverse.\n",
"\n",
"\u001b[1;36m3\u001b[0m. **Establish a Baseline**\n",
" - Start with zero-shot or few-shot prompts to create a baseline before fine-tuning.\n",
"\n",
"\u001b[1;36m4\u001b[0m. **Automate Your Feedback Pipeline**\n",
" - Use automated evaluations to identify and clean up problem cases for training data.\n",
" - Consider using the G-Eval approach with GPT-\u001b[1;36m4\u001b[0m for automated testing with a scorecard.\n",
"\n",
"\u001b[1;36m5\u001b[0m. **Optimize for Latency and Token Efficiency**\n",
" - After establishing a baseline, consider fine-tuning with GPT-\u001b[1;36m3.5\u001b[0m for similar performance at lower cost and \n",
"latency.\n",
" - Experiment with reducing or removing system instructions in subsequent fine-tuned versions.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"for c in content:\n",
" print(c)\n",
" print(\"\\n\\n-------------------------------\\n\\n\")"
]
},
{
"cell_type": "code",
"execution_count": 91,
"id": "f8b84f96",
"metadata": {},
"outputs": [],
"source": [
"# Cleaning up content\n",
"# Removing trailing spaces, additional line breaks, page numbers and references to the content being a slide\n",
"clean_content = []\n",
"for c in content:\n",
" text = c.replace(' \\n', '').replace('\\n\\n', '\\n').replace('\\n\\n\\n', '\\n').strip()\n",
" text = re.sub(r\"(?<=\\n)\\d{1,2}\", \"\", text)\n",
" text = re.sub(r\"\\b(?:the|this)\\s*slide\\s*\\w+\\b\", \"\", text, flags=re.IGNORECASE)\n",
" clean_content.append(text)"
]
},
{
"cell_type": "code",
"execution_count": 92,
"id": "7a13c8c3",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Overview\n",
"Retrieval-Augmented Generationenhances the capabilities of languagemodels by combining them with aretrieval system.\n",
"This allows the modelto leverage external knowledge sourcesto generate more accurate andcontextually relevant \n",
"responses.\n",
"Example use cases\n",
"- Provide answers with up-to-date\n",
"information\n",
"- Generate contextual responses\n",
"What well cover\n",
"● Technical patterns\n",
"● Best practices\n",
"● Common pitfalls\n",
"● Resources\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Overview\n",
"Retrieval-Augmented Generationenhances the capabilities of languagemodels by combining them with aretrieval system.\n",
"This allows the modelto leverage external knowledge sourcesto generate more accurate andcontextually relevant \n",
"responses.\n",
"Example use cases\n",
"- Provide answers with up-to-date\n",
"information\n",
"- Generate contextual responses\n",
"What well cover\n",
"● Technical patterns\n",
"● Best practices\n",
"● Common pitfalls\n",
"● Resources\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What is RAG\n",
"Retrieve information to Augment the models knowledge and Generate the output\n",
"“What is yourreturn policy?”\n",
"ask\n",
"result\n",
"search\n",
"LLM\n",
"return information\n",
"Total refunds: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days\n",
"% of value vouchers: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"$<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> discount on next order: &gt; <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days\n",
"“You can get a full refund upto <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days after thepurchase, then up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> daysyou would get a voucher forhalf the \n",
"value of your order”\n",
"KnowledgeBase <span style=\"color: #800080; text-decoration-color: #800080\">/</span> Externalsources\n",
"\n",
"RAG stands for <span style=\"color: #008000; text-decoration-color: #008000\">\"Retrieve information to Augment the models knowledge and Generate the output.\"</span> This process \n",
"involves using a language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> to enhance its responses by accessing external information sources.\n",
"Here's how it works:\n",
". **User Query**: A user asks a question, such as <span style=\"color: #008000; text-decoration-color: #008000\">\"What is your return policy?\"</span>\n",
". **LLM Processing**: The language model receives the question and initiates a search for relevant information.\n",
". **Information Retrieval**: The LLM accesses a knowledge base or external sources to find the necessary details. \n",
"In this example, the information retrieved includes:\n",
" - Total refunds available from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days.\n",
" - <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>% value vouchers for returns between <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days.\n",
" - A $<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> discount on the next order for returns after <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days.\n",
". **Response Generation**: The LLM uses the retrieved information to generate a coherent response for the user. For\n",
"instance, it might say, <span style=\"color: #008000; text-decoration-color: #008000\">\"You can get a full refund up to 14 days after the purchase, then up to 30 days you would </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">get a voucher for half the value of your order.\"</span>\n",
"This method allows the model to provide accurate and up-to-date answers by leveraging external data sources.\n",
"</pre>\n"
],
"text/plain": [
"What is RAG\n",
"Retrieve information to Augment the models knowledge and Generate the output\n",
"“What is yourreturn policy?”\n",
"ask\n",
"result\n",
"search\n",
"LLM\n",
"return information\n",
"Total refunds: \u001b[1;36m0\u001b[0m-\u001b[1;36m14\u001b[0m days\n",
"% of value vouchers: \u001b[1;36m14\u001b[0m-\u001b[1;36m30\u001b[0m days\n",
"$\u001b[1;36m5\u001b[0m discount on next order: > \u001b[1;36m30\u001b[0m days\n",
"“You can get a full refund upto \u001b[1;36m14\u001b[0m days after thepurchase, then up to \u001b[1;36m30\u001b[0m daysyou would get a voucher forhalf the \n",
"value of your order”\n",
"KnowledgeBase \u001b[35m/\u001b[0m Externalsources\n",
"\n",
"RAG stands for \u001b[32m\"Retrieve information to Augment the models knowledge and Generate the output.\"\u001b[0m This process \n",
"involves using a language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m to enhance its responses by accessing external information sources.\n",
"Here's how it works:\n",
". **User Query**: A user asks a question, such as \u001b[32m\"What is your return policy?\"\u001b[0m\n",
". **LLM Processing**: The language model receives the question and initiates a search for relevant information.\n",
". **Information Retrieval**: The LLM accesses a knowledge base or external sources to find the necessary details. \n",
"In this example, the information retrieved includes:\n",
" - Total refunds available from \u001b[1;36m0\u001b[0m to \u001b[1;36m14\u001b[0m days.\n",
" - \u001b[1;36m50\u001b[0m% value vouchers for returns between \u001b[1;36m14\u001b[0m to \u001b[1;36m30\u001b[0m days.\n",
" - A $\u001b[1;36m5\u001b[0m discount on the next order for returns after \u001b[1;36m30\u001b[0m days.\n",
". **Response Generation**: The LLM uses the retrieved information to generate a coherent response for the user. For\n",
"instance, it might say, \u001b[32m\"You can get a full refund up to 14 days after the purchase, then up to 30 days you would \u001b[0m\n",
"\u001b[32mget a voucher for half the value of your order.\"\u001b[0m\n",
"This method allows the model to provide accurate and up-to-date answers by leveraging external data sources.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">When to use RAG\n",
"Good for ✅\n",
"Not good for ❌\n",
"●\n",
"●\n",
"Introducing new information to the model\n",
"●\n",
"Teaching the model a specific format, style,\n",
"to update its knowledge\n",
"Reducing hallucinations by controlling\n",
"content\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span>!\\ Hallucinations can still happen with RAG\n",
"or language\n",
"➔ Use fine-tuning or custom models instead\n",
"●\n",
"Reducing token usage\n",
"➔ Consider fine-tuning depending on the use\n",
"case\n",
"\n",
"**Good for:**\n",
"- **Introducing new information to the model:** RAG <span style=\"font-weight: bold\">(</span>Retrieval-Augmented Generation<span style=\"font-weight: bold\">)</span> is effective for updating a \n",
"model's knowledge by incorporating new data.\n",
"- **Reducing hallucinations by controlling content:** While RAG can help minimize hallucinations, it's important to\n",
"note that they can still occur.\n",
"**Not good for:**\n",
"- **Teaching the model a specific format, style, or language:** For these tasks, it's better to use fine-tuning or \n",
"custom models.\n",
"- **Reducing token usage:** If token usage is a concern, consider fine-tuning based on the specific use case.\n",
"</pre>\n"
],
"text/plain": [
"When to use RAG\n",
"Good for ✅\n",
"Not good for ❌\n",
"●\n",
"●\n",
"Introducing new information to the model\n",
"●\n",
"Teaching the model a specific format, style,\n",
"to update its knowledge\n",
"Reducing hallucinations by controlling\n",
"content\n",
"\u001b[35m/\u001b[0m!\\ Hallucinations can still happen with RAG\n",
"or language\n",
"➔ Use fine-tuning or custom models instead\n",
"●\n",
"Reducing token usage\n",
"➔ Consider fine-tuning depending on the use\n",
"case\n",
"\n",
"**Good for:**\n",
"- **Introducing new information to the model:** RAG \u001b[1m(\u001b[0mRetrieval-Augmented Generation\u001b[1m)\u001b[0m is effective for updating a \n",
"model's knowledge by incorporating new data.\n",
"- **Reducing hallucinations by controlling content:** While RAG can help minimize hallucinations, it's important to\n",
"note that they can still occur.\n",
"**Not good for:**\n",
"- **Teaching the model a specific format, style, or language:** For these tasks, it's better to use fine-tuning or \n",
"custom models.\n",
"- **Reducing token usage:** If token usage is a concern, consider fine-tuning based on the specific use case.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation\n",
"Input processing\n",
"Retrieval\n",
"Answer Generation\n",
"● Chunking\n",
"●\n",
"●\n",
"Embeddings\n",
"Augmentingcontent\n",
"●\n",
"Inputaugmentation\n",
"● NER\n",
"●\n",
"Search\n",
"● Context window\n",
"● Multi-stepretrieval\n",
"● Optimisation\n",
"●\n",
"Safety checks\n",
"●\n",
"Embeddings\n",
"● Re-ranking\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation\n",
"Input processing\n",
"Retrieval\n",
"Answer Generation\n",
"● Chunking\n",
"●\n",
"●\n",
"Embeddings\n",
"Augmentingcontent\n",
"●\n",
"Inputaugmentation\n",
"● NER\n",
"●\n",
"Search\n",
"● Context window\n",
"● Multi-stepretrieval\n",
"● Optimisation\n",
"●\n",
"Safety checks\n",
"●\n",
"Embeddings\n",
"● Re-ranking\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation\n",
"chunk documents into multiplepieces for easier consumption\n",
"content\n",
"embeddings\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span>…\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">876</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.145</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.179</span>…\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span>…\n",
"Augment contentusing LLMs\n",
"Ex: parse text only, ask gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> to rephrase &amp;summarize each part, generate bullet points…\n",
"BEST PRACTICES\n",
"Pre-process content for LLMconsumption:Add summary, headers for eachpart, etc.\n",
"+ curate relevant data sources\n",
"KnowledgeBase\n",
"COMMON PITFALLS\n",
"➔ Having too much low-quality\n",
"content\n",
"➔ Having too large documents\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation\n",
"chunk documents into multiplepieces for easier consumption\n",
"content\n",
"embeddings\n",
".\u001b[1;36m983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m…\n",
".\u001b[1;36m876\u001b[0m, \u001b[1;36m0.145\u001b[0m, \u001b[1;36m0.179\u001b[0m…\n",
".\u001b[1;36m983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m…\n",
"Augment contentusing LLMs\n",
"Ex: parse text only, ask gpt-\u001b[1;36m4\u001b[0m to rephrase &summarize each part, generate bullet points…\n",
"BEST PRACTICES\n",
"Pre-process content for LLMconsumption:Add summary, headers for eachpart, etc.\n",
"+ curate relevant data sources\n",
"KnowledgeBase\n",
"COMMON PITFALLS\n",
"➔ Having too much low-quality\n",
"content\n",
"➔ Having too large documents\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation: chunking\n",
"Why chunking?\n",
"If your system doesnt requireentire documents to providerelevant answers, you canchunk them into multiple \n",
"piecesfor easier consumption <span style=\"font-weight: bold\">(</span>reducedcost &amp; latency<span style=\"font-weight: bold\">)</span>.\n",
"Other approaches: graphs ormap-reduce\n",
"Things to consider\n",
"●\n",
"Overlap:\n",
"○\n",
"○\n",
"Should chunks be independent or overlap oneanother?\n",
"If they overlap, by how much?\n",
"●\n",
"Size of chunks:\n",
"○ What is the optimal chunk size for my use case?\n",
"○\n",
"Do I want to include a lot in the context window orjust the minimum?\n",
"● Where to chunk:\n",
"○\n",
"○\n",
"Should I chunk every N tokens or use specificseparators?Is there a logical way to split the context that wouldhelp \n",
"the retrieval process?\n",
"● What to return:\n",
"○\n",
"○\n",
"Should I return chunks across multiple documentsor top chunks within the same doc?\n",
"Should chunks be linked together with metadata toindicate common properties?\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation: chunking\n",
"Why chunking?\n",
"If your system doesnt requireentire documents to providerelevant answers, you canchunk them into multiple \n",
"piecesfor easier consumption \u001b[1m(\u001b[0mreducedcost & latency\u001b[1m)\u001b[0m.\n",
"Other approaches: graphs ormap-reduce\n",
"Things to consider\n",
"●\n",
"Overlap:\n",
"○\n",
"○\n",
"Should chunks be independent or overlap oneanother?\n",
"If they overlap, by how much?\n",
"●\n",
"Size of chunks:\n",
"○ What is the optimal chunk size for my use case?\n",
"○\n",
"Do I want to include a lot in the context window orjust the minimum?\n",
"● Where to chunk:\n",
"○\n",
"○\n",
"Should I chunk every N tokens or use specificseparators?Is there a logical way to split the context that wouldhelp \n",
"the retrieval process?\n",
"● What to return:\n",
"○\n",
"○\n",
"Should I return chunks across multiple documentsor top chunks within the same doc?\n",
"Should chunks be linked together with metadata toindicate common properties?\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation: embeddings\n",
"What to embed?\n",
"Depending on your use caseyou might not want just toembed the text in thedocuments but metadata as well- anything \n",
"that will make it easierto surface this specific chunk ordocument when performing asearch\n",
"Examples\n",
"Embedding Q&amp;A posts in a forum\n",
"You might want to embed the title of the posts,the text of the original question and the content ofthe top answers.\n",
"Additionally, if the posts are tagged by topic orwith keywords, you can embed those too.\n",
"Embedding product specs\n",
"In additional to embedding the text contained indocuments describing the products, you mightwant to add metadata \n",
"that you have on theproduct such as the color, size, etc. in yourembeddings.\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation: embeddings\n",
"What to embed?\n",
"Depending on your use caseyou might not want just toembed the text in thedocuments but metadata as well- anything \n",
"that will make it easierto surface this specific chunk ordocument when performing asearch\n",
"Examples\n",
"Embedding Q&A posts in a forum\n",
"You might want to embed the title of the posts,the text of the original question and the content ofthe top answers.\n",
"Additionally, if the posts are tagged by topic orwith keywords, you can embed those too.\n",
"Embedding product specs\n",
"In additional to embedding the text contained indocuments describing the products, you mightwant to add metadata \n",
"that you have on theproduct such as the color, size, etc. in yourembeddings.\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Data preparation: augmenting content\n",
"What does “Augmentingcontent” mean?\n",
"Augmenting content refers tomodifications of the original contentto make it more digestible for asystem relying on \n",
"RAG. Themodifications could be a change informat, wording, or addingdescriptive content such assummaries or \n",
"keywords.\n",
"Example approaches\n",
"Make it a guide*\n",
"Reformat the content to look more likea step-by-step guide with clearheadings and bullet-points, as thisformat is \n",
"more easily understandableby an LLM.\n",
"Add descriptive metadata*\n",
"Consider adding keywords or text thatusers might search for when thinkingof a specific product or service.\n",
"Multimodality\n",
"Leverage modelssuch as Whisper orGPT-4V totransform audio orvisual content intotext.\n",
"For example, youcan use GPT-4V togenerate tags forimages or todescribe slides.\n",
"* GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can do this for you with the right prompt\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Data preparation: augmenting content\n",
"What does “Augmentingcontent” mean?\n",
"Augmenting content refers tomodifications of the original contentto make it more digestible for asystem relying on \n",
"RAG. Themodifications could be a change informat, wording, or addingdescriptive content such assummaries or \n",
"keywords.\n",
"Example approaches\n",
"Make it a guide*\n",
"Reformat the content to look more likea step-by-step guide with clearheadings and bullet-points, as thisformat is \n",
"more easily understandableby an LLM.\n",
"Add descriptive metadata*\n",
"Consider adding keywords or text thatusers might search for when thinkingof a specific product or service.\n",
"Multimodality\n",
"Leverage modelssuch as Whisper orGPT-4V totransform audio orvisual content intotext.\n",
"For example, youcan use GPT-4V togenerate tags forimages or todescribe slides.\n",
"* GPT-\u001b[1;36m4\u001b[0m can do this for you with the right prompt\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Input processing\n",
"Process input according to task\n",
"Q&amp;A\n",
"HyDE: Ask LLM to hypothetically answer thequestion &amp; use the answer to search the KB\n",
"embeddings\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span>…\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">876</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.145</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.179</span>…\n",
"Content search\n",
"Prompt LLM to rephrase input &amp; optionally addmore context\n",
"query\n",
"SELECT * from items…\n",
"DB search\n",
"NER: Find relevant entities to be used for akeyword search or to construct a search query\n",
"keywords\n",
"red\n",
"summer\n",
"BEST PRACTICES\n",
"Consider how to transform theinput to match content in thedatabase\n",
"Consider using metadata toaugment the user input\n",
"COMMON PITFALLS\n",
"➔ Comparing directly the inputto the database withoutconsidering the taskspecificities\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Input processing\n",
"Process input according to task\n",
"Q&A\n",
"HyDE: Ask LLM to hypothetically answer thequestion & use the answer to search the KB\n",
"embeddings\n",
".\u001b[1;36m983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m…\n",
".\u001b[1;36m876\u001b[0m, \u001b[1;36m0.145\u001b[0m, \u001b[1;36m0.179\u001b[0m…\n",
"Content search\n",
"Prompt LLM to rephrase input & optionally addmore context\n",
"query\n",
"SELECT * from items…\n",
"DB search\n",
"NER: Find relevant entities to be used for akeyword search or to construct a search query\n",
"keywords\n",
"red\n",
"summer\n",
"BEST PRACTICES\n",
"Consider how to transform theinput to match content in thedatabase\n",
"Consider using metadata toaugment the user input\n",
"COMMON PITFALLS\n",
"➔ Comparing directly the inputto the database withoutconsidering the taskspecificities\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Input processing: input augmentation\n",
"What is input augmentation?\n",
"Example approaches\n",
"Augmenting the input means turningit into something different, eitherrephrasing it, splitting it in severalinputs or\n",
"expanding it.\n",
"This helps boost performance asthe LLM might understand betterthe user intent.\n",
"Queryexpansion*\n",
"Rephrase thequery to bemoredescriptive\n",
"HyDE*\n",
"Hypotheticallyanswer thequestion &amp; usethe answer tosearch the KB\n",
"Splitting a query in N*\n",
"When there is more than <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> question orintent in a user query, considersplitting it in several queries\n",
"Fallback\n",
"Considerimplementing aflow where the LLMcan ask forclarification whenthere is not enoughinformation in theoriginal \n",
"user queryto get a result\n",
"<span style=\"font-weight: bold\">(</span>Especially relevantwith tool usage<span style=\"font-weight: bold\">)</span>\n",
"* GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can do this for you with the right prompt\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Input processing: input augmentation\n",
"What is input augmentation?\n",
"Example approaches\n",
"Augmenting the input means turningit into something different, eitherrephrasing it, splitting it in severalinputs or\n",
"expanding it.\n",
"This helps boost performance asthe LLM might understand betterthe user intent.\n",
"Queryexpansion*\n",
"Rephrase thequery to bemoredescriptive\n",
"HyDE*\n",
"Hypotheticallyanswer thequestion & usethe answer tosearch the KB\n",
"Splitting a query in N*\n",
"When there is more than \u001b[1;36m1\u001b[0m question orintent in a user query, considersplitting it in several queries\n",
"Fallback\n",
"Considerimplementing aflow where the LLMcan ask forclarification whenthere is not enoughinformation in theoriginal \n",
"user queryto get a result\n",
"\u001b[1m(\u001b[0mEspecially relevantwith tool usage\u001b[1m)\u001b[0m\n",
"* GPT-\u001b[1;36m4\u001b[0m can do this for you with the right prompt\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Input processing: NER\n",
"Why use NER?\n",
"Using NER <span style=\"font-weight: bold\">(</span>Named EntityRecognition<span style=\"font-weight: bold\">)</span> allows to extractrelevant entities from the input, thatcan then be used for \n",
"moredeterministic search queries.This can be useful when the scopeis very constrained.\n",
"Example\n",
"Searching for movies\n",
"If you have a structured database containingmetadata on movies, you can extract genre,actors or directors names, \n",
"etc. from the userquery and use this to search the database\n",
"Note: You can use exact values or embeddings afterhaving extracted the relevant entities\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Input processing: NER\n",
"Why use NER?\n",
"Using NER \u001b[1m(\u001b[0mNamed EntityRecognition\u001b[1m)\u001b[0m allows to extractrelevant entities from the input, thatcan then be used for \n",
"moredeterministic search queries.This can be useful when the scopeis very constrained.\n",
"Example\n",
"Searching for movies\n",
"If you have a structured database containingmetadata on movies, you can extract genre,actors or directors names, \n",
"etc. from the userquery and use this to search the database\n",
"Note: You can use exact values or embeddings afterhaving extracted the relevant entities\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Retrieval\n",
"re-ranking\n",
"INPUT\n",
"embeddings\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span>…\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">876</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.145</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.179</span>…\n",
"query\n",
"SELECT * from items…\n",
"keywords\n",
"red\n",
"summer\n",
"Semanticsearch\n",
"RESULTS\n",
"RESULTS\n",
"vector DB\n",
"relational <span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">nosql</span> db\n",
"FINAL RESULT\n",
"Used togenerate output\n",
"BEST PRACTICES\n",
"Use a combination of semanticsearch and deterministic querieswhere possible\n",
"+ Cache output where possible\n",
"COMMON PITFALLS\n",
"➔ The wrong elements could becompared when looking attext similarity, that is whyre-ranking is important\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Retrieval\n",
"re-ranking\n",
"INPUT\n",
"embeddings\n",
".\u001b[1;36m983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m…\n",
".\u001b[1;36m876\u001b[0m, \u001b[1;36m0.145\u001b[0m, \u001b[1;36m0.179\u001b[0m…\n",
"query\n",
"SELECT * from items…\n",
"keywords\n",
"red\n",
"summer\n",
"Semanticsearch\n",
"RESULTS\n",
"RESULTS\n",
"vector DB\n",
"relational \u001b[35m/\u001b[0m\u001b[95mnosql\u001b[0m db\n",
"FINAL RESULT\n",
"Used togenerate output\n",
"BEST PRACTICES\n",
"Use a combination of semanticsearch and deterministic querieswhere possible\n",
"+ Cache output where possible\n",
"COMMON PITFALLS\n",
"➔ The wrong elements could becompared when looking attext similarity, that is whyre-ranking is important\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Retrieval: search\n",
"How to search?\n",
"Semantic search\n",
"Keyword search\n",
"Search query\n",
"There are many differentapproaches to search depending onthe use case and the existingsystem.\n",
"Using embeddings, youcan perform semanticsearches. You cancompare embeddingswith what is in yourdatabase and find \n",
"themost similar.\n",
"If you have extractedspecific entities orkeywords to search for,you can search for thesein your database.\n",
"Based on the extractedentities you have or theuser input as is, you canconstruct search <span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">queries</span><span style=\"font-weight: bold\">(</span>SQL, cypher…<span style=\"font-weight: bold\">)</span> and \n",
"usethese queries to searchyour database.\n",
"You can use a hybrid approach and combine several of these.\n",
"You can perform multiple searches in parallel or in sequence, orsearch for keywords with their embeddings for \n",
"example.\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Retrieval: search\n",
"How to search?\n",
"Semantic search\n",
"Keyword search\n",
"Search query\n",
"There are many differentapproaches to search depending onthe use case and the existingsystem.\n",
"Using embeddings, youcan perform semanticsearches. You cancompare embeddingswith what is in yourdatabase and find \n",
"themost similar.\n",
"If you have extractedspecific entities orkeywords to search for,you can search for thesein your database.\n",
"Based on the extractedentities you have or theuser input as is, you canconstruct search \u001b[1;35mqueries\u001b[0m\u001b[1m(\u001b[0mSQL, cypher…\u001b[1m)\u001b[0m and \n",
"usethese queries to searchyour database.\n",
"You can use a hybrid approach and combine several of these.\n",
"You can perform multiple searches in parallel or in sequence, orsearch for keywords with their embeddings for \n",
"example.\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Retrieval: multi-step retrieval\n",
"What is multi-step retrieval?\n",
"In some cases, there might beseveral actions to be performed toget the required information togenerate an answer.\n",
"Things to consider\n",
"●\n",
"Framework to be used:\n",
"○ When there are multiple steps to perform,consider whether you want to handle thisyourself or use a framework to \n",
"make it easier\n",
"●\n",
"Cost &amp; Latency:\n",
"○\n",
"○\n",
"Performing multiple steps at the retrievalstage can increase latency and costsignificantly\n",
"Consider performing actions in parallel toreduce latency\n",
"●\n",
"Chain of Thought:\n",
"○\n",
"○\n",
"Guide the assistant with the chain of thoughtapproach: break down instructions intoseveral steps, with clear \n",
"guidelines onwhether to continue, stop or do somethingelse.This is more appropriate when tasks need tobe performed \n",
"sequentially - for example: “ifthis didnt work, then do this”\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Retrieval: multi-step retrieval\n",
"What is multi-step retrieval?\n",
"In some cases, there might beseveral actions to be performed toget the required information togenerate an answer.\n",
"Things to consider\n",
"●\n",
"Framework to be used:\n",
"○ When there are multiple steps to perform,consider whether you want to handle thisyourself or use a framework to \n",
"make it easier\n",
"●\n",
"Cost & Latency:\n",
"○\n",
"○\n",
"Performing multiple steps at the retrievalstage can increase latency and costsignificantly\n",
"Consider performing actions in parallel toreduce latency\n",
"●\n",
"Chain of Thought:\n",
"○\n",
"○\n",
"Guide the assistant with the chain of thoughtapproach: break down instructions intoseveral steps, with clear \n",
"guidelines onwhether to continue, stop or do somethingelse.This is more appropriate when tasks need tobe performed \n",
"sequentially - for example: “ifthis didnt work, then do this”\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Retrieval: re-ranking\n",
"What is re-ranking?\n",
"Example approaches\n",
"Re-ranking means re-ordering theresults of the retrieval process tosurface more relevant results.\n",
"This is particularly important whendoing semantic searches.\n",
"Rule-based re-ranking\n",
"You can use metadata to rank results by relevance. Forexample, you can look at the recency of the documents, \n",
"attags, specific keywords in the title, etc.\n",
"Re-ranking algorithms\n",
"There are several existing algorithms/approaches you can usebased on your use case: BERT-based \n",
"re-rankers,cross-encoder re-ranking, TF-IDF algorithms…\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Retrieval: re-ranking\n",
"What is re-ranking?\n",
"Example approaches\n",
"Re-ranking means re-ordering theresults of the retrieval process tosurface more relevant results.\n",
"This is particularly important whendoing semantic searches.\n",
"Rule-based re-ranking\n",
"You can use metadata to rank results by relevance. Forexample, you can look at the recency of the documents, \n",
"attags, specific keywords in the title, etc.\n",
"Re-ranking algorithms\n",
"There are several existing algorithms/approaches you can usebased on your use case: BERT-based \n",
"re-rankers,cross-encoder re-ranking, TF-IDF algorithms…\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Answer Generation\n",
"FINAL RESULT\n",
"Piece of contentretrieved\n",
"LLM\n",
"Prompt includingthe content\n",
"User sees thefinal result\n",
"BEST PRACTICES\n",
"Evaluate performance after eachexperimentation to assess if itsworth exploring other paths\n",
"+ Implement guardrails if applicable\n",
"COMMON PITFALLS\n",
"➔ Going for fine-tuning withouttrying other approaches\n",
"➔ Not paying attention to theway the model is prompted\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Answer Generation\n",
"FINAL RESULT\n",
"Piece of contentretrieved\n",
"LLM\n",
"Prompt includingthe content\n",
"User sees thefinal result\n",
"BEST PRACTICES\n",
"Evaluate performance after eachexperimentation to assess if itsworth exploring other paths\n",
"+ Implement guardrails if applicable\n",
"COMMON PITFALLS\n",
"➔ Going for fine-tuning withouttrying other approaches\n",
"➔ Not paying attention to theway the model is prompted\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Answer Generation: context window\n",
"How to manage context?\n",
"Depending on your use case, there areseveral things to consider whenincluding retrieved content into thecontext \n",
"window to generate an answer.\n",
"Things to consider\n",
"●\n",
"Context window max size:\n",
"○\n",
"○\n",
"There is a maximum size, so putting toomuch content is not ideal\n",
"In conversation use cases, theconversation will be part of the contextas well and will add to that size\n",
"●\n",
"Cost &amp; Latency vs Accuracy:\n",
"○ More context results in increased\n",
"latency and additional costs since therewill be more input tokens\n",
"Less context might also result indecreased accuracy\n",
"○\n",
"●\n",
"“Lost in the middle” problem:\n",
"○ When there is too much context, LLMstend to forget the text “in the middle” ofthe content and might look over \n",
"someimportant information.\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Answer Generation: context window\n",
"How to manage context?\n",
"Depending on your use case, there areseveral things to consider whenincluding retrieved content into thecontext \n",
"window to generate an answer.\n",
"Things to consider\n",
"●\n",
"Context window max size:\n",
"○\n",
"○\n",
"There is a maximum size, so putting toomuch content is not ideal\n",
"In conversation use cases, theconversation will be part of the contextas well and will add to that size\n",
"●\n",
"Cost & Latency vs Accuracy:\n",
"○ More context results in increased\n",
"latency and additional costs since therewill be more input tokens\n",
"Less context might also result indecreased accuracy\n",
"○\n",
"●\n",
"“Lost in the middle” problem:\n",
"○ When there is too much context, LLMstend to forget the text “in the middle” ofthe content and might look over \n",
"someimportant information.\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Answer Generation: optimisation\n",
"How to optimise?\n",
"There are a few differentmethods to consider whenoptimising a RAG application.\n",
"Try them from left to right, anditerate with several of theseapproaches if needed.\n",
"Prompt Engineering\n",
"Few-shot examples\n",
"Fine-tuning\n",
"At each point of theprocess, experiment withdifferent prompts to getthe expected input formator generate a \n",
"relevantoutput.\n",
"Try guiding the model ifthe process to get to thefinal outcome containsseveral steps.\n",
"If the model doesntbehave as expected,provide examples of whatyou want e.g. provideexample user inputs andthe \n",
"expected processingformat.\n",
"If giving a few examplesisnt enough, considerfine-tuning a model withmore examples for eachstep of the process: \n",
"youcan fine-tune to get aspecific input processingor output format.\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Answer Generation: optimisation\n",
"How to optimise?\n",
"There are a few differentmethods to consider whenoptimising a RAG application.\n",
"Try them from left to right, anditerate with several of theseapproaches if needed.\n",
"Prompt Engineering\n",
"Few-shot examples\n",
"Fine-tuning\n",
"At each point of theprocess, experiment withdifferent prompts to getthe expected input formator generate a \n",
"relevantoutput.\n",
"Try guiding the model ifthe process to get to thefinal outcome containsseveral steps.\n",
"If the model doesntbehave as expected,provide examples of whatyou want e.g. provideexample user inputs andthe \n",
"expected processingformat.\n",
"If giving a few examplesisnt enough, considerfine-tuning a model withmore examples for eachstep of the process: \n",
"youcan fine-tune to get aspecific input processingor output format.\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Answer Generation: safety checks\n",
"Why include safety checks?\n",
"Just because you provide the modelwith <span style=\"font-weight: bold\">(</span>supposedly<span style=\"font-weight: bold\">)</span> relevant contextdoesnt mean the answer willsystematically be \n",
"truthful or on-point.\n",
"Depending on the use case, youmight want to double-check.\n",
"Example evaluation framework: RAGAS\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Answer Generation: safety checks\n",
"Why include safety checks?\n",
"Just because you provide the modelwith \u001b[1m(\u001b[0msupposedly\u001b[1m)\u001b[0m relevant contextdoesnt mean the answer willsystematically be \n",
"truthful or on-point.\n",
"Depending on the use case, youmight want to double-check.\n",
"Example evaluation framework: RAGAS\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"</pre>\n"
],
"text/plain": [
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Overview**\n",
"Retrieval-Augmented Generation <span style=\"font-weight: bold\">(</span>RAG<span style=\"font-weight: bold\">)</span> enhances language models by integrating them with a retrieval system. This \n",
"combination allows the model to access external knowledge sources, resulting in more accurate and contextually \n",
"relevant responses.\n",
"**Example Use Cases:**\n",
"- Providing answers with up-to-date information\n",
"- Generating contextual responses\n",
"**What Well Cover:**\n",
"- Technical patterns\n",
"- Best practices\n",
"- Common pitfalls\n",
"- Resources\n",
"</pre>\n"
],
"text/plain": [
"**Overview**\n",
"Retrieval-Augmented Generation \u001b[1m(\u001b[0mRAG\u001b[1m)\u001b[0m enhances language models by integrating them with a retrieval system. This \n",
"combination allows the model to access external knowledge sources, resulting in more accurate and contextually \n",
"relevant responses.\n",
"**Example Use Cases:**\n",
"- Providing answers with up-to-date information\n",
"- Generating contextual responses\n",
"**What Well Cover:**\n",
"- Technical patterns\n",
"- Best practices\n",
"- Common pitfalls\n",
"- Resources\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns**\n",
"This image outlines four key technical patterns involved in data processing and answer generation:\n",
". **Data Preparation**\n",
" - **Chunking**: Breaking down data into smaller, manageable pieces.\n",
" - **Embeddings**: Converting data into numerical formats that can be easily processed by machine learning \n",
"models.\n",
" - **Augmenting Content**: Enhancing data with additional information to improve its quality or usefulness.\n",
". **Input Processing**\n",
" - **Input Augmentation**: Adding extra data or features to the input to improve model performance.\n",
" - **NER <span style=\"font-weight: bold\">(</span>Named Entity Recognition<span style=\"font-weight: bold\">)</span>**: Identifying and classifying key entities in the text, such as names, \n",
"dates, and locations.\n",
" - **Embeddings**: Similar to data preparation, embeddings are used here to represent input data in a format \n",
"suitable for processing.\n",
". **Retrieval**\n",
" - **Search**: Locating relevant information from a dataset.\n",
" - **Multi-step Retrieval**: Using multiple steps or methods to refine the search process and improve accuracy.\n",
" - **Re-ranking**: Adjusting the order of retrieved results based on relevance or other criteria.\n",
". **Answer Generation**\n",
" - **Context Window**: Using a specific portion of data to generate relevant answers.\n",
" - **Optimisation**: Improving the efficiency and accuracy of the answer generation process.\n",
" - **Safety Checks**: Ensuring that the generated answers are safe and appropriate for use.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns**\n",
"This image outlines four key technical patterns involved in data processing and answer generation:\n",
". **Data Preparation**\n",
" - **Chunking**: Breaking down data into smaller, manageable pieces.\n",
" - **Embeddings**: Converting data into numerical formats that can be easily processed by machine learning \n",
"models.\n",
" - **Augmenting Content**: Enhancing data with additional information to improve its quality or usefulness.\n",
". **Input Processing**\n",
" - **Input Augmentation**: Adding extra data or features to the input to improve model performance.\n",
" - **NER \u001b[1m(\u001b[0mNamed Entity Recognition\u001b[1m)\u001b[0m**: Identifying and classifying key entities in the text, such as names, \n",
"dates, and locations.\n",
" - **Embeddings**: Similar to data preparation, embeddings are used here to represent input data in a format \n",
"suitable for processing.\n",
". **Retrieval**\n",
" - **Search**: Locating relevant information from a dataset.\n",
" - **Multi-step Retrieval**: Using multiple steps or methods to refine the search process and improve accuracy.\n",
" - **Re-ranking**: Adjusting the order of retrieved results based on relevance or other criteria.\n",
". **Answer Generation**\n",
" - **Context Window**: Using a specific portion of data to generate relevant answers.\n",
" - **Optimisation**: Improving the efficiency and accuracy of the answer generation process.\n",
" - **Safety Checks**: Ensuring that the generated answers are safe and appropriate for use.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Data Preparation**\n",
"This presentation focuses on the process of preparing data for easier consumption by large language models <span style=\"font-weight: bold\">(</span>LLMs<span style=\"font-weight: bold\">)</span>.\n",
". **Content Chunking**: - Documents are divided into smaller, manageable pieces. This makes it easier for LLMs to\n",
"process the information.\n",
". **Embeddings**:\n",
" - Each chunk of content is converted into embeddings, which are numerical representations <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span><span style=\"font-weight: bold\">)</span> that capture the semantic meaning of the text. These embeddings are then stored in a knowledge base.\n",
". **Augmenting Content**:\n",
" - Content can be enhanced using LLMs. For example, GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can be used to rephrase, summarize, and generate bullet\n",
"points from the text.\n",
". **Best Practices**:\n",
" - Pre-process content for LLM consumption by adding summaries and headers for each part.\n",
" - Curate relevant data sources to ensure quality and relevance.\n",
". **Common Pitfalls**:\n",
" - Avoid having too much low-quality content.\n",
" - Ensure documents are not too large, as this can hinder processing efficiency.\n",
"This approach helps in organizing and optimizing data for better performance and understanding by LLMs.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Data Preparation**\n",
"This presentation focuses on the process of preparing data for easier consumption by large language models \u001b[1m(\u001b[0mLLMs\u001b[1m)\u001b[0m.\n",
". **Content Chunking**: - Documents are divided into smaller, manageable pieces. This makes it easier for LLMs to\n",
"process the information.\n",
". **Embeddings**:\n",
" - Each chunk of content is converted into embeddings, which are numerical representations \u001b[1m(\u001b[0me.g., \u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \n",
"\u001b[1;36m0.289\u001b[0m\u001b[1m)\u001b[0m that capture the semantic meaning of the text. These embeddings are then stored in a knowledge base.\n",
". **Augmenting Content**:\n",
" - Content can be enhanced using LLMs. For example, GPT-\u001b[1;36m4\u001b[0m can be used to rephrase, summarize, and generate bullet\n",
"points from the text.\n",
". **Best Practices**:\n",
" - Pre-process content for LLM consumption by adding summaries and headers for each part.\n",
" - Curate relevant data sources to ensure quality and relevance.\n",
". **Common Pitfalls**:\n",
" - Avoid having too much low-quality content.\n",
" - Ensure documents are not too large, as this can hinder processing efficiency.\n",
"This approach helps in organizing and optimizing data for better performance and understanding by LLMs.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Data Preparation - Chunking**\n",
"**Why Chunking?**\n",
"Chunking is a technique used when your system doesn't need entire documents to provide relevant answers. By \n",
"breaking documents into smaller pieces, you can make data easier to process, which reduces cost and latency. This \n",
"approach is beneficial for systems that need to handle large volumes of data efficiently. Other methods for data \n",
"preparation include using graphs or map-reduce.\n",
"**Things to Consider**\n",
". **Overlap:**\n",
" - Should chunks be independent or overlap with one another?\n",
" - If they overlap, by how much should they do so?\n",
". **Size of Chunks:**\n",
" - What is the optimal chunk size for your specific use case?\n",
" - Do you want to include a lot of information in the context window, or just the minimum necessary?\n",
". **Where to Chunk:**\n",
" - Should you chunk every N tokens or use specific separators?\n",
" - Is there a logical way to split the context that would aid the retrieval process?\n",
". **What to Return:**\n",
" - Should you return chunks across multiple documents or focus on top chunks within the same document?\n",
" - Should chunks be linked together with metadata to indicate common properties?\n",
"These considerations help in designing an efficient chunking strategy that aligns with your system's requirements \n",
"and goals.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Data Preparation - Chunking**\n",
"**Why Chunking?**\n",
"Chunking is a technique used when your system doesn't need entire documents to provide relevant answers. By \n",
"breaking documents into smaller pieces, you can make data easier to process, which reduces cost and latency. This \n",
"approach is beneficial for systems that need to handle large volumes of data efficiently. Other methods for data \n",
"preparation include using graphs or map-reduce.\n",
"**Things to Consider**\n",
". **Overlap:**\n",
" - Should chunks be independent or overlap with one another?\n",
" - If they overlap, by how much should they do so?\n",
". **Size of Chunks:**\n",
" - What is the optimal chunk size for your specific use case?\n",
" - Do you want to include a lot of information in the context window, or just the minimum necessary?\n",
". **Where to Chunk:**\n",
" - Should you chunk every N tokens or use specific separators?\n",
" - Is there a logical way to split the context that would aid the retrieval process?\n",
". **What to Return:**\n",
" - Should you return chunks across multiple documents or focus on top chunks within the same document?\n",
" - Should chunks be linked together with metadata to indicate common properties?\n",
"These considerations help in designing an efficient chunking strategy that aligns with your system's requirements \n",
"and goals.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"># Technical Patterns: Data Preparation - Embeddings\n",
"## What to Embed?\n",
"When preparing data for embedding, it's important to consider not just the text but also the metadata. This \n",
"approach can enhance the searchability and relevance of the data. Here are some examples:\n",
"### Examples\n",
". **Embedding Q&amp;A Posts in a Forum**\n",
" - You might want to include the title of the posts, the original question, and the top answers.\n",
" - Additionally, if the posts are tagged by topic or keywords, these can be embedded as well.\n",
". **Embedding Product Specs**\n",
" - Besides embedding the text from product descriptions, you can add metadata such as color, size, and other \n",
"specifications to your embeddings.\n",
"By embedding both text and metadata, you can improve the ability to surface specific chunks or documents during a \n",
"search.\n",
"</pre>\n"
],
"text/plain": [
"# Technical Patterns: Data Preparation - Embeddings\n",
"## What to Embed?\n",
"When preparing data for embedding, it's important to consider not just the text but also the metadata. This \n",
"approach can enhance the searchability and relevance of the data. Here are some examples:\n",
"### Examples\n",
". **Embedding Q&A Posts in a Forum**\n",
" - You might want to include the title of the posts, the original question, and the top answers.\n",
" - Additionally, if the posts are tagged by topic or keywords, these can be embedded as well.\n",
". **Embedding Product Specs**\n",
" - Besides embedding the text from product descriptions, you can add metadata such as color, size, and other \n",
"specifications to your embeddings.\n",
"By embedding both text and metadata, you can improve the ability to surface specific chunks or documents during a \n",
"search.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Data Preparation - Augmenting Content**\n",
"**What does “Augmenting content” mean?**\n",
"Augmenting content involves modifying the original material to make it more accessible and understandable for \n",
"systems that rely on Retrieval-Augmented Generation <span style=\"font-weight: bold\">(</span>RAG<span style=\"font-weight: bold\">)</span>. These modifications can include changes in format, \n",
"wording, or the addition of descriptive elements like summaries or keywords.\n",
"**Example Approaches:**\n",
". **Make it a Guide:**\n",
" - Reformat the content into a step-by-step guide with clear headings and bullet points. This structure is more \n",
"easily understood by a Language Learning Model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span>. GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can assist with this transformation using the right \n",
"prompts.\n",
". **Add Descriptive Metadata:**\n",
" - Incorporate keywords or text that users might search for when considering a specific product or service. This \n",
"helps in making the content more searchable and relevant.\n",
". **Multimodality:**\n",
" - Utilize models like Whisper or GPT-4V to convert audio or visual content into text. For instance, GPT-4V can \n",
"generate tags for images or describe slides, enhancing the content's accessibility and utility.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Data Preparation - Augmenting Content**\n",
"**What does “Augmenting content” mean?**\n",
"Augmenting content involves modifying the original material to make it more accessible and understandable for \n",
"systems that rely on Retrieval-Augmented Generation \u001b[1m(\u001b[0mRAG\u001b[1m)\u001b[0m. These modifications can include changes in format, \n",
"wording, or the addition of descriptive elements like summaries or keywords.\n",
"**Example Approaches:**\n",
". **Make it a Guide:**\n",
" - Reformat the content into a step-by-step guide with clear headings and bullet points. This structure is more \n",
"easily understood by a Language Learning Model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m. GPT-\u001b[1;36m4\u001b[0m can assist with this transformation using the right \n",
"prompts.\n",
". **Add Descriptive Metadata:**\n",
" - Incorporate keywords or text that users might search for when considering a specific product or service. This \n",
"helps in making the content more searchable and relevant.\n",
". **Multimodality:**\n",
" - Utilize models like Whisper or GPT-4V to convert audio or visual content into text. For instance, GPT-4V can \n",
"generate tags for images or describe slides, enhancing the content's accessibility and utility.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Input Processing**\n",
" methods for processing input data according to specific tasks, focusing on three main areas: Q&amp;A, content search, \n",
"and database <span style=\"font-weight: bold\">(</span>DB<span style=\"font-weight: bold\">)</span> search.\n",
". **Q&amp;A**: - Uses a technique called HyDE, where a large language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> is asked to hypothetically answer a\n",
"question. This answer is then used to search the knowledge base <span style=\"font-weight: bold\">(</span>KB<span style=\"font-weight: bold\">)</span>.\n",
". **Content Search**:\n",
" - Involves prompting the LLM to rephrase the input and optionally add more context to improve search results.\n",
". **DB Search**:\n",
" - Utilizes Named Entity Recognition <span style=\"font-weight: bold\">(</span>NER<span style=\"font-weight: bold\">)</span> to find relevant entities. These entities are then used for keyword \n",
"searches or to construct a search query.\n",
" highlights different output formats:\n",
"- **Embeddings**: Numerical representations of data, such as vectors <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span><span style=\"font-weight: bold\">)</span>.\n",
"- **Query**: SQL-like statements for database searches <span style=\"font-weight: bold\">(</span>e.g., SELECT * from items<span style=\"font-weight: bold\">)</span>.\n",
"- **Keywords**: Specific terms extracted from the input <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008000; text-decoration-color: #008000\">\"red,\"</span> <span style=\"color: #008000; text-decoration-color: #008000\">\"summer\"</span><span style=\"font-weight: bold\">)</span>.\n",
"**Best Practices**:\n",
"- Transform the input to match the content in the database.\n",
"- Use metadata to enhance user input.\n",
"**Common Pitfalls**:\n",
"- Avoid directly comparing input to the database without considering the specific requirements of the task.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Input Processing**\n",
" methods for processing input data according to specific tasks, focusing on three main areas: Q&A, content search, \n",
"and database \u001b[1m(\u001b[0mDB\u001b[1m)\u001b[0m search.\n",
". **Q&A**: - Uses a technique called HyDE, where a large language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m is asked to hypothetically answer a\n",
"question. This answer is then used to search the knowledge base \u001b[1m(\u001b[0mKB\u001b[1m)\u001b[0m.\n",
". **Content Search**:\n",
" - Involves prompting the LLM to rephrase the input and optionally add more context to improve search results.\n",
". **DB Search**:\n",
" - Utilizes Named Entity Recognition \u001b[1m(\u001b[0mNER\u001b[1m)\u001b[0m to find relevant entities. These entities are then used for keyword \n",
"searches or to construct a search query.\n",
" highlights different output formats:\n",
"- **Embeddings**: Numerical representations of data, such as vectors \u001b[1m(\u001b[0me.g., \u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m\u001b[1m)\u001b[0m.\n",
"- **Query**: SQL-like statements for database searches \u001b[1m(\u001b[0me.g., SELECT * from items\u001b[1m)\u001b[0m.\n",
"- **Keywords**: Specific terms extracted from the input \u001b[1m(\u001b[0me.g., \u001b[32m\"red,\"\u001b[0m \u001b[32m\"summer\"\u001b[0m\u001b[1m)\u001b[0m.\n",
"**Best Practices**:\n",
"- Transform the input to match the content in the database.\n",
"- Use metadata to enhance user input.\n",
"**Common Pitfalls**:\n",
"- Avoid directly comparing input to the database without considering the specific requirements of the task.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Input Processing - Input Augmentation**\n",
"**What is input augmentation?**\n",
"Input augmentation involves transforming the input into something different, such as rephrasing it, splitting it \n",
"into several inputs, or expanding it. This process enhances performance by helping the language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> better \n",
"understand the user's intent.\n",
"**Example Approaches:**\n",
". **Query Expansion**\n",
" - Rephrase the query to make it more descriptive. This helps the LLM grasp the context and details more \n",
"effectively.\n",
". **HyDE**\n",
" - Hypothetically answer the question and use that answer to search the knowledge base <span style=\"font-weight: bold\">(</span>KB<span style=\"font-weight: bold\">)</span>. This approach can \n",
"provide more relevant results by anticipating possible answers.\n",
". **Splitting a Query in N**\n",
" - When a user query contains multiple questions or intents, consider dividing it into several queries. This \n",
"ensures each part is addressed thoroughly.\n",
". **Fallback**\n",
" - Implement a flow where the LLM can ask for clarification if the original query lacks sufficient information. \n",
"This is particularly useful when using tools that require precise input.\n",
"*Note: GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> can perform these tasks with the appropriate prompt.*\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Input Processing - Input Augmentation**\n",
"**What is input augmentation?**\n",
"Input augmentation involves transforming the input into something different, such as rephrasing it, splitting it \n",
"into several inputs, or expanding it. This process enhances performance by helping the language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m better \n",
"understand the user's intent.\n",
"**Example Approaches:**\n",
". **Query Expansion**\n",
" - Rephrase the query to make it more descriptive. This helps the LLM grasp the context and details more \n",
"effectively.\n",
". **HyDE**\n",
" - Hypothetically answer the question and use that answer to search the knowledge base \u001b[1m(\u001b[0mKB\u001b[1m)\u001b[0m. This approach can \n",
"provide more relevant results by anticipating possible answers.\n",
". **Splitting a Query in N**\n",
" - When a user query contains multiple questions or intents, consider dividing it into several queries. This \n",
"ensures each part is addressed thoroughly.\n",
". **Fallback**\n",
" - Implement a flow where the LLM can ask for clarification if the original query lacks sufficient information. \n",
"This is particularly useful when using tools that require precise input.\n",
"*Note: GPT-\u001b[1;36m4\u001b[0m can perform these tasks with the appropriate prompt.*\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Input Processing - NER\n",
"**Why use NER?**\n",
"Named Entity Recognition <span style=\"font-weight: bold\">(</span>NER<span style=\"font-weight: bold\">)</span> is a technique used to extract relevant entities from input data. This process is \n",
"beneficial for creating more deterministic search queries, especially when the scope is very constrained. By \n",
"identifying specific entities, such as names, dates, or locations, NER helps in refining and improving the accuracy\n",
"of searches.\n",
"**Example: Searching for Movies**\n",
"Consider a structured database containing metadata on movies. By using NER, you can extract specific entities like \n",
"genre, actors, or directors' names from a user's query. This information can then be used to search the database \n",
"more effectively.\n",
"**Note:** After extracting the relevant entities, you can use exact values or embeddings to enhance the search \n",
"process.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Input Processing - NER\n",
"**Why use NER?**\n",
"Named Entity Recognition \u001b[1m(\u001b[0mNER\u001b[1m)\u001b[0m is a technique used to extract relevant entities from input data. This process is \n",
"beneficial for creating more deterministic search queries, especially when the scope is very constrained. By \n",
"identifying specific entities, such as names, dates, or locations, NER helps in refining and improving the accuracy\n",
"of searches.\n",
"**Example: Searching for Movies**\n",
"Consider a structured database containing metadata on movies. By using NER, you can extract specific entities like \n",
"genre, actors, or directors' names from a user's query. This information can then be used to search the database \n",
"more effectively.\n",
"**Note:** After extracting the relevant entities, you can use exact values or embeddings to enhance the search \n",
"process.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Retrieval\n",
"This diagram illustrates a retrieval process using technical patterns. The process begins with three types of \n",
"input: embeddings, queries, and keywords.\n",
". **Embeddings**: These are numerical representations <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.983</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.123</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.289</span><span style=\"font-weight: bold\">)</span> used for semantic search. They \n",
"are processed through a vector database <span style=\"font-weight: bold\">(</span>vector DB<span style=\"font-weight: bold\">)</span>.\n",
". **Query**: This involves structured queries <span style=\"font-weight: bold\">(</span>e.g., <span style=\"color: #008000; text-decoration-color: #008000\">\"SELECT * from items...\"</span><span style=\"font-weight: bold\">)</span> that interact with a relational or \n",
"NoSQL database.\n",
". **Keywords**: Simple search terms like <span style=\"color: #008000; text-decoration-color: #008000\">\"red\"</span> and <span style=\"color: #008000; text-decoration-color: #008000\">\"summer\"</span> are also used with the relational or NoSQL database.\n",
"The results from both the vector and relational/NoSQL databases are combined. The initial results undergo a \n",
"re-ranking process to ensure accuracy and relevance, leading to the final result, which is then used to generate \n",
"output.\n",
"**Best Practices**:\n",
"- Combine semantic search with deterministic queries for more effective retrieval.\n",
"- Cache outputs where possible to improve efficiency.\n",
"**Common Pitfalls**:\n",
"- Incorrect element comparison during text similarity checks can occur, highlighting the importance of re-ranking \n",
"to ensure accurate results.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Retrieval\n",
"This diagram illustrates a retrieval process using technical patterns. The process begins with three types of \n",
"input: embeddings, queries, and keywords.\n",
". **Embeddings**: These are numerical representations \u001b[1m(\u001b[0me.g., \u001b[1;36m0.983\u001b[0m, \u001b[1;36m0.123\u001b[0m, \u001b[1;36m0.289\u001b[0m\u001b[1m)\u001b[0m used for semantic search. They \n",
"are processed through a vector database \u001b[1m(\u001b[0mvector DB\u001b[1m)\u001b[0m.\n",
". **Query**: This involves structured queries \u001b[1m(\u001b[0me.g., \u001b[32m\"SELECT * from items...\"\u001b[0m\u001b[1m)\u001b[0m that interact with a relational or \n",
"NoSQL database.\n",
". **Keywords**: Simple search terms like \u001b[32m\"red\"\u001b[0m and \u001b[32m\"summer\"\u001b[0m are also used with the relational or NoSQL database.\n",
"The results from both the vector and relational/NoSQL databases are combined. The initial results undergo a \n",
"re-ranking process to ensure accuracy and relevance, leading to the final result, which is then used to generate \n",
"output.\n",
"**Best Practices**:\n",
"- Combine semantic search with deterministic queries for more effective retrieval.\n",
"- Cache outputs where possible to improve efficiency.\n",
"**Common Pitfalls**:\n",
"- Incorrect element comparison during text similarity checks can occur, highlighting the importance of re-ranking \n",
"to ensure accurate results.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Retrieval - Search\n",
"**How to search?**\n",
"There are various approaches to searching, which depend on the use case and the existing system. Here are three \n",
"main methods:\n",
". **Semantic Search**:\n",
" - This method uses embeddings to perform searches. - By comparing embeddings with the data in your database, \n",
"you can find the most similar matches.\n",
". **Keyword Search**:\n",
" - If you have specific entities or keywords extracted, you can search for these directly in your database.\n",
". **Search Query**:\n",
" - Based on extracted entities or direct user input, you can construct search queries <span style=\"font-weight: bold\">(</span>such as SQL or Cypher<span style=\"font-weight: bold\">)</span> to \n",
"search your database.\n",
"Additionally, you can use a hybrid approach by combining several methods. This can involve performing multiple \n",
"searches in parallel or in sequence, or searching for keywords along with their embeddings.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Retrieval - Search\n",
"**How to search?**\n",
"There are various approaches to searching, which depend on the use case and the existing system. Here are three \n",
"main methods:\n",
". **Semantic Search**:\n",
" - This method uses embeddings to perform searches. - By comparing embeddings with the data in your database, \n",
"you can find the most similar matches.\n",
". **Keyword Search**:\n",
" - If you have specific entities or keywords extracted, you can search for these directly in your database.\n",
". **Search Query**:\n",
" - Based on extracted entities or direct user input, you can construct search queries \u001b[1m(\u001b[0msuch as SQL or Cypher\u001b[1m)\u001b[0m to \n",
"search your database.\n",
"Additionally, you can use a hybrid approach by combining several methods. This can involve performing multiple \n",
"searches in parallel or in sequence, or searching for keywords along with their embeddings.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Retrieval - Multi-step Retrieval**\n",
"**What is multi-step retrieval?**\n",
"Multi-step retrieval involves performing several actions to obtain the necessary information to generate an answer.\n",
"This approach is useful when a single step is insufficient to gather all required data.\n",
"**Things to Consider**\n",
". **Framework to be Used:**\n",
" - When multiple steps are needed, decide whether to manage this process yourself or use a framework to simplify \n",
"the task.\n",
". **Cost &amp; Latency:**\n",
" - Performing multiple steps can significantly increase both latency and cost.\n",
" - To mitigate latency, consider executing actions in parallel.\n",
". **Chain of Thought:**\n",
" - Use a chain of thought approach to guide the process. Break down instructions into clear steps, providing \n",
"guidelines on whether to continue, stop, or take alternative actions.\n",
" - This method is particularly useful for tasks that must be performed sequentially, such as <span style=\"color: #008000; text-decoration-color: #008000\">\"if this didnt </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">work, then do this.\"</span>\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Retrieval - Multi-step Retrieval**\n",
"**What is multi-step retrieval?**\n",
"Multi-step retrieval involves performing several actions to obtain the necessary information to generate an answer.\n",
"This approach is useful when a single step is insufficient to gather all required data.\n",
"**Things to Consider**\n",
". **Framework to be Used:**\n",
" - When multiple steps are needed, decide whether to manage this process yourself or use a framework to simplify \n",
"the task.\n",
". **Cost & Latency:**\n",
" - Performing multiple steps can significantly increase both latency and cost.\n",
" - To mitigate latency, consider executing actions in parallel.\n",
". **Chain of Thought:**\n",
" - Use a chain of thought approach to guide the process. Break down instructions into clear steps, providing \n",
"guidelines on whether to continue, stop, or take alternative actions.\n",
" - This method is particularly useful for tasks that must be performed sequentially, such as \u001b[32m\"if this didnt \u001b[0m\n",
"\u001b[32mwork, then do this.\"\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Retrieval - Re-ranking**\n",
"**What is re-ranking?**\n",
"Re-ranking involves re-ordering the results of a retrieval process to highlight more relevant outcomes. This is \n",
"especially crucial in semantic searches, where understanding the context and meaning of queries is important.\n",
"**Example Approaches**\n",
". **Rule-based Re-ranking**\n",
" - This approach uses metadata to rank results by relevance. For instance, you might consider the recency of \n",
"documents, tags, or specific keywords in the title to determine their importance.\n",
". **Re-ranking Algorithms**\n",
" - There are various algorithms available for re-ranking based on specific use cases. Examples include BERT-based\n",
"re-rankers, cross-encoder re-ranking, and TF-IDF algorithms. These methods apply different techniques to assess and\n",
"order the relevance of search results.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Retrieval - Re-ranking**\n",
"**What is re-ranking?**\n",
"Re-ranking involves re-ordering the results of a retrieval process to highlight more relevant outcomes. This is \n",
"especially crucial in semantic searches, where understanding the context and meaning of queries is important.\n",
"**Example Approaches**\n",
". **Rule-based Re-ranking**\n",
" - This approach uses metadata to rank results by relevance. For instance, you might consider the recency of \n",
"documents, tags, or specific keywords in the title to determine their importance.\n",
". **Re-ranking Algorithms**\n",
" - There are various algorithms available for re-ranking based on specific use cases. Examples include BERT-based\n",
"re-rankers, cross-encoder re-ranking, and TF-IDF algorithms. These methods apply different techniques to assess and\n",
"order the relevance of search results.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Answer Generation**\n",
"This diagram illustrates the process of generating answers using a language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span>. Here's a breakdown of the \n",
"components and concepts:\n",
". **Process Flow:**\n",
" - A piece of content is retrieved and used to create a prompt.\n",
" - This prompt is fed into the LLM, which processes it to generate a final result.\n",
" - The user then sees this final result.\n",
". **Best Practices:**\n",
" - It's important to evaluate performance after each experiment. This helps determine if exploring other methods \n",
"is beneficial.\n",
" - Implementing guardrails can be useful to ensure the model's outputs are safe and reliable.\n",
". **Common Pitfalls:**\n",
" - Avoid jumping straight to fine-tuning the model without considering other approaches that might be more \n",
"effective or efficient.\n",
" - Pay close attention to how the model is prompted, as this can significantly impact the quality of the output.\n",
"By following these guidelines, you can optimize the use of LLMs for generating accurate and useful answers.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Answer Generation**\n",
"This diagram illustrates the process of generating answers using a language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m. Here's a breakdown of the \n",
"components and concepts:\n",
". **Process Flow:**\n",
" - A piece of content is retrieved and used to create a prompt.\n",
" - This prompt is fed into the LLM, which processes it to generate a final result.\n",
" - The user then sees this final result.\n",
". **Best Practices:**\n",
" - It's important to evaluate performance after each experiment. This helps determine if exploring other methods \n",
"is beneficial.\n",
" - Implementing guardrails can be useful to ensure the model's outputs are safe and reliable.\n",
". **Common Pitfalls:**\n",
" - Avoid jumping straight to fine-tuning the model without considering other approaches that might be more \n",
"effective or efficient.\n",
" - Pay close attention to how the model is prompted, as this can significantly impact the quality of the output.\n",
"By following these guidelines, you can optimize the use of LLMs for generating accurate and useful answers.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"># Technical Patterns: Answer Generation - Context Window\n",
"## How to Manage Context?\n",
"When generating answers using a context window, it's important to consider several factors based on your specific \n",
"use case. Here are key points to keep in mind:\n",
"### Things to Consider\n",
"- **Context Window Max Size:**\n",
" - The context window has a maximum size, so overloading it with too much content is not ideal.\n",
" - In conversational scenarios, the conversation itself becomes part of the context, contributing to the overall \n",
"size.\n",
"- **Cost &amp; Latency vs. Accuracy:**\n",
" - Including more context can lead to increased latency and higher costs due to the additional input tokens \n",
"required.\n",
" - Conversely, using less context might reduce accuracy.\n",
"- **<span style=\"color: #008000; text-decoration-color: #008000\">\"Lost in the Middle\"</span> Problem:**\n",
" - When the context is too extensive, language models may overlook or forget information that is <span style=\"color: #008000; text-decoration-color: #008000\">\"in the middle\"</span> \n",
"of the content, potentially missing important details.\n",
"</pre>\n"
],
"text/plain": [
"# Technical Patterns: Answer Generation - Context Window\n",
"## How to Manage Context?\n",
"When generating answers using a context window, it's important to consider several factors based on your specific \n",
"use case. Here are key points to keep in mind:\n",
"### Things to Consider\n",
"- **Context Window Max Size:**\n",
" - The context window has a maximum size, so overloading it with too much content is not ideal.\n",
" - In conversational scenarios, the conversation itself becomes part of the context, contributing to the overall \n",
"size.\n",
"- **Cost & Latency vs. Accuracy:**\n",
" - Including more context can lead to increased latency and higher costs due to the additional input tokens \n",
"required.\n",
" - Conversely, using less context might reduce accuracy.\n",
"- **\u001b[32m\"Lost in the Middle\"\u001b[0m Problem:**\n",
" - When the context is too extensive, language models may overlook or forget information that is \u001b[32m\"in the middle\"\u001b[0m \n",
"of the content, potentially missing important details.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Answer Generation Optimisation**\n",
"**How to optimise?**\n",
"When optimising a Retrieval-Augmented Generation <span style=\"font-weight: bold\">(</span>RAG<span style=\"font-weight: bold\">)</span> application, there are several methods to consider. These \n",
"methods should be tried sequentially from left to right, and multiple approaches can be iterated if necessary.\n",
". **Prompt Engineering**\n",
" - Experiment with different prompts at each stage of the process to achieve the desired input format or generate\n",
"relevant output.\n",
" - Guide the model through multiple steps to reach the final outcome.\n",
". **Few-shot Examples**\n",
" - If the model's behavior is not as expected, provide examples of the desired outcome.\n",
" - Include sample user inputs and the expected processing format to guide the model.\n",
". **Fine-tuning**\n",
" - If a few examples are insufficient, consider fine-tuning the model with more examples for each process step.\n",
" - Fine-tuning can help achieve a specific input processing or output format.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Answer Generation Optimisation**\n",
"**How to optimise?**\n",
"When optimising a Retrieval-Augmented Generation \u001b[1m(\u001b[0mRAG\u001b[1m)\u001b[0m application, there are several methods to consider. These \n",
"methods should be tried sequentially from left to right, and multiple approaches can be iterated if necessary.\n",
". **Prompt Engineering**\n",
" - Experiment with different prompts at each stage of the process to achieve the desired input format or generate\n",
"relevant output.\n",
" - Guide the model through multiple steps to reach the final outcome.\n",
". **Few-shot Examples**\n",
" - If the model's behavior is not as expected, provide examples of the desired outcome.\n",
" - Include sample user inputs and the expected processing format to guide the model.\n",
". **Fine-tuning**\n",
" - If a few examples are insufficient, consider fine-tuning the model with more examples for each process step.\n",
" - Fine-tuning can help achieve a specific input processing or output format.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Answer Generation - Safety Checks\n",
"**Why include safety checks?**\n",
"Safety checks are crucial because providing a model with supposedly relevant context does not guarantee that the \n",
"generated answer will be truthful or accurate. Depending on the use case, it is important to double-check the \n",
"information to ensure reliability.\n",
"**RAGAS Score Evaluation Framework**\n",
"The RAGAS score is an evaluation framework that assesses both the generation and retrieval aspects of answer \n",
"generation:\n",
"- **Generation:**\n",
" - **Faithfulness:** This measures how factually accurate the generated answer is.\n",
" - **Answer Relevancy:** This evaluates how relevant the generated answer is to the question.\n",
"- **Retrieval:**\n",
" - **Context Precision:** This assesses the signal-to-noise ratio of the retrieved context, ensuring that the \n",
"information is precise.\n",
" - **Context Recall:** This checks if all relevant information required to answer the question is retrieved.\n",
"By using this framework, one can systematically evaluate and improve the quality of generated answers.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Answer Generation - Safety Checks\n",
"**Why include safety checks?**\n",
"Safety checks are crucial because providing a model with supposedly relevant context does not guarantee that the \n",
"generated answer will be truthful or accurate. Depending on the use case, it is important to double-check the \n",
"information to ensure reliability.\n",
"**RAGAS Score Evaluation Framework**\n",
"The RAGAS score is an evaluation framework that assesses both the generation and retrieval aspects of answer \n",
"generation:\n",
"- **Generation:**\n",
" - **Faithfulness:** This measures how factually accurate the generated answer is.\n",
" - **Answer Relevancy:** This evaluates how relevant the generated answer is to the question.\n",
"- **Retrieval:**\n",
" - **Context Precision:** This assesses the signal-to-noise ratio of the retrieved context, ensuring that the \n",
"information is precise.\n",
" - **Context Recall:** This checks if all relevant information required to answer the question is retrieved.\n",
"By using this framework, one can systematically evaluate and improve the quality of generated answers.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"Models - OpenAI API\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo , gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> , and gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview point to the latest model\n",
"version. You can verify this by looking at the response object after sending a request.\n",
"The response will include the specific model version used <span style=\"font-weight: bold\">(</span>e.g. gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span> <span style=\"font-weight: bold\">)</span>.\n",
"We also offer static model versions that developers can continue using for at least\n",
"three months after an updated model has been introduced. With the new cadence of\n",
"model updates, we are also giving people the ability to contribute evals to help us\n",
"improve the model for different use cases. If you are interested, check out the OpenAI\n",
"Evals repository.\n",
"Learn more about model deprecation on our deprecation page.\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> and GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is a large multimodal model <span style=\"font-weight: bold\">(</span>accepting text or image inputs and outputting text<span style=\"font-weight: bold\">)</span>\n",
"that can solve difficult problems with greater accuracy than any of our previous\n",
"models, thanks to its broader general knowledge and advanced reasoning capabilities.\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is available in the OpenAI API to paying customers. Like gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo , GPT-\n",
" is optimized for chat but works well for traditional completions tasks using the Chat\n",
"Completions API. Learn how to use GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> in our text generation guide.\n",
"MODEL\n",
"DE S CRIPTION\n",
"CONTEXT\n",
"WIND OW\n",
"TRAINING\n",
"DATA\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>-preview\n",
"New GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"Up to\n",
"Dec\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23</span>\n",
"The latest GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> model\n",
"tokens\n",
"intended to reduce cases of\n",
"“laziness” where the model\n",
"doesnt complete a task.\n",
"Returns a maximum of\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
"Learn more.\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview\n",
"Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">25</span>-preview.\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-preview\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo model\n",
"featuring improved\n",
"instruction following, JSON\n",
"mode, reproducible outputs,\n",
"parallel function calling, and\n",
"more. Returns a maximum\n",
"of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens. This\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"tokens\n",
"Up to\n",
"Dec\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23</span>\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"tokens\n",
"Up to\n",
"Apr <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"Models - OpenAI API\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo , gpt-\u001b[1;36m4\u001b[0m , and gpt-\u001b[1;36m4\u001b[0m-turbo-preview point to the latest model\n",
"version. You can verify this by looking at the response object after sending a request.\n",
"The response will include the specific model version used \u001b[1m(\u001b[0me.g. gpt-\u001b[1;36m3.5\u001b[0m-turbo-\n",
"\u001b[1;36m13\u001b[0m \u001b[1m)\u001b[0m.\n",
"We also offer static model versions that developers can continue using for at least\n",
"three months after an updated model has been introduced. With the new cadence of\n",
"model updates, we are also giving people the ability to contribute evals to help us\n",
"improve the model for different use cases. If you are interested, check out the OpenAI\n",
"Evals repository.\n",
"Learn more about model deprecation on our deprecation page.\n",
"GPT-\u001b[1;36m4\u001b[0m and GPT-\u001b[1;36m4\u001b[0m Turbo\n",
"GPT-\u001b[1;36m4\u001b[0m is a large multimodal model \u001b[1m(\u001b[0maccepting text or image inputs and outputting text\u001b[1m)\u001b[0m\n",
"that can solve difficult problems with greater accuracy than any of our previous\n",
"models, thanks to its broader general knowledge and advanced reasoning capabilities.\n",
"GPT-\u001b[1;36m4\u001b[0m is available in the OpenAI API to paying customers. Like gpt-\u001b[1;36m3.5\u001b[0m-turbo , GPT-\n",
" is optimized for chat but works well for traditional completions tasks using the Chat\n",
"Completions API. Learn how to use GPT-\u001b[1;36m4\u001b[0m in our text generation guide.\n",
"MODEL\n",
"DE S CRIPTION\n",
"CONTEXT\n",
"WIND OW\n",
"TRAINING\n",
"DATA\n",
"gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0125\u001b[0m-preview\n",
"New GPT-\u001b[1;36m4\u001b[0m Turbo\n",
"\u001b[1;36m8\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"Up to\n",
"Dec\n",
"\u001b[1;36m23\u001b[0m\n",
"The latest GPT-\u001b[1;36m4\u001b[0m model\n",
"tokens\n",
"intended to reduce cases of\n",
"“laziness” where the model\n",
"doesnt complete a task.\n",
"Returns a maximum of\n",
",\u001b[1;36m096\u001b[0m output tokens.\n",
"Learn more.\n",
"gpt-\u001b[1;36m4\u001b[0m-turbo-preview\n",
"Currently points to gpt-\u001b[1;36m4\u001b[0m-\n",
"\u001b[1;36m25\u001b[0m-preview.\n",
"gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-preview\n",
"GPT-\u001b[1;36m4\u001b[0m Turbo model\n",
"featuring improved\n",
"instruction following, JSON\n",
"mode, reproducible outputs,\n",
"parallel function calling, and\n",
"more. Returns a maximum\n",
"of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens. This\n",
"\u001b[1;36m8\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"tokens\n",
"Up to\n",
"Dec\n",
"\u001b[1;36m23\u001b[0m\n",
"\u001b[1;36m8\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"tokens\n",
"Up to\n",
"Apr \u001b[1;36m2023\u001b[0m\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"Models - OpenAI API\n",
"MODEL\n",
"DE S CRIPTION\n",
"is a preview model.\n",
"Learn more.\n",
"CONTEXT\n",
"WIND OW\n",
"TRAINING\n",
"DATA\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> with the ability to\n",
"understand images, in\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"tokens\n",
"Up to\n",
"Apr <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"addition to all other GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"Turbo capabilities. Currently\n",
"points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-\n",
"vision-preview.\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-vision-preview GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> with the ability to\n",
"understand images, in\n",
"addition to all other GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"Turbo capabilities. Returns a\n",
"maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output\n",
"tokens. This is a preview\n",
"model version. Learn more.\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span>\n",
"tokens\n",
"Up to\n",
"Apr <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>\n",
"Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">192</span>\n",
"Up to\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>. See\n",
"tokens\n",
"Sep <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"continuous model upgrades.\n",
"Snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> from\n",
"June 13th <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> with\n",
"improved function calling\n",
"support.\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">192</span>\n",
"tokens\n",
"Up to\n",
"Sep <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k\n",
"Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>\n",
"k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>. See\n",
"continuous model upgrades.\n",
"This model was never rolled\n",
"out widely in favor of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"Turbo.\n",
"Snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k\n",
"from June 13th <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> with\n",
"improved function calling\n",
"support. This model was\n",
"never rolled out widely in\n",
"favor of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo.\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"tokens\n",
"Up to\n",
"Sep <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"tokens\n",
"Up to\n",
"Sep <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"For many basic tasks, the difference between GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> and GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> models is not\n",
"significant. However, in more complex reasoning situations, GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is much more\n",
"capable than any of our previous models.\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"Models - OpenAI API\n",
"MODEL\n",
"DE S CRIPTION\n",
"is a preview model.\n",
"Learn more.\n",
"CONTEXT\n",
"WIND OW\n",
"TRAINING\n",
"DATA\n",
"gpt-\u001b[1;36m4\u001b[0m-vision-preview\n",
"GPT-\u001b[1;36m4\u001b[0m with the ability to\n",
"understand images, in\n",
"\u001b[1;36m8\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"tokens\n",
"Up to\n",
"Apr \u001b[1;36m2023\u001b[0m\n",
"addition to all other GPT-\u001b[1;36m4\u001b[0m\n",
"Turbo capabilities. Currently\n",
"points to gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-\n",
"vision-preview.\n",
"gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-vision-preview GPT-\u001b[1;36m4\u001b[0m with the ability to\n",
"understand images, in\n",
"addition to all other GPT-\u001b[1;36m4\u001b[0m\n",
"Turbo capabilities. Returns a\n",
"maximum of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output\n",
"tokens. This is a preview\n",
"model version. Learn more.\n",
"\u001b[1;36m8\u001b[0m,\u001b[1;36m000\u001b[0m\n",
"tokens\n",
"Up to\n",
"Apr \u001b[1;36m2023\u001b[0m\n",
"gpt-\u001b[1;36m4\u001b[0m\n",
"gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0613\u001b[0m\n",
"Currently points to gpt-\u001b[1;36m4\u001b[0m-\n",
",\u001b[1;36m192\u001b[0m\n",
"Up to\n",
"\u001b[1;36m13\u001b[0m. See\n",
"tokens\n",
"Sep \u001b[1;36m2021\u001b[0m\n",
"continuous model upgrades.\n",
"Snapshot of gpt-\u001b[1;36m4\u001b[0m from\n",
"June 13th \u001b[1;36m2023\u001b[0m with\n",
"improved function calling\n",
"support.\n",
",\u001b[1;36m192\u001b[0m\n",
"tokens\n",
"Up to\n",
"Sep \u001b[1;36m2021\u001b[0m\n",
"gpt-\u001b[1;36m4\u001b[0m-32k\n",
"Currently points to gpt-\u001b[1;36m4\u001b[0m-\n",
"gpt-\u001b[1;36m4\u001b[0m-32k-\u001b[1;36m0613\u001b[0m\n",
"k-\u001b[1;36m0613\u001b[0m. See\n",
"continuous model upgrades.\n",
"This model was never rolled\n",
"out widely in favor of GPT-\u001b[1;36m4\u001b[0m\n",
"Turbo.\n",
"Snapshot of gpt-\u001b[1;36m4\u001b[0m-32k\n",
"from June 13th \u001b[1;36m2023\u001b[0m with\n",
"improved function calling\n",
"support. This model was\n",
"never rolled out widely in\n",
"favor of GPT-\u001b[1;36m4\u001b[0m Turbo.\n",
",\u001b[1;36m768\u001b[0m\n",
"tokens\n",
"Up to\n",
"Sep \u001b[1;36m2021\u001b[0m\n",
",\u001b[1;36m768\u001b[0m\n",
"tokens\n",
"Up to\n",
"Sep \u001b[1;36m2021\u001b[0m\n",
"For many basic tasks, the difference between GPT-\u001b[1;36m4\u001b[0m and GPT-\u001b[1;36m3.5\u001b[0m models is not\n",
"significant. However, in more complex reasoning situations, GPT-\u001b[1;36m4\u001b[0m is much more\n",
"capable than any of our previous models.\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"Models - OpenAI API\n",
"Multilingual capabilities\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> outperforms both previous large language models and as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, most state-\n",
"of-the-art systems <span style=\"font-weight: bold\">(</span>which often have benchmark-specific training or hand-\n",
"engineering<span style=\"font-weight: bold\">)</span>. On the MMLU benchmark, an English-language suite of multiple-choice\n",
"questions covering <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">57</span> subjects, GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> not only outperforms existing models by a\n",
"considerable margin in English, but also demonstrates strong performance in other\n",
"languages.\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo models can understand and generate natural language or code and\n",
"have been optimized for chat using the Chat Completions API but work well for non-\n",
"chat tasks as well.\n",
"CONTEXT\n",
"WIND OW\n",
"TRAINING\n",
"DATA\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span>\n",
"tokens\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"MODEL\n",
"DE S CRIPTION\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>\n",
"New Updated GPT <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo\n",
"The latest GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo\n",
"model with higher accuracy at\n",
"responding in requested\n",
"formats and a fix for a bug\n",
"which caused a text encoding\n",
"issue for non-English\n",
"language function calls.\n",
"Returns a maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"output tokens. Learn more.\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
"Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"Up to Sep\n",
"turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>. The gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-\n",
"tokens\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"turbo model alias will be\n",
"automatically upgraded from\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span> to\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span> on\n",
"February 16th.\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo model with\n",
"improved instruction\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span>\n",
"tokens\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"following, JSON mode,\n",
"reproducible outputs, parallel\n",
"function calling, and more.\n",
"Returns a maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"output tokens. Learn more.\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"Models - OpenAI API\n",
"Multilingual capabilities\n",
"GPT-\u001b[1;36m4\u001b[0m outperforms both previous large language models and as of \u001b[1;36m2023\u001b[0m, most state-\n",
"of-the-art systems \u001b[1m(\u001b[0mwhich often have benchmark-specific training or hand-\n",
"engineering\u001b[1m)\u001b[0m. On the MMLU benchmark, an English-language suite of multiple-choice\n",
"questions covering \u001b[1;36m57\u001b[0m subjects, GPT-\u001b[1;36m4\u001b[0m not only outperforms existing models by a\n",
"considerable margin in English, but also demonstrates strong performance in other\n",
"languages.\n",
"GPT-\u001b[1;36m3.5\u001b[0m Turbo\n",
"GPT-\u001b[1;36m3.5\u001b[0m Turbo models can understand and generate natural language or code and\n",
"have been optimized for chat using the Chat Completions API but work well for non-\n",
"chat tasks as well.\n",
"CONTEXT\n",
"WIND OW\n",
"TRAINING\n",
"DATA\n",
",\u001b[1;36m385\u001b[0m\n",
"tokens\n",
"Up to Sep\n",
"\u001b[1;36m21\u001b[0m\n",
"MODEL\n",
"DE S CRIPTION\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0125\u001b[0m\n",
"New Updated GPT \u001b[1;36m3.5\u001b[0m Turbo\n",
"The latest GPT-\u001b[1;36m3.5\u001b[0m Turbo\n",
"model with higher accuracy at\n",
"responding in requested\n",
"formats and a fix for a bug\n",
"which caused a text encoding\n",
"issue for non-English\n",
"language function calls.\n",
"Returns a maximum of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m\n",
"output tokens. Learn more.\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
"Currently points to gpt-\u001b[1;36m3.5\u001b[0m-\n",
",\u001b[1;36m096\u001b[0m\n",
"Up to Sep\n",
"turbo-\u001b[1;36m0613\u001b[0m. The gpt-\u001b[1;36m3.5\u001b[0m-\n",
"tokens\n",
"\u001b[1;36m21\u001b[0m\n",
"turbo model alias will be\n",
"automatically upgraded from\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0613\u001b[0m to\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0125\u001b[0m on\n",
"February 16th.\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m1106\u001b[0m\n",
"GPT-\u001b[1;36m3.5\u001b[0m Turbo model with\n",
"improved instruction\n",
",\u001b[1;36m385\u001b[0m\n",
"tokens\n",
"Up to Sep\n",
"\u001b[1;36m21\u001b[0m\n",
"following, JSON mode,\n",
"reproducible outputs, parallel\n",
"function calling, and more.\n",
"Returns a maximum of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m\n",
"output tokens. Learn more.\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"Models - OpenAI API\n",
"MODEL\n",
"DE S CRIPTION\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-instruct Similar capabilities as GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"era models. Compatible with\n",
"legacy Completions endpoint\n",
"and not Chat Completions.\n",
"CONTEXT\n",
"WIND OW\n",
"TRAINING\n",
"DATA\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"tokens\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k\n",
"Legacy Currently points to\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>.\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span>\n",
"tokens\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>\n",
"Legacy Snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-\n",
"turbo from June 13th <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"Will be deprecated on June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>,\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">24</span>.\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span>\n",
"tokens\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>\n",
"Legacy Snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span>\n",
"Up to Sep\n",
"k-turbo from June 13th\n",
"tokens\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23</span>. Will be deprecated on\n",
"June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>.\n",
"DALL·E\n",
"DALL·E is a AI system that can create realistic images and art from a description in\n",
"natural language. DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> currently supports the ability, given a prompt, to create a\n",
"new image with a specific size. DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> also support the ability to edit an existing\n",
"image, or create variations of a user provided image.\n",
"DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> is available through our Images API along with DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. You can try DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"through ChatGPT Plus.\n",
"MODEL\n",
"DE S CRIPTION\n",
"dall-e-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"New DALL·E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>\n",
"The latest DALL·E model released in Nov <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. Learn more.\n",
"dall-e-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> The previous DALL·E model released in Nov <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2022</span>. The 2nd iteration of\n",
"DALL·E with more realistic, accurate, and 4x greater resolution images\n",
"than the original model.\n",
"TTS\n",
"TTS is an AI model that converts text to natural sounding spoken text. We offer two\n",
"different model variates, tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> is optimized for real time text to speech use cases\n",
"and tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd is optimized for quality. These models can be used with the Speech\n",
"endpoint in the Audio API.\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"Models - OpenAI API\n",
"MODEL\n",
"DE S CRIPTION\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-instruct Similar capabilities as GPT-\u001b[1;36m3\u001b[0m\n",
"era models. Compatible with\n",
"legacy Completions endpoint\n",
"and not Chat Completions.\n",
"CONTEXT\n",
"WIND OW\n",
"TRAINING\n",
"DATA\n",
",\u001b[1;36m096\u001b[0m\n",
"tokens\n",
"Up to Sep\n",
"\u001b[1;36m21\u001b[0m\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k\n",
"Legacy Currently points to\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-\u001b[1;36m0613\u001b[0m.\n",
",\u001b[1;36m385\u001b[0m\n",
"tokens\n",
"Up to Sep\n",
"\u001b[1;36m21\u001b[0m\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0613\u001b[0m\n",
"Legacy Snapshot of gpt-\u001b[1;36m3.5\u001b[0m-\n",
"turbo from June 13th \u001b[1;36m2023\u001b[0m.\n",
"Will be deprecated on June \u001b[1;36m13\u001b[0m,\n",
"\u001b[1;36m24\u001b[0m.\n",
",\u001b[1;36m096\u001b[0m\n",
"tokens\n",
"Up to Sep\n",
"\u001b[1;36m21\u001b[0m\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-\u001b[1;36m0613\u001b[0m\n",
"Legacy Snapshot of gpt-\u001b[1;36m3.5\u001b[0m-\n",
",\u001b[1;36m385\u001b[0m\n",
"Up to Sep\n",
"k-turbo from June 13th\n",
"tokens\n",
"\u001b[1;36m21\u001b[0m\n",
"\u001b[1;36m23\u001b[0m. Will be deprecated on\n",
"June \u001b[1;36m13\u001b[0m, \u001b[1;36m2024\u001b[0m.\n",
"DALL·E\n",
"DALL·E is a AI system that can create realistic images and art from a description in\n",
"natural language. DALL·E \u001b[1;36m3\u001b[0m currently supports the ability, given a prompt, to create a\n",
"new image with a specific size. DALL·E \u001b[1;36m2\u001b[0m also support the ability to edit an existing\n",
"image, or create variations of a user provided image.\n",
"DALL·E \u001b[1;36m3\u001b[0m is available through our Images API along with DALL·E \u001b[1;36m2\u001b[0m. You can try DALL·E \u001b[1;36m3\u001b[0m\n",
"through ChatGPT Plus.\n",
"MODEL\n",
"DE S CRIPTION\n",
"dall-e-\u001b[1;36m3\u001b[0m\n",
"New DALL·E \u001b[1;36m3\u001b[0m\n",
"The latest DALL·E model released in Nov \u001b[1;36m2023\u001b[0m. Learn more.\n",
"dall-e-\u001b[1;36m2\u001b[0m The previous DALL·E model released in Nov \u001b[1;36m2022\u001b[0m. The 2nd iteration of\n",
"DALL·E with more realistic, accurate, and 4x greater resolution images\n",
"than the original model.\n",
"TTS\n",
"TTS is an AI model that converts text to natural sounding spoken text. We offer two\n",
"different model variates, tts-\u001b[1;36m1\u001b[0m is optimized for real time text to speech use cases\n",
"and tts-\u001b[1;36m1\u001b[0m-hd is optimized for quality. These models can be used with the Speech\n",
"endpoint in the Audio API.\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"Models - OpenAI API\n",
"MODEL\n",
"DE S CRIPTION\n",
"tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"New Text-to-speech <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"The latest text to speech model, optimized for speed.\n",
"tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd\n",
"New Text-to-speech <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> HD\n",
"The latest text to speech model, optimized for quality.\n",
"Whisper\n",
"Whisper is a general-purpose speech recognition model. It is trained on a large dataset\n",
"of diverse audio and is also a multi-task model that can perform multilingual speech\n",
"recognition as well as speech translation and language identification. The Whisper v2-\n",
"large model is currently available through our API with the whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> model name.\n",
"Currently, there is no difference between the open source version of Whisper and the\n",
"version available through our API. However, through our API, we offer an optimized\n",
"inference process which makes running Whisper through our API much faster than\n",
"doing it through other means. For more technical details on Whisper, you can read the\n",
"paper.\n",
"Embeddings\n",
"Embeddings are a numerical representation of text that can be used to measure the\n",
"relatedness between two pieces of text. Embeddings are useful for search, clustering,\n",
"recommendations, anomaly detection, and classification tasks. You can read more\n",
"about our latest embedding models in the announcement blog post.\n",
"MODEL\n",
"DE S CRIPTION\n",
"text-embedding-\n",
"-large\n",
"New Embedding V3 large\n",
"Most capable embedding model for both\n",
"english and non-english tasks\n",
"text-embedding-\n",
"New Embedding V3 small\n",
"-small\n",
"Increased performance over 2nd generation ada\n",
"embedding model\n",
"text-embedding-\n",
"ada-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>\n",
"Most capable 2nd generation embedding\n",
"model, replacing <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span> first generation models\n",
"OUTP UT\n",
"DIMENSION\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">072</span>\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536</span>\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536</span>\n",
"Moderation\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"Models - OpenAI API\n",
"MODEL\n",
"DE S CRIPTION\n",
"tts-\u001b[1;36m1\u001b[0m\n",
"New Text-to-speech \u001b[1;36m1\u001b[0m\n",
"The latest text to speech model, optimized for speed.\n",
"tts-\u001b[1;36m1\u001b[0m-hd\n",
"New Text-to-speech \u001b[1;36m1\u001b[0m HD\n",
"The latest text to speech model, optimized for quality.\n",
"Whisper\n",
"Whisper is a general-purpose speech recognition model. It is trained on a large dataset\n",
"of diverse audio and is also a multi-task model that can perform multilingual speech\n",
"recognition as well as speech translation and language identification. The Whisper v2-\n",
"large model is currently available through our API with the whisper-\u001b[1;36m1\u001b[0m model name.\n",
"Currently, there is no difference between the open source version of Whisper and the\n",
"version available through our API. However, through our API, we offer an optimized\n",
"inference process which makes running Whisper through our API much faster than\n",
"doing it through other means. For more technical details on Whisper, you can read the\n",
"paper.\n",
"Embeddings\n",
"Embeddings are a numerical representation of text that can be used to measure the\n",
"relatedness between two pieces of text. Embeddings are useful for search, clustering,\n",
"recommendations, anomaly detection, and classification tasks. You can read more\n",
"about our latest embedding models in the announcement blog post.\n",
"MODEL\n",
"DE S CRIPTION\n",
"text-embedding-\n",
"-large\n",
"New Embedding V3 large\n",
"Most capable embedding model for both\n",
"english and non-english tasks\n",
"text-embedding-\n",
"New Embedding V3 small\n",
"-small\n",
"Increased performance over 2nd generation ada\n",
"embedding model\n",
"text-embedding-\n",
"ada-\u001b[1;36m002\u001b[0m\n",
"Most capable 2nd generation embedding\n",
"model, replacing \u001b[1;36m16\u001b[0m first generation models\n",
"OUTP UT\n",
"DIMENSION\n",
",\u001b[1;36m072\u001b[0m\n",
",\u001b[1;36m536\u001b[0m\n",
",\u001b[1;36m536\u001b[0m\n",
"Moderation\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"Models - OpenAI API\n",
"The Moderation models are designed to check whether content complies with\n",
"OpenAI's usage policies. The models provide classification capabilities that look for\n",
"content in the following categories: hate, hate/threatening, self-harm, sexual,\n",
"sexual/minors, violence, and violence/graphic. You can find out more in our moderation\n",
"guide.\n",
"Moderation models take in an arbitrary sized input that is automatically broken up into\n",
"chunks of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens. In cases where the input is more than <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens,\n",
"truncation is used which in a rare condition may omit a small number of tokens from\n",
"the moderation check.\n",
"The final results from each request to the moderation endpoint shows the maximum\n",
"value on a per category basis. For example, if one chunk of 4K tokens had a category\n",
"score of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.9901</span> and the other had a score of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.1901</span>, the results would show <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.9901</span> in the\n",
"API response since it is higher.\n",
"MODEL\n",
"DE S CRIPTION\n",
"MAX\n",
"TOKENS\n",
"text-moderation-latest Currently points to text-moderation-\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span>.\n",
"text-moderation-stable Currently points to text-moderation-\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span>.\n",
"text-moderation-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span>\n",
"Most capable moderation model across\n",
"all categories.\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span>\n",
"GPT base\n",
"GPT base models can understand and generate natural language or code but are not\n",
"trained with instruction following. These models are made to be replacements for our\n",
"original GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> base models and use the legacy Completions API. Most customers\n",
"should use GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> or GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>.\n",
"MODEL\n",
"DE S CRIPTION\n",
"babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span> Replacement for the GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> ada and\n",
"babbage base models.\n",
"davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span> Replacement for the GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> curie and\n",
"davinci base models.\n",
"MAX\n",
"TOKENS\n",
"TRAINING\n",
"DATA\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span>\n",
"tokens\n",
",<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span>\n",
"tokens\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"Up to Sep\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">21</span>\n",
"How we use your data\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"Models - OpenAI API\n",
"The Moderation models are designed to check whether content complies with\n",
"OpenAI's usage policies. The models provide classification capabilities that look for\n",
"content in the following categories: hate, hate/threatening, self-harm, sexual,\n",
"sexual/minors, violence, and violence/graphic. You can find out more in our moderation\n",
"guide.\n",
"Moderation models take in an arbitrary sized input that is automatically broken up into\n",
"chunks of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens. In cases where the input is more than \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens,\n",
"truncation is used which in a rare condition may omit a small number of tokens from\n",
"the moderation check.\n",
"The final results from each request to the moderation endpoint shows the maximum\n",
"value on a per category basis. For example, if one chunk of 4K tokens had a category\n",
"score of \u001b[1;36m0.9901\u001b[0m and the other had a score of \u001b[1;36m0.1901\u001b[0m, the results would show \u001b[1;36m0.9901\u001b[0m in the\n",
"API response since it is higher.\n",
"MODEL\n",
"DE S CRIPTION\n",
"MAX\n",
"TOKENS\n",
"text-moderation-latest Currently points to text-moderation-\n",
",\u001b[1;36m768\u001b[0m\n",
"\u001b[1;36m7\u001b[0m.\n",
"text-moderation-stable Currently points to text-moderation-\n",
",\u001b[1;36m768\u001b[0m\n",
"\u001b[1;36m7\u001b[0m.\n",
"text-moderation-\u001b[1;36m007\u001b[0m\n",
"Most capable moderation model across\n",
"all categories.\n",
",\u001b[1;36m768\u001b[0m\n",
"GPT base\n",
"GPT base models can understand and generate natural language or code but are not\n",
"trained with instruction following. These models are made to be replacements for our\n",
"original GPT-\u001b[1;36m3\u001b[0m base models and use the legacy Completions API. Most customers\n",
"should use GPT-\u001b[1;36m3.5\u001b[0m or GPT-\u001b[1;36m4\u001b[0m.\n",
"MODEL\n",
"DE S CRIPTION\n",
"babbage-\u001b[1;36m002\u001b[0m Replacement for the GPT-\u001b[1;36m3\u001b[0m ada and\n",
"babbage base models.\n",
"davinci-\u001b[1;36m002\u001b[0m Replacement for the GPT-\u001b[1;36m3\u001b[0m curie and\n",
"davinci base models.\n",
"MAX\n",
"TOKENS\n",
"TRAINING\n",
"DATA\n",
",\u001b[1;36m384\u001b[0m\n",
"tokens\n",
",\u001b[1;36m384\u001b[0m\n",
"tokens\n",
"Up to Sep\n",
"\u001b[1;36m21\u001b[0m\n",
"Up to Sep\n",
"\u001b[1;36m21\u001b[0m\n",
"How we use your data\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"Models - OpenAI API\n",
"Your data is your data.\n",
"As of March <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, data sent to the OpenAI API will not be used to train or improve\n",
"OpenAI models <span style=\"font-weight: bold\">(</span>unless you explicitly opt in<span style=\"font-weight: bold\">)</span>. One advantage to opting in is that the\n",
"models may get better at your use case over time.\n",
"To help identify abuse, API data may be retained for up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, after which it will be\n",
"deleted <span style=\"font-weight: bold\">(</span>unless otherwise required by law<span style=\"font-weight: bold\">)</span>. For trusted customers with sensitive\n",
"applications, zero data retention may be available. With zero data retention, request\n",
"and response bodies are not persisted to any logging mechanism and exist only in\n",
"memory in order to serve the request.\n",
"Note that this data policy does not apply to OpenAI's non-API consumer services like\n",
"ChatGPT or DALL·E Labs.\n",
"Default usage policies by endpoint\n",
"ENDP OINT\n",
"DATA USED\n",
"FOR TRAINING\n",
"DEFAULT\n",
"RETENTION\n",
"ELIGIBLE FOR\n",
"ZERO RETENTION\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/chat/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>*\n",
"No\n",
" days\n",
"Yes, except\n",
"image inputs*\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">files</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">assistants</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">threads</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">messages</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">runs</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/runs/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">steps</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">generations</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">edits</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">variations</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">embeddings</span>\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">transcriptions</span> No\n",
"Until deleted by\n",
"No\n",
"customer\n",
"Until deleted by\n",
"No\n",
"customer\n",
" days *\n",
" days *\n",
" days *\n",
" days *\n",
" days\n",
" days\n",
" days\n",
" days\n",
"Zero data\n",
"retention\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"Yes\n",
"-\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"Models - OpenAI API\n",
"Your data is your data.\n",
"As of March \u001b[1;36m1\u001b[0m, \u001b[1;36m2023\u001b[0m, data sent to the OpenAI API will not be used to train or improve\n",
"OpenAI models \u001b[1m(\u001b[0munless you explicitly opt in\u001b[1m)\u001b[0m. One advantage to opting in is that the\n",
"models may get better at your use case over time.\n",
"To help identify abuse, API data may be retained for up to \u001b[1;36m30\u001b[0m days, after which it will be\n",
"deleted \u001b[1m(\u001b[0munless otherwise required by law\u001b[1m)\u001b[0m. For trusted customers with sensitive\n",
"applications, zero data retention may be available. With zero data retention, request\n",
"and response bodies are not persisted to any logging mechanism and exist only in\n",
"memory in order to serve the request.\n",
"Note that this data policy does not apply to OpenAI's non-API consumer services like\n",
"ChatGPT or DALL·E Labs.\n",
"Default usage policies by endpoint\n",
"ENDP OINT\n",
"DATA USED\n",
"FOR TRAINING\n",
"DEFAULT\n",
"RETENTION\n",
"ELIGIBLE FOR\n",
"ZERO RETENTION\n",
"\u001b[35m/v1/chat/\u001b[0m\u001b[95mcompletions\u001b[0m*\n",
"No\n",
" days\n",
"Yes, except\n",
"image inputs*\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mfiles\u001b[0m\n",
"\u001b[35m/v1/\u001b[0m\u001b[95massistants\u001b[0m\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mthreads\u001b[0m\n",
"\u001b[35m/v1/threads/\u001b[0m\u001b[95mmessages\u001b[0m\n",
"\u001b[35m/v1/threads/\u001b[0m\u001b[95mruns\u001b[0m\n",
"\u001b[35m/v1/threads/runs/\u001b[0m\u001b[95msteps\u001b[0m\n",
"\u001b[35m/v1/images/\u001b[0m\u001b[95mgenerations\u001b[0m\n",
"\u001b[35m/v1/images/\u001b[0m\u001b[95medits\u001b[0m\n",
"\u001b[35m/v1/images/\u001b[0m\u001b[95mvariations\u001b[0m\n",
"\u001b[35m/v1/\u001b[0m\u001b[95membeddings\u001b[0m\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranscriptions\u001b[0m No\n",
"Until deleted by\n",
"No\n",
"customer\n",
"Until deleted by\n",
"No\n",
"customer\n",
" days *\n",
" days *\n",
" days *\n",
" days *\n",
" days\n",
" days\n",
" days\n",
" days\n",
"Zero data\n",
"retention\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"No\n",
"Yes\n",
"-\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"Models - OpenAI API\n",
"ENDP OINT\n",
"DATA USED\n",
"FOR TRAINING\n",
"DEFAULT\n",
"RETENTION\n",
"ELIGIBLE FOR\n",
"ZERO RETENTION\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">translations</span>\n",
"No\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">speech</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/fine_tuning/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">jobs</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">moderations</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>\n",
"No\n",
"No\n",
"No\n",
"No\n",
"Zero data\n",
"retention\n",
" days\n",
"Until deleted by\n",
"customer\n",
"Zero data\n",
"retention\n",
"-\n",
"No\n",
"No\n",
"-\n",
" days\n",
"Yes\n",
"* Image inputs via the gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview model are not eligible for zero\n",
"retention.\n",
"* For the Assistants API, we are still evaluating the default retention period during the\n",
"Beta. We expect that the default retention period will be stable after the end of the\n",
"Beta.\n",
"For details, see our API data usage policies. To learn more about zero retention, get in\n",
"touch with our sales team.\n",
"Model endpoint compatibility\n",
"ENDP OINT\n",
"L ATE ST MODEL S\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">assistants</span>\n",
"All models except gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0301</span>\n",
"supported. The retrieval tool requires gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"turbo-preview <span style=\"font-weight: bold\">(</span>and subsequent dated model\n",
"releases<span style=\"font-weight: bold\">)</span> or gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span> <span style=\"font-weight: bold\">(</span>and\n",
"subsequent versions<span style=\"font-weight: bold\">)</span>.\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">transcriptions</span> whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">translations</span>\n",
"whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">speech</span>\n",
"tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/chat/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> and dated model releases, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-\n",
"preview and dated model releases, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-\n",
"vision-preview, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k and dated model\n",
"releases, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo and dated model\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"Models - OpenAI API\n",
"ENDP OINT\n",
"DATA USED\n",
"FOR TRAINING\n",
"DEFAULT\n",
"RETENTION\n",
"ELIGIBLE FOR\n",
"ZERO RETENTION\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranslations\u001b[0m\n",
"No\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mspeech\u001b[0m\n",
"\u001b[35m/v1/fine_tuning/\u001b[0m\u001b[95mjobs\u001b[0m\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mmoderations\u001b[0m\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mcompletions\u001b[0m\n",
"No\n",
"No\n",
"No\n",
"No\n",
"Zero data\n",
"retention\n",
" days\n",
"Until deleted by\n",
"customer\n",
"Zero data\n",
"retention\n",
"-\n",
"No\n",
"No\n",
"-\n",
" days\n",
"Yes\n",
"* Image inputs via the gpt-\u001b[1;36m4\u001b[0m-vision-preview model are not eligible for zero\n",
"retention.\n",
"* For the Assistants API, we are still evaluating the default retention period during the\n",
"Beta. We expect that the default retention period will be stable after the end of the\n",
"Beta.\n",
"For details, see our API data usage policies. To learn more about zero retention, get in\n",
"touch with our sales team.\n",
"Model endpoint compatibility\n",
"ENDP OINT\n",
"L ATE ST MODEL S\n",
"\u001b[35m/v1/\u001b[0m\u001b[95massistants\u001b[0m\n",
"All models except gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0301\u001b[0m\n",
"supported. The retrieval tool requires gpt-\u001b[1;36m4\u001b[0m-\n",
"turbo-preview \u001b[1m(\u001b[0mand subsequent dated model\n",
"releases\u001b[1m)\u001b[0m or gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m1106\u001b[0m \u001b[1m(\u001b[0mand\n",
"subsequent versions\u001b[1m)\u001b[0m.\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranscriptions\u001b[0m whisper-\u001b[1;36m1\u001b[0m\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranslations\u001b[0m\n",
"whisper-\u001b[1;36m1\u001b[0m\n",
"\u001b[35m/v1/audio/\u001b[0m\u001b[95mspeech\u001b[0m\n",
"tts-\u001b[1;36m1\u001b[0m, tts-\u001b[1;36m1\u001b[0m-hd\n",
"\u001b[35m/v1/chat/\u001b[0m\u001b[95mcompletions\u001b[0m\n",
"gpt-\u001b[1;36m4\u001b[0m and dated model releases, gpt-\u001b[1;36m4\u001b[0m-turbo-\n",
"preview and dated model releases, gpt-\u001b[1;36m4\u001b[0m-\n",
"vision-preview, gpt-\u001b[1;36m4\u001b[0m-32k and dated model\n",
"releases, gpt-\u001b[1;36m3.5\u001b[0m-turbo and dated model\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">02</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>, <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">17:58</span>\n",
"ENDP OINT\n",
"Models - OpenAI API\n",
"L ATE ST MODEL S\n",
"releases, gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k and dated model\n",
"releases, fine-tuned versions of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span> <span style=\"font-weight: bold\">(</span>Legacy<span style=\"font-weight: bold\">)</span> gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-instruct, babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>,\n",
"davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">embeddings</span>\n",
"text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-small, text-embedding-\n",
"-large, text-embedding-ada-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/fine_tuning/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">jobs</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo, babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>, davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">moderations</span>\n",
"text-moderation-stable, text-\n",
"<span style=\"color: #0000ff; text-decoration-color: #0000ff; text-decoration: underline\">https://platform.openai.com/docs/models/overview</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">10</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;36m26\u001b[0m/\u001b[1;36m02\u001b[0m/\u001b[1;36m2024\u001b[0m, \u001b[1;92m17:58\u001b[0m\n",
"ENDP OINT\n",
"Models - OpenAI API\n",
"L ATE ST MODEL S\n",
"releases, gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k and dated model\n",
"releases, fine-tuned versions of gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mcompletions\u001b[0m \u001b[1m(\u001b[0mLegacy\u001b[1m)\u001b[0m gpt-\u001b[1;36m3.5\u001b[0m-turbo-instruct, babbage-\u001b[1;36m002\u001b[0m,\n",
"davinci-\u001b[1;36m002\u001b[0m\n",
"\u001b[35m/v1/\u001b[0m\u001b[95membeddings\u001b[0m\n",
"text-embedding-\u001b[1;36m3\u001b[0m-small, text-embedding-\n",
"-large, text-embedding-ada-\u001b[1;36m002\u001b[0m\n",
"\u001b[35m/v1/fine_tuning/\u001b[0m\u001b[95mjobs\u001b[0m\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo, babbage-\u001b[1;36m002\u001b[0m, davinci-\u001b[1;36m002\u001b[0m\n",
"\u001b[35m/v1/\u001b[0m\u001b[95mmoderations\u001b[0m\n",
"text-moderation-stable, text-\n",
"\u001b[4;94mhttps://platform.openai.com/docs/models/overview\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m10\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"</pre>\n"
],
"text/plain": [
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> and GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo**\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is a sophisticated multimodal model capable of processing both text and image inputs to produce text outputs.\n",
"It is designed to tackle complex problems with higher accuracy than previous models, leveraging its extensive \n",
"general knowledge and advanced reasoning skills. GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> is accessible through the OpenAI API for paying customers \n",
"and is optimized for chat applications, although it can also handle traditional completion tasks using the Chat \n",
"Completions API.\n",
"**Model Versions:**\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>-preview**\n",
" - **Description:** This is the latest GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo model, designed to minimize instances where the model fails to\n",
"complete a task, known as <span style=\"color: #008000; text-decoration-color: #008000\">\"laziness.\"</span> It can return up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data:** Up to December <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview**\n",
" - **Description:** This version currently points to the gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>-preview model.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data:** Up to December <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-preview**\n",
" - **Description:** This version of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo includes enhancements such as improved instruction following, \n",
"JSON mode, reproducible outputs, and parallel function calling. It also supports up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data:** Up to April <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
"These models are part of OpenAI's ongoing efforts to provide developers with robust tools for various applications,\n",
"ensuring flexibility and improved performance across different use cases.\n",
"</pre>\n"
],
"text/plain": [
"**GPT-\u001b[1;36m4\u001b[0m and GPT-\u001b[1;36m4\u001b[0m Turbo**\n",
"GPT-\u001b[1;36m4\u001b[0m is a sophisticated multimodal model capable of processing both text and image inputs to produce text outputs.\n",
"It is designed to tackle complex problems with higher accuracy than previous models, leveraging its extensive \n",
"general knowledge and advanced reasoning skills. GPT-\u001b[1;36m4\u001b[0m is accessible through the OpenAI API for paying customers \n",
"and is optimized for chat applications, although it can also handle traditional completion tasks using the Chat \n",
"Completions API.\n",
"**Model Versions:**\n",
". **gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0125\u001b[0m-preview**\n",
" - **Description:** This is the latest GPT-\u001b[1;36m4\u001b[0m Turbo model, designed to minimize instances where the model fails to\n",
"complete a task, known as \u001b[32m\"laziness.\"\u001b[0m It can return up to \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
" - **Context Window:** \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data:** Up to December \u001b[1;36m2023\u001b[0m\n",
". **gpt-\u001b[1;36m4\u001b[0m-turbo-preview**\n",
" - **Description:** This version currently points to the gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0125\u001b[0m-preview model.\n",
" - **Context Window:** \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data:** Up to December \u001b[1;36m2023\u001b[0m\n",
". **gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-preview**\n",
" - **Description:** This version of GPT-\u001b[1;36m4\u001b[0m Turbo includes enhancements such as improved instruction following, \n",
"JSON mode, reproducible outputs, and parallel function calling. It also supports up to \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
" - **Context Window:** \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data:** Up to April \u001b[1;36m2023\u001b[0m\n",
"These models are part of OpenAI's ongoing efforts to provide developers with robust tools for various applications,\n",
"ensuring flexibility and improved performance across different use cases.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Models - OpenAI API Overview**\n",
"This document provides an overview of various GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> models, highlighting their capabilities, context windows, and \n",
"training data timelines.\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview**\n",
" - **Description**: This model has the ability to understand images, in addition to all other GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo \n",
"capabilities. It currently points to the gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-vision-preview model.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data**: Up to April <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>-vision-preview**\n",
" - **Description**: Similar to the gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview, this model can understand images and includes all GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> \n",
"Turbo capabilities. It returns a maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens and is a preview model version.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">128</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">000</span> tokens\n",
" - **Training Data**: Up to April <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>**\n",
" - **Description**: This model currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span> and includes continuous model upgrades.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">192</span> tokens\n",
" - **Training Data**: Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>**\n",
" - **Description**: A snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> from June 13th, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, with improved function calling support.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">192</span> tokens\n",
" - **Training Data**: Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k**\n",
" - **Description**: This model points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span> and includes continuous model upgrades. It was not widely\n",
"rolled out in favor of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens\n",
" - **Training Data**: Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>**\n",
" - **Description**: A snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k from June 13th, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, with improved function calling support. Like \n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k, it was not widely rolled out in favor of GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> Turbo.\n",
" - **Context Window**: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens\n",
" - **Training Data**: Up to September\n",
"</pre>\n"
],
"text/plain": [
"**Models - OpenAI API Overview**\n",
"This document provides an overview of various GPT-\u001b[1;36m4\u001b[0m models, highlighting their capabilities, context windows, and \n",
"training data timelines.\n",
". **gpt-\u001b[1;36m4\u001b[0m-vision-preview**\n",
" - **Description**: This model has the ability to understand images, in addition to all other GPT-\u001b[1;36m4\u001b[0m Turbo \n",
"capabilities. It currently points to the gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-vision-preview model.\n",
" - **Context Window**: \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data**: Up to April \u001b[1;36m2023\u001b[0m\n",
". **gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m1106\u001b[0m-vision-preview**\n",
" - **Description**: Similar to the gpt-\u001b[1;36m4\u001b[0m-vision-preview, this model can understand images and includes all GPT-\u001b[1;36m4\u001b[0m \n",
"Turbo capabilities. It returns a maximum of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens and is a preview model version.\n",
" - **Context Window**: \u001b[1;36m128\u001b[0m,\u001b[1;36m000\u001b[0m tokens\n",
" - **Training Data**: Up to April \u001b[1;36m2023\u001b[0m\n",
". **gpt-\u001b[1;36m4\u001b[0m**\n",
" - **Description**: This model currently points to gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0613\u001b[0m and includes continuous model upgrades.\n",
" - **Context Window**: \u001b[1;36m8\u001b[0m,\u001b[1;36m192\u001b[0m tokens\n",
" - **Training Data**: Up to September \u001b[1;36m2021\u001b[0m\n",
". **gpt-\u001b[1;36m4\u001b[0m-\u001b[1;36m0613\u001b[0m**\n",
" - **Description**: A snapshot of gpt-\u001b[1;36m4\u001b[0m from June 13th, \u001b[1;36m2023\u001b[0m, with improved function calling support.\n",
" - **Context Window**: \u001b[1;36m8\u001b[0m,\u001b[1;36m192\u001b[0m tokens\n",
" - **Training Data**: Up to September \u001b[1;36m2021\u001b[0m\n",
". **gpt-\u001b[1;36m4\u001b[0m-32k**\n",
" - **Description**: This model points to gpt-\u001b[1;36m4\u001b[0m-32k-\u001b[1;36m0613\u001b[0m and includes continuous model upgrades. It was not widely\n",
"rolled out in favor of GPT-\u001b[1;36m4\u001b[0m Turbo.\n",
" - **Context Window**: \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens\n",
" - **Training Data**: Up to September \u001b[1;36m2021\u001b[0m\n",
". **gpt-\u001b[1;36m4\u001b[0m-32k-\u001b[1;36m0613\u001b[0m**\n",
" - **Description**: A snapshot of gpt-\u001b[1;36m4\u001b[0m-32k from June 13th, \u001b[1;36m2023\u001b[0m, with improved function calling support. Like \n",
"gpt-\u001b[1;36m4\u001b[0m-32k, it was not widely rolled out in favor of GPT-\u001b[1;36m4\u001b[0m Turbo.\n",
" - **Context Window**: \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens\n",
" - **Training Data**: Up to September\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Multilingual Capabilities and GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo**\n",
"**Multilingual Capabilities**\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> surpasses previous large language models and, as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, most state-of-the-art systems. It excels in the \n",
"MMLU benchmark, which involves English-language multiple-choice questions across <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">57</span> subjects. GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> not only \n",
"outperforms existing models in English but also shows strong performance in other languages.\n",
"**GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo**\n",
"GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo models are designed to understand and generate natural language or code. They are optimized for chat \n",
"using the Chat Completions API but are also effective for non-chat tasks.\n",
"**Model Descriptions:**\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span>**\n",
" - **Description:** Updated GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Turbo with improved accuracy and a fix for a text encoding bug in non-English\n",
"language function calls. It returns up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo**\n",
" - **Description:** Currently points to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>. The alias will automatically upgrade to \n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0125</span> on February 16th.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>**\n",
" - **Description:** Features improved instruction following, JSON mode, reproducible outputs, and parallel \n",
"function calling. It returns up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> output tokens.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"</pre>\n"
],
"text/plain": [
"**Multilingual Capabilities and GPT-\u001b[1;36m3.5\u001b[0m Turbo**\n",
"**Multilingual Capabilities**\n",
"GPT-\u001b[1;36m4\u001b[0m surpasses previous large language models and, as of \u001b[1;36m2023\u001b[0m, most state-of-the-art systems. It excels in the \n",
"MMLU benchmark, which involves English-language multiple-choice questions across \u001b[1;36m57\u001b[0m subjects. GPT-\u001b[1;36m4\u001b[0m not only \n",
"outperforms existing models in English but also shows strong performance in other languages.\n",
"**GPT-\u001b[1;36m3.5\u001b[0m Turbo**\n",
"GPT-\u001b[1;36m3.5\u001b[0m Turbo models are designed to understand and generate natural language or code. They are optimized for chat \n",
"using the Chat Completions API but are also effective for non-chat tasks.\n",
"**Model Descriptions:**\n",
". **gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0125\u001b[0m**\n",
" - **Description:** Updated GPT-\u001b[1;36m3.5\u001b[0m Turbo with improved accuracy and a fix for a text encoding bug in non-English\n",
"language function calls. It returns up to \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
" - **Context Window:** \u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
". **gpt-\u001b[1;36m3.5\u001b[0m-turbo**\n",
" - **Description:** Currently points to gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0613\u001b[0m. The alias will automatically upgrade to \n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0125\u001b[0m on February 16th.\n",
" - **Context Window:** \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
". **gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m1106\u001b[0m**\n",
" - **Description:** Features improved instruction following, JSON mode, reproducible outputs, and parallel \n",
"function calling. It returns up to \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m output tokens.\n",
" - **Context Window:** \u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Models - OpenAI API**\n",
"**GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> Models:**\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-instruct**\n",
" - **Description:** Similar capabilities to GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> era models. Compatible with legacy Completions endpoint, not \n",
"Chat Completions.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k**\n",
" - **Description:** Legacy model pointing to gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>**\n",
" - **Description:** Legacy snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo from June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. Will be deprecated on June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
". **gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0613</span>**\n",
" - **Description:** Legacy snapshot of gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k-turbo from June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>. Will be deprecated on June <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">13</span>,\n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>.\n",
" - **Context Window:** <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">385</span> tokens\n",
" - **Training Data:** Up to September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>\n",
"**DALL-E:**\n",
"- DALL-E is an AI system that creates realistic images and art from natural language descriptions. DALL-E <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> \n",
"supports creating new images with specific sizes and editing existing images or creating variations. Available \n",
"through the Images API and ChatGPT Plus.\n",
". **dall-e-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>**\n",
" - **Description:** The latest DALL-E model released in November <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
". **dall-e-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>**\n",
" - **Description:** Released in November <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2022</span>, this model offers more realistic, accurate, and higher resolution \n",
"images than the original.\n",
"**TTS <span style=\"font-weight: bold\">(</span>Text-to-Speech<span style=\"font-weight: bold\">)</span>:**\n",
"- TTS converts text to natural-sounding spoken text. Two model variants are offered:\n",
" - **tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>:** Optimized for real-time text-to-speech use cases.\n",
" - **tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd:** Optimized for quality.\n",
"- These models can be used with the Speech endpoint in\n",
"</pre>\n"
],
"text/plain": [
"**Models - OpenAI API**\n",
"**GPT-\u001b[1;36m3.5\u001b[0m Models:**\n",
". **gpt-\u001b[1;36m3.5\u001b[0m-turbo-instruct**\n",
" - **Description:** Similar capabilities to GPT-\u001b[1;36m3\u001b[0m era models. Compatible with legacy Completions endpoint, not \n",
"Chat Completions.\n",
" - **Context Window:** \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
". **gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k**\n",
" - **Description:** Legacy model pointing to gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-\u001b[1;36m0613\u001b[0m.\n",
" - **Context Window:** \u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
". **gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0613\u001b[0m**\n",
" - **Description:** Legacy snapshot of gpt-\u001b[1;36m3.5\u001b[0m-turbo from June \u001b[1;36m13\u001b[0m, \u001b[1;36m2023\u001b[0m. Will be deprecated on June \u001b[1;36m13\u001b[0m, \u001b[1;36m2024\u001b[0m.\n",
" - **Context Window:** \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
". **gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-\u001b[1;36m0613\u001b[0m**\n",
" - **Description:** Legacy snapshot of gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k-turbo from June \u001b[1;36m13\u001b[0m, \u001b[1;36m2023\u001b[0m. Will be deprecated on June \u001b[1;36m13\u001b[0m,\n",
"\u001b[1;36m2024\u001b[0m.\n",
" - **Context Window:** \u001b[1;36m16\u001b[0m,\u001b[1;36m385\u001b[0m tokens\n",
" - **Training Data:** Up to September \u001b[1;36m2021\u001b[0m\n",
"**DALL-E:**\n",
"- DALL-E is an AI system that creates realistic images and art from natural language descriptions. DALL-E \u001b[1;36m3\u001b[0m \n",
"supports creating new images with specific sizes and editing existing images or creating variations. Available \n",
"through the Images API and ChatGPT Plus.\n",
". **dall-e-\u001b[1;36m3\u001b[0m**\n",
" - **Description:** The latest DALL-E model released in November \u001b[1;36m2023\u001b[0m.\n",
". **dall-e-\u001b[1;36m2\u001b[0m**\n",
" - **Description:** Released in November \u001b[1;36m2022\u001b[0m, this model offers more realistic, accurate, and higher resolution \n",
"images than the original.\n",
"**TTS \u001b[1m(\u001b[0mText-to-Speech\u001b[1m)\u001b[0m:**\n",
"- TTS converts text to natural-sounding spoken text. Two model variants are offered:\n",
" - **tts-\u001b[1;36m1\u001b[0m:** Optimized for real-time text-to-speech use cases.\n",
" - **tts-\u001b[1;36m1\u001b[0m-hd:** Optimized for quality.\n",
"- These models can be used with the Speech endpoint in\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Models - OpenAI API**\n",
"**Text-to-Speech Models:**\n",
". **tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>**: This is a new text-to-speech model optimized for speed, providing efficient conversion of text into \n",
"spoken words.\n",
" <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>. **tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd**: This model is optimized for quality, offering high-definition text-to-speech conversion.\n",
"**Whisper:**\n",
"Whisper is a versatile speech recognition model capable of handling diverse audio inputs. It supports multilingual \n",
"speech recognition, speech translation, and language identification. The Whisper v2-large model is accessible via \n",
"the API under the name <span style=\"color: #008000; text-decoration-color: #008000\">\"whisper-1.\"</span> While the open-source version and the API version are similar, the API offers \n",
"an optimized inference process for faster performance. More technical details can be found in the associated paper.\n",
"**Embeddings:**\n",
"Embeddings are numerical representations of text, useful for measuring the relatedness between text pieces. They \n",
"are applied in search, clustering, recommendations, anomaly detection, and classification tasks.\n",
"- **text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-large**: The most capable embedding model for both English and non-English tasks, with an \n",
"output dimension of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">072</span>.\n",
" - **text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-small**: Offers improved performance over the second-generation ada embedding model, with an \n",
"output dimension of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536</span>.\n",
" - **text-embedding-ada-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>**: A second-generation embedding model replacing <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span> first-generation models, also with \n",
"an output dimension of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">536</span>.\n",
"**Moderation:**\n",
"The document mentions a section on moderation, likely related to content moderation capabilities, though specific \n",
"details are not provided in the visible content.\n",
"</pre>\n"
],
"text/plain": [
"**Models - OpenAI API**\n",
"**Text-to-Speech Models:**\n",
". **tts-\u001b[1;36m1\u001b[0m**: This is a new text-to-speech model optimized for speed, providing efficient conversion of text into \n",
"spoken words.\n",
" \u001b[1;36m2\u001b[0m. **tts-\u001b[1;36m1\u001b[0m-hd**: This model is optimized for quality, offering high-definition text-to-speech conversion.\n",
"**Whisper:**\n",
"Whisper is a versatile speech recognition model capable of handling diverse audio inputs. It supports multilingual \n",
"speech recognition, speech translation, and language identification. The Whisper v2-large model is accessible via \n",
"the API under the name \u001b[32m\"whisper-1.\"\u001b[0m While the open-source version and the API version are similar, the API offers \n",
"an optimized inference process for faster performance. More technical details can be found in the associated paper.\n",
"**Embeddings:**\n",
"Embeddings are numerical representations of text, useful for measuring the relatedness between text pieces. They \n",
"are applied in search, clustering, recommendations, anomaly detection, and classification tasks.\n",
"- **text-embedding-\u001b[1;36m3\u001b[0m-large**: The most capable embedding model for both English and non-English tasks, with an \n",
"output dimension of \u001b[1;36m3\u001b[0m,\u001b[1;36m072\u001b[0m.\n",
" - **text-embedding-\u001b[1;36m3\u001b[0m-small**: Offers improved performance over the second-generation ada embedding model, with an \n",
"output dimension of \u001b[1;36m1\u001b[0m,\u001b[1;36m536\u001b[0m.\n",
" - **text-embedding-ada-\u001b[1;36m002\u001b[0m**: A second-generation embedding model replacing \u001b[1;36m16\u001b[0m first-generation models, also with \n",
"an output dimension of \u001b[1;36m1\u001b[0m,\u001b[1;36m536\u001b[0m.\n",
"**Moderation:**\n",
"The document mentions a section on moderation, likely related to content moderation capabilities, though specific \n",
"details are not provided in the visible content.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Moderation Models and GPT Base**\n",
"**Moderation Models**\n",
"The moderation models are designed to ensure content compliance with OpenAI's usage policies. They classify content\n",
"into categories such as hate, hate/threatening, self-harm, sexual, sexual/minors, violence, and violence/graphic. \n",
"These models process inputs by breaking them into chunks of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">096</span> tokens. If the input exceeds <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens, some \n",
"tokens may be truncated, potentially omitting a few from the moderation check.\n",
"The moderation endpoint provides the maximum score per category from each request. For instance, if one chunk \n",
"scores <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.9901</span> and another scores <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.1901</span> in a category, the API response will show <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.9901</span>.\n",
"- **text-moderation-latest**: Points to text-moderation-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span> with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens.\n",
"- **text-moderation-stable**: Also points to text-moderation-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span> with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens.\n",
"- **text-moderation-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">007</span>**: The most capable model across all categories with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">32</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">768</span> tokens.\n",
"**GPT Base**\n",
"GPT base models are capable of understanding and generating natural language or code but are not trained for \n",
"instruction following. They serve as replacements for the original GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> base models and utilize the legacy \n",
"Completions API. Most users are advised to use GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> or GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>.\n",
"- **babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>**: Replaces the GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> ada and babbage models, with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> tokens and training data up to \n",
"September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>.\n",
"- **davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>**: Replaces the GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> curie and davinci models, with a max of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">16</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">384</span> tokens and training data up to\n",
"September <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2021</span>.\n",
"</pre>\n"
],
"text/plain": [
"**Moderation Models and GPT Base**\n",
"**Moderation Models**\n",
"The moderation models are designed to ensure content compliance with OpenAI's usage policies. They classify content\n",
"into categories such as hate, hate/threatening, self-harm, sexual, sexual/minors, violence, and violence/graphic. \n",
"These models process inputs by breaking them into chunks of \u001b[1;36m4\u001b[0m,\u001b[1;36m096\u001b[0m tokens. If the input exceeds \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens, some \n",
"tokens may be truncated, potentially omitting a few from the moderation check.\n",
"The moderation endpoint provides the maximum score per category from each request. For instance, if one chunk \n",
"scores \u001b[1;36m0.9901\u001b[0m and another scores \u001b[1;36m0.1901\u001b[0m in a category, the API response will show \u001b[1;36m0.9901\u001b[0m.\n",
"- **text-moderation-latest**: Points to text-moderation-\u001b[1;36m007\u001b[0m with a max of \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens.\n",
"- **text-moderation-stable**: Also points to text-moderation-\u001b[1;36m007\u001b[0m with a max of \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens.\n",
"- **text-moderation-\u001b[1;36m007\u001b[0m**: The most capable model across all categories with a max of \u001b[1;36m32\u001b[0m,\u001b[1;36m768\u001b[0m tokens.\n",
"**GPT Base**\n",
"GPT base models are capable of understanding and generating natural language or code but are not trained for \n",
"instruction following. They serve as replacements for the original GPT-\u001b[1;36m3\u001b[0m base models and utilize the legacy \n",
"Completions API. Most users are advised to use GPT-\u001b[1;36m3.5\u001b[0m or GPT-\u001b[1;36m4\u001b[0m.\n",
"- **babbage-\u001b[1;36m002\u001b[0m**: Replaces the GPT-\u001b[1;36m3\u001b[0m ada and babbage models, with a max of \u001b[1;36m16\u001b[0m,\u001b[1;36m384\u001b[0m tokens and training data up to \n",
"September \u001b[1;36m2021\u001b[0m.\n",
"- **davinci-\u001b[1;36m002\u001b[0m**: Replaces the GPT-\u001b[1;36m3\u001b[0m curie and davinci models, with a max of \u001b[1;36m16\u001b[0m,\u001b[1;36m384\u001b[0m tokens and training data up to\n",
"September \u001b[1;36m2021\u001b[0m.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Your Data is Your Data\n",
"As of March <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>, data sent to the OpenAI API is not used to train or improve OpenAI models unless you \n",
"explicitly opt in. Opting in can help models improve for your specific use case over time.\n",
"To prevent abuse, API data may be retained for up to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days before deletion, unless legally required otherwise. \n",
"Trusted customers with sensitive applications may have zero data retention, meaning request and response bodies are\n",
"not logged and exist only in memory to serve the request.\n",
"This data policy does not apply to OpenAI's non-API consumer services like ChatGPT or DALL-E Labs.\n",
"**Default Usage Policies by Endpoint**\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/chat/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>**: Data is not used for training. Default retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, and it is eligible for \n",
"zero retention except for image inputs.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">files</span>**: Data is not used for training. Retention is until deleted by the customer, with no zero retention \n",
"option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">assistants</span>**: Data is not used for training. Retention is until deleted by the customer, with no zero \n",
"retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">threads</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">messages</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">runs</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/threads/runs/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">steps</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">60</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">generations</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">edits</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/images/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">variations</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, with no zero retention option.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">embeddings</span>**: Data is not used for training. Retention is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days, and it is eligible for zero retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">transcriptions</span>**: Data is not used for training\n",
"</pre>\n"
],
"text/plain": [
"Your Data is Your Data\n",
"As of March \u001b[1;36m1\u001b[0m, \u001b[1;36m2023\u001b[0m, data sent to the OpenAI API is not used to train or improve OpenAI models unless you \n",
"explicitly opt in. Opting in can help models improve for your specific use case over time.\n",
"To prevent abuse, API data may be retained for up to \u001b[1;36m30\u001b[0m days before deletion, unless legally required otherwise. \n",
"Trusted customers with sensitive applications may have zero data retention, meaning request and response bodies are\n",
"not logged and exist only in memory to serve the request.\n",
"This data policy does not apply to OpenAI's non-API consumer services like ChatGPT or DALL-E Labs.\n",
"**Default Usage Policies by Endpoint**\n",
"- **\u001b[35m/v1/chat/\u001b[0m\u001b[95mcompletions\u001b[0m**: Data is not used for training. Default retention is \u001b[1;36m30\u001b[0m days, and it is eligible for \n",
"zero retention except for image inputs.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95mfiles\u001b[0m**: Data is not used for training. Retention is until deleted by the customer, with no zero retention \n",
"option.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95massistants\u001b[0m**: Data is not used for training. Retention is until deleted by the customer, with no zero \n",
"retention option.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95mthreads\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m60\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/threads/\u001b[0m\u001b[95mmessages\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m60\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/threads/\u001b[0m\u001b[95mruns\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m60\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/threads/runs/\u001b[0m\u001b[95msteps\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m60\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/images/\u001b[0m\u001b[95mgenerations\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m30\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/images/\u001b[0m\u001b[95medits\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m30\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/images/\u001b[0m\u001b[95mvariations\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m30\u001b[0m days, with no zero retention option.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95membeddings\u001b[0m**: Data is not used for training. Retention is \u001b[1;36m30\u001b[0m days, and it is eligible for zero retention.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranscriptions\u001b[0m**: Data is not used for training\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">### Model Endpoint Compatibility and Data Retention\n",
"#### Data Retention Details\n",
"The table outlines the data retention policies for various API endpoints:\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">translations</span>**: No data is used for training, and there is zero data retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">speech</span>**: No data is used for training, with a default retention period of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days. It is not \n",
"eligible for zero retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/fine_tuning/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">jobs</span>**: No data is used for training, and data is retained until deleted by the customer. It is\n",
"not eligible for zero retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">moderations</span>**: No data is used for training, and there is zero data retention.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>**: No data is used for training, with a default retention period of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30</span> days. It is eligible for\n",
"zero retention.\n",
"Additional notes:\n",
"- Image inputs via the `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview` model are not eligible for zero retention.\n",
"- The default retention period for the Assistants API is still being evaluated during the Beta phase.\n",
"#### Model Endpoint Compatibility\n",
"The table provides information on the compatibility of endpoints with the latest models:\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">assistants</span>**: Supports all models except `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0301</span>`. The `retrieval` tool requires \n",
"`gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview` or `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1106</span>`.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">transcriptions</span>**: Compatible with `whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>`.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">translations</span>**: Compatible with `whisper-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>`.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/audio/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">speech</span>**: Compatible with `tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>` and `tts-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-hd`.\n",
"- **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/chat/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span>**: Compatible with `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>`, `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-turbo-preview`, `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-vision-preview`, `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>-32k`, \n",
"and `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo`.\n",
"For more details, users are encouraged to refer to the API data usage policies or contact the sales team for \n",
"information on zero retention.\n",
"</pre>\n"
],
"text/plain": [
"### Model Endpoint Compatibility and Data Retention\n",
"#### Data Retention Details\n",
"The table outlines the data retention policies for various API endpoints:\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranslations\u001b[0m**: No data is used for training, and there is zero data retention.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mspeech\u001b[0m**: No data is used for training, with a default retention period of \u001b[1;36m30\u001b[0m days. It is not \n",
"eligible for zero retention.\n",
"- **\u001b[35m/v1/fine_tuning/\u001b[0m\u001b[95mjobs\u001b[0m**: No data is used for training, and data is retained until deleted by the customer. It is\n",
"not eligible for zero retention.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95mmoderations\u001b[0m**: No data is used for training, and there is zero data retention.\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95mcompletions\u001b[0m**: No data is used for training, with a default retention period of \u001b[1;36m30\u001b[0m days. It is eligible for\n",
"zero retention.\n",
"Additional notes:\n",
"- Image inputs via the `gpt-\u001b[1;36m4\u001b[0m-vision-preview` model are not eligible for zero retention.\n",
"- The default retention period for the Assistants API is still being evaluated during the Beta phase.\n",
"#### Model Endpoint Compatibility\n",
"The table provides information on the compatibility of endpoints with the latest models:\n",
"- **\u001b[35m/v1/\u001b[0m\u001b[95massistants\u001b[0m**: Supports all models except `gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m0301\u001b[0m`. The `retrieval` tool requires \n",
"`gpt-\u001b[1;36m4\u001b[0m-turbo-preview` or `gpt-\u001b[1;36m3.5\u001b[0m-turbo-\u001b[1;36m1106\u001b[0m`.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranscriptions\u001b[0m**: Compatible with `whisper-\u001b[1;36m1\u001b[0m`.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mtranslations\u001b[0m**: Compatible with `whisper-\u001b[1;36m1\u001b[0m`.\n",
"- **\u001b[35m/v1/audio/\u001b[0m\u001b[95mspeech\u001b[0m**: Compatible with `tts-\u001b[1;36m1\u001b[0m` and `tts-\u001b[1;36m1\u001b[0m-hd`.\n",
"- **\u001b[35m/v1/chat/\u001b[0m\u001b[95mcompletions\u001b[0m**: Compatible with `gpt-\u001b[1;36m4\u001b[0m`, `gpt-\u001b[1;36m4\u001b[0m-turbo-preview`, `gpt-\u001b[1;36m4\u001b[0m-vision-preview`, `gpt-\u001b[1;36m4\u001b[0m-32k`, \n",
"and `gpt-\u001b[1;36m3.5\u001b[0m-turbo`.\n",
"For more details, users are encouraged to refer to the API data usage policies or contact the sales team for \n",
"information on zero retention.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">LATEST MODELS\n",
"This document outlines the latest models available for different endpoints in the OpenAI API:\n",
". **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">completions</span> <span style=\"font-weight: bold\">(</span>Legacy<span style=\"font-weight: bold\">)</span>**:\n",
" - Models: `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-instruct`, `babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`, `davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`\n",
" - These models are used for generating text completions based on input prompts.\n",
". **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">embeddings</span>**:\n",
" - Models: `text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-small`, `text-embedding-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>-large`, `text-embedding-ada-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`\n",
" - These models are designed to convert text into numerical vectors, which can be used for various tasks like \n",
"similarity comparison and clustering.\n",
". **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/fine_tuning/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">jobs</span>**:\n",
" - Models: `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo`, `babbage-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`, `davinci-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">002</span>`\n",
" - These models support fine-tuning, allowing users to customize the models for specific tasks by training them \n",
"on additional data.\n",
". **<span style=\"color: #800080; text-decoration-color: #800080\">/v1/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">moderations</span>**:\n",
" - Models: `text-moderation-stable`\n",
" - This model is used for content moderation, helping to identify and filter out inappropriate or harmful \n",
"content.\n",
"Additionally, the document mentions the availability of `gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo-16k` and other fine-tuned versions of \n",
"`gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo`, indicating enhancements in model capabilities and performance.\n",
"</pre>\n"
],
"text/plain": [
"LATEST MODELS\n",
"This document outlines the latest models available for different endpoints in the OpenAI API:\n",
". **\u001b[35m/v1/\u001b[0m\u001b[95mcompletions\u001b[0m \u001b[1m(\u001b[0mLegacy\u001b[1m)\u001b[0m**:\n",
" - Models: `gpt-\u001b[1;36m3.5\u001b[0m-turbo-instruct`, `babbage-\u001b[1;36m002\u001b[0m`, `davinci-\u001b[1;36m002\u001b[0m`\n",
" - These models are used for generating text completions based on input prompts.\n",
". **\u001b[35m/v1/\u001b[0m\u001b[95membeddings\u001b[0m**:\n",
" - Models: `text-embedding-\u001b[1;36m3\u001b[0m-small`, `text-embedding-\u001b[1;36m3\u001b[0m-large`, `text-embedding-ada-\u001b[1;36m002\u001b[0m`\n",
" - These models are designed to convert text into numerical vectors, which can be used for various tasks like \n",
"similarity comparison and clustering.\n",
". **\u001b[35m/v1/fine_tuning/\u001b[0m\u001b[95mjobs\u001b[0m**:\n",
" - Models: `gpt-\u001b[1;36m3.5\u001b[0m-turbo`, `babbage-\u001b[1;36m002\u001b[0m`, `davinci-\u001b[1;36m002\u001b[0m`\n",
" - These models support fine-tuning, allowing users to customize the models for specific tasks by training them \n",
"on additional data.\n",
". **\u001b[35m/v1/\u001b[0m\u001b[95mmoderations\u001b[0m**:\n",
" - Models: `text-moderation-stable`\n",
" - This model is used for content moderation, helping to identify and filter out inappropriate or harmful \n",
"content.\n",
"Additionally, the document mentions the availability of `gpt-\u001b[1;36m3.5\u001b[0m-turbo-16k` and other fine-tuned versions of \n",
"`gpt-\u001b[1;36m3.5\u001b[0m-turbo`, indicating enhancements in model capabilities and performance.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Overview\n",
"Evaluation is the process of validatingand testing the outputs that your LLMapplications are producing. \n",
"Havingstrong evaluations <span style=\"font-weight: bold\">(</span>“evals”<span style=\"font-weight: bold\">)</span> will mean amore stable, reliable application which isresilient to code and model\n",
"changes.\n",
"Example use cases\n",
"- Quantify a solutions reliability\n",
"- Monitor application performance in\n",
"production\n",
"Test for regressions\n",
"-\n",
"What well cover\n",
"● What are evals\n",
"● Technical patterns\n",
"● Example framework\n",
"● Best practices\n",
"● Resources\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Overview\n",
"Evaluation is the process of validatingand testing the outputs that your LLMapplications are producing. \n",
"Havingstrong evaluations \u001b[1m(\u001b[0m“evals”\u001b[1m)\u001b[0m will mean amore stable, reliable application which isresilient to code and model\n",
"changes.\n",
"Example use cases\n",
"- Quantify a solutions reliability\n",
"- Monitor application performance in\n",
"production\n",
"Test for regressions\n",
"-\n",
"What well cover\n",
"● What are evals\n",
"● Technical patterns\n",
"● Example framework\n",
"● Best practices\n",
"● Resources\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What are evals\n",
"Example\n",
"An evaluation contains a question and a correct answer. We call this the ground truth.\n",
"Question\n",
"What is the populationof Canada?\n",
"Thought: I dont know. Ishould use a tool\n",
"Action: Search\n",
"Action Input: What is thepopulation of Canada?\n",
"LLM\n",
"Search\n",
"There are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> peoplein Canada as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"The current population ofCanada is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> as ofTuesday, May <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>….\n",
"Actual result\n",
"\n",
"An evaluation, or <span style=\"color: #008000; text-decoration-color: #008000\">\"eval,\"</span> involves a question and a correct answer, known as the ground truth. In this example, the\n",
"question posed is, <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span>\n",
"The process begins with a person asking this question. The language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> initially does not know the answer \n",
"and decides to use a tool to find it. The LLM takes the action of searching, with the input being the question \n",
"about Canada's population.\n",
"The search tool then provides the answer: <span style=\"color: #008000; text-decoration-color: #008000\">\"The current population of Canada is 39,566,248 as of Tuesday, May 23, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2023.\"</span> This result matches the actual result expected, which is that there are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people in Canada as of \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"This example illustrates how evaluations are used to verify the accuracy of information provided by a language \n",
"model.\n",
" an example of an evaluation process, often referred to as <span style=\"color: #008000; text-decoration-color: #008000\">\"evals.\"</span> The purpose of evals is to compare a predicted \n",
"answer to a known correct answer, called the <span style=\"color: #008000; text-decoration-color: #008000\">\"ground truth,\"</span> to determine if they match.\n",
"In this example, the question posed is: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span> The ground truth states that the \n",
"population of Canada in <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people. The predicted answer is: <span style=\"color: #008000; text-decoration-color: #008000\">\"There are 39,566,248 people in Canada </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">as of 2023.\"</span>\n",
"Since the predicted answer matches the ground truth, the evaluation is successful, as indicated by a checkmark. \n",
"This process is crucial for verifying the accuracy of predictions in various applications.\n",
"</pre>\n"
],
"text/plain": [
"What are evals\n",
"Example\n",
"An evaluation contains a question and a correct answer. We call this the ground truth.\n",
"Question\n",
"What is the populationof Canada?\n",
"Thought: I dont know. Ishould use a tool\n",
"Action: Search\n",
"Action Input: What is thepopulation of Canada?\n",
"LLM\n",
"Search\n",
"There are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m peoplein Canada as of \u001b[1;36m2023\u001b[0m.\n",
"The current population ofCanada is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m as ofTuesday, May \u001b[1;36m23\u001b[0m, \u001b[1;36m2023\u001b[0m….\n",
"Actual result\n",
"\n",
"An evaluation, or \u001b[32m\"eval,\"\u001b[0m involves a question and a correct answer, known as the ground truth. In this example, the\n",
"question posed is, \u001b[32m\"What is the population of Canada?\"\u001b[0m\n",
"The process begins with a person asking this question. The language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m initially does not know the answer \n",
"and decides to use a tool to find it. The LLM takes the action of searching, with the input being the question \n",
"about Canada's population.\n",
"The search tool then provides the answer: \u001b[32m\"The current population of Canada is 39,566,248 as of Tuesday, May 23, \u001b[0m\n",
"\u001b[32m2023.\"\u001b[0m This result matches the actual result expected, which is that there are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people in Canada as of \n",
"\u001b[1;36m2023\u001b[0m.\n",
"This example illustrates how evaluations are used to verify the accuracy of information provided by a language \n",
"model.\n",
" an example of an evaluation process, often referred to as \u001b[32m\"evals.\"\u001b[0m The purpose of evals is to compare a predicted \n",
"answer to a known correct answer, called the \u001b[32m\"ground truth,\"\u001b[0m to determine if they match.\n",
"In this example, the question posed is: \u001b[32m\"What is the population of Canada?\"\u001b[0m The ground truth states that the \n",
"population of Canada in \u001b[1;36m2023\u001b[0m is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people. The predicted answer is: \u001b[32m\"There are 39,566,248 people in Canada \u001b[0m\n",
"\u001b[32mas of 2023.\"\u001b[0m\n",
"Since the predicted answer matches the ground truth, the evaluation is successful, as indicated by a checkmark. \n",
"This process is crucial for verifying the accuracy of predictions in various applications.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What are evals\n",
"Example\n",
"Our ground truth matches the predicted answer, so the evaluation passes!\n",
"Evaluation\n",
"Question\n",
"Ground Truth\n",
"Predicted Answer\n",
"What is the populationof Canada?\n",
"The population of Canada in2023 is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people.\n",
"There are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> peoplein Canada as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"\n",
"An evaluation, or <span style=\"color: #008000; text-decoration-color: #008000\">\"eval,\"</span> involves a question and a correct answer, known as the ground truth. In this example, the\n",
"question posed is, <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span>\n",
"The process begins with a person asking this question. The language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> initially does not know the answer \n",
"and decides to use a tool to find it. The LLM takes the action of searching, with the input being the question \n",
"about Canada's population.\n",
"The search tool then provides the answer: <span style=\"color: #008000; text-decoration-color: #008000\">\"The current population of Canada is 39,566,248 as of Tuesday, May 23, </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">2023.\"</span> This result matches the actual result expected, which is that there are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people in Canada as of \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"This example illustrates how evaluations are used to verify the accuracy of information provided by a language \n",
"model.\n",
" an example of an evaluation process, often referred to as <span style=\"color: #008000; text-decoration-color: #008000\">\"evals.\"</span> The purpose of evals is to compare a predicted \n",
"answer to a known correct answer, called the <span style=\"color: #008000; text-decoration-color: #008000\">\"ground truth,\"</span> to determine if they match.\n",
"In this example, the question posed is: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span> The ground truth states that the \n",
"population of Canada in <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> people. The predicted answer is: <span style=\"color: #008000; text-decoration-color: #008000\">\"There are 39,566,248 people in Canada </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">as of 2023.\"</span>\n",
"Since the predicted answer matches the ground truth, the evaluation is successful, as indicated by a checkmark. \n",
"This process is crucial for verifying the accuracy of predictions in various applications.\n",
"</pre>\n"
],
"text/plain": [
"What are evals\n",
"Example\n",
"Our ground truth matches the predicted answer, so the evaluation passes!\n",
"Evaluation\n",
"Question\n",
"Ground Truth\n",
"Predicted Answer\n",
"What is the populationof Canada?\n",
"The population of Canada in2023 is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people.\n",
"There are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m peoplein Canada as of \u001b[1;36m2023\u001b[0m.\n",
"\n",
"An evaluation, or \u001b[32m\"eval,\"\u001b[0m involves a question and a correct answer, known as the ground truth. In this example, the\n",
"question posed is, \u001b[32m\"What is the population of Canada?\"\u001b[0m\n",
"The process begins with a person asking this question. The language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m initially does not know the answer \n",
"and decides to use a tool to find it. The LLM takes the action of searching, with the input being the question \n",
"about Canada's population.\n",
"The search tool then provides the answer: \u001b[32m\"The current population of Canada is 39,566,248 as of Tuesday, May 23, \u001b[0m\n",
"\u001b[32m2023.\"\u001b[0m This result matches the actual result expected, which is that there are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people in Canada as of \n",
"\u001b[1;36m2023\u001b[0m.\n",
"This example illustrates how evaluations are used to verify the accuracy of information provided by a language \n",
"model.\n",
" an example of an evaluation process, often referred to as \u001b[32m\"evals.\"\u001b[0m The purpose of evals is to compare a predicted \n",
"answer to a known correct answer, called the \u001b[32m\"ground truth,\"\u001b[0m to determine if they match.\n",
"In this example, the question posed is: \u001b[32m\"What is the population of Canada?\"\u001b[0m The ground truth states that the \n",
"population of Canada in \u001b[1;36m2023\u001b[0m is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m people. The predicted answer is: \u001b[32m\"There are 39,566,248 people in Canada \u001b[0m\n",
"\u001b[32mas of 2023.\"\u001b[0m\n",
"Since the predicted answer matches the ground truth, the evaluation is successful, as indicated by a checkmark. \n",
"This process is crucial for verifying the accuracy of predictions in various applications.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Metric-based evaluations\n",
"Component evaluations\n",
"Subjective evaluations\n",
"●\n",
"●\n",
"Comparison metrics likeBLEU, ROUGE\n",
"Gives a score to filter andrank results\n",
"●\n",
"●\n",
"Compares groundtruth to prediction\n",
"Gives Pass/Fail\n",
"●\n",
"●\n",
"Uses a scorecard toevaluate subjectively\n",
"Scorecard may alsohave a Pass/Fail\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Metric-based evaluations\n",
"Component evaluations\n",
"Subjective evaluations\n",
"●\n",
"●\n",
"Comparison metrics likeBLEU, ROUGE\n",
"Gives a score to filter andrank results\n",
"●\n",
"●\n",
"Compares groundtruth to prediction\n",
"Gives Pass/Fail\n",
"●\n",
"●\n",
"Uses a scorecard toevaluate subjectively\n",
"Scorecard may alsohave a Pass/Fail\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Metric-based evaluations\n",
"ROUGE is a common metric for evaluating machine summarizations of text\n",
"ROUGE\n",
"Metric for evaluatingsummarization tasks\n",
"Original\n",
"OpenAI's mission is to ensure thatartificial general intelligence <span style=\"font-weight: bold\">(</span>AGI<span style=\"font-weight: bold\">)</span>benefits all of humanity. OpenAIwill build \n",
"safe and beneficial AGIdirectly, but will also consider itsmission fulfilled if its work aidsothers to achieve this \n",
"outcome.OpenAI follows several keyprinciples for this purpose. First,broadly distributed benefits - anyinfluence over\n",
"AGI's deploymentwill be used for the benefit of all,and to avoid harmful uses or undueconcentration of power…\n",
"MachineSummary\n",
"OpenAI aims to ensure AGI isfor everyone's use, totallyavoiding harmful stuff or bigpower concentration.Committed \n",
"to researchingAGI's safe side, promotingthese studies in AI folks.OpenAI wants to be top in AIthings and works \n",
"withworldwide research, policygroups to figure AGI's stuff.\n",
"ROUGEScore\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">51162</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Metric-based evaluations\n",
"ROUGE is a common metric for evaluating machine summarizations of text\n",
"ROUGE\n",
"Metric for evaluatingsummarization tasks\n",
"Original\n",
"OpenAI's mission is to ensure thatartificial general intelligence \u001b[1m(\u001b[0mAGI\u001b[1m)\u001b[0mbenefits all of humanity. OpenAIwill build \n",
"safe and beneficial AGIdirectly, but will also consider itsmission fulfilled if its work aidsothers to achieve this \n",
"outcome.OpenAI follows several keyprinciples for this purpose. First,broadly distributed benefits - anyinfluence over\n",
"AGI's deploymentwill be used for the benefit of all,and to avoid harmful uses or undueconcentration of power…\n",
"MachineSummary\n",
"OpenAI aims to ensure AGI isfor everyone's use, totallyavoiding harmful stuff or bigpower concentration.Committed \n",
"to researchingAGI's safe side, promotingthese studies in AI folks.OpenAI wants to be top in AIthings and works \n",
"withworldwide research, policygroups to figure AGI's stuff.\n",
"ROUGEScore\n",
".\u001b[1;36m51162\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Metric-based evaluations\n",
"BLEU score is another standard metric, this time focusing on machine translation tasks\n",
"BLEU\n",
"Original text\n",
"Reference\n",
"Translation\n",
"PredictedTranslation\n",
"Metric forevaluatingtranslation tasks\n",
"Y gwir oedddoedden nhwddim yn dweudcelwyddau wedi'rcwbl.\n",
"The truth wasthey were nottelling lies afterall.\n",
"The truth wasthey weren'ttelling lies afterall.\n",
"BLEUScore\n",
".<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39938</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Metric-based evaluations\n",
"BLEU score is another standard metric, this time focusing on machine translation tasks\n",
"BLEU\n",
"Original text\n",
"Reference\n",
"Translation\n",
"PredictedTranslation\n",
"Metric forevaluatingtranslation tasks\n",
"Y gwir oedddoedden nhwddim yn dweudcelwyddau wedi'rcwbl.\n",
"The truth wasthey were nottelling lies afterall.\n",
"The truth wasthey weren'ttelling lies afterall.\n",
"BLEUScore\n",
".\u001b[1;36m39938\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Metric-based evaluations\n",
"What theyre good for\n",
"What to be aware of\n",
"●\n",
"●\n",
"A good starting point for evaluating a\n",
"● Not tuned to your specific context\n",
"fresh solution\n",
"Useful yardstick for automated testing\n",
"of whether a change has triggered a\n",
"major performance shift\n",
"● Most customers require more\n",
"sophisticated evaluations to go to\n",
"production\n",
"● Cheap and fast\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Metric-based evaluations\n",
"What theyre good for\n",
"What to be aware of\n",
"●\n",
"●\n",
"A good starting point for evaluating a\n",
"● Not tuned to your specific context\n",
"fresh solution\n",
"Useful yardstick for automated testing\n",
"of whether a change has triggered a\n",
"major performance shift\n",
"● Most customers require more\n",
"sophisticated evaluations to go to\n",
"production\n",
"● Cheap and fast\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Component evaluations\n",
"Component evaluations <span style=\"font-weight: bold\">(</span>or “unit tests”<span style=\"font-weight: bold\">)</span> cover a single input/output of the application. They checkwhether each \n",
"component works in isolation, comparing the input to a ground truth ideal result\n",
"Is this thecorrect action?\n",
"Exact matchcomparison\n",
"Does this answeruse the context?\n",
"Extract numbersfrom each andcompare\n",
"What is the populationof Canada?\n",
"Thought: I dont know. Ishould use a tool\n",
"Action: Search\n",
"Action Input: What is thepopulation of Canada?\n",
"Agent\n",
"Search\n",
"There are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> peoplein Canada as of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>.\n",
"The current population ofCanada is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">566</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">248</span> as ofTuesday, May <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">23</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span>….\n",
"Is this the rightsearch result?\n",
"Tag the rightanswer and doan exact matchcomparison withthe retrieval.\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Component evaluations\n",
"Component evaluations \u001b[1m(\u001b[0mor “unit tests”\u001b[1m)\u001b[0m cover a single input/output of the application. They checkwhether each \n",
"component works in isolation, comparing the input to a ground truth ideal result\n",
"Is this thecorrect action?\n",
"Exact matchcomparison\n",
"Does this answeruse the context?\n",
"Extract numbersfrom each andcompare\n",
"What is the populationof Canada?\n",
"Thought: I dont know. Ishould use a tool\n",
"Action: Search\n",
"Action Input: What is thepopulation of Canada?\n",
"Agent\n",
"Search\n",
"There are \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m peoplein Canada as of \u001b[1;36m2023\u001b[0m.\n",
"The current population ofCanada is \u001b[1;36m39\u001b[0m,\u001b[1;36m566\u001b[0m,\u001b[1;36m248\u001b[0m as ofTuesday, May \u001b[1;36m23\u001b[0m, \u001b[1;36m2023\u001b[0m….\n",
"Is this the rightsearch result?\n",
"Tag the rightanswer and doan exact matchcomparison withthe retrieval.\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical patterns\n",
"Subjective evaluations\n",
"Building up a good scorecard for automated testing benefits from a few rounds of detailed humanreview so we can \n",
"learn what is valuable.\n",
"A policy of “show rather than tell” is also advised for GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, so include examples of what a <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> and8 out of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> \n",
"look like so the model can appreciate the spread.\n",
"Examplescorecard\n",
"You are a helpful evaluation assistant who grades how well the Assistant has answered the customers query.\n",
"You will assess each submission against these metrics, please think through these step by step:\n",
"-\n",
"relevance: Grade how relevant the search content is to the question from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> <span style=\"color: #800080; text-decoration-color: #800080\">//</span> <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> being highly relevant and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> \n",
"beingnot relevant at all.\n",
"- credibility: Grade how credible the sources provided are from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> <span style=\"color: #800080; text-decoration-color: #800080\">//</span> <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> being an established newspaper,\n",
"-\n",
"government agency or large company and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> being unreferenced.\n",
"result: Assess whether the question is correct given only the content returned from the search and the \n",
"usersquestion <span style=\"color: #800080; text-decoration-color: #800080\">//</span> acceptable values are “correct” or “incorrect”\n",
"You will output this as a JSON document: <span style=\"font-weight: bold\">{</span>relevance: integer, credibility: integer, result: string<span style=\"font-weight: bold\">}</span>\n",
"User: What is the population of Canada?\n",
"Assistant: Canada's population was estimated at <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">39</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">858</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">480</span> on April <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2023</span> by Statistics Canada.\n",
"Evaluation: <span style=\"font-weight: bold\">{</span>relevance: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, credibility: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, result: correct<span style=\"font-weight: bold\">}</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Technical patterns\n",
"Subjective evaluations\n",
"Building up a good scorecard for automated testing benefits from a few rounds of detailed humanreview so we can \n",
"learn what is valuable.\n",
"A policy of “show rather than tell” is also advised for GPT-\u001b[1;36m4\u001b[0m, so include examples of what a \u001b[1;36m1\u001b[0m, \u001b[1;36m3\u001b[0m and8 out of \u001b[1;36m10\u001b[0m \n",
"look like so the model can appreciate the spread.\n",
"Examplescorecard\n",
"You are a helpful evaluation assistant who grades how well the Assistant has answered the customers query.\n",
"You will assess each submission against these metrics, please think through these step by step:\n",
"-\n",
"relevance: Grade how relevant the search content is to the question from \u001b[1;36m1\u001b[0m to \u001b[1;36m5\u001b[0m \u001b[35m/\u001b[0m\u001b[35m/\u001b[0m \u001b[1;36m5\u001b[0m being highly relevant and \u001b[1;36m1\u001b[0m \n",
"beingnot relevant at all.\n",
"- credibility: Grade how credible the sources provided are from \u001b[1;36m1\u001b[0m to \u001b[1;36m5\u001b[0m \u001b[35m/\u001b[0m\u001b[35m/\u001b[0m \u001b[1;36m5\u001b[0m being an established newspaper,\n",
"-\n",
"government agency or large company and \u001b[1;36m1\u001b[0m being unreferenced.\n",
"result: Assess whether the question is correct given only the content returned from the search and the \n",
"usersquestion \u001b[35m/\u001b[0m\u001b[35m/\u001b[0m acceptable values are “correct” or “incorrect”\n",
"You will output this as a JSON document: \u001b[1m{\u001b[0mrelevance: integer, credibility: integer, result: string\u001b[1m}\u001b[0m\n",
"User: What is the population of Canada?\n",
"Assistant: Canada's population was estimated at \u001b[1;36m39\u001b[0m,\u001b[1;36m858\u001b[0m,\u001b[1;36m480\u001b[0m on April \u001b[1;36m1\u001b[0m, \u001b[1;36m2023\u001b[0m by Statistics Canada.\n",
"Evaluation: \u001b[1m{\u001b[0mrelevance: \u001b[1;36m5\u001b[0m, credibility: \u001b[1;36m5\u001b[0m, result: correct\u001b[1m}\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Example framework\n",
"Your evaluations can be grouped up into test suites called runs and executed in a batch to testthe effectiveness of \n",
"your system.\n",
"Each run should have its contents logged and stored at the most granular level <span style=\"color: #800080; text-decoration-color: #800080; font-weight: bold\">possible</span><span style=\"font-weight: bold\">(</span>“tracing”<span style=\"font-weight: bold\">)</span> so you can \n",
"investigate failure reasons, make tweaks and then rerun your evals.\n",
"Run ID Model\n",
"Score\n",
"Annotation feedback\n",
"Changes since last run\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">28</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"<span style=\"color: #800080; text-decoration-color: #800080\">/</span><span style=\"color: #ff00ff; text-decoration-color: #ff00ff\">50</span>\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">34</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18</span> incorrect with correct search results\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
"N/A\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> incorrect with correct search results\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">12</span> incorrect with correct search results\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
"Model updated to GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"Added few-shot examples\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">42</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> incorrect with correct search results\n",
"Added metadata to search\n",
"Prompt engineering for Answer step\n",
"gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">48</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
"● <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> incorrect with correct search results\n",
"Prompt engineering to Answer step\n",
"\n",
"This diagram illustrates a framework for processing a return request using a language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> system. Here's a \n",
"breakdown of the process:\n",
". **User Input**: The user wants to return a T-shirt purchased on Amazon on March 3rd.\n",
". **Router**: The initial input is processed by a router LLM, which determines the nature of the request. The \n",
"expected and predicted outcomes are both <span style=\"color: #008000; text-decoration-color: #008000\">\"return,\"</span> and the process passes this evaluation.\n",
". **Return Assistant**: The request is then handled by a return assistant LLM. It interacts with a knowledge base \n",
"to verify the return policy.\n",
". **Knowledge Base**: The system checks the return policy, confirming that the item is eligible for return within \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days of purchase. The expected and predicted outcomes are <span style=\"color: #008000; text-decoration-color: #008000\">\"return_policy,\"</span> and this step also passes.\n",
". **Response to User**: The system responds to the user, confirming that the return can be processed because it is \n",
"within the <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span>-day window.\n",
". **Evaluation**: The response is evaluated for adherence to guidelines, scoring <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> for politeness, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for coherence,\n",
"and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for relevancy, resulting in a pass.\n",
"The framework uses both component evaluations <span style=\"font-weight: bold\">(</span>red dashed lines<span style=\"font-weight: bold\">)</span> and subjective evaluations <span style=\"font-weight: bold\">(</span>orange dashed lines<span style=\"font-weight: bold\">)</span> \n",
"to ensure the process is accurate and user-friendly.\n",
"</pre>\n"
],
"text/plain": [
"Example framework\n",
"Your evaluations can be grouped up into test suites called runs and executed in a batch to testthe effectiveness of \n",
"your system.\n",
"Each run should have its contents logged and stored at the most granular level \u001b[1;35mpossible\u001b[0m\u001b[1m(\u001b[0m“tracing”\u001b[1m)\u001b[0m so you can \n",
"investigate failure reasons, make tweaks and then rerun your evals.\n",
"Run ID Model\n",
"Score\n",
"Annotation feedback\n",
"Changes since last run\n",
"\n",
"\n",
"\n",
"\n",
"\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo \u001b[1;36m28\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"gpt-\u001b[1;36m4\u001b[0m\n",
"\u001b[35m/\u001b[0m\u001b[95m50\u001b[0m\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo \u001b[1;36m34\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"● \u001b[1;36m18\u001b[0m incorrect with correct search results\n",
"● \u001b[1;36m4\u001b[0m incorrect searches\n",
"N/A\n",
"● \u001b[1;36m10\u001b[0m incorrect with correct search results\n",
"● \u001b[1;36m4\u001b[0m incorrect searches\n",
"● \u001b[1;36m12\u001b[0m incorrect with correct search results\n",
"● \u001b[1;36m4\u001b[0m incorrect searches\n",
"Model updated to GPT-\u001b[1;36m4\u001b[0m\n",
"Added few-shot examples\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo \u001b[1;36m42\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"● \u001b[1;36m8\u001b[0m incorrect with correct search results\n",
"Added metadata to search\n",
"Prompt engineering for Answer step\n",
"gpt-\u001b[1;36m3.5\u001b[0m-turbo \u001b[1;36m48\u001b[0m/\u001b[1;36m50\u001b[0m\n",
"● \u001b[1;36m2\u001b[0m incorrect with correct search results\n",
"Prompt engineering to Answer step\n",
"\n",
"This diagram illustrates a framework for processing a return request using a language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m system. Here's a \n",
"breakdown of the process:\n",
". **User Input**: The user wants to return a T-shirt purchased on Amazon on March 3rd.\n",
". **Router**: The initial input is processed by a router LLM, which determines the nature of the request. The \n",
"expected and predicted outcomes are both \u001b[32m\"return,\"\u001b[0m and the process passes this evaluation.\n",
". **Return Assistant**: The request is then handled by a return assistant LLM. It interacts with a knowledge base \n",
"to verify the return policy.\n",
". **Knowledge Base**: The system checks the return policy, confirming that the item is eligible for return within \n",
"\u001b[1;36m14\u001b[0m days of purchase. The expected and predicted outcomes are \u001b[32m\"return_policy,\"\u001b[0m and this step also passes.\n",
". **Response to User**: The system responds to the user, confirming that the return can be processed because it is \n",
"within the \u001b[1;36m14\u001b[0m-day window.\n",
". **Evaluation**: The response is evaluated for adherence to guidelines, scoring \u001b[1;36m5\u001b[0m for politeness, \u001b[1;36m4\u001b[0m for coherence,\n",
"and \u001b[1;36m4\u001b[0m for relevancy, resulting in a pass.\n",
"The framework uses both component evaluations \u001b[1m(\u001b[0mred dashed lines\u001b[1m)\u001b[0m and subjective evaluations \u001b[1m(\u001b[0morange dashed lines\u001b[1m)\u001b[0m \n",
"to ensure the process is accurate and user-friendly.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Example framework\n",
"I want to return aT-shirt I bought onAmazon on March 3rd.\n",
"User\n",
"Router\n",
"LLM\n",
"Expected: return\n",
"Predicted: return\n",
"PASS\n",
"Return\n",
"Assistant\n",
"LLM\n",
"Component evals\n",
"Subjective evals\n",
"Expected: return_policy\n",
"Predicted: return_policy\n",
"PASS\n",
"Knowledgebase\n",
"Question: Does this response adhere toour guidelines\n",
"Score:Politeness: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, Coherence: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, Relevancy: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"PASS\n",
"Sure - because werewithin <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days of thepurchase, I canprocess the return\n",
"Question: I want to return a T-shirt Ibought on Amazon on March 3rd.\n",
"Ground truth: Eligible for return\n",
"PASS\n",
"\n",
"This diagram illustrates a framework for processing a return request using a language model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> system. Here's a \n",
"breakdown of the process:\n",
". **User Input**: The user wants to return a T-shirt purchased on Amazon on March 3rd.\n",
". **Router**: The initial input is processed by a router LLM, which determines the nature of the request. The \n",
"expected and predicted outcomes are both <span style=\"color: #008000; text-decoration-color: #008000\">\"return,\"</span> and the process passes this evaluation.\n",
". **Return Assistant**: The request is then handled by a return assistant LLM. It interacts with a knowledge base \n",
"to verify the return policy.\n",
". **Knowledge Base**: The system checks the return policy, confirming that the item is eligible for return within \n",
"<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span> days of purchase. The expected and predicted outcomes are <span style=\"color: #008000; text-decoration-color: #008000\">\"return_policy,\"</span> and this step also passes.\n",
". **Response to User**: The system responds to the user, confirming that the return can be processed because it is \n",
"within the <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14</span>-day window.\n",
". **Evaluation**: The response is evaluated for adherence to guidelines, scoring <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> for politeness, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for coherence,\n",
"and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for relevancy, resulting in a pass.\n",
"The framework uses both component evaluations <span style=\"font-weight: bold\">(</span>red dashed lines<span style=\"font-weight: bold\">)</span> and subjective evaluations <span style=\"font-weight: bold\">(</span>orange dashed lines<span style=\"font-weight: bold\">)</span> \n",
"to ensure the process is accurate and user-friendly.\n",
"</pre>\n"
],
"text/plain": [
"Example framework\n",
"I want to return aT-shirt I bought onAmazon on March 3rd.\n",
"User\n",
"Router\n",
"LLM\n",
"Expected: return\n",
"Predicted: return\n",
"PASS\n",
"Return\n",
"Assistant\n",
"LLM\n",
"Component evals\n",
"Subjective evals\n",
"Expected: return_policy\n",
"Predicted: return_policy\n",
"PASS\n",
"Knowledgebase\n",
"Question: Does this response adhere toour guidelines\n",
"Score:Politeness: \u001b[1;36m5\u001b[0m, Coherence: \u001b[1;36m4\u001b[0m, Relevancy: \u001b[1;36m4\u001b[0m\n",
"PASS\n",
"Sure - because werewithin \u001b[1;36m14\u001b[0m days of thepurchase, I canprocess the return\n",
"Question: I want to return a T-shirt Ibought on Amazon on March 3rd.\n",
"Ground truth: Eligible for return\n",
"PASS\n",
"\n",
"This diagram illustrates a framework for processing a return request using a language model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m system. Here's a \n",
"breakdown of the process:\n",
". **User Input**: The user wants to return a T-shirt purchased on Amazon on March 3rd.\n",
". **Router**: The initial input is processed by a router LLM, which determines the nature of the request. The \n",
"expected and predicted outcomes are both \u001b[32m\"return,\"\u001b[0m and the process passes this evaluation.\n",
". **Return Assistant**: The request is then handled by a return assistant LLM. It interacts with a knowledge base \n",
"to verify the return policy.\n",
". **Knowledge Base**: The system checks the return policy, confirming that the item is eligible for return within \n",
"\u001b[1;36m14\u001b[0m days of purchase. The expected and predicted outcomes are \u001b[32m\"return_policy,\"\u001b[0m and this step also passes.\n",
". **Response to User**: The system responds to the user, confirming that the return can be processed because it is \n",
"within the \u001b[1;36m14\u001b[0m-day window.\n",
". **Evaluation**: The response is evaluated for adherence to guidelines, scoring \u001b[1;36m5\u001b[0m for politeness, \u001b[1;36m4\u001b[0m for coherence,\n",
"and \u001b[1;36m4\u001b[0m for relevancy, resulting in a pass.\n",
"The framework uses both component evaluations \u001b[1m(\u001b[0mred dashed lines\u001b[1m)\u001b[0m and subjective evaluations \u001b[1m(\u001b[0morange dashed lines\u001b[1m)\u001b[0m \n",
"to ensure the process is accurate and user-friendly.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Best practices\n",
"Log everything\n",
"●\n",
"Evals need test cases - log everything as you develop so you can mine your logs for good eval cases\n",
"Create a feedback loop\n",
"●\n",
"●\n",
"Build evals into your application so you can quickly run them, iterate and rerun to see the impact\n",
"Evals also provide a useful structure for few-shot or fine-tuning examples when optimizing\n",
"Employ expert labellers who know the process\n",
"● Use experts to help create your eval cases - these need to be as lifelike as possible\n",
"Evaluate early and often\n",
"●\n",
"Evals are something you should build as soon as you have your first functioning prompt - you wont beable to \n",
"optimize without this baseline, so build it early\n",
"● Making evals early also forces you to engage with what a good response looks like\n",
". **Log Everything**\n",
" - It's important to log all test cases during development. This allows you to mine your logs for effective \n",
"evaluation cases.\n",
". **Create a Feedback Loop**\n",
" - Integrate evaluations into your application to quickly run, iterate, and rerun them to observe impacts.\n",
" - Evaluations provide a useful structure for few-shot or fine-tuning examples during optimization.\n",
". **Employ Expert Labelers Who Know the Process**\n",
" - Use experts to help create evaluation cases, ensuring they are as realistic as possible.\n",
". **Evaluate Early and Often**\n",
" - Build evaluations as soon as you have a functioning prompt. This baseline is crucial for optimization.\n",
" - Early evaluations help you understand what a good response looks like, facilitating better engagement.\n",
"</pre>\n"
],
"text/plain": [
"Best practices\n",
"Log everything\n",
"●\n",
"Evals need test cases - log everything as you develop so you can mine your logs for good eval cases\n",
"Create a feedback loop\n",
"●\n",
"●\n",
"Build evals into your application so you can quickly run them, iterate and rerun to see the impact\n",
"Evals also provide a useful structure for few-shot or fine-tuning examples when optimizing\n",
"Employ expert labellers who know the process\n",
"● Use experts to help create your eval cases - these need to be as lifelike as possible\n",
"Evaluate early and often\n",
"●\n",
"Evals are something you should build as soon as you have your first functioning prompt - you wont beable to \n",
"optimize without this baseline, so build it early\n",
"● Making evals early also forces you to engage with what a good response looks like\n",
". **Log Everything**\n",
" - It's important to log all test cases during development. This allows you to mine your logs for effective \n",
"evaluation cases.\n",
". **Create a Feedback Loop**\n",
" - Integrate evaluations into your application to quickly run, iterate, and rerun them to observe impacts.\n",
" - Evaluations provide a useful structure for few-shot or fine-tuning examples during optimization.\n",
". **Employ Expert Labelers Who Know the Process**\n",
" - Use experts to help create evaluation cases, ensuring they are as realistic as possible.\n",
". **Evaluate Early and Often**\n",
" - Build evaluations as soon as you have a functioning prompt. This baseline is crucial for optimization.\n",
" - Early evaluations help you understand what a good response looks like, facilitating better engagement.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"</pre>\n"
],
"text/plain": [
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">## Overview\n",
"Evaluation is the process of validating and testing the outputs that your Large Language Model <span style=\"font-weight: bold\">(</span>LLM<span style=\"font-weight: bold\">)</span> applications \n",
"are producing. Strong evaluations, referred to as <span style=\"color: #008000; text-decoration-color: #008000\">\"evals,\"</span> contribute to creating a more stable and reliable \n",
"application that can withstand changes in code and model updates.\n",
"### Example Use Cases\n",
"- **Quantify a solutions reliability**: Measure how dependable your application is.\n",
"- **Monitor application performance in production**: Keep track of how well your application performs in real-world\n",
"scenarios.\n",
"- **Test for regressions**: Ensure that new updates do not negatively impact existing functionality.\n",
"### What Well Cover\n",
"- **What are evals**: Understanding the concept and importance of evaluations.\n",
"- **Technical patterns**: Exploring common methods and strategies used in evaluations.\n",
"- **Example framework**: Providing a structured approach to implementing evaluations.\n",
"- **Best practices**: Sharing tips and guidelines for effective evaluations.\n",
"- **Resources**: Offering additional materials for further learning and exploration.\n",
"</pre>\n"
],
"text/plain": [
"## Overview\n",
"Evaluation is the process of validating and testing the outputs that your Large Language Model \u001b[1m(\u001b[0mLLM\u001b[1m)\u001b[0m applications \n",
"are producing. Strong evaluations, referred to as \u001b[32m\"evals,\"\u001b[0m contribute to creating a more stable and reliable \n",
"application that can withstand changes in code and model updates.\n",
"### Example Use Cases\n",
"- **Quantify a solutions reliability**: Measure how dependable your application is.\n",
"- **Monitor application performance in production**: Keep track of how well your application performs in real-world\n",
"scenarios.\n",
"- **Test for regressions**: Ensure that new updates do not negatively impact existing functionality.\n",
"### What Well Cover\n",
"- **What are evals**: Understanding the concept and importance of evaluations.\n",
"- **Technical patterns**: Exploring common methods and strategies used in evaluations.\n",
"- **Example framework**: Providing a structured approach to implementing evaluations.\n",
"- **Best practices**: Sharing tips and guidelines for effective evaluations.\n",
"- **Resources**: Offering additional materials for further learning and exploration.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns**\n",
" three types of evaluation methods used in technical assessments:\n",
". **Metric-based Evaluations**:\n",
" - These evaluations use comparison metrics such as BLEU and ROUGE. - They provide a score that helps in \n",
"filtering and ranking results, making it easier to assess the quality of outputs quantitatively.\n",
". **Component Evaluations**:\n",
" - This method involves comparing the ground truth to predictions.\n",
" - It results in a simple Pass/Fail outcome, which is useful for determining whether specific components meet the\n",
"required standards.\n",
". **Subjective Evaluations**:\n",
" - These evaluations rely on a scorecard to assess outputs subjectively.\n",
" - The scorecard can also include a Pass/Fail option, allowing for a more nuanced evaluation that considers \n",
"qualitative aspects.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns**\n",
" three types of evaluation methods used in technical assessments:\n",
". **Metric-based Evaluations**:\n",
" - These evaluations use comparison metrics such as BLEU and ROUGE. - They provide a score that helps in \n",
"filtering and ranking results, making it easier to assess the quality of outputs quantitatively.\n",
". **Component Evaluations**:\n",
" - This method involves comparing the ground truth to predictions.\n",
" - It results in a simple Pass/Fail outcome, which is useful for determining whether specific components meet the\n",
"required standards.\n",
". **Subjective Evaluations**:\n",
" - These evaluations rely on a scorecard to assess outputs subjectively.\n",
" - The scorecard can also include a Pass/Fail option, allowing for a more nuanced evaluation that considers \n",
"qualitative aspects.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Metric-based Evaluations\n",
"ROUGE is a common metric for evaluating machine summarizations of text. It is specifically used to assess the \n",
"quality of summaries by comparing them to reference summaries. an example of how ROUGE is applied:\n",
"- **Original Text**: This is a detailed description of OpenAI's mission, emphasizing the development of artificial \n",
"general intelligence <span style=\"font-weight: bold\">(</span>AGI<span style=\"font-weight: bold\">)</span> that benefits humanity. It highlights the importance of safety, broad distribution of \n",
"benefits, and avoiding harmful uses or power concentration.\n",
"- **Machine Summary**: This is a condensed version of the original text. It focuses on ensuring AGI is safe and \n",
"accessible, avoiding harm and power concentration, and promoting research and collaboration in AI.\n",
"- **ROUGE Score**: The score given is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.51162</span>, which quantifies the similarity between the machine-generated \n",
"summary and the original text. A higher score indicates a closer match to the reference summary.\n",
"Overall, ROUGE helps in evaluating how well a machine-generated summary captures the essence of the original text.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Metric-based Evaluations\n",
"ROUGE is a common metric for evaluating machine summarizations of text. It is specifically used to assess the \n",
"quality of summaries by comparing them to reference summaries. an example of how ROUGE is applied:\n",
"- **Original Text**: This is a detailed description of OpenAI's mission, emphasizing the development of artificial \n",
"general intelligence \u001b[1m(\u001b[0mAGI\u001b[1m)\u001b[0m that benefits humanity. It highlights the importance of safety, broad distribution of \n",
"benefits, and avoiding harmful uses or power concentration.\n",
"- **Machine Summary**: This is a condensed version of the original text. It focuses on ensuring AGI is safe and \n",
"accessible, avoiding harm and power concentration, and promoting research and collaboration in AI.\n",
"- **ROUGE Score**: The score given is \u001b[1;36m0.51162\u001b[0m, which quantifies the similarity between the machine-generated \n",
"summary and the original text. A higher score indicates a closer match to the reference summary.\n",
"Overall, ROUGE helps in evaluating how well a machine-generated summary captures the essence of the original text.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"># Technical Patterns: Metric-based Evaluations\n",
" the BLEU score, a standard metric used to evaluate machine translation tasks. BLEU stands for Bilingual Evaluation\n",
"Understudy and is a method for assessing the quality of text that has been machine-translated from one language to \n",
"another.\n",
"### Key Elements:\n",
"- **BLEU**: This is a metric specifically designed for evaluating translation tasks. It compares the \n",
"machine-generated translation to one or more reference translations.\n",
"- **Original Text**: The example given is in Welsh: <span style=\"color: #008000; text-decoration-color: #008000\">\"Y gwir oedd doedden nhw ddim yn dweud celwyddau wedi'r cwbl.\"</span>\n",
"- **Reference Translation**: This is the human-generated translation used as a standard for comparison: <span style=\"color: #008000; text-decoration-color: #008000\">\"The truth </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">was they were not telling lies after all.\"</span>\n",
"- **Predicted Translation**: This is the translation produced by the machine: <span style=\"color: #008000; text-decoration-color: #008000\">\"The truth was they weren't telling </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">lies after all.\"</span>\n",
"- **BLEU Score**: The score for this translation is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.39938</span>. This score indicates how closely the machine \n",
"translation matches the reference translation, with a higher score representing a closer match.\n",
"The BLEU score is widely used in the field of natural language processing to provide a quantitative measure of \n",
"translation quality.\n",
"</pre>\n"
],
"text/plain": [
"# Technical Patterns: Metric-based Evaluations\n",
" the BLEU score, a standard metric used to evaluate machine translation tasks. BLEU stands for Bilingual Evaluation\n",
"Understudy and is a method for assessing the quality of text that has been machine-translated from one language to \n",
"another.\n",
"### Key Elements:\n",
"- **BLEU**: This is a metric specifically designed for evaluating translation tasks. It compares the \n",
"machine-generated translation to one or more reference translations.\n",
"- **Original Text**: The example given is in Welsh: \u001b[32m\"Y gwir oedd doedden nhw ddim yn dweud celwyddau wedi'r cwbl.\"\u001b[0m\n",
"- **Reference Translation**: This is the human-generated translation used as a standard for comparison: \u001b[32m\"The truth \u001b[0m\n",
"\u001b[32mwas they were not telling lies after all.\"\u001b[0m\n",
"- **Predicted Translation**: This is the translation produced by the machine: \u001b[32m\"The truth was they weren't telling \u001b[0m\n",
"\u001b[32mlies after all.\"\u001b[0m\n",
"- **BLEU Score**: The score for this translation is \u001b[1;36m0.39938\u001b[0m. This score indicates how closely the machine \n",
"translation matches the reference translation, with a higher score representing a closer match.\n",
"The BLEU score is widely used in the field of natural language processing to provide a quantitative measure of \n",
"translation quality.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Technical Patterns: Metric-based Evaluations\n",
"**What theyre good for:**\n",
"- **Starting Point**: They provide a good starting point for evaluating a new solution, helping to establish \n",
"initial benchmarks.\n",
"- **Automated Testing**: These evaluations serve as a useful yardstick for automated testing, particularly in \n",
"determining if a change has caused a significant performance shift.\n",
"- **Cost-Effective**: They are cheap and fast, making them accessible for quick assessments.\n",
"**What to be aware of:**\n",
"- **Context Specificity**: These evaluations are not tailored to specific contexts, which can limit their \n",
"effectiveness in certain situations.\n",
"- **Sophistication Needs**: Most customers require more sophisticated evaluations before moving to production, \n",
"indicating that metric-based evaluations might not be sufficient on their own for final decision-making.\n",
"</pre>\n"
],
"text/plain": [
"Technical Patterns: Metric-based Evaluations\n",
"**What theyre good for:**\n",
"- **Starting Point**: They provide a good starting point for evaluating a new solution, helping to establish \n",
"initial benchmarks.\n",
"- **Automated Testing**: These evaluations serve as a useful yardstick for automated testing, particularly in \n",
"determining if a change has caused a significant performance shift.\n",
"- **Cost-Effective**: They are cheap and fast, making them accessible for quick assessments.\n",
"**What to be aware of:**\n",
"- **Context Specificity**: These evaluations are not tailored to specific contexts, which can limit their \n",
"effectiveness in certain situations.\n",
"- **Sophistication Needs**: Most customers require more sophisticated evaluations before moving to production, \n",
"indicating that metric-based evaluations might not be sufficient on their own for final decision-making.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Component Evaluations**\n",
"Component evaluations, also known as <span style=\"color: #008000; text-decoration-color: #008000\">\"unit tests,\"</span> focus on assessing a single input/output of an application. The \n",
"goal is to verify that each component functions correctly in isolation by comparing the input to a predefined ideal\n",
"result, known as the ground truth.\n",
"**Process Overview:**\n",
". **Input Question:** - The process begins with a question: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span>\n",
". **Agent's Role:**\n",
" - The agent receives the question and processes it. The agent's thought process is: <span style=\"color: #008000; text-decoration-color: #008000\">\"I dont know. I should use </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">a tool.\"</span>\n",
" - The agent decides on an action: <span style=\"color: #008000; text-decoration-color: #008000\">\"Search.\"</span>\n",
" - The action input is the original question: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span>\n",
". **Search Component:**\n",
" - The search component is tasked with finding the answer. It retrieves the information: <span style=\"color: #008000; text-decoration-color: #008000\">\"The current population </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">of Canada is 39,566,248 as of Tuesday, May 23, 2023….\"</span>\n",
". **Evaluation Steps:**\n",
" - **Correct Action Check:** Is the agent's decision to search the correct action?\n",
" - **Exact Match Comparison:** Does the retrieved answer match the expected result exactly?\n",
" - **Contextual Relevance:** Does the answer use the context provided in the question?\n",
" - **Number Extraction and Comparison:** Extract numbers from both the expected and retrieved answers and compare\n",
"them for accuracy.\n",
". **Final Output:**\n",
" - The final output is the verified answer: <span style=\"color: #008000; text-decoration-color: #008000\">\"There are 39,566,248 people in Canada as of 2023.\"</span>\n",
"This process ensures that each component of the application is functioning correctly and producing accurate results\n",
"by systematically evaluating each step against the ground truth.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Component Evaluations**\n",
"Component evaluations, also known as \u001b[32m\"unit tests,\"\u001b[0m focus on assessing a single input/output of an application. The \n",
"goal is to verify that each component functions correctly in isolation by comparing the input to a predefined ideal\n",
"result, known as the ground truth.\n",
"**Process Overview:**\n",
". **Input Question:** - The process begins with a question: \u001b[32m\"What is the population of Canada?\"\u001b[0m\n",
". **Agent's Role:**\n",
" - The agent receives the question and processes it. The agent's thought process is: \u001b[32m\"I dont know. I should use \u001b[0m\n",
"\u001b[32ma tool.\"\u001b[0m\n",
" - The agent decides on an action: \u001b[32m\"Search.\"\u001b[0m\n",
" - The action input is the original question: \u001b[32m\"What is the population of Canada?\"\u001b[0m\n",
". **Search Component:**\n",
" - The search component is tasked with finding the answer. It retrieves the information: \u001b[32m\"The current population \u001b[0m\n",
"\u001b[32mof Canada is 39,566,248 as of Tuesday, May 23, 2023….\"\u001b[0m\n",
". **Evaluation Steps:**\n",
" - **Correct Action Check:** Is the agent's decision to search the correct action?\n",
" - **Exact Match Comparison:** Does the retrieved answer match the expected result exactly?\n",
" - **Contextual Relevance:** Does the answer use the context provided in the question?\n",
" - **Number Extraction and Comparison:** Extract numbers from both the expected and retrieved answers and compare\n",
"them for accuracy.\n",
". **Final Output:**\n",
" - The final output is the verified answer: \u001b[32m\"There are 39,566,248 people in Canada as of 2023.\"\u001b[0m\n",
"This process ensures that each component of the application is functioning correctly and producing accurate results\n",
"by systematically evaluating each step against the ground truth.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Technical Patterns: Subjective Evaluations**\n",
"Building an effective scorecard for automated testing is enhanced by incorporating detailed human reviews. This \n",
"process helps identify what is truly valuable. The approach of <span style=\"color: #008000; text-decoration-color: #008000\">\"show rather than tell\"</span> is recommended for GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, \n",
"meaning that examples of scores like <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>, and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> out of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> should be provided to help the model understand the \n",
"range.\n",
"**Example Scorecard:**\n",
"- **Role**: You are an evaluation assistant assessing how well the Assistant has answered a customer's query.\n",
" - **Metrics for Assessment**:\n",
" - **Relevance**: Rate the relevance of the search content to the question on a scale from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, where <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> is \n",
"highly relevant and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> is not relevant at all.\n",
" - **Credibility**: Rate the credibility of the sources from <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, where <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> is an established newspaper, \n",
"government agency, or large company, and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> is unreferenced.\n",
" - **Result**: Determine if the question is answered correctly based on the search content and the user's \n",
"question. Acceptable values are <span style=\"color: #008000; text-decoration-color: #008000\">\"correct\"</span> or <span style=\"color: #008000; text-decoration-color: #008000\">\"incorrect.\"</span>\n",
"- **Output Format**: Provide the evaluation as a JSON document with fields for relevance, credibility, and result.\n",
"**Example Evaluation**:\n",
"- **User Query**: <span style=\"color: #008000; text-decoration-color: #008000\">\"What is the population of Canada?\"</span>\n",
"- **Assistant's Response**: <span style=\"color: #008000; text-decoration-color: #008000\">\"Canada's population was estimated at 39,858,480 on April 1, 2023, by Statistics </span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">Canada.\"</span>\n",
"- **Evaluation**: `<span style=\"font-weight: bold\">{</span>relevance: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, credibility: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>, result: correct<span style=\"font-weight: bold\">}</span>`\n",
"This structured approach ensures clarity and consistency in evaluating the performance of automated systems.\n",
"</pre>\n"
],
"text/plain": [
"**Technical Patterns: Subjective Evaluations**\n",
"Building an effective scorecard for automated testing is enhanced by incorporating detailed human reviews. This \n",
"process helps identify what is truly valuable. The approach of \u001b[32m\"show rather than tell\"\u001b[0m is recommended for GPT-\u001b[1;36m4\u001b[0m, \n",
"meaning that examples of scores like \u001b[1;36m1\u001b[0m, \u001b[1;36m3\u001b[0m, and \u001b[1;36m8\u001b[0m out of \u001b[1;36m10\u001b[0m should be provided to help the model understand the \n",
"range.\n",
"**Example Scorecard:**\n",
"- **Role**: You are an evaluation assistant assessing how well the Assistant has answered a customer's query.\n",
" - **Metrics for Assessment**:\n",
" - **Relevance**: Rate the relevance of the search content to the question on a scale from \u001b[1;36m1\u001b[0m to \u001b[1;36m5\u001b[0m, where \u001b[1;36m5\u001b[0m is \n",
"highly relevant and \u001b[1;36m1\u001b[0m is not relevant at all.\n",
" - **Credibility**: Rate the credibility of the sources from \u001b[1;36m1\u001b[0m to \u001b[1;36m5\u001b[0m, where \u001b[1;36m5\u001b[0m is an established newspaper, \n",
"government agency, or large company, and \u001b[1;36m1\u001b[0m is unreferenced.\n",
" - **Result**: Determine if the question is answered correctly based on the search content and the user's \n",
"question. Acceptable values are \u001b[32m\"correct\"\u001b[0m or \u001b[32m\"incorrect.\"\u001b[0m\n",
"- **Output Format**: Provide the evaluation as a JSON document with fields for relevance, credibility, and result.\n",
"**Example Evaluation**:\n",
"- **User Query**: \u001b[32m\"What is the population of Canada?\"\u001b[0m\n",
"- **Assistant's Response**: \u001b[32m\"Canada's population was estimated at 39,858,480 on April 1, 2023, by Statistics \u001b[0m\n",
"\u001b[32mCanada.\"\u001b[0m\n",
"- **Evaluation**: `\u001b[1m{\u001b[0mrelevance: \u001b[1;36m5\u001b[0m, credibility: \u001b[1;36m5\u001b[0m, result: correct\u001b[1m}\u001b[0m`\n",
"This structured approach ensures clarity and consistency in evaluating the performance of automated systems.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Example Framework**\n",
"This framework outlines a method for evaluating the effectiveness of a system by grouping evaluations into test \n",
"suites called <span style=\"color: #008000; text-decoration-color: #008000\">\"runs.\"</span> These runs are executed in batches, and each run's contents are logged and stored at a \n",
"detailed level, known as <span style=\"color: #008000; text-decoration-color: #008000\">\"tracing.\"</span> This allows for investigation of failures, making adjustments, and rerunning \n",
"evaluations.\n",
"The table provides a summary of different runs:\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>**: - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">28</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18</span> incorrect with correct search results, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
" - Changes: N/A\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>**: - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">36</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> incorrect with correct search results, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
" - Changes: Model updated to GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span>**: - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">34</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">12</span> incorrect with correct search results, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> incorrect searches\n",
" - Changes: Added few-shot examples\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>**: - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">42</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8</span> incorrect with correct search results\n",
" - Changes: Added metadata to search, Prompt engineering for Answer step\n",
"- **Run ID <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span>**: - Model: gpt-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span>-turbo\n",
" - Score: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">48</span>/<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>\n",
" - Annotation Feedback: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> incorrect with correct search results\n",
" - Changes: Prompt engineering to Answer step\n",
"This framework emphasizes the importance of detailed logging and iterative improvements to enhance system \n",
"performance.\n",
"</pre>\n"
],
"text/plain": [
"**Example Framework**\n",
"This framework outlines a method for evaluating the effectiveness of a system by grouping evaluations into test \n",
"suites called \u001b[32m\"runs.\"\u001b[0m These runs are executed in batches, and each run's contents are logged and stored at a \n",
"detailed level, known as \u001b[32m\"tracing.\"\u001b[0m This allows for investigation of failures, making adjustments, and rerunning \n",
"evaluations.\n",
"The table provides a summary of different runs:\n",
"- **Run ID \u001b[1;36m1\u001b[0m**: - Model: gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
" - Score: \u001b[1;36m28\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m18\u001b[0m incorrect with correct search results, \u001b[1;36m4\u001b[0m incorrect searches\n",
" - Changes: N/A\n",
"- **Run ID \u001b[1;36m2\u001b[0m**: - Model: gpt-\u001b[1;36m4\u001b[0m\n",
" - Score: \u001b[1;36m36\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m10\u001b[0m incorrect with correct search results, \u001b[1;36m4\u001b[0m incorrect searches\n",
" - Changes: Model updated to GPT-\u001b[1;36m4\u001b[0m\n",
"- **Run ID \u001b[1;36m3\u001b[0m**: - Model: gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
" - Score: \u001b[1;36m34\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m12\u001b[0m incorrect with correct search results, \u001b[1;36m4\u001b[0m incorrect searches\n",
" - Changes: Added few-shot examples\n",
"- **Run ID \u001b[1;36m4\u001b[0m**: - Model: gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
" - Score: \u001b[1;36m42\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m8\u001b[0m incorrect with correct search results\n",
" - Changes: Added metadata to search, Prompt engineering for Answer step\n",
"- **Run ID \u001b[1;36m5\u001b[0m**: - Model: gpt-\u001b[1;36m3.5\u001b[0m-turbo\n",
" - Score: \u001b[1;36m48\u001b[0m/\u001b[1;36m50\u001b[0m\n",
" - Annotation Feedback: \u001b[1;36m2\u001b[0m incorrect with correct search results\n",
" - Changes: Prompt engineering to Answer step\n",
"This framework emphasizes the importance of detailed logging and iterative improvements to enhance system \n",
"performance.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Overview\n",
"Fine-tuning involves adjusting theparameters of pre-trained models on aspecific dataset or task. This \n",
"processenhances the model's ability to generatemore accurate and relevant responses forthe given context by \n",
"adapting it to thenuances and specific requirements of thetask at hand.\n",
"Example use cases\n",
"- Generate output in a consistent\n",
"-\n",
"format\n",
"Process input by following specificinstructions\n",
"What well cover\n",
"● When to fine-tune\n",
"● Preparing the dataset\n",
"● Best practices\n",
"● Hyperparameters\n",
"● Fine-tuning advances\n",
"● Resources\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Overview\n",
"Fine-tuning involves adjusting theparameters of pre-trained models on aspecific dataset or task. This \n",
"processenhances the model's ability to generatemore accurate and relevant responses forthe given context by \n",
"adapting it to thenuances and specific requirements of thetask at hand.\n",
"Example use cases\n",
"- Generate output in a consistent\n",
"-\n",
"format\n",
"Process input by following specificinstructions\n",
"What well cover\n",
"● When to fine-tune\n",
"● Preparing the dataset\n",
"● Best practices\n",
"● Hyperparameters\n",
"● Fine-tuning advances\n",
"● Resources\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">What is Fine-tuning\n",
"Public Model\n",
"Training data\n",
"Training\n",
"Fine-tunedmodel\n",
"Fine-tuning a model consists of training themodel to follow a set of given input/outputexamples.\n",
"This will teach the model to behave in acertain way when confronted with a similarinput in the future.\n",
"We recommend using <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples\n",
"even if the minimum is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>.\n",
"\n",
"Fine-tuning is a process in machine learning where a pre-existing model, known as a public model, is further \n",
"trained using specific training data. This involves adjusting the model to follow a set of given input/output \n",
"examples. The goal is to teach the model to respond in a particular way when it encounters similar inputs in the \n",
"future.\n",
"The diagram illustrates this process: starting with a public model, training data is used in a training phase to \n",
"produce a fine-tuned model. This refined model is better suited to specific tasks or datasets.\n",
"It is recommended to use <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples for effective fine-tuning, although the minimum requirement is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span> \n",
"examples. This ensures the model learns adequately from the examples provided.\n",
"</pre>\n"
],
"text/plain": [
"What is Fine-tuning\n",
"Public Model\n",
"Training data\n",
"Training\n",
"Fine-tunedmodel\n",
"Fine-tuning a model consists of training themodel to follow a set of given input/outputexamples.\n",
"This will teach the model to behave in acertain way when confronted with a similarinput in the future.\n",
"We recommend using \u001b[1;36m50\u001b[0m-\u001b[1;36m100\u001b[0m examples\n",
"even if the minimum is \u001b[1;36m10\u001b[0m.\n",
"\n",
"Fine-tuning is a process in machine learning where a pre-existing model, known as a public model, is further \n",
"trained using specific training data. This involves adjusting the model to follow a set of given input/output \n",
"examples. The goal is to teach the model to respond in a particular way when it encounters similar inputs in the \n",
"future.\n",
"The diagram illustrates this process: starting with a public model, training data is used in a training phase to \n",
"produce a fine-tuned model. This refined model is better suited to specific tasks or datasets.\n",
"It is recommended to use \u001b[1;36m50\u001b[0m-\u001b[1;36m100\u001b[0m examples for effective fine-tuning, although the minimum requirement is \u001b[1;36m10\u001b[0m \n",
"examples. This ensures the model learns adequately from the examples provided.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">When to fine-tune\n",
"Good for ✅\n",
"Not good for ❌\n",
"●\n",
"●\n",
"●\n",
"●\n",
"Following a given format or tone for the\n",
"output\n",
"Processing the input following specific,\n",
"complex instructions\n",
"Improving latency\n",
"Reducing token usage\n",
"●\n",
"●\n",
"●\n",
"Teaching the model new knowledge\n",
"➔ Use RAG or custom models instead\n",
"Performing well at multiple, unrelated tasks\n",
"➔ Do prompt-engineering or create multiple\n",
"FT models instead\n",
"Include up-to-date content in responses\n",
"➔ Use RAG instead\n",
"\n",
"</pre>\n"
],
"text/plain": [
"When to fine-tune\n",
"Good for ✅\n",
"Not good for ❌\n",
"●\n",
"●\n",
"●\n",
"●\n",
"Following a given format or tone for the\n",
"output\n",
"Processing the input following specific,\n",
"complex instructions\n",
"Improving latency\n",
"Reducing token usage\n",
"●\n",
"●\n",
"●\n",
"Teaching the model new knowledge\n",
"➔ Use RAG or custom models instead\n",
"Performing well at multiple, unrelated tasks\n",
"➔ Do prompt-engineering or create multiple\n",
"FT models instead\n",
"Include up-to-date content in responses\n",
"➔ Use RAG instead\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Preparing the dataset\n",
"Example format\n",
"<span style=\"font-weight: bold\">{</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"messages\"</span>: <span style=\"font-weight: bold\">[</span>\n",
"<span style=\"font-weight: bold\">{</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"role\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"system\"</span>,\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"content\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"Marv is a factual chatbotthat is also sarcastic.\"</span>\n",
"<span style=\"font-weight: bold\">}</span>,\n",
"<span style=\"font-weight: bold\">{</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"role\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"user\"</span>,\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"content\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"What's the capital ofFrance?\"</span>\n",
"<span style=\"font-weight: bold\">}</span>,\n",
"<span style=\"font-weight: bold\">{</span>\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"role\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"assistant\"</span>,\n",
"<span style=\"color: #008000; text-decoration-color: #008000\">\"content\"</span>: <span style=\"color: #008000; text-decoration-color: #008000\">\"Paris, as if everyonedoesn't know that already.\"</span>\n",
"<span style=\"font-weight: bold\">}</span>\n",
"<span style=\"font-weight: bold\">]</span>\n",
"<span style=\"font-weight: bold\">}</span>\n",
".jsonl\n",
"➔ Take the set of instructions and prompts that you\n",
"found worked best for the model prior to fine-tuning.Include them in every training example\n",
"➔ If you would like to shorten the instructions or\n",
"prompts, it may take more training examples to arriveat good results\n",
"We recommend using <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples\n",
"even if the minimum is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>.\n",
"\n",
"</pre>\n"
],
"text/plain": [
"Preparing the dataset\n",
"Example format\n",
"\u001b[1m{\u001b[0m\n",
"\u001b[32m\"messages\"\u001b[0m: \u001b[1m[\u001b[0m\n",
"\u001b[1m{\u001b[0m\n",
"\u001b[32m\"role\"\u001b[0m: \u001b[32m\"system\"\u001b[0m,\n",
"\u001b[32m\"content\"\u001b[0m: \u001b[32m\"Marv is a factual chatbotthat is also sarcastic.\"\u001b[0m\n",
"\u001b[1m}\u001b[0m,\n",
"\u001b[1m{\u001b[0m\n",
"\u001b[32m\"role\"\u001b[0m: \u001b[32m\"user\"\u001b[0m,\n",
"\u001b[32m\"content\"\u001b[0m: \u001b[32m\"What's the capital ofFrance?\"\u001b[0m\n",
"\u001b[1m}\u001b[0m,\n",
"\u001b[1m{\u001b[0m\n",
"\u001b[32m\"role\"\u001b[0m: \u001b[32m\"assistant\"\u001b[0m,\n",
"\u001b[32m\"content\"\u001b[0m: \u001b[32m\"Paris, as if everyonedoesn't know that already.\"\u001b[0m\n",
"\u001b[1m}\u001b[0m\n",
"\u001b[1m]\u001b[0m\n",
"\u001b[1m}\u001b[0m\n",
".jsonl\n",
"➔ Take the set of instructions and prompts that you\n",
"found worked best for the model prior to fine-tuning.Include them in every training example\n",
"➔ If you would like to shorten the instructions or\n",
"prompts, it may take more training examples to arriveat good results\n",
"We recommend using \u001b[1;36m50\u001b[0m-\u001b[1;36m100\u001b[0m examples\n",
"even if the minimum is \u001b[1;36m10\u001b[0m.\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Best practices\n",
"Curate examples carefully\n",
"Datasets can be difficult to build, startsmall and invest intentionally.Optimize for fewer high-qualitytraining \n",
"examples.\n",
"● Consider “prompt baking”, or using a basicprompt to generate your initial examples\n",
"● If your conversations are multi-turn, ensure\n",
"your examples are representative\n",
"● Collect examples to target issues detected\n",
"in evaluation\n",
"● Consider the balance &amp; diversity of data\n",
"● Make sure your examples contain all the\n",
"information needed in the response\n",
"Iterate on hyperparameters\n",
"Establish a baseline\n",
"Start with the defaults and adjustbased on performance.\n",
"● If the model does not appear to converge,\n",
"increase the learning rate multiplier\n",
"● If the model does not follow the trainingdata as much as expected increase thenumber of epochs\n",
"● If the model becomes less diverse than\n",
"expected decrease the # of epochs by <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span>\n",
"Automate your feedbackpipeline\n",
"Introduce automated evaluations tohighlight potential problem cases toclean up and use as training data.\n",
"Consider the G-Eval approach ofusing GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> to perform automatedtesting using a scorecard.\n",
"Often users start with azero-shot or few-shot prompt tobuild a baseline evaluationbefore graduating to fine-tuning.\n",
"Often users start with azero-shot or few-shot prompt tobuild a baseline evaluationOptimize for latency andbefore \n",
"graduating to fine-tuning.\n",
"token efficiency\n",
"When using GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span>, once youhave a baseline evaluation andtraining examples considerfine-tuning <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> to get \n",
"similarperformance for less cost andlatency.\n",
"Experiment with reducing orremoving system instructionswith subsequent fine-tunedmodel versions.\n",
"</pre>\n"
],
"text/plain": [
"Best practices\n",
"Curate examples carefully\n",
"Datasets can be difficult to build, startsmall and invest intentionally.Optimize for fewer high-qualitytraining \n",
"examples.\n",
"● Consider “prompt baking”, or using a basicprompt to generate your initial examples\n",
"● If your conversations are multi-turn, ensure\n",
"your examples are representative\n",
"● Collect examples to target issues detected\n",
"in evaluation\n",
"● Consider the balance & diversity of data\n",
"● Make sure your examples contain all the\n",
"information needed in the response\n",
"Iterate on hyperparameters\n",
"Establish a baseline\n",
"Start with the defaults and adjustbased on performance.\n",
"● If the model does not appear to converge,\n",
"increase the learning rate multiplier\n",
"● If the model does not follow the trainingdata as much as expected increase thenumber of epochs\n",
"● If the model becomes less diverse than\n",
"expected decrease the # of epochs by \u001b[1;36m1\u001b[0m-\u001b[1;36m2\u001b[0m\n",
"Automate your feedbackpipeline\n",
"Introduce automated evaluations tohighlight potential problem cases toclean up and use as training data.\n",
"Consider the G-Eval approach ofusing GPT-\u001b[1;36m4\u001b[0m to perform automatedtesting using a scorecard.\n",
"Often users start with azero-shot or few-shot prompt tobuild a baseline evaluationbefore graduating to fine-tuning.\n",
"Often users start with azero-shot or few-shot prompt tobuild a baseline evaluationOptimize for latency andbefore \n",
"graduating to fine-tuning.\n",
"token efficiency\n",
"When using GPT-\u001b[1;36m4\u001b[0m, once youhave a baseline evaluation andtraining examples considerfine-tuning \u001b[1;36m3.5\u001b[0m to get \n",
"similarperformance for less cost andlatency.\n",
"Experiment with reducing orremoving system instructionswith subsequent fine-tunedmodel versions.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">Hyperparameters\n",
"Epochs\n",
"Refers to <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> full cycle through the training dataset\n",
"If you have hundreds of thousands of examples, we would recommendexperimenting with two epochs <span style=\"font-weight: bold\">(</span>or one<span style=\"font-weight: bold\">)</span> to avoid \n",
"overfitting.\n",
"default: auto <span style=\"font-weight: bold\">(</span>standard is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span><span style=\"font-weight: bold\">)</span>\n",
"Batch size\n",
"Number of training examples used to train a singleforward &amp; backward pass\n",
"In general, we've found that larger batch sizes tend to work better for larger datasets\n",
"default: ~<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>% x N* <span style=\"font-weight: bold\">(</span>max <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">256</span><span style=\"font-weight: bold\">)</span>\n",
"*N = number of training examples\n",
"Learning rate multiplier\n",
"Scaling factor for the original learning rate\n",
"We recommend experimenting with values between <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.02</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>. We've found thatlarger learning rates often perform better\n",
"with larger batch sizes.\n",
"default: <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.05</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.1</span> or <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>*\n",
"*depends on final batch size\n",
"\n",
"**Epochs**\n",
"- An epoch refers to one complete cycle through the training dataset.\n",
"- For datasets with hundreds of thousands of examples, it is recommended to use fewer epochs <span style=\"font-weight: bold\">(</span>one or two<span style=\"font-weight: bold\">)</span> to \n",
"prevent overfitting.\n",
"- Default setting is auto, with a standard of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> epochs.\n",
"**Batch Size**\n",
"- This is the number of training examples used to train in a single forward and backward pass.\n",
"- Larger batch sizes are generally more effective for larger datasets.\n",
"- The default batch size is approximately <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>% of the total number of training examples <span style=\"font-weight: bold\">(</span>N<span style=\"font-weight: bold\">)</span>, with a maximum of <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">256</span>.\n",
"**Learning Rate Multiplier**\n",
"- This is a scaling factor for the original learning rate.\n",
"- Experimentation with values between <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.02</span> and <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span> is recommended.\n",
"- Larger learning rates often yield better results with larger batch sizes.\n",
"- Default values are <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.05</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.1</span>, or <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.2</span>, depending on the final batch size.\n",
"</pre>\n"
],
"text/plain": [
"Hyperparameters\n",
"Epochs\n",
"Refers to \u001b[1;36m1\u001b[0m full cycle through the training dataset\n",
"If you have hundreds of thousands of examples, we would recommendexperimenting with two epochs \u001b[1m(\u001b[0mor one\u001b[1m)\u001b[0m to avoid \n",
"overfitting.\n",
"default: auto \u001b[1m(\u001b[0mstandard is \u001b[1;36m4\u001b[0m\u001b[1m)\u001b[0m\n",
"Batch size\n",
"Number of training examples used to train a singleforward & backward pass\n",
"In general, we've found that larger batch sizes tend to work better for larger datasets\n",
"default: ~\u001b[1;36m0.2\u001b[0m% x N* \u001b[1m(\u001b[0mmax \u001b[1;36m256\u001b[0m\u001b[1m)\u001b[0m\n",
"*N = number of training examples\n",
"Learning rate multiplier\n",
"Scaling factor for the original learning rate\n",
"We recommend experimenting with values between \u001b[1;36m0.02\u001b[0m-\u001b[1;36m0.2\u001b[0m. We've found thatlarger learning rates often perform better\n",
"with larger batch sizes.\n",
"default: \u001b[1;36m0.05\u001b[0m, \u001b[1;36m0.1\u001b[0m or \u001b[1;36m0.2\u001b[0m*\n",
"*depends on final batch size\n",
"\n",
"**Epochs**\n",
"- An epoch refers to one complete cycle through the training dataset.\n",
"- For datasets with hundreds of thousands of examples, it is recommended to use fewer epochs \u001b[1m(\u001b[0mone or two\u001b[1m)\u001b[0m to \n",
"prevent overfitting.\n",
"- Default setting is auto, with a standard of \u001b[1;36m4\u001b[0m epochs.\n",
"**Batch Size**\n",
"- This is the number of training examples used to train in a single forward and backward pass.\n",
"- Larger batch sizes are generally more effective for larger datasets.\n",
"- The default batch size is approximately \u001b[1;36m0.2\u001b[0m% of the total number of training examples \u001b[1m(\u001b[0mN\u001b[1m)\u001b[0m, with a maximum of \u001b[1;36m256\u001b[0m.\n",
"**Learning Rate Multiplier**\n",
"- This is a scaling factor for the original learning rate.\n",
"- Experimentation with values between \u001b[1;36m0.02\u001b[0m and \u001b[1;36m0.2\u001b[0m is recommended.\n",
"- Larger learning rates often yield better results with larger batch sizes.\n",
"- Default values are \u001b[1;36m0.05\u001b[0m, \u001b[1;36m0.1\u001b[0m, or \u001b[1;36m0.2\u001b[0m, depending on the final batch size.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"</pre>\n"
],
"text/plain": [
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Overview**\n",
"Fine-tuning involves adjusting the parameters of pre-trained models on a specific dataset or task. This process \n",
"enhances the model's ability to generate more accurate and relevant responses for the given context by adapting it \n",
"to the nuances and specific requirements of the task at hand.\n",
"**Example Use Cases:**\n",
"- Generate output in a consistent format.\n",
"- Process input by following specific instructions.\n",
"**What Well Cover:**\n",
"- When to fine-tune\n",
"- Preparing the dataset\n",
"- Best practices\n",
"- Hyperparameters\n",
"- Fine-tuning advances\n",
"- Resources\n",
"</pre>\n"
],
"text/plain": [
"**Overview**\n",
"Fine-tuning involves adjusting the parameters of pre-trained models on a specific dataset or task. This process \n",
"enhances the model's ability to generate more accurate and relevant responses for the given context by adapting it \n",
"to the nuances and specific requirements of the task at hand.\n",
"**Example Use Cases:**\n",
"- Generate output in a consistent format.\n",
"- Process input by following specific instructions.\n",
"**What Well Cover:**\n",
"- When to fine-tune\n",
"- Preparing the dataset\n",
"- Best practices\n",
"- Hyperparameters\n",
"- Fine-tuning advances\n",
"- Resources\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">When to Fine-Tune\n",
"**Good for:**\n",
"- **Following a given format or tone for the output:** Fine-tuning is effective when you need the model to adhere \n",
"to a specific style or structure in its responses.\n",
" - **Processing the input following specific, complex instructions:** It helps in handling detailed and intricate \n",
"instructions accurately.\n",
"- **Improving latency:** Fine-tuning can enhance the speed of the model's responses.\n",
"- **Reducing token usage:** It can optimize the model to use fewer tokens, making it more efficient.\n",
"**Not good for:**\n",
"- **Teaching the model new knowledge:** Fine-tuning is not suitable for adding new information to the model. \n",
"Instead, use Retrieval-Augmented Generation <span style=\"font-weight: bold\">(</span>RAG<span style=\"font-weight: bold\">)</span> or custom models.\n",
"- **Performing well at multiple, unrelated tasks:** For diverse tasks, it's better to use prompt engineering or \n",
"create multiple fine-tuned models.\n",
"- **Including up-to-date content in responses:** Fine-tuning is not ideal for ensuring the model has the latest \n",
"information. RAG is recommended for this purpose.\n",
"</pre>\n"
],
"text/plain": [
"When to Fine-Tune\n",
"**Good for:**\n",
"- **Following a given format or tone for the output:** Fine-tuning is effective when you need the model to adhere \n",
"to a specific style or structure in its responses.\n",
" - **Processing the input following specific, complex instructions:** It helps in handling detailed and intricate \n",
"instructions accurately.\n",
"- **Improving latency:** Fine-tuning can enhance the speed of the model's responses.\n",
"- **Reducing token usage:** It can optimize the model to use fewer tokens, making it more efficient.\n",
"**Not good for:**\n",
"- **Teaching the model new knowledge:** Fine-tuning is not suitable for adding new information to the model. \n",
"Instead, use Retrieval-Augmented Generation \u001b[1m(\u001b[0mRAG\u001b[1m)\u001b[0m or custom models.\n",
"- **Performing well at multiple, unrelated tasks:** For diverse tasks, it's better to use prompt engineering or \n",
"create multiple fine-tuned models.\n",
"- **Including up-to-date content in responses:** Fine-tuning is not ideal for ensuring the model has the latest \n",
"information. RAG is recommended for this purpose.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Preparing the Dataset**\n",
" guidance on preparing a dataset for training a chatbot model. It includes an example format using JSONL <span style=\"font-weight: bold\">(</span>JSON \n",
"Lines<span style=\"font-weight: bold\">)</span> to structure the data. The example shows a conversation with three roles:\n",
". **System**: Sets the context by describing the chatbot as <span style=\"color: #008000; text-decoration-color: #008000\">\"Marv is a factual chatbot that is also sarcastic.\"</span>\n",
". **User**: Asks a question, <span style=\"color: #008000; text-decoration-color: #008000\">\"What's the capital of France?\"</span>\n",
". **Assistant**: Responds with a sarcastic answer, <span style=\"color: #008000; text-decoration-color: #008000\">\"Paris, as if everyone doesn't know that already.\"</span>\n",
"Key recommendations for dataset preparation include:\n",
"- Use a set of instructions and prompts that have proven effective for the model before fine-tuning. These should \n",
"be included in every training example.\n",
"- If you choose to shorten instructions or prompts, be aware that more training examples may be needed to achieve \n",
"good results.\n",
"- It is recommended to use <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100</span> examples, even though the minimum required is <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10</span>.\n",
"</pre>\n"
],
"text/plain": [
"**Preparing the Dataset**\n",
" guidance on preparing a dataset for training a chatbot model. It includes an example format using JSONL \u001b[1m(\u001b[0mJSON \n",
"Lines\u001b[1m)\u001b[0m to structure the data. The example shows a conversation with three roles:\n",
". **System**: Sets the context by describing the chatbot as \u001b[32m\"Marv is a factual chatbot that is also sarcastic.\"\u001b[0m\n",
". **User**: Asks a question, \u001b[32m\"What's the capital of France?\"\u001b[0m\n",
". **Assistant**: Responds with a sarcastic answer, \u001b[32m\"Paris, as if everyone doesn't know that already.\"\u001b[0m\n",
"Key recommendations for dataset preparation include:\n",
"- Use a set of instructions and prompts that have proven effective for the model before fine-tuning. These should \n",
"be included in every training example.\n",
"- If you choose to shorten instructions or prompts, be aware that more training examples may be needed to achieve \n",
"good results.\n",
"- It is recommended to use \u001b[1;36m50\u001b[0m-\u001b[1;36m100\u001b[0m examples, even though the minimum required is \u001b[1;36m10\u001b[0m.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">**Best Practices**\n",
". **Curate Examples Carefully**\n",
" - Building datasets can be challenging, so start small and focus on high-quality examples.\n",
" - Use <span style=\"color: #008000; text-decoration-color: #008000\">\"prompt baking\"</span> to generate initial examples.\n",
" - Ensure multi-turn conversations are well-represented.\n",
" - Collect examples to address issues found during evaluation.\n",
" - Balance and diversify your data.\n",
" - Ensure examples contain all necessary information for responses.\n",
". **Iterate on Hyperparameters**\n",
" - Begin with default settings and adjust based on performance.\n",
" - Increase the learning rate multiplier if the model doesn't converge.\n",
" - Increase the number of epochs if the model doesn't follow training data closely.\n",
" - Decrease the number of epochs by <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> if the model becomes less diverse.\n",
". **Establish a Baseline**\n",
" - Start with zero-shot or few-shot prompts to create a baseline before fine-tuning.\n",
". **Automate Your Feedback Pipeline**\n",
" - Use automated evaluations to identify and clean up problem cases for training data.\n",
" - Consider using the G-Eval approach with GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> for automated testing with a scorecard.\n",
". **Optimize for Latency and Token Efficiency**\n",
" - After establishing a baseline, consider fine-tuning with GPT-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.5</span> for similar performance at lower cost and \n",
"latency.\n",
" - Experiment with reducing or removing system instructions in subsequent fine-tuned versions.\n",
"</pre>\n"
],
"text/plain": [
"**Best Practices**\n",
". **Curate Examples Carefully**\n",
" - Building datasets can be challenging, so start small and focus on high-quality examples.\n",
" - Use \u001b[32m\"prompt baking\"\u001b[0m to generate initial examples.\n",
" - Ensure multi-turn conversations are well-represented.\n",
" - Collect examples to address issues found during evaluation.\n",
" - Balance and diversify your data.\n",
" - Ensure examples contain all necessary information for responses.\n",
". **Iterate on Hyperparameters**\n",
" - Begin with default settings and adjust based on performance.\n",
" - Increase the learning rate multiplier if the model doesn't converge.\n",
" - Increase the number of epochs if the model doesn't follow training data closely.\n",
" - Decrease the number of epochs by \u001b[1;36m1\u001b[0m-\u001b[1;36m2\u001b[0m if the model becomes less diverse.\n",
". **Establish a Baseline**\n",
" - Start with zero-shot or few-shot prompts to create a baseline before fine-tuning.\n",
". **Automate Your Feedback Pipeline**\n",
" - Use automated evaluations to identify and clean up problem cases for training data.\n",
" - Consider using the G-Eval approach with GPT-\u001b[1;36m4\u001b[0m for automated testing with a scorecard.\n",
". **Optimize for Latency and Token Efficiency**\n",
" - After establishing a baseline, consider fine-tuning with GPT-\u001b[1;36m3.5\u001b[0m for similar performance at lower cost and \n",
"latency.\n",
" - Experiment with reducing or removing system instructions in subsequent fine-tuned versions.\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
"\n",
"-------------------------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\n",
"\n",
"-------------------------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"source": [
"for c in clean_content:\n",
" print(c)\n",
" print(\"\\n\\n-------------------------------\\n\\n\")"
]
},
{
"cell_type": "code",
"execution_count": 93,
"id": "c183f248",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-weight: bold\">(</span><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">88</span>, <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span><span style=\"font-weight: bold\">)</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1m(\u001b[0m\u001b[1;36m88\u001b[0m, \u001b[1;36m1\u001b[0m\u001b[1m)\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>content</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Overview\\nRetrieval-Augmented Generationenhanc...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>What is RAG\\nRetrieve information to Augment t...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>When to use RAG\\nGood for ✅\\nNot good for ❌\\...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Technical patterns\\nData preparation\\nInput pr...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Technical patterns\\nData preparation\\nchunk do...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" content\n",
"0 Overview\\nRetrieval-Augmented Generationenhanc...\n",
"1 What is RAG\\nRetrieve information to Augment t...\n",
"2 When to use RAG\\nGood for ✅\\nNot good for ❌\\...\n",
"3 Technical patterns\\nData preparation\\nInput pr...\n",
"4 Technical patterns\\nData preparation\\nchunk do..."
]
},
"execution_count": 93,
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Creating the embeddings\n",
"# We'll save to a csv file here for testing purposes but this is where you should load content in your vectorDB.\n",
"df = pd.DataFrame(clean_content, columns=['content'])\n",
"print(df.shape)\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 94,
"id": "99e498ce",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"embeddings_model = \"text-embedding-3-large\"\n",
"\n",
"def get_embeddings(text):\n",
" embeddings = client.embeddings.create(\n",
" model=\"text-embedding-3-small\",\n",
" input=text,\n",
" encoding_format=\"float\"\n",
" )\n",
" return embeddings.data[0].embedding"
]
},
{
"cell_type": "code",
"execution_count": 95,
"id": "c55ffea5",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>content</th>\n",
" <th>embeddings</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Overview\\nRetrieval-Augmented Generationenhanc...</td>\n",
" <td>[-0.013741373, 0.029359376, 0.054372873, 0.022...</td>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>What is RAG\\nRetrieve information to Augment t...</td>\n",
" <td>[-0.018389475, 0.030965596, 0.0056745913, 0.01...</td>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>When to use RAG\\nGood for ✅\\nNot good for ❌\\...</td>\n",
" <td>[-0.008419483, 0.021529013, -0.0060885856, 0.0...</td>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Technical patterns\\nData preparation\\nInput pr...</td>\n",
" <td>[-0.0034501953, 0.03871357, 0.07771268, 0.0041...</td>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Technical patterns\\nData preparation\\nchunk do...</td>\n",
" <td>[-0.0024594103, 0.023041151, 0.053115055, -0.0...</td>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" content \\\n",
"0 Overview\\nRetrieval-Augmented Generationenhanc... \n",
"1 What is RAG\\nRetrieve information to Augment t... \n",
"2 When to use RAG\\nGood for ✅\\nNot good for ❌\\... \n",
"3 Technical patterns\\nData preparation\\nInput pr... \n",
"4 Technical patterns\\nData preparation\\nchunk do... \n",
"\n",
" embeddings \n",
"0 [-0.013741373, 0.029359376, 0.054372873, 0.022... \n",
"1 [-0.018389475, 0.030965596, 0.0056745913, 0.01... \n",
"2 [-0.008419483, 0.021529013, -0.0060885856, 0.0... \n",
"3 [-0.0034501953, 0.03871357, 0.07771268, 0.0041... \n",
"4 [-0.0024594103, 0.023041151, 0.053115055, -0.0... "
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"execution_count": 95,
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df['embeddings'] = df['content'].apply(lambda x: get_embeddings(x))\n",
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 97,
"id": "4ed508fc",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Saving locally for later\n",
"data_path = \"data/parsed_pdf_docs_with_embeddings.csv\"\n",
"df.to_csv(data_path, index=False)"
]
},
{
"cell_type": "code",
"execution_count": 98,
"id": "b2d46009",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Optional: load data from saved file\n",
"df = pd.read_csv(data_path)\n",
"df[\"embeddings\"] = df.embeddings.apply(literal_eval).apply(np.array)"
]
},
{
"cell_type": "markdown",
"id": "20f28788",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"## Retrieval-augmented generation\n",
"\n",
"The last step of the process is to generate outputs in response to input queries, after retrieving content as context to reply."
]
},
{
"cell_type": "code",
"execution_count": 99,
"id": "d7edc01b",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"system_prompt = '''\n",
" You will be provided with an input prompt and content as context that can be used to reply to the prompt.\n",
" \n",
" You will do 2 things:\n",
" \n",
" 1. First, you will internally assess whether the content provided is relevant to reply to the input prompt. \n",
" \n",
" 2a. If that is the case, answer directly using this content. If the content is relevant, use elements found in the content to craft a reply to the input prompt.\n",
"\n",
" 2b. If the content is not relevant, use your own knowledge to reply or say that you don't know how to respond if your knowledge is not sufficient to answer.\n",
" \n",
" Stay concise with your answer, replying specifically to the input prompt without mentioning additional information provided in the context content.\n",
"'''\n",
"\n",
"model=\"gpt-4o\"\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"def search_content(df, input_text, top_k):\n",
" embedded_value = get_embeddings(input_text)\n",
" df[\"similarity\"] = df.embeddings.apply(lambda x: cosine_similarity(np.array(x).reshape(1,-1), np.array(embedded_value).reshape(1, -1)))\n",
" res = df.sort_values('similarity', ascending=False).head(top_k)\n",
" return res\n",
"\n",
"def get_similarity(row):\n",
" similarity_score = row['similarity']\n",
" if isinstance(similarity_score, np.ndarray):\n",
" similarity_score = similarity_score[0][0]\n",
" return similarity_score\n",
"\n",
"def generate_output(input_prompt, similar_content, threshold = 0.5):\n",
" \n",
" content = similar_content.iloc[0]['content']\n",
" \n",
" # Adding more matching content if the similarity is above threshold\n",
" if len(similar_content) > 1:\n",
" for i, row in similar_content.iterrows():\n",
" similarity_score = get_similarity(row)\n",
" if similarity_score > threshold:\n",
" content += f\"\\n\\n{row['content']}\"\n",
" \n",
" prompt = f\"INPUT PROMPT:\\n{input_prompt}\\n-------\\nCONTENT:\\n{content}\"\n",
" \n",
" completion = client.chat.completions.create(\n",
" model=model,\n",
" temperature=0.5,\n",
" messages=[\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": system_prompt\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": prompt\n",
" }\n",
" ]\n",
" )\n",
"\n",
" return completion.choices[0].message.content"
]
},
{
"cell_type": "code",
"execution_count": 100,
"id": "54f9fb11",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"outputs": [],
"source": [
"# Example user queries related to the content\n",
"example_inputs = [\n",
" 'What are the main models you offer?',\n",
" 'Do you have a speech recognition model?',\n",
" 'Which embedding model should I use for non-English use cases?',\n",
" 'Can I introduce new knowledge in my LLM app using RAG?',\n",
" 'How many examples do I need to fine-tune a model?',\n",
" 'Which metric can I use to evaluate a summarization task?',\n",
" 'Give me a detailed example for an evaluation process where we are looking for a clear answer to compare to a ground truth.',\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 101,
"id": "313d2f7e",
"metadata": {},
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"outputs": [
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #af005f; text-decoration-color: #af005f; font-weight: bold\">QUERY:</span><span style=\"color: #af005f; text-decoration-color: #af005f\"> What are the main models you offer?</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;125mQUERY:\u001b[0m\u001b[38;5;125m What are the main models you offer?\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">Matching content:</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59mMatching content:\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.42</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.42\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">LATEST MODELS</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">This document outlines the latest models available for different endpoints in the Open...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mLATEST MODELS\u001b[0m\n",
"\u001b[38;5;59mThis document outlines the latest models available for different endpoints in the Open\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.39</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.39\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">26</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">02</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">2024</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">, </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">17:58</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Models - OpenAI API</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">The Moderation models are designed to check whether content co...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59m26\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m02\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m2024\u001b[0m\u001b[38;5;59m, \u001b[0m\u001b[1;38;5;59m17:58\u001b[0m\n",
"\u001b[38;5;59mModels - OpenAI API\u001b[0m\n",
"\u001b[38;5;59mThe Moderation models are designed to check whether content co\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.38</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.38\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">26</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">02</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">2024</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">, </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">17:58</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Models - OpenAI API</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">MODEL</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">DE S CRIPTION</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">tts-</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">1</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">New Text-to-speech </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">1</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">The latest tex...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59m26\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m02\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m2024\u001b[0m\u001b[38;5;59m, \u001b[0m\u001b[1;38;5;59m17:58\u001b[0m\n",
"\u001b[38;5;59mModels - OpenAI API\u001b[0m\n",
"\u001b[38;5;59mMODEL\u001b[0m\n",
"\u001b[38;5;59mDE S CRIPTION\u001b[0m\n",
"\u001b[38;5;59mtts-\u001b[0m\u001b[1;38;5;59m1\u001b[0m\n",
"\u001b[38;5;59mNew Text-to-speech \u001b[0m\u001b[1;38;5;59m1\u001b[0m\n",
"\u001b[38;5;59mThe latest tex\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008787; text-decoration-color: #008787; font-weight: bold\">REPLY:</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">We offer the following main models:</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">- **/v1/completions </span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">(</span><span style=\"color: #00875f; text-decoration-color: #00875f\">Legacy</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">)</span><span style=\"color: #00875f; text-decoration-color: #00875f\">**: `gpt-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">3.5</span><span style=\"color: #00875f; text-decoration-color: #00875f\">-turbo-instruct`, `babbage-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">002</span><span style=\"color: #00875f; text-decoration-color: #00875f\">`, `davinci-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">002</span><span style=\"color: #00875f; text-decoration-color: #00875f\">`</span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">- **/v1/embeddings**: `text-embedding-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">3</span><span style=\"color: #00875f; text-decoration-color: #00875f\">-small`, `text-embedding-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">3</span><span style=\"color: #00875f; text-decoration-color: #00875f\">-large`, `text-embedding-ada-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">002</span><span style=\"color: #00875f; text-decoration-color: #00875f\">`</span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">- **/v1/fine_tuning/jobs**: `gpt-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">3.5</span><span style=\"color: #00875f; text-decoration-color: #00875f\">-turbo`, `babbage-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">002</span><span style=\"color: #00875f; text-decoration-color: #00875f\">`, `davinci-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">002</span><span style=\"color: #00875f; text-decoration-color: #00875f\">`</span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">- **/v1/moderations**: `text-moderation-stable`</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">Additionally, there are enhanced versions like `gpt-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">3.5</span><span style=\"color: #00875f; text-decoration-color: #00875f\">-turbo-16k` and other fine-tuned models.</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;30mREPLY:\u001b[0m\n",
"\n",
"\u001b[38;5;29mWe offer the following main models:\u001b[0m\n",
"\n",
"\u001b[38;5;29m- **\u001b[0m\u001b[38;5;29m/v1/\u001b[0m\u001b[38;5;29mcompletions\u001b[0m\u001b[38;5;29m \u001b[0m\u001b[1;38;5;29m(\u001b[0m\u001b[38;5;29mLegacy\u001b[0m\u001b[1;38;5;29m)\u001b[0m\u001b[38;5;29m**: `gpt-\u001b[0m\u001b[1;38;5;29m3.5\u001b[0m\u001b[38;5;29m-turbo-instruct`, `babbage-\u001b[0m\u001b[1;38;5;29m002\u001b[0m\u001b[38;5;29m`, `davinci-\u001b[0m\u001b[1;38;5;29m002\u001b[0m\u001b[38;5;29m`\u001b[0m\n",
"\u001b[38;5;29m- **\u001b[0m\u001b[38;5;29m/v1/\u001b[0m\u001b[38;5;29membeddings\u001b[0m\u001b[38;5;29m**: `text-embedding-\u001b[0m\u001b[1;38;5;29m3\u001b[0m\u001b[38;5;29m-small`, `text-embedding-\u001b[0m\u001b[1;38;5;29m3\u001b[0m\u001b[38;5;29m-large`, `text-embedding-ada-\u001b[0m\u001b[1;38;5;29m002\u001b[0m\u001b[38;5;29m`\u001b[0m\n",
"\u001b[38;5;29m- **\u001b[0m\u001b[38;5;29m/v1/fine_tuning/\u001b[0m\u001b[38;5;29mjobs\u001b[0m\u001b[38;5;29m**: `gpt-\u001b[0m\u001b[1;38;5;29m3.5\u001b[0m\u001b[38;5;29m-turbo`, `babbage-\u001b[0m\u001b[1;38;5;29m002\u001b[0m\u001b[38;5;29m`, `davinci-\u001b[0m\u001b[1;38;5;29m002\u001b[0m\u001b[38;5;29m`\u001b[0m\n",
"\u001b[38;5;29m- **\u001b[0m\u001b[38;5;29m/v1/\u001b[0m\u001b[38;5;29mmoderations\u001b[0m\u001b[38;5;29m**: `text-moderation-stable`\u001b[0m\n",
"\n",
"\u001b[38;5;29mAdditionally, there are enhanced versions like `gpt-\u001b[0m\u001b[1;38;5;29m3.5\u001b[0m\u001b[38;5;29m-turbo-16k` and other fine-tuned models.\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #af005f; text-decoration-color: #af005f; font-weight: bold\">QUERY:</span><span style=\"color: #af005f; text-decoration-color: #af005f\"> Do you have a speech recognition model?</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;125mQUERY:\u001b[0m\u001b[38;5;125m Do you have a speech recognition model?\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">Matching content:</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59mMatching content:\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.51</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.51\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">**Models - OpenAI API**</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">**Text-to-Speech Models:**</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">. **tts-</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">1</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">**: This is a new text-to-speech model o...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59m**Models - OpenAI API**\u001b[0m\n",
"\u001b[38;5;59m**Text-to-Speech Models:**\u001b[0m\n",
"\u001b[38;5;59m. **tts-\u001b[0m\u001b[1;38;5;59m1\u001b[0m\u001b[38;5;59m**: This is a new text-to-speech model o\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.50</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.50\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">26</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">02</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">2024</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">, </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">17:58</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Models - OpenAI API</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">MODEL</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">DE S CRIPTION</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">tts-</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">1</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">New Text-to-speech </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">1</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">The latest tex...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59m26\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m02\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m2024\u001b[0m\u001b[38;5;59m, \u001b[0m\u001b[1;38;5;59m17:58\u001b[0m\n",
"\u001b[38;5;59mModels - OpenAI API\u001b[0m\n",
"\u001b[38;5;59mMODEL\u001b[0m\n",
"\u001b[38;5;59mDE S CRIPTION\u001b[0m\n",
"\u001b[38;5;59mtts-\u001b[0m\u001b[1;38;5;59m1\u001b[0m\n",
"\u001b[38;5;59mNew Text-to-speech \u001b[0m\u001b[1;38;5;59m1\u001b[0m\n",
"\u001b[38;5;59mThe latest tex\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.44</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.44\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">26</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">02</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">2024</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">, </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">17:58</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Models - OpenAI API</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">ENDP OINT</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">DATA USED</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">FOR TRAINING</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">DEFAULT</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">RETENTION</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">ELIGIBLE FO...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59m26\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m02\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m2024\u001b[0m\u001b[38;5;59m, \u001b[0m\u001b[1;38;5;59m17:58\u001b[0m\n",
"\u001b[38;5;59mModels - OpenAI API\u001b[0m\n",
"\u001b[38;5;59mENDP OINT\u001b[0m\n",
"\u001b[38;5;59mDATA USED\u001b[0m\n",
"\u001b[38;5;59mFOR TRAINING\u001b[0m\n",
"\u001b[38;5;59mDEFAULT\u001b[0m\n",
"\u001b[38;5;59mRETENTION\u001b[0m\n",
"\u001b[38;5;59mELIGIBLE FO\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008787; text-decoration-color: #008787; font-weight: bold\">REPLY:</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">Yes, there is a speech recognition model called Whisper, which is capable of handling diverse audio inputs and </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">supports multilingual speech recognition, speech translation, and language identification. The Whisper v2-large </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">model is accessible via the API under the name </span><span style=\"color: #00875f; text-decoration-color: #00875f\">\"whisper-1.\"</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;30mREPLY:\u001b[0m\n",
"\n",
"\u001b[38;5;29mYes, there is a speech recognition model called Whisper, which is capable of handling diverse audio inputs and \u001b[0m\n",
"\u001b[38;5;29msupports multilingual speech recognition, speech translation, and language identification. The Whisper v2-large \u001b[0m\n",
"\u001b[38;5;29mmodel is accessible via the API under the name \u001b[0m\u001b[38;5;29m\"whisper-1.\"\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #af005f; text-decoration-color: #af005f; font-weight: bold\">QUERY:</span><span style=\"color: #af005f; text-decoration-color: #af005f\"> Which embedding model should I use for non-English use cases?</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;125mQUERY:\u001b[0m\u001b[38;5;125m Which embedding model should I use for non-English use cases?\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">Matching content:</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59mMatching content:\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.49</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.49\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\"># Technical Patterns: Data Preparation - Embeddings</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">## What to Embed?</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">When preparing data for embedd...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59m# Technical Patterns: Data Preparation - Embeddings\u001b[0m\n",
"\u001b[38;5;59m## What to Embed?\u001b[0m\n",
"\u001b[38;5;59mWhen preparing data for embedd\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.48</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.48\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Technical patterns</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Data preparation: embeddings</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">What to embed?</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Depending on your use caseyou might n...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mTechnical patterns\u001b[0m\n",
"\u001b[38;5;59mData preparation: embeddings\u001b[0m\n",
"\u001b[38;5;59mWhat to embed?\u001b[0m\n",
"\u001b[38;5;59mDepending on your use caseyou might n\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.48</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.48\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">**Models - OpenAI API**</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">**Text-to-Speech Models:**</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">. **tts-</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">1</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">**: This is a new text-to-speech model o...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59m**Models - OpenAI API**\u001b[0m\n",
"\u001b[38;5;59m**Text-to-Speech Models:**\u001b[0m\n",
"\u001b[38;5;59m. **tts-\u001b[0m\u001b[1;38;5;59m1\u001b[0m\u001b[38;5;59m**: This is a new text-to-speech model o\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008787; text-decoration-color: #008787; font-weight: bold\">REPLY:</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">The content provided does not address which embedding model to use for non-English use cases. For non-English use </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">cases, you might consider using multilingual models like Google's mBERT or Facebook's XLM-R, which are designed to </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">handle multiple languages effectively.</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;30mREPLY:\u001b[0m\n",
"\n",
"\u001b[38;5;29mThe content provided does not address which embedding model to use for non-English use cases. For non-English use \u001b[0m\n",
"\u001b[38;5;29mcases, you might consider using multilingual models like Google's mBERT or Facebook's XLM-R, which are designed to \u001b[0m\n",
"\u001b[38;5;29mhandle multiple languages effectively.\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #af005f; text-decoration-color: #af005f; font-weight: bold\">QUERY:</span><span style=\"color: #af005f; text-decoration-color: #af005f\"> Can I introduce new knowledge in my LLM app using RAG?</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;125mQUERY:\u001b[0m\u001b[38;5;125m Can I introduce new knowledge in my LLM app using RAG?\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">Matching content:</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59mMatching content:\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.54</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.54\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">What is RAG</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Retrieve information to Augment the models knowledge and Generate the output</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">“What is y...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mWhat is RAG\u001b[0m\n",
"\u001b[38;5;59mRetrieve information to Augment the models knowledge and Generate the output\u001b[0m\n",
"\u001b[38;5;59m“What is y\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.50</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.50\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">**Overview**</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Retrieval-Augmented Generation </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">(</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">RAG</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">)</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\"> enhances language models by integrating them with ...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59m**Overview**\u001b[0m\n",
"\u001b[38;5;59mRetrieval-Augmented Generation \u001b[0m\u001b[1;38;5;59m(\u001b[0m\u001b[38;5;59mRAG\u001b[0m\u001b[1;38;5;59m)\u001b[0m\u001b[38;5;59m enhances language models by integrating them with \u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.49</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.49\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">When to use RAG</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Good for ✅</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Not good for ❌</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">●</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">●</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Introducing new information to the model</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">●</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Teaching ...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mWhen to use RAG\u001b[0m\n",
"\u001b[38;5;59mGood for ✅\u001b[0m\n",
"\u001b[38;5;59mNot good for ❌\u001b[0m\n",
"\u001b[38;5;59m●\u001b[0m\n",
"\u001b[38;5;59m●\u001b[0m\n",
"\u001b[38;5;59mIntroducing new information to the model\u001b[0m\n",
"\u001b[38;5;59m●\u001b[0m\n",
"\u001b[38;5;59mTeaching \u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008787; text-decoration-color: #008787; font-weight: bold\">REPLY:</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">Yes, you can introduce new knowledge in your LLM app using Retrieval-Augmented Generation </span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">(</span><span style=\"color: #00875f; text-decoration-color: #00875f\">RAG</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">)</span><span style=\"color: #00875f; text-decoration-color: #00875f\">. This method allows</span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">the language model to access external knowledge sources, enhancing its responses with up-to-date and contextually </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">relevant information.</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;30mREPLY:\u001b[0m\n",
"\n",
"\u001b[38;5;29mYes, you can introduce new knowledge in your LLM app using Retrieval-Augmented Generation \u001b[0m\u001b[1;38;5;29m(\u001b[0m\u001b[38;5;29mRAG\u001b[0m\u001b[1;38;5;29m)\u001b[0m\u001b[38;5;29m. This method allows\u001b[0m\n",
"\u001b[38;5;29mthe language model to access external knowledge sources, enhancing its responses with up-to-date and contextually \u001b[0m\n",
"\u001b[38;5;29mrelevant information.\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #af005f; text-decoration-color: #af005f; font-weight: bold\">QUERY:</span><span style=\"color: #af005f; text-decoration-color: #af005f\"> How many examples do I need to fine-tune a model?</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;125mQUERY:\u001b[0m\u001b[38;5;125m How many examples do I need to fine-tune a model?\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">Matching content:</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59mMatching content:\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.71</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.71\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">What is Fine-tuning</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Public Model</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Training data</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Training</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Fine-tunedmodel</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Fine-tuning a model consists...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mWhat is Fine-tuning\u001b[0m\n",
"\u001b[38;5;59mPublic Model\u001b[0m\n",
"\u001b[38;5;59mTraining data\u001b[0m\n",
"\u001b[38;5;59mTraining\u001b[0m\n",
"\u001b[38;5;59mFine-tunedmodel\u001b[0m\n",
"\u001b[38;5;59mFine-tuning a model consists\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.62</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.62\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">When to Fine-Tune</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">**Good for:**</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">- **Following a given format or tone for the output:** Fine-tuning i...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mWhen to Fine-Tune\u001b[0m\n",
"\u001b[38;5;59m**Good for:**\u001b[0m\n",
"\u001b[38;5;59m- **Following a given format or tone for the output:** Fine-tuning i\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.60</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.60\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Best practices</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Curate examples carefully</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Datasets can be difficult to build, startsmall and invest int...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mBest practices\u001b[0m\n",
"\u001b[38;5;59mCurate examples carefully\u001b[0m\n",
"\u001b[38;5;59mDatasets can be difficult to build, startsmall and invest int\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008787; text-decoration-color: #008787; font-weight: bold\">REPLY:</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">For effective fine-tuning of a model, it is recommended to use </span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">50</span><span style=\"color: #00875f; text-decoration-color: #00875f\">-</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">100</span><span style=\"color: #00875f; text-decoration-color: #00875f\"> examples. However, the minimum requirement is</span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">10</span><span style=\"color: #00875f; text-decoration-color: #00875f\"> examples.</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;30mREPLY:\u001b[0m\n",
"\n",
"\u001b[38;5;29mFor effective fine-tuning of a model, it is recommended to use \u001b[0m\u001b[1;38;5;29m50\u001b[0m\u001b[38;5;29m-\u001b[0m\u001b[1;38;5;29m100\u001b[0m\u001b[38;5;29m examples. However, the minimum requirement is\u001b[0m\n",
"\u001b[1;38;5;29m10\u001b[0m\u001b[38;5;29m examples.\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #af005f; text-decoration-color: #af005f; font-weight: bold\">QUERY:</span><span style=\"color: #af005f; text-decoration-color: #af005f\"> Which metric can I use to evaluate a summarization task?</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;125mQUERY:\u001b[0m\u001b[38;5;125m Which metric can I use to evaluate a summarization task?\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">Matching content:</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59mMatching content:\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.61</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.61\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Technical Patterns: Metric-based Evaluations</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">ROUGE is a common metric for evaluating machine summari...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mTechnical Patterns: Metric-based Evaluations\u001b[0m\n",
"\u001b[38;5;59mROUGE is a common metric for evaluating machine summari\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.54</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.54\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Technical patterns</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Metric-based evaluations</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">ROUGE is a common metric for evaluating machine summariz...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mTechnical patterns\u001b[0m\n",
"\u001b[38;5;59mMetric-based evaluations\u001b[0m\n",
"\u001b[38;5;59mROUGE is a common metric for evaluating machine summariz\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.48</span>\n",
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.48\u001b[0m\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Technical patterns</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Metric-based evaluations</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Component evaluations</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Subjective evaluations</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">●</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">●</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Compari...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mTechnical patterns\u001b[0m\n",
"\u001b[38;5;59mMetric-based evaluations\u001b[0m\n",
"\u001b[38;5;59mComponent evaluations\u001b[0m\n",
"\u001b[38;5;59mSubjective evaluations\u001b[0m\n",
"\u001b[38;5;59m●\u001b[0m\n",
"\u001b[38;5;59m●\u001b[0m\n",
"\u001b[38;5;59mCompari\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008787; text-decoration-color: #008787; font-weight: bold\">REPLY:</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">You can use the ROUGE metric to evaluate a summarization task. ROUGE assesses the quality of summaries by comparing</span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">them to reference summaries, quantifying how well a machine-generated summary captures the essence of the original </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">text.</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;30mREPLY:\u001b[0m\n",
"\n",
"\u001b[38;5;29mYou can use the ROUGE metric to evaluate a summarization task. ROUGE assesses the quality of summaries by comparing\u001b[0m\n",
"\u001b[38;5;29mthem to reference summaries, quantifying how well a machine-generated summary captures the essence of the original \u001b[0m\n",
"\u001b[38;5;29mtext.\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #af005f; text-decoration-color: #af005f; font-weight: bold\">QUERY:</span><span style=\"color: #af005f; text-decoration-color: #af005f\"> Give me a detailed example for an evaluation process where we are looking for a clear answer to compare to a</span>\n",
"<span style=\"color: #af005f; text-decoration-color: #af005f\">ground truth.</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;125mQUERY:\u001b[0m\u001b[38;5;125m Give me a detailed example for an evaluation process where we are looking for a clear answer to compare to a\u001b[0m\n",
"\u001b[38;5;125mground truth.\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">Matching content:</span>\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;59mMatching content:\u001b[0m\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.56</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.56\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">What are evals</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Example</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Our ground truth matches the predicted answer, so the evaluation passes!</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Eval...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mWhat are evals\u001b[0m\n",
"\u001b[38;5;59mExample\u001b[0m\n",
"\u001b[38;5;59mOur ground truth matches the predicted answer, so the evaluation passes!\u001b[0m\n",
"\u001b[38;5;59mEval\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.55</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.55\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">What are evals</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Example</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">An evaluation contains a question and a correct answer. We call this the grou...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mWhat are evals\u001b[0m\n",
"\u001b[38;5;59mExample\u001b[0m\n",
"\u001b[38;5;59mAn evaluation contains a question and a correct answer. We call this the grou\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-style: italic\">Similarity: </span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold; font-style: italic\">0.55</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"</pre>\n"
],
"text/plain": [
"\u001b[3;38;5;59mSimilarity: \u001b[0m\u001b[1;3;38;5;59m0.55\u001b[0m\n"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Technical patterns</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Metric-based evaluations</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Component evaluations</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Subjective evaluations</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">●</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">●</span>\n",
"<span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">Compari...</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">[</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f\">/</span><span style=\"color: #5f5f5f; text-decoration-color: #5f5f5f; font-weight: bold\">]</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[38;5;59mTechnical patterns\u001b[0m\n",
"\u001b[38;5;59mMetric-based evaluations\u001b[0m\n",
"\u001b[38;5;59mComponent evaluations\u001b[0m\n",
"\u001b[38;5;59mSubjective evaluations\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\u001b[38;5;59m●\u001b[0m\n",
"\u001b[38;5;59m●\u001b[0m\n",
"\u001b[38;5;59mCompari\u001b[0m\u001b[38;5;59m...\u001b[0m\u001b[1;38;5;59m[\u001b[0m\u001b[38;5;59m/\u001b[0m\u001b[1;38;5;59m]\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008787; text-decoration-color: #008787; font-weight: bold\">REPLY:</span>\n",
"\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">An example of an evaluation process where we look for a clear answer to compare to a ground truth is when </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">determining the population of a country. In this case, the question is </span><span style=\"color: #00875f; text-decoration-color: #00875f\">\"What is the population of Canada?\"</span><span style=\"color: #00875f; text-decoration-color: #00875f\"> The </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">ground truth, or correct answer, is that the population of Canada in </span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">2023</span><span style=\"color: #00875f; text-decoration-color: #00875f\"> is </span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">39</span><span style=\"color: #00875f; text-decoration-color: #00875f\">,</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">566</span><span style=\"color: #00875f; text-decoration-color: #00875f\">,</span><span style=\"color: #00875f; text-decoration-color: #00875f; font-weight: bold\">248</span><span style=\"color: #00875f; text-decoration-color: #00875f\"> people. The predicted </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">answer, obtained through a search tool, is </span><span style=\"color: #00875f; text-decoration-color: #00875f\">\"There are 39,566,248 people in Canada as of 2023.\"</span><span style=\"color: #00875f; text-decoration-color: #00875f\"> Since the predicted </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">answer matches the ground truth, the evaluation is successful. This process is used to verify the accuracy of </span>\n",
"<span style=\"color: #00875f; text-decoration-color: #00875f\">information provided by a language model or other predictive tools.</span>\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n",
"</pre>\n"
],
"text/plain": [
"\u001b[1;38;5;30mREPLY:\u001b[0m\n",
"\n",
"\u001b[38;5;29mAn example of an evaluation process where we look for a clear answer to compare to a ground truth is when \u001b[0m\n",
"\u001b[38;5;29mdetermining the population of a country. In this case, the question is \u001b[0m\u001b[38;5;29m\"What is the population of Canada?\"\u001b[0m\u001b[38;5;29m The \u001b[0m\n",
"\u001b[38;5;29mground truth, or correct answer, is that the population of Canada in \u001b[0m\u001b[1;38;5;29m2023\u001b[0m\u001b[38;5;29m is \u001b[0m\u001b[1;38;5;29m39\u001b[0m\u001b[38;5;29m,\u001b[0m\u001b[1;38;5;29m566\u001b[0m\u001b[38;5;29m,\u001b[0m\u001b[1;38;5;29m248\u001b[0m\u001b[38;5;29m people. The predicted \u001b[0m\n",
"\u001b[38;5;29manswer, obtained through a search tool, is \u001b[0m\u001b[38;5;29m\"There are 39,566,248 people in Canada as of 2023.\"\u001b[0m\u001b[38;5;29m Since the predicted \u001b[0m\n",
"\u001b[38;5;29manswer matches the ground truth, the evaluation is successful. This process is used to verify the accuracy of \u001b[0m\n",
"\u001b[38;5;29minformation provided by a language model or other predictive tools.\u001b[0m\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"--------------\n",
"\n",
"\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Running the RAG pipeline on each example\n",
"for ex in example_inputs:\n",
" print(f\"[deep_pink4][bold]QUERY:[/bold] {ex}[/deep_pink4]\\n\\n\")\n",
" matching_content = search_content(df, ex, 3)\n",
" print(f\"[grey37][b]Matching content:[/b][/grey37]\\n\")\n",
" for i, match in matching_content.iterrows():\n",
" print(f\"[grey37][i]Similarity: {get_similarity(match):.2f}[/i][/grey37]\")\n",
" print(f\"[grey37]{match['content'][:100]}{'...' if len(match['content']) > 100 else ''}[/[grey37]]\\n\\n\")\n",
" reply = generate_output(ex, matching_content)\n",
" print(f\"[turquoise4][b]REPLY:[/b][/turquoise4]\\n\\n[spring_green4]{reply}[/spring_green4]\\n\\n--------------\\n\\n\")"
]
},
{
"cell_type": "markdown",
"id": "0d0d14e4",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"metadata": {},
"source": [
"## Wrapping up\n",
"\n",
"In this notebook, we have learned how to develop a basic RAG pipeline based on PDF documents. This includes:\n",
"\n",
"- How to parse pdf documents, taking slide decks and an export from an HTML page as examples, using a python library as well as GPT-4o to interpret the visuals\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"- How to process the extracted content, clean it and chunk it into several pieces\n",
"- How to embed the processed content using OpenAI embeddings\n",
"- How to retrieve content that is relevant to an input query\n",
"- How to use GPT-4o to generate an answer using the retrieved content as context\n",
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
"\n",
"If you want to explore further, consider these optimisations:\n",
"\n",
"- Playing around with the prompts provided as examples\n",
"- Chunking the content further and adding metadata as context to each chunk\n",
"- Adding rule-based filtering on the retrieval results or re-ranking results to surface to most relevant content\n",
"\n",
"You can apply the techniques covered in this notebook to multiple use cases, such as assistants that can access your proprietary data, customer service or FAQ bots that can read from your internal policies, or anything that requires leveraging rich documents that would be better understood as images."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.0"
Added a new notebook: "Parse PDF docs for RAG applications" (#1080) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: prestontuggle <97747561+prestontuggle@users.noreply.github.com> Co-authored-by: Shyamal H Anadkat <shyamal@openai.com> Co-authored-by: Simón Fishman <simonpfish@gmail.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: aalmaksour82 <49364099+aalmaksour82@users.noreply.github.com> Co-authored-by: colin-openai <119888926+colin-openai@users.noreply.github.com> Co-authored-by: Michael Wu <mwu1993@users.noreply.github.com> Co-authored-by: Logan Kilpatrick <logan@openai.com> Co-authored-by: Viet Hoang Tran Duong <36019296+viethoangtranduong@users.noreply.github.com> Co-authored-by: Christine Belzie <105683440+CBID2@users.noreply.github.com> Co-authored-by: Eliah Kagan <degeneracypressure@gmail.com> Co-authored-by: recordcrash <recordcrash@users.noreply.github.com> Co-authored-by: Stefano Lottini <hemidactylus@users.noreply.github.com> Co-authored-by: Safa Asgar <70315479+SaFaUU@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Will DePue <will@depue.net> Co-authored-by: ys64 <815824+ys64@users.noreply.github.com> Co-authored-by: Shawn Yuxuan Tong <tongyuxuan361@gmail.com> Co-authored-by: Steven Pousty <steve.pousty@gmail.com> Co-authored-by: Puneet Dhiman <142409038+PuneetDhimanShorthillsAI@users.noreply.github.com> Co-authored-by: Krista Pratico <krpratic@microsoft.com> Co-authored-by: dongqqcom <32085836+dongqqcom@users.noreply.github.com> Co-authored-by: Alvaro Videla <videlalvaro@gmail.com> Co-authored-by: DevilsWorkShop <ashokmanghat@gmail.com> Co-authored-by: Ashok Manghat <amanghat@rmplc.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Matthew Jericho Go Sy <69558553+jerichosy@users.noreply.github.com> Co-authored-by: Farzad Sunavala <40604067+farzad528@users.noreply.github.com> Co-authored-by: Nirant <NirantK@users.noreply.github.com> Co-authored-by: Cathy Chen <cathykaichen@gmail.com> Co-authored-by: gusmally <hannahmbmoraes@gmail.com> Co-authored-by: Chuong Ho <31106432+chuongmep@users.noreply.github.com> Co-authored-by: ridrisa <138629783+ridrisa@users.noreply.github.com> Co-authored-by: Xin(Leo) Jing <jingxin@berkeley.edu> Co-authored-by: Per Harald Borgen <perhborgen@gmail.com> Co-authored-by: Hoang Viet Khoa <khoahv92@gmail.com> Co-authored-by: Stefano Fiorucci <44616784+anakin87@users.noreply.github.com> Co-authored-by: Albarqawi <barqawi.88@outlook.com> Co-authored-by: Saarika Bhasi <55930906+saarikabhasi@users.noreply.github.com> Co-authored-by: Daniel <10074684+danieltprice@users.noreply.github.com> Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com> Co-authored-by: Jiří Hofman <jiri.hofman@gmail.com> Co-authored-by: Fayaz Rahman <fayazrahman4u@gmail.com> Co-authored-by: Anish Shah <93145909+ash0ts@users.noreply.github.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: Emil Sedgh <emilsedgh@kde.org> Co-authored-by: Megan O'Keefe <3137106+askmeegs@users.noreply.github.com> Co-authored-by: Joschka Braun <47435119+joschkabraun@users.noreply.github.com> Co-authored-by: Roger Zurawicki <zurawiki@users.noreply.github.com> Co-authored-by: pavlovp <pavel.pavlov1990@gmail.com> Co-authored-by: Surav Shrestha <98219089+suravshresth@users.noreply.github.com> Co-authored-by: vrushankportkey <134934501+vrushankportkey@users.noreply.github.com> Co-authored-by: Soonoh <chk0ndanger@gmail.com> Co-authored-by: Mayuresh Dharwadkar <98738585+Mayureshd-18@users.noreply.github.com> Co-authored-by: Yashwant Jodha <76436993+yashwantjodha@users.noreply.github.com> Co-authored-by: Guspan Tanadi <36249910+guspan-tanadi@users.noreply.github.com> Co-authored-by: Ana Martins <60753223+OutSystemsAMM@users.noreply.github.com> Co-authored-by: Greg Richardson <greg.nmr@gmail.com> Co-authored-by: john <johnoctubre7@gmail.com> Co-authored-by: John Octubre <johnoctubre@Johns-MacBook-Pro.local> Co-authored-by: jhills20 <70035505+jhills20@users.noreply.github.com> Co-authored-by: Tad <wptady@gmail.com> Co-authored-by: Ilan Bigio <ilanbigio@gmail.com> Co-authored-by: Ilan Bigio <ilan@openai.com> Co-authored-by: royziv11 <103690170+royziv11@users.noreply.github.com> Co-authored-by: Gabor Cselle <gaborcselle@users.noreply.github.com> Co-authored-by: D. Carpintero <6709785+dcarpintero@users.noreply.github.com> Co-authored-by: Ed Spencer <ed@edspencer.net> Co-authored-by: Ravi Theja <ravi03071991@gmail.com> Co-authored-by: dylanra-openai <149511600+dylanra-openai@users.noreply.github.com> Co-authored-by: Taranjeet Singh <reachtotj@gmail.com> Co-authored-by: Frode Jensen <jensen.frode@gmail.com> Co-authored-by: Lionel Cheng <60159831+lionelchg@users.noreply.github.com> Co-authored-by: lionelchg <Cheng.Lionel@bcg.com> Co-authored-by: Jing Ai <42414856+jingairpi@users.noreply.github.com> Co-authored-by: Jing Ai <jingai@jings-air-2020.lan> Co-authored-by: Spring_MT <today.is.sky.blue.sky@gmail.com> Co-authored-by: kevleininger <kevleininger@gmail.com> Co-authored-by: Prakul <discover.prakul@gmail.com> Co-authored-by: Logan Kilpatrick <23kilpatrick23@gmail.com> Co-authored-by: Jiang Yucheng <fatjyc@gmail.com> Co-authored-by: Haomin Liu <644074553@qq.com> Co-authored-by: Xavier Amatriain <xavier.amatriain@gmail.com> Co-authored-by: Caio Curitiba Marcellos <caiocuritiba@gmail.com> Co-authored-by: Kesku <62210496+kesku@users.noreply.github.com> Co-authored-by: markbigears <86395716+markbigears@users.noreply.github.com> Co-authored-by: bigears <mark.forsyth@yourbigears.com> Co-authored-by: Nghiauet <63385521+Nghiauet@users.noreply.github.com> Co-authored-by: Vince Fulco--Bighire.tools <vince@bighire.io> Co-authored-by: Wang22004K <152562528+Wang22004K@users.noreply.github.com> Co-authored-by: Shaurya Rohatgi <shauryr@gmail.com> Co-authored-by: Dhruv Singh <ds3638@columbia.edu> Co-authored-by: Adam Hendel <ChuckHend@users.noreply.github.com> Co-authored-by: Enoch Cheung <enoch@enochc.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: rissois <44072214+rissois@users.noreply.github.com> Co-authored-by: ayush rajgor <ayushrajgorar@gmail.com> Co-authored-by: teomusatoiu <156829031+teomusatoiu@users.noreply.github.com> Co-authored-by: James Briggs <35938317+jamescalam@users.noreply.github.com> Co-authored-by: Shivam Rastogi <shivamsupr@gmail.com> Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Elmira Ghorbani <elmira.ghorbani96@gmail.com> Co-authored-by: gloryjain <glory@openai.com> Co-authored-by: Andrew Peng <apeng@berkeley.edu>
2024-02-29 13:54:06 +00:00
}
},
"nbformat": 4,
"nbformat_minor": 5
}