49 Commits

Author SHA1 Message Date
Adityavardhan Agrawal
5ae396cddb Squashed changes from hosted-service 2025-04-03 12:54:54 -07:00
Adityavardhan Agrawal
bf7c90164f change document endpoint to POST, and adjust API calls for consistency 2025-04-03 10:47:34 -07:00
Adityavardhan Agrawal
08893733f6
Add custom prompt and example injections for query and graph creation (#68) 2025-03-31 21:30:48 -07:00
Adityavardhan Agrawal
f3a0ea7876
Add update graphs, and custom open ai url (#63) 2025-03-29 23:22:47 -07:00
Adityavardhan Agrawal
9ce0507616
Add delete document endpoint (#62) 2025-03-29 18:42:52 -07:00
Adityavardhan Agrawal
6ef3ec207e Reduce extra logging, change to debugs 2025-03-27 20:05:27 -07:00
Adityavardhan Agrawal
7eb5887d2f
Add hosted tier limits, cloud uri gen (#59) 2025-03-27 17:30:02 -07:00
Adityavardhan Agrawal
adc0b2dbb8
Add batch ingestion (#55) 2025-03-18 23:27:53 -04:00
Adityavardhan Agrawal
4ae132ff46
Implement knowledge graphs, and graph enhanced querying (#48) 2025-03-17 17:36:43 -04:00
Adityavardhan Agrawal
32a5d787fe
Add update methods with add update strategy (#53) 2025-03-13 11:26:01 -04:00
Adityavardhan Agrawal
38683df0f3
Add completion sources and batch retrieval for docs and chunks (#51) 2025-03-09 18:42:04 -04:00
LukeZekes
e56691a1c5
add filename option for text documents (#47) 2025-03-05 10:56:02 -05:00
Arnav Agrawal
821e9d7e20
Add support for ColPali (#43)
* debug mps not supported

* further debug (i think i lost some braincells)

* fix mps bug and resolve dependency issues

* remove libmagic dependence

* add colpali embedding model

* multi-vector store works - verified with testing

* add integration testing

* support text embedding in colpali

* complete colplai integration and testing

* formatting + some PR comments

* remove experimental files

* resolve PR comments
2025-02-26 20:17:12 -05:00
Adityavardhan Agrawal
c8ed46b12b
Separate parsing and chunking into different function for easy rules processing (#41) 2025-02-15 13:02:15 -05:00
Adityavardhan Agrawal
a46fa064c7
Add natural language rules based ingestion (#34) 2025-02-07 21:08:40 -05:00
Arnav Agrawal
d124e6aa0d
Add support for cache-augmented-generation (#30) 2025-01-28 23:49:28 -05:00
Adityavardhan Agrawal
f4c14fc71b
Streamline dev experience with optional auth and simplified config (#27) 2025-01-11 11:24:00 -05:00
Adityavardhan Agrawal
0a933e5fd6
Add docker support (#24) 2025-01-09 05:17:25 -05:00
Arnav Agrawal
f72f6f0249
Config improvements (#17) 2025-01-07 01:42:10 -05:00
Arnav Agrawal
c3726504f7
add support for PostgreSQL and pgvector (#15)
Co-authored-by: Adityavardhan Agrawal <aa729@cornell.edu>
2025-01-04 08:14:52 -05:00
Adityavardhan Agrawal
273dfcc5e7
Add PostgreSQL support (#13)
Co-authored-by: Arnav Agrawal <aa779@cornell.edu>
2025-01-04 08:11:09 -05:00
Arnav Agrawal
20faae8903
Add reranking (#14) 2025-01-02 03:42:47 -05:00
Arnav Agrawal
48e6aeb8b7
use local unstructured by default (#12) 2025-01-01 09:18:23 -05:00
Arnav Agrawal
abccf99974
add contextual embedding with claude prompt caching (#11)
* add context augmentation while chunking

* add contextual embeddings

* default config should be combined

* fix comments on PR

* update example environment

* update config and api to support env-variable optionality
2024-12-31 06:58:34 -05:00
Adityavardhan Agrawal
367dc079e8
Add local file system for storage (#10) 2024-12-31 06:25:51 -05:00
Adityavardhan Agrawal
3e4a9999ad
Add open telemetry and shell (#5) 2024-12-30 23:52:25 -05:00
Arnav Agrawal
0e4a43645a reformat files 2024-12-29 12:48:41 +05:30
Arnav Agrawal
16e5decc4b fix linting issues 2024-12-28 17:29:33 +05:30
Arnav Agrawal
b883f52a11 add ollama embeddings and test them out 2024-12-27 12:17:16 +05:30
Arnav Agrawal
418054e9a3 update configuration style to support easy model editing 2024-12-27 11:19:07 +05:30
Arnav Agrawal
13ab54fbf8
add a video parser + formatting changes (#4) 2024-12-26 11:34:24 -05:00
Adityavardhan Agrawal
03345dcc07
Add completions API (#3) 2024-12-26 08:52:25 -05:00
Arnav Agrawal
4f2f221d40 bug fixes and end-to-end testing 2024-12-17 21:40:38 -05:00
Adityavardhan Agrawal
df8d7fcdd0
refactor some stuff (#2)
* refactor some stuff, remove bare try catches
2024-12-15 14:31:25 -05:00
Adityavardhan Agrawal
251e38828a clean up 2024-12-04 20:26:14 -05:00
Adityavardhan Agrawal
1f68fb99d3 sdk and querying in api works 2024-12-03 21:46:25 -05:00
Arnav Agrawal
f1f52d9b67 pass api tests 2024-12-02 20:03:35 -05:00
Arnav Agrawal
000887a4dc pass all tests apart from querying 2024-11-28 19:09:40 -05:00
Adityavardhan Agrawal
983a4ee854 separate text and doc ingestion pathways 2024-11-24 14:29:25 -05:00
Adityavardhan Agrawal
84f620437d fix imports 2024-11-23 13:49:19 -05:00
Adityavardhan Agrawal
f7ecdc708d SDK changes and tests 2024-11-23 13:32:47 -05:00
Adityavardhan Agrawal
d70f53cf86 system changes 2024-11-22 20:58:17 -05:00
Adityavardhan Agrawal
c3cb888aaa add get document by id 2024-11-20 18:42:19 -05:00
Arnav Agrawal
d47240ea5c add presigned urls fix multiple doc bug 2024-11-18 20:37:37 -05:00
Arnav Agrawal
ab4fd6def2 add document retrieval endpoint 2024-11-18 18:41:23 -05:00
Arnav Agrawal
56fe944326 add s3 uploading and fly deploy 2024-11-18 10:45:07 -05:00
Adityavardhan Agrawal
97ac012364 pdf ingestion, simplify doc, remove chunk id, content type 2024-11-17 15:38:03 -05:00
Adityavardhan Agrawal
3251236fbe basic usage for string content works 2024-11-16 14:37:01 -05:00
Adityavardhan Agrawal
1a926c7be0 restructuring and WIP api and sdk changes 2024-11-16 01:48:15 -05:00