dependabot[bot]
0f7d648975
Bump tornado from 6.2 to 6.3.2 in /apps/web-crawl-q-and-a ( #459 )
...
Bumps [tornado](https://github.com/tornadoweb/tornado ) from 6.2 to 6.3.2.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst )
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.2.0...v6.3.2 )
---
updated-dependencies:
- dependency-name: tornado
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-11 16:05:37 -07:00
DevilsWorkShop
39b62a6c09
Catch the exception thrown by the With.Open and continue with the queue ( #155 )
...
Co-authored-by: Ashok Manghat <amanghat@rmplc.net>
2023-09-11 15:55:57 -07:00
ys64
7cda7e2df7
add the last chunk to the list of chunks in web-qa.ipynb ( #691 )
2023-09-11 14:54:40 -07:00
Simón Fishman
b2ca4d395c
Revert "File name sanitization ( #630 )" ( #668 )
...
This reverts commit 169f5e02c8ab13372bb066263424f9ddb31f7f9f.
2023-08-29 17:45:47 -07:00
Safa Asgar
169f5e02c8
File name sanitization ( #630 )
...
* File name sanitization
URL containing reserved characters blocks file name creation.
* Regular Expression fix for Sanitized URL
Co-authored-by: Simón Fishman <simonpfish@gmail.com>
---------
Co-authored-by: Simón Fishman <simonpfish@gmail.com>
2023-08-29 10:49:23 -07:00
Tomas Dulka
4fd2b1a6d2
replace eval with safer literal_eval ( #561 )
2023-07-17 16:40:54 -07:00
Darshan Panchal
e66613331a
Update requirements.txt
...
removed html since it was not required
2023-05-11 09:21:34 +05:30
Alexander Khapaev
ee9b6268d4
Updated the get_domain_hyperlinks function to include handling of tel: links in addition to mailto: links, to exclude them from the clean links list.
2023-04-07 18:28:44 +03:00
fabiofranco85
5a80ef2571
Improve regex
2023-03-27 07:38:35 -03:00
William Buck
ca9b9d485d
remove duplicate import of distances_from_embeddings
2023-03-20 13:02:37 -07:00
Sung Kim
3210b38e35
Add handling for last chunk in split_into_sentences function
...
I have added handling for the last chunk in the split_into_sentences function. Previously, the function did not account for the last chunk, which could lead to incomplete sentences in the output.
To solve this, I added a conditional statement to check if the last chunk is non-empty. If it is, I append it to the list of chunks with a period to ensure the last sentence is complete.
This change improves the accuracy of the split_into_sentences function and ensures that all sentences in the input text are properly segmented. Please review and let me know if you have any feedback or concerns.
2023-02-19 11:00:27 +09:00
Logan Kilpatrick
3826607431
Add comment on where to learn about rate limits
2023-02-17 06:16:14 -06:00
Daniel Zhukovsky
be9877edbf
Redefinition of unused 'pd'
2023-02-16 15:05:04 +00:00
isafulf
daf8e0d011
rename web crawl q and a
2023-02-11 16:37:29 -08:00