Commit Graph

1887 Commits

Author SHA1 Message Date
Lorenzo Paleari
66ea166438
fix: Added support for nested structure 2024-09-13 04:18:53 +02:00
Lorenzo Paleari
039ba2e95a
fix: Fixed pydantic error on SearchGraphs
Changed instatiation location of iterated graph classes
2024-09-13 01:56:58 +02:00
semantic-release-bot
88b2c469ae ci(release): 1.19.0-beta.8 [skip ci]
## [1.19.0-beta.8](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.19.0-beta.7...v1.19.0-beta.8) (2024-09-12)

### Features

* refactoring of the tokenization function ([ec6b164](ec6b164653))
2024-09-12 18:22:24 +00:00
Marco Vinciguerra
ec6b164653 feat: refactoring of the tokenization function 2024-09-12 20:21:00 +02:00
semantic-release-bot
4ab26a24a3 ci(release): 1.19.0-beta.7 [skip ci]
## [1.19.0-beta.7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.19.0-beta.6...v1.19.0-beta.7) (2024-09-12)

### Bug Fixes

* pyproject.toml dependencies ([b805aea](b805aea1de))
2024-09-12 08:26:37 +00:00
Marco Vinciguerra
b805aea1de fix: pyproject.toml dependencies 2024-09-12 10:25:23 +02:00
Marco Vinciguerra
4a16f14b25
Merge pull request #660 from tm-robinson/651-add-tokenization-for-ollama-and-mistral
651 add tokenization for ollama and mistral
2024-09-12 10:22:20 +02:00
Marco Vinciguerra
c64ce88db8 refactoting of imports 2024-09-12 10:16:15 +02:00
Tom Robinson
dc4a76b9c5 use semchunk by default as the other code is causing tokenizers to be called for every individual word which is very slow especially with the mistral tokenizer 2024-09-12 08:46:52 +01:00
Tom Robinson
da9726f738 updates to tokenization for #651 to implement for mistral and ollama 2024-09-12 08:28:30 +01:00
semantic-release-bot
ed8e1738c3 ci(release): 1.19.0-beta.6 [skip ci]
## [1.19.0-beta.6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.19.0-beta.5...v1.19.0-beta.6) (2024-09-12)

### Bug Fixes

* models tokens ([039fe3c](039fe3c6d9))

### Docs

* Updated the graph_config in the documentation. ([57a58e1](57a58e162e))

### CI

* **release:** 1.18.2 [skip ci] ([e1a9caa](e1a9caa905))
* **release:** 1.18.3 [skip ci] ([4bd4659](4bd4659dc1))
2024-09-12 07:11:03 +00:00
Marco Vinciguerra
5ff8cc706f
Merge pull request #659 from ScrapeGraphAI/temp
allignment
2024-09-12 09:09:45 +02:00
Marco Vinciguerra
18277c1109
Merge branch 'pre/beta' into temp 2024-09-12 09:09:38 +02:00
Marco Vinciguerra
ca31bd9412
Merge pull request #658 from shenghongtw/docs/Updated_the_graph_config_in_the_documentation
problem
2024-09-12 09:07:59 +02:00
Marco Vinciguerra
9eb40e1eae Update script_generator_openai.py 2024-09-12 09:06:15 +02:00
roryhaung
57a58e162e docs: Updated the graph_config in the documentation. 2024-09-12 14:50:02 +08:00
Marco Vinciguerra
fe3aa28fe7 refactoring of the code
Some checks failed
/ build (push) Has been cancelled
Release / Build (push) Has been cancelled
Release / Release (push) Has been cancelled
2024-09-11 16:04:43 +02:00
semantic-release-bot
4bd4659dc1 ci(release): 1.18.3 [skip ci]
## [1.18.3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.18.2...v1.18.3) (2024-09-11)

### Bug Fixes

* models tokens ([039fe3c](039fe3c6d9))
2024-09-11 09:47:05 +00:00
Marco Vinciguerra
039fe3c6d9 fix: models tokens 2024-09-11 11:45:44 +02:00
semantic-release-bot
7621a7c7b7 ci(release): 1.19.0-beta.5 [skip ci]
## [1.19.0-beta.5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.19.0-beta.4...v1.19.0-beta.5) (2024-09-10)

### Bug Fixes

* models tokens ([b2be6b7](b2be6b739e))
2024-09-10 13:58:33 +00:00
semantic-release-bot
e1a9caa905 ci(release): 1.18.2 [skip ci]
## [1.18.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.18.1...v1.18.2) (2024-09-10)

### Bug Fixes

* models tokens ([b2be6b7](b2be6b739e))
2024-09-10 13:58:18 +00:00
Marco Vinciguerra
4ee7753895
Merge pull request #654 from ScrapeGraphAI/main
allignment
2024-09-10 15:57:18 +02:00
Marco Vinciguerra
b2be6b739e fix: models tokens 2024-09-10 15:56:52 +02:00
semantic-release-bot
24c38f945a ci(release): 1.19.0-beta.4 [skip ci]
## [1.19.0-beta.4](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.19.0-beta.3...v1.19.0-beta.4) (2024-09-10)

### Features

* removed semchunk and used tikton ([1a7f21f](1a7f21fbf3))
2024-09-10 12:05:18 +00:00
Marco Vinciguerra
1a7f21fbf3 feat: removed semchunk and used tikton 2024-09-10 14:03:52 +02:00
Marco Vinciguerra
380174d490 add chunking functionn 2024-09-10 13:52:15 +02:00
semantic-release-bot
38cba96ea3 ci(release): 1.19.0-beta.3 [skip ci]
## [1.19.0-beta.3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.19.0-beta.2...v1.19.0-beta.3) (2024-09-10)

### Bug Fixes

* parse node ([947ebd2](947ebd2895))
2024-09-10 09:20:45 +00:00
Federico Aguzzi
4c14fd79b2
Merge pull request #650 from ScrapeGraphAI/637-it-can´t-scrape-urls-from-the-source
fix: parse node
2024-09-10 11:19:24 +02:00
Marco Vinciguerra
cf0397bc57
Update pyproject.toml 2024-09-10 11:10:22 +02:00
Marco Vinciguerra
947ebd2895 fix: parse node
Some checks failed
/ build (push) Has been cancelled
2024-09-10 08:41:08 +02:00
semantic-release-bot
23a260c51e ci(release): 1.19.0-beta.2 [skip ci]
## [1.19.0-beta.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.19.0-beta.1...v1.19.0-beta.2) (2024-09-09)

### Features

* return urls in searchgraph ([afb6eb7](afb6eb7e47))

### Bug Fixes

* temporary fix for parse_node ([f2bb22d](f2bb22d8e9))
2024-09-09 10:06:07 +00:00
Federico Aguzzi
32a102af3c
Merge pull request #648 from ScrapeGraphAI/637-it-can´t-scrape-urls-from-the-source
637 it can´t scrape urls from the source
2024-09-09 12:04:46 +02:00
Federico Aguzzi
8a0d46b714
Merge pull request #641 from ScrapeGraphAI/urls_search_graph
feat: return urls in searchgraph
2024-09-09 12:03:36 +02:00
Marco Vinciguerra
f2bb22d8e9 fix: temporary fix for parse_node 2024-09-09 11:42:33 +02:00
semantic-release-bot
eddcb79486 ci(release): 1.19.0-beta.1 [skip ci]
## [1.19.0-beta.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.18.1...v1.19.0-beta.1) (2024-09-08)

### Features

* **AbstractGraph:** add adjustable rate limit ([2859fb7](2859fb72d6))
* add scrape_do_integration ([94e69a0](94e69a0515))
* add togheterai ([8f615ad](8f615adef3))
* ConcatNode.py added for heavy merge operations ([bd4b26d](bd4b26d7d7))
* fetch_node improved ([167f970](167f97040f))

### Bug Fixes

* **AbstractGraph:** Bedrock init issues ([63a5d18](63a5d18486)), closes [#633](https://github.com/ScrapeGraphAI/Scrapegraph-ai/issues/633)
* correctly parsing output when using structured_output ([8e74ac5](8e74ac55a1))
* **ScreenshotScraper:** impose dynamic imports ([b8ef937](b8ef93738e))
* **Ollama:** instance model from correct package ([398b2c5](398b2c556f))
* Parse Node scraping link and img urls allowing OmniScraper to work ([66a3b6d](66a3b6d6a3))
* **SmartScraper:** pass llm_model to ParseNode ([5242166](5242166575))
* **DeepSeek:** proper model initialization ([74dfc69](74dfc693f6))
* Removed link_urls and img_ulrs from FetchNode output ([57337a0](57337a0a8c))
* screenshot scraper ([388630c](388630c0ff))
* screenshot_scraper ([ef7a589](ef7a5891dc))
* **ScreenShotScraper:** static import of optional dependencies ([52fe441](52fe441c5a))
* update generate answernode ([c348f67](c348f674ad))

### chore

* **examples:** create Together AI examples ([34942de](34942deca5))

### CI

* **release:** 1.16.0-beta.1 [skip ci] ([d7f6036](d7f6036f90))
* **release:** 1.16.0-beta.2 [skip ci] ([1c37d5d](1c37d5db1c))
* **release:** 1.16.0-beta.3 [skip ci] ([886c987](886c987172))
* **release:** 1.16.0-beta.4 [skip ci] ([ba5c7ad](ba5c7adcea))
* **release:** 1.17.0-beta.1 [skip ci] ([13efd4e](13efd4e3a4))
* **release:** 1.17.0-beta.10 [skip ci] ([af28885](af2888539e))
* **release:** 1.17.0-beta.11 [skip ci] ([a73fec5](a73fec5a98))
* **release:** 1.17.0-beta.2 [skip ci] ([08afc92](08afc9292e))
* **release:** 1.17.0-beta.3 [skip ci] ([fc55418](fc55418a45))
* **release:** 1.17.0-beta.4 [skip ci] ([5e99071](5e990719cf))
* **release:** 1.17.0-beta.5 [skip ci] ([16ab1bf](16ab1bf3d9))
* **release:** 1.17.0-beta.6 [skip ci] ([50c9c6b](50c9c6bd8c))
* **release:** 1.17.0-beta.7 [skip ci] ([4347afb](4347afb8d4)), closes [#633](https://github.com/ScrapeGraphAI/Scrapegraph-ai/issues/633)
* **release:** 1.17.0-beta.8 [skip ci] ([85c374e](85c374e4b3))
* **release:** 1.17.0-beta.9 [skip ci] ([77d0fd3](77d0fd3dba))
2024-09-08 11:13:21 +00:00
Marco Vinciguerra
9f52602d74
Merge pull request #646 from ScrapeGraphAI/temp
allignement
2024-09-08 13:11:46 +02:00
Marco Vinciguerra
14c5e6baf9
Merge branch 'pre/beta' into temp 2024-09-08 13:11:37 +02:00
Marco Vinciguerra
fc738cacac Update parse_node.py 2024-09-08 11:54:11 +02:00
semantic-release-bot
c5ffdef4ff ci(release): 1.18.1 [skip ci]
## [1.18.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.18.0...v1.18.1) (2024-09-08)

### Bug Fixes

* **browser_base_fetch:** correct function signature and async_mode handling ([007ff08](007ff084c6))
2024-09-08 09:45:47 +00:00
Marco Vinciguerra
5f09b1f698
Merge pull request #645 from tuhinmallick/main 2024-09-08 11:44:33 +02:00
Tuhin Mallick
007ff084c6
fix(browser_base_fetch): correct function signature and async_mode handling
- Added missing `async_mode` parameter to the function signature.
2024-09-08 10:59:04 +02:00
semantic-release-bot
29ef63d85a ci(release): 1.18.0 [skip ci]
## [1.18.0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.17.0...v1.18.0) (2024-09-08)

### Features

* **browser_base_fetch:** add async_mode to support both synchronous and asynchronous execution ([d56253d](d56253d183))
2024-09-08 08:54:33 +00:00
Marco Vinciguerra
e9a74e1644
Merge pull request #644 from tuhinmallick/main
feat(browser_base_fetch): add async_mode to support both synchronous …
2024-09-08 10:53:14 +02:00
Marco Vinciguerra
02eed1ac9d
Merge branch 'main' into main 2024-09-08 10:51:31 +02:00
Tuhin Mallick
d56253d183 feat(browser_base_fetch): add async_mode to support both synchronous and asynchronous execution
- Introduced an async_mode flag to allow users to choose between synchronous and asynchronous fetching using Browserbase.
- Refactored common logic (browserbase initialization and result list) to avoid redundancy.
- Added internal async handling with asyncio.to_thread() for non-blocking execution in async_mode.
- Maintained backward compatibility for existing synchronous functionality.
2024-09-08 08:49:08 +00:00
semantic-release-bot
cd4ffd761a ci(release): 1.17.0 [skip ci]
## [1.17.0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.16.0...v1.17.0) (2024-09-08)

### Features

* **docloaders:** Enhance browser_base_fetch function flexibility ([57fd01f](57fd01f9a7))

### Docs

* **sponsor:** 🅱️ Browserbase sponsor 🅱️ ([a540139](a5401394cc))
2024-09-08 07:07:13 +00:00
Marco Vinciguerra
7d39019f39
Merge pull request #642 from tuhinmallick/patch-1 2024-09-08 09:06:00 +02:00
Tuhin Mallick
57fd01f9a7
feat(docloaders): Enhance browser_base_fetch function flexibility
- Update browser_base_fetch to accept single URL or list of URLs
- Add text_content parameter for choosing between text-only and HTML output
- Improve type hinting and function documentation
- Ensure compatibility with latest Browserbase SDK interface
2024-09-08 01:41:39 +02:00
Marco Perini
a5401394cc
docs(sponsor): 🅱️ Browserbase sponsor 🅱️ 2024-09-08 01:12:29 +02:00
semantic-release-bot
a73fec5a98 ci(release): 1.17.0-beta.11 [skip ci]
## [1.17.0-beta.11](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.17.0-beta.10...v1.17.0-beta.11) (2024-09-07)

### Features

* add scrape_do_integration ([94e69a0](94e69a0515))
* fetch_node improved ([167f970](167f97040f))
2024-09-07 15:07:49 +00:00