Commit Graph

2237 Commits

Author SHA1 Message Date
semantic-release-bot
fd57cc7c12 ci(release): 1.27.0-beta.9 [skip ci]
## [1.27.0-beta.9](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.8...v1.27.0-beta.9) (2024-10-24)

### Features

* add model integration gpt4 ([51c55eb](51c55eb3a2))
2024-10-24 22:39:44 +00:00
Marco Vinciguerra
9e5e76abbb
Merge pull request #765 from ScrapeGraphAI/add-model-integration-for-images
feat: add model integration gpt4
2024-10-25 00:38:16 +02:00
Marco Vinciguerra
51c55eb3a2 feat: add model integration gpt4 2024-10-24 09:10:51 +02:00
semantic-release-bot
4f1ed939e6 ci(release): 1.27.0-beta.8 [skip ci]
## [1.27.0-beta.8](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.7...v1.27.0-beta.8) (2024-10-24)

### Bug Fixes

* removed tokenizer ([a184716](a18471688f))

### CI

* **release:** 1.26.7 [skip ci] ([ec9ef2b](ec9ef2bcda))
2024-10-24 06:55:58 +00:00
Marco Vinciguerra
066e77dbe7
Merge branch 'main' into pre/beta 2024-10-24 08:54:17 +02:00
semantic-release-bot
407f1ce4eb ci(release): 1.27.0-beta.7 [skip ci]
## [1.27.0-beta.7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.6...v1.27.0-beta.7) (2024-10-24)

### Features

* refactoring of get_probable_tags node ([f658092](f658092dff))
2024-10-24 06:45:14 +00:00
Marco Vinciguerra
a1bd05da10
Merge pull request #763 from ScrapeGraphAI/refactoring-get-probable-tags
feat: refactoring of get_probable_tags node
2024-10-24 08:43:49 +02:00
Marco Vinciguerra
f658092dff feat: refactoring of get_probable_tags node 2024-10-23 12:15:16 +02:00
semantic-release-bot
94b9836ef6 ci(release): 1.27.0-beta.6 [skip ci]
## [1.27.0-beta.6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.5...v1.27.0-beta.6) (2024-10-23)

### Features

* add integration with scrape.do ([ae275ec](ae275ec5e8))
2024-10-23 10:09:36 +00:00
Marco Vinciguerra
ae275ec5e8 feat: add integration with scrape.do 2024-10-23 12:08:00 +02:00
semantic-release-bot
5002c713d5 ci(release): 1.27.0-beta.5 [skip ci]
## [1.27.0-beta.5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.4...v1.27.0-beta.5) (2024-10-22)

### Features

* refactoring of export functions ([0ea00c0](0ea00c078f))
2024-10-22 07:06:26 +00:00
Marco Vinciguerra
34d2964f08
Merge pull request #761 from ScrapeGraphAI/refactoring-export-functions
feat: refactoring of export functions
2024-10-22 09:04:57 +02:00
Marco Vinciguerra
11ae717623 add new doc
Some checks failed
CodeQL / Analyze (python) (push) Has been cancelled
/ build (push) Has been cancelled
Release / Build (push) Has been cancelled
Release / Release (push) Has been cancelled
2024-10-21 11:16:29 +02:00
Marco Vinciguerra
0ea00c078f feat: refactoring of export functions 2024-10-21 10:30:21 +02:00
semantic-release-bot
3d6bbcdaa3 ci(release): 1.27.0-beta.4 [skip ci]
## [1.27.0-beta.4](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.3...v1.27.0-beta.4) (2024-10-21)

### Features

* refactoring of ScrapeGraph to SmartScraperLiteGraph ([52b6bf5](52b6bf5fb8))
2024-10-21 08:14:25 +00:00
Marco Vinciguerra
52b6bf5fb8 feat: refactoring of ScrapeGraph to SmartScraperLiteGraph 2024-10-21 10:12:53 +02:00
Marco Vinciguerra
b84883bfd1 add smartscraper lite 2024-10-21 09:39:17 +02:00
Marco Vinciguerra
2991ca8dd2 add examples smart scraper lite 2024-10-21 09:33:40 +02:00
semantic-release-bot
f576afaf0c ci(release): 1.27.0-beta.3 [skip ci]
## [1.27.0-beta.3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.2...v1.27.0-beta.3) (2024-10-20)

### Features

* implement ScrapeGraph class for only web scraping automation ([612c644](612c644623))
* Implement SmartScraperMultiParseMergeFirstGraph class that scrapes a list of URLs and merge the content first and finally generates answers to a given prompt. ([3e3e1b2](3e3e1b2f3a))

### Bug Fixes

* fix the example variable name ([69ff649](69ff649556))

### chore

* fix example ([9cd9a87](9cd9a874f9))

### Test

* Add scrape_graph test ([cdb3c11](cdb3c1100e))
* Add smart_scraper_multi_parse_merge_first_graph test ([464b8b0](464b8b04ea))
2024-10-20 08:15:19 +00:00
Marco Vinciguerra
ffa1067f0d
Merge pull request #756 from shenghongtw/pre/beta
The smart_scraper_multi_graph method is too expensive
2024-10-20 10:13:47 +02:00
Marco Vinciguerra
b912904313
Merge pull request #758 from ScrapeGraphAI/fix-together-ai
chore: fix example
2024-10-19 07:25:57 +02:00
semantic-release-bot
ec9ef2bcda ci(release): 1.26.7 [skip ci]
## [1.26.7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.6...v1.26.7) (2024-10-19)

### Bug Fixes

* removed tokenizer ([a184716](a18471688f))
2024-10-19 05:20:39 +00:00
Marco Vinciguerra
a18471688f fix: removed tokenizer 2024-10-19 07:18:56 +02:00
Federico Aguzzi
9cd9a874f9 chore: fix example
Committing even though this is not the bug we were looking for
2024-10-18 22:35:33 +02:00
semantic-release-bot
d84d295389 ci(release): 1.27.0-beta.2 [skip ci]
## [1.27.0-beta.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.1...v1.27.0-beta.2) (2024-10-18)

### Bug Fixes

* refactoring of gpt2 tokenizer ([44c3f9c](44c3f9c989))

### CI

* **release:** 1.26.6 [skip ci] ([a4634c7](a4634c7331))
2024-10-18 20:18:25 +00:00
Federico Aguzzi
8cb9646a45 Merge branch 'main' into pre/beta 2024-10-18 22:16:39 +02:00
Marco Vinciguerra
58b11334d3 Merge branch 'main' of https://github.com/ScrapeGraphAI/Scrapegraph-ai 2024-10-18 17:11:36 +02:00
Marco Vinciguerra
3f71f103a7 scrape do key added 2024-10-18 17:11:33 +02:00
semantic-release-bot
a4634c7331 ci(release): 1.26.6 [skip ci]
## [1.26.6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.5...v1.26.6) (2024-10-18)

### Bug Fixes

* refactoring of gpt2 tokenizer ([44c3f9c](44c3f9c989))
2024-10-18 07:00:26 +00:00
Marco Vinciguerra
44c3f9c989 fix: refactoring of gpt2 tokenizer 2024-10-18 08:58:53 +02:00
Marco Vinciguerra
bde1e0fbad
Merge pull request #757 from yusefes/fix-tokenizer-loading
Fix tokenizer loading for GPT2
2024-10-18 08:57:42 +02:00
roryhaung
da2a3c8ec7 add smart_scraper_multi_lite_graph example 2024-10-18 03:19:00 +08:00
roryhaung
28dda2b476 rename graph name 2024-10-18 03:14:08 +08:00
roryhaung
3e8f047ab6 Renamed smart_scraper_multi_abstract_graph back to smart_scraper_multi_graph. 2024-10-18 03:10:57 +08:00
roryhaung
974f88a77e rename SmartScraperMultiGraph to SmartScraperMultiLiteGraph 2024-10-18 03:01:59 +08:00
roryhaung
6dbac93668 rename the SmartScraperMultiParseMergeFirstGraph to SmartScraperMultiGraph 2024-10-18 01:52:39 +08:00
roryhaung
78bd40c3b5 modify the graph name 2024-10-18 01:51:26 +08:00
roryhaung
dfc67c670d rename the smart_scraper_multi_parse_merge_first_graph to smart_scraper_multi_graph,so delete this file 2024-10-18 01:49:54 +08:00
roryhaung
94d8042c2a rename smart_scraper_multi_graph to smart_scraper_multi_abstract_graph 2024-10-18 01:39:42 +08:00
roryhaung
69ff649556 fix: fix the example variable name 2024-10-18 01:36:29 +08:00
yusefes
d291819be3 Fix tokenizer loading for GPT2
Fixes #752

Fix the issue with loading the tokenizer for 'gpt2'.

* **scrapegraphai/utils/tokenizer.py**
  - Add a check for `GPT2TokenizerFast` in the `num_tokens_calculus` function.
  - Import `GPT2TokenizerFast` from `transformers`.

* **scrapegraphai/utils/tokenizers/tokenizer_ollama.py**
  - Modify the `num_tokens_ollama` function to handle `GPT2TokenizerFast`.

* **tests/graphs/smart_scraper_ollama_test.py**
  - Add a test case to verify the tokenizer loading for `GPT2TokenizerFast`.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/ScrapeGraphAI/Scrapegraph-ai/issues/752?shareId=XXXX-XXXX-XXXX-XXXX).
2024-10-17 16:34:13 +03:30
shenghong
2512262be8
Rename smart_scraper_multi_parse_merge_first_graph_test.py to smart_scraper_multi_parse_merge_first_graph_openai_test.py 2024-10-17 06:46:34 +08:00
roryhaung
9b78e2d755 Merge branch 'pre/beta' of https://github.com/shenghongtw/Scrapegraph-ai into pre/beta 2024-10-17 03:20:46 +08:00
semantic-release-bot
9266a36b2e ci(release): 1.27.0-beta.1 [skip ci]
## [1.27.0-beta.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.6-beta.1...v1.27.0-beta.1) (2024-10-16)

### Features

* add conditional node structure to the smart_scraper_graph and implemented a structured way to check condition ([cacd9cd](cacd9cde00))
2024-10-16 15:54:35 +00:00
Marco Vinciguerra
aaa011cc89
Merge pull request #754 from ekinsenler/cond_node_refactor
feat: add conditional node to the smart_scraper_graph
2024-10-16 17:53:10 +02:00
ekinsenler
eaa83edc04 update project requirement and add example 2024-10-16 15:21:23 +03:00
Marco Vinciguerra
488821abfa Update smart_scraper_openai.py 2024-10-16 14:07:01 +02:00
roryhaung
464b8b04ea test: Add smart_scraper_multi_parse_merge_first_graph test 2024-10-16 20:05:36 +08:00
roryhaung
cdb3c1100e test: Add scrape_graph test 2024-10-16 20:05:03 +08:00
roryhaung
3e3e1b2f3a feat: Implement SmartScraperMultiParseMergeFirstGraph class that scrapes a list of URLs and merge the content first and finally generates answers to a given prompt.
(Different from the SmartScraperMultiGraph is that in this case the content is merged before to be processed by the llm.)
2024-10-16 19:38:53 +08:00