Commit Graph

730 Commits

Author SHA1 Message Date
Federico Minutoli
fc2aa3ac1c Merge branch 'pre/beta' of https://github.com/DiTo97/Scrapegraph-ai into fix/fetch-node-proxybroker 2024-05-10 21:20:40 +02:00
Federico Minutoli
768719cce8 feat(safe-web-driver): enchanced the original AsyncChromiumLoader web driver with proxy protection and flexible kwargs and backend
the original class prevents passing kwargs down to the playwright backend, making some config unfeasible, including passing a proxy server to the web driver.

the new class has backward compatibility with the original, but 1) allows any kwarg to be passed down to the web driver, 2) allows specifying the web driver backend (only playwright is supported for now) in case more (e.g., selenium) will be supported in the future and 3) automatically fetches a suitable proxy if one is not passed already
2024-05-10 21:13:38 +02:00
Federico Minutoli
217013181d feat(proxy-rotation): add parse (IP address) or search (from broker) functionality for proxy rotation
the broker has been made fully configurable for anonymity level, admissible locations, scheme and max shape not to waste resources, unlike the original `free-proxy` package.

other options have been explored (e.g., `proxybroker`, `proxybroker2`) due to their built-in proxy server and rotation capabilities, but the former is no longer maintained, and the latter has issue with any python version outside of python 3.9
2024-05-10 21:09:48 +02:00
Federico Minutoli
db2234bf5d feat(webdriver-backend): add dynamic import scripts from module and file 2024-05-10 21:06:05 +02:00
Federico Minutoli
2f4fd45700 fix(pytest): add dependency for mocking testing functions 2024-05-10 21:05:48 +02:00
semantic-release-bot
7ae50c035e ci(release): 0.11.0-beta.2 [skip ci]
## [0.11.0-beta.2](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.11.0-beta.1...v0.11.0-beta.2) (2024-05-10)

### Features

* revert fetch_node ([864aa91](864aa91326))
2024-05-10 13:13:20 +00:00
Marco Perini
864aa91326 feat: revert fetch_node 2024-05-10 15:11:54 +02:00
semantic-release-bot
63c0dd9372 ci(release): 0.11.0-beta.1 [skip ci]
## [0.11.0-beta.1](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.10.0...v0.11.0-beta.1) (2024-05-10)

### Features

* Add support for passing pdf path as source ([f10f3b1](f10f3b1438))
* update info ([4ed0fb8](4ed0fb89c3))

### Bug Fixes

* add json integration ([0ab31c3](0ab31c3fdb))
* Augment the information getting fetched from a webpage ([f8ce3d5](f8ce3d5916))
* fixed bugs for csv and xml ([324e977](324e977b85))
* limit python version to < 3.12 ([a37fbbc](a37fbbcbcf))

### CI

* **release:** 0.10.0-beta.3 [skip ci] ([ad32298](ad32298e70))
* **release:** 0.10.0-beta.4 [skip ci] ([548bff9](548bff9d77))
* **release:** 0.10.0-beta.5 [skip ci] ([28c9dce](28c9dce7cb))
* **release:** 0.10.0-beta.6 [skip ci] ([460d292](460d292af2))
2024-05-10 09:15:24 +00:00
Marco Vinciguerra
4e62689eaa
Merge pull request #203 from mayurdb/fetchNodeFix
fix: Augment the information getting fetched from a webpage
2024-05-10 11:14:01 +02:00
Marco Vinciguerra
99adc9799f
Merge branch 'pre/beta' into fetchNodeFix 2024-05-10 11:13:54 +02:00
mayurdb
f8ce3d5916 fix: Augment the information getting fetched from a webpage 2024-05-10 13:28:53 +05:30
semantic-release-bot
460d292af2 ci(release): 0.10.0-beta.6 [skip ci]
## [0.10.0-beta.6](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.10.0-beta.5...v0.10.0-beta.6) (2024-05-09)

### Bug Fixes

* add json integration ([0ab31c3](0ab31c3fdb))
2024-05-09 19:08:34 +00:00
VinciGit00
0ab31c3fdb fix: add json integration 2024-05-09 21:07:07 +02:00
semantic-release-bot
28c9dce7cb ci(release): 0.10.0-beta.5 [skip ci]
## [0.10.0-beta.5](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.10.0-beta.4...v0.10.0-beta.5) (2024-05-09)

### Bug Fixes

* fixed bugs for csv and xml ([324e977](324e977b85))
2024-05-09 18:49:05 +00:00
VinciGit00
c32caadf00 Merge branch 'pre/beta' of https://github.com/VinciGit00/Scrapegraph-ai into pre/beta 2024-05-09 20:47:40 +02:00
VinciGit00
324e977b85 fix: fixed bugs for csv and xml 2024-05-09 20:46:46 +02:00
semantic-release-bot
548bff9d77 ci(release): 0.10.0-beta.4 [skip ci]
## [0.10.0-beta.4](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.10.0-beta.3...v0.10.0-beta.4) (2024-05-09)

### Features

* Add support for passing pdf path as source ([f10f3b1](f10f3b1438))

### Bug Fixes

* limit python version to < 3.12 ([a37fbbc](a37fbbcbcf))
2024-05-09 18:17:45 +00:00
VinciGit00
84e8d12793 update lock 2024-05-09 20:16:07 +02:00
Marco Vinciguerra
a1d580c4eb
Merge pull request #195 from shorthills-ai/pre/beta 2024-05-09 19:11:58 +02:00
Shubham Kamboj
905b34510f
Merge pull request #4 from shkamboj1/pre/beta
feat: Add support for passing pdf path as source
2024-05-09 16:26:31 +00:00
Shubham Kamboj
f10f3b1438 feat: Add support for passing pdf path as source 2024-05-09 21:55:05 +05:30
Marco Vinciguerra
590aab792d
Merge pull request #193 from daniele-roncaglioni/189-poetry-python-version-issue 2024-05-09 15:55:57 +02:00
roncaglionidaniele
a37fbbcbcf fix: limit python version to < 3.12 2024-05-09 15:47:01 +02:00
semantic-release-bot
ad32298e70 ci(release): 0.10.0-beta.3 [skip ci]
## [0.10.0-beta.3](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.10.0-beta.2...v0.10.0-beta.3) (2024-05-09)

### Features

* update info ([4ed0fb8](4ed0fb89c3))
2024-05-09 13:32:21 +00:00
Marco Perini
7e00c1497e
Merge pull request #183 from VinciGit00/182-googlegenerativeaiembeddings-is-not-defined
fixed gemini embeddings
2024-05-09 15:30:55 +02:00
VinciGit00
94156755d1 Update abstract_graph.py 2024-05-09 12:29:39 +02:00
VinciGit00
403979337c Update abstract_graph.py 2024-05-09 10:46:31 +02:00
VinciGit00
8272d736a6 add tokenizatio for mxbai-embed-large 2024-05-08 21:50:42 +02:00
VinciGit00
4ed0fb89c3 feat: update info 2024-05-08 21:25:03 +02:00
VinciGit00
e7d39a5daf fixed gemini embeddings 2024-05-08 21:24:17 +02:00
semantic-release-bot
0ca52b1da6 ci(release): 0.10.0 [skip ci]
## [0.10.0](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.9.0...v0.10.0) (2024-05-08)

### Features

* add claude documentation ([5bdee55](5bdee55876))
* add gemini embeddings ([79daa4c](79daa4c112))
* add llava integration ([019b722](019b7223dc))
* add new hugging_face models ([d5547a4](d5547a450c))
* Fix bug for gemini case when embeddings config not passed ([726de28](726de28898))
* fixed custom_graphs example and robots_node ([84fcb44](84fcb44aaa))
* multiple graph instances ([dbb614a](dbb614a8dd))
* **node:** multiple url search in SearchGraph + fixes ([930adb3](930adb38f2))
* refactoring search function ([aeb1acb](aeb1acbf05))

### Bug Fixes

* bug on .toml ([f7d66f5](f7d66f5181))
* **llm:** fixed gemini api_key ([fd01b73](fd01b73b71))
* **examples:** local, mixed models and fixed SearchGraph embeddings problem ([6b71ec1](6b71ec1d2b))
* **examples:** openai std examples ([186c0d0](186c0d035d))
* removed .lock file for deployment ([d4c7d4e](d4c7d4e7fc))

### Docs

* update README.md ([17ec992](17ec992b49))

### CI

* **release:** 0.10.0-beta.1 [skip ci] ([c47a505](c47a505750))
* **release:** 0.10.0-beta.2 [skip ci] ([3f0e069](3f0e0694f3))
* **release:** 0.9.0-beta.2 [skip ci] ([5aa600c](5aa600cb0a))
* **release:** 0.9.0-beta.3 [skip ci] ([da8c72c](da8c72ce13))
* **release:** 0.9.0-beta.4 [skip ci] ([8c5397f](8c5397f67a))
* **release:** 0.9.0-beta.5 [skip ci] ([532adb6](532adb639d))
* **release:** 0.9.0-beta.6 [skip ci] ([8c0b46e](8c0b46eb40))
* **release:** 0.9.0-beta.7 [skip ci] ([6911e21](6911e21584))
* **release:** 0.9.0-beta.8 [skip ci] ([739aaa3](739aaa33c3))
2024-05-08 13:52:51 +00:00
Marco Perini
5ea4df4f7d
Merge pull request #170 from VinciGit00/pre/beta
New release, many new features and bug-fix
2024-05-08 15:49:21 +02:00
semantic-release-bot
3f0e0694f3 ci(release): 0.10.0-beta.2 [skip ci]
## [0.10.0-beta.2](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.10.0-beta.1...v0.10.0-beta.2) (2024-05-08)

### Bug Fixes

* **examples:** local, mixed models and fixed SearchGraph embeddings problem ([6b71ec1](6b71ec1d2b))
* **examples:** openai std examples ([186c0d0](186c0d035d))
* removed .lock file for deployment ([d4c7d4e](d4c7d4e7fc))

### Docs

* update README.md ([17ec992](17ec992b49))
2024-05-08 13:47:59 +00:00
Marco Perini
d4c7d4e7fc
fix: removed .lock file for deployment 2024-05-08 15:44:21 +02:00
Marco Perini
71fcdfaaf9
Merge pull request #177 from VinciGit00/fix-bugs
Fix embeddings and local models bugs
2024-05-08 15:40:19 +02:00
Marco Perini
6b71ec1d2b fix(examples): local, mixed models and fixed SearchGraph embeddings problem 2024-05-08 15:36:26 +02:00
Marco Perini
186c0d035d fix(examples): openai std examples 2024-05-08 14:56:44 +02:00
Marco Vinciguerra
d86437fd72
Merge pull request #176 from kahwoo/patch-1
Update examples.rst
2024-05-08 12:59:07 +02:00
kahwoo
b1df161818
Update examples.rst
fix formatting and add other needed models
2024-05-08 20:39:54 +10:00
Marco Vinciguerra
8632c0a06d
Merge pull request #169 from VinciGit00/main
allignment
2024-05-07 22:24:27 +02:00
Marco Vinciguerra
9a873ca10a
Merge pull request #167 from KPCOFGS/main
Update README.md
2024-05-07 13:37:34 +02:00
Shixian Sheng
96f9d63ac5
Update README.md 2024-05-07 07:13:58 -04:00
Marco Vinciguerra
cbc7b1f1cd
Merge pull request #165 from eltociear/patch-1 2024-05-07 08:41:25 +02:00
Ikko Eltociear Ashimine
17ec992b49
docs: update README.md
avaiable -> available
2024-05-07 11:36:41 +09:00
VinciGit00
2258fe5ac0 add new search graph examples 2024-05-06 22:31:48 +02:00
semantic-release-bot
c47a505750 ci(release): 0.10.0-beta.1 [skip ci]
## [0.10.0-beta.1](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.9.0...v0.10.0-beta.1) (2024-05-06)

### Features

* add claude documentation ([5bdee55](5bdee55876))
* add gemini embeddings ([79daa4c](79daa4c112))
* add llava integration ([019b722](019b7223dc))
* add new hugging_face models ([d5547a4](d5547a450c))
* Fix bug for gemini case when embeddings config not passed ([726de28](726de28898))
* fixed custom_graphs example and robots_node ([84fcb44](84fcb44aaa))
* multiple graph instances ([dbb614a](dbb614a8dd))
* **node:** multiple url search in SearchGraph + fixes ([930adb3](930adb38f2))
* refactoring search function ([aeb1acb](aeb1acbf05))

### Bug Fixes

* bug on .toml ([f7d66f5](f7d66f5181))
* **llm:** fixed gemini api_key ([fd01b73](fd01b73b71))

### CI

* **release:** 0.9.0-beta.2 [skip ci] ([5aa600c](5aa600cb0a))
* **release:** 0.9.0-beta.3 [skip ci] ([da8c72c](da8c72ce13))
* **release:** 0.9.0-beta.4 [skip ci] ([8c5397f](8c5397f67a))
* **release:** 0.9.0-beta.5 [skip ci] ([532adb6](532adb639d))
* **release:** 0.9.0-beta.6 [skip ci] ([8c0b46e](8c0b46eb40))
* **release:** 0.9.0-beta.7 [skip ci] ([6911e21](6911e21584))
* **release:** 0.9.0-beta.8 [skip ci] ([739aaa3](739aaa33c3))
2024-05-06 12:53:53 +00:00
Marco Vinciguerra
88f04bf212
Merge pull request #161 from cemkod/main
Support for Anthropic Claude 3 models
2024-05-06 14:50:32 +02:00
VinciGit00
5a67bca0db Merge branch 'pre/beta' into pr/161 2024-05-06 14:50:04 +02:00
VinciGit00
ac6d2005bb Merge branch 'pre/beta' of https://github.com/VinciGit00/Scrapegraph-ai into pre/beta 2024-05-06 14:46:25 +02:00
VinciGit00
cbd77dfdb0 removed claude 2024-05-06 14:46:24 +02:00