Commit Graph

1145 Commits

Author SHA1 Message Date
octo-patch
6a2f8ecc7b feat: add MiniMax as a supported LLM provider
MiniMax provides an OpenAI-compatible API, making integration
straightforward. This adds:

- MiniMax model wrapper class (OpenAI-compatible)
- Model token mappings for MiniMax-M1, M2, and M2.5 models
- Provider routing in abstract_graph factory
- README update listing MiniMax as a supported provider
2026-03-14 22:54:38 +08:00
Vikrant-Khedkar
96dc59c797 remove client side validation to save cpu usage for user 2026-02-16 15:41:32 +05:30
Vikrant-Khedkar
b17b154bff fix: handle list content in telemetry event validation 2026-02-16 15:20:04 +05:30
Vikrant-Khedkar
518945dd75 use custom api for tracing 2026-02-16 13:24:45 +05:30
Marco Vinciguerra
9c24ecc180 feat: update model tokens 2026-01-30 16:45:57 +01:00
Marco Vinciguerra
f315f3a8c0 feat: add new tests 2026-01-20 14:01:09 +01:00
majiayu000
ebd909ad74 fix: use 'content' instead of 'context' in generate_answer_node_k_level
The PromptTemplate expects 'content' variable but the code was passing
'context', causing KeyError during graph execution.

Fixes #995

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-04 20:23:17 +08:00
majiayu000
621d3a5bba fix: update langchain imports for v1.0+ compatibility
- Change ResponseSchema and StructuredOutputParser imports to use
  langchain_classic.output_parsers instead of langchain_core.output_parsers
- Change create_extraction_chain import to use langchain_classic.chains
  instead of langchain.chains
- Add langchain-classic>=1.0.0 as explicit dependency
- Relax async-timeout requirement to >=4.0.0 for compatibility

Fixes #1017

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-01-04 20:18:49 +08:00
Jesse Peters
e81db730a2
Updates dependencies 2025-12-18 18:34:11 -06:00
Marco Vinciguerra
2cd3c8c6d0 feat: add openai gpt 5.2 2025-12-12 16:23:38 -08:00
mohammadehsanansari
e230856fbe added posthog proxy 2025-12-08 14:23:35 +05:30
Marco Vinciguerra
ece2bb4fa3
Merge pull request #1029 from ScrapeGraphAI/copilot/fix-e402-import-issues
Fix E402 import ordering in smart_scraper_graph.py
2025-12-04 08:07:54 -08:00
copilot-swe-agent[bot]
ced0373951 Fix E402 errors in smart_scraper_graph.py by moving imports to top
Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>
2025-12-03 23:35:17 +00:00
copilot-swe-agent[bot]
6deac76bec Apply black and isort formatting to modified files
Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>
2025-12-03 23:17:37 +00:00
copilot-swe-agent[bot]
7cb49e450a Fix whitespace formatting errors (W291, W292, W293)
Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>
2025-12-03 23:15:33 +00:00
Denis Ershov
6c5f7bb155
fix: add null check for document.body when reading scrollHeight
On some pages document.body can be null (non-standard DOM structure or early script execution).
Accessing document.body.scrollHeight caused errors in these cases.
2025-12-03 13:53:06 +03:00
copilot-swe-agent[bot]
8cf81c986a Add documentation explaining __new__ usage in Nvidia class
Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>
2025-11-26 20:01:37 +00:00
copilot-swe-agent[bot]
f23072cc8f Fix linting issues - remove unused imports and whitespace
Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>
2025-11-26 19:59:34 +00:00
copilot-swe-agent[bot]
cddf497c81 Add NVIDIA LLM integration support
- Created Nvidia wrapper model class in scrapegraphai/models/nvidia.py
- Updated models/__init__.py to export Nvidia class
- Updated abstract_graph.py to use Nvidia wrapper instead of direct ChatNVIDIA import
- Added nvidia as optional dependency in pyproject.toml
- Created example usage file for NVIDIA in examples/smart_scraper_graph/nvidia/

Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>
2025-11-26 19:56:44 +00:00
copilot-swe-agent[bot]
9439fe5932 Fix langchain import issues blocking tests
Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>
2025-11-26 17:33:59 +00:00
Harsh Abasaheb Chavan
e81a4ed745 feat: Add configurable timeout to FetchNode
- Add timeout parameter to FetchNode (default: 30 seconds)
- Apply timeout to requests.get() calls to prevent indefinite hangs
- Implement timeout for PDF parsing using ThreadPoolExecutor
- Propagate timeout to ChromiumLoader via loader_kwargs
- Add comprehensive unit tests for timeout functionality
- Fully backward compatible (timeout can be disabled with None)

Fixes issue with requests.get() and PDF parsing blocking indefinitely
on slow/unresponsive servers or large documents.

Usage:
  node_config={'timeout': 30}  # Custom timeout
  node_config={'timeout': None}  # Disable timeout
  node_config={}  # Use default 30s timeout
2025-11-01 09:08:13 +00:00
Lorenzo Padoan
8f0433cfb6 fix: url redirect 2025-10-23 19:11:16 -07:00
Marco Vinciguerra
79db9b9f13 feat: update model tokens 2025-10-22 09:45:34 -07:00
Mirza-Samad-Ahmed-Baig
4cc1fc5061 Fix critical schema transformation bugs and improve logging
- Fixed typo in docstring (trasfrom -> transforms)
- Added comprehensive error handling for missing schema keys
- Added fallback values for malformed array items and missing references
- Improved logging in SmartScraperGraph (replaced print with logger)
- Added proper validation for pydantic schema structure

These fixes prevent KeyError exceptions and improve production reliability.
2025-07-25 13:33:48 +05:00
Marco Vinciguerra
10c39b8978
Merge pull request #993 from ScrapeGraphAI/main
allignement
2025-06-24 17:29:48 +02:00
Samuel Yiu
df24c391a3
removed an extra space in TEMPLATE_CHUKS_CSV in generate_answer_node_csv_prompts.py 2025-06-21 15:42:42 +01:00
Marco Vinciguerra
0c2481fffe feat: add new oss link 2025-06-21 13:09:47 +02:00
Marco Vinciguerra
73403755da feat: add markdownify endpoint 2025-06-13 12:41:21 +02:00
Marco Vinciguerra
8c54162087 feat: update logs 2025-06-07 16:53:55 +02:00
Marco Vinciguerra
cd29791894 feat: add adv 2025-06-07 16:41:11 +02:00
Marco Vinciguerra
e846a14155 fix: bug on generate answer node 2025-06-06 20:42:20 +02:00
Marco Vinciguerra
9b4efaf287
Merge branch 'main' into pre/beta 2025-06-06 12:46:34 +02:00
Vinh Thieu
3f1827274c fix: grok integration and add new grok models 2025-05-31 00:13:44 +07:00
Marco Vinciguerra
0c476a4a7b feat: add grok integration 2025-05-30 14:25:24 +02:00
Marco Vinciguerra
8e706d43ef feat: add new models 2025-05-23 12:25:35 +02:00
flst01
e660914994
Fixed Issue: Burr integration ParseNode by updating parse_node.py
Before: Using Burr Integration in SmartScraperGraph resulted in error: ValueError: Action ParseNode attempted to write to keys {'content'} that it did not declare. It declared: (['parsed_doc'])!
2025-05-08 02:30:39 +02:00
flst01
79c8046711
Fix issue: Burr integration by updating fetch_node.py
Before Burr integration would cause an error in fetch:
ValueError: Action Fetch attempted to write to keys {'original_html'} that it did not declare. It declared: (['doc'])!
2025-05-08 02:10:55 +02:00
Marco Vinciguerra
5c37f3e490
Merge branch 'main' into pre/beta 2025-04-29 10:05:18 +02:00
Marco Vinciguerra
97ee48cb52 feat: add new openai models 2025-04-29 09:47:01 +02:00
souvik03-136
b552aa902f feat: enhance error handling and validation across utility modules
- Add Pydantic models for state validation in code_error_analysis.py and code_error_correction.py
- Implement comprehensive key existence checks to prevent KeyError exceptions
- Create custom exception hierarchy for better error management
- Add improved PDF detection with regex pattern matching in research_web.py
- Implement input validation for all public functions
- Add detailed error messages and type hints
2025-04-28 21:09:13 +05:30
Marco Vinciguerra
54d5e46d4c feat: add 4.1 integration 2025-04-15 15:05:41 +02:00
Marco Vinciguerra
3df0eaf405
Merge branch 'main' into pre/beta 2025-04-15 14:57:24 +02:00
CodeBeaver
b09a5838d1 codebeaver/pre/beta-963 - . 2025-04-14 07:50:46 +00:00
Marco Vinciguerra
df4aa5fd65
Merge pull request #962 from lrdoflnlss/add-js-scraping
tune scraper
2025-04-14 09:14:11 +02:00
lrd
98a7bab7b1 for test 2025-04-11 17:15:08 +03:00
Marco Vinciguerra
087cbcbc8f feat: add new model openai support 2025-03-26 08:44:04 +01:00
Pedro Perez de Ayala
562a97c3d2 Fix schema option not working 2025-03-13 21:57:19 +01:00
Marco Vinciguerra
fc0a148017 feat: add intrgration for o3min 2025-03-13 14:36:05 +01:00
Marco Vinciguerra
cff799b50d fix: add new gpt model 2025-03-12 16:39:52 +01:00
Marco Vinciguerra
18dbd05725 Merge branch 'main' of https://github.com/ScrapeGraphAI/Scrapegraph-ai 2025-03-10 11:27:35 +01:00