Scrapegraph-ai

mirror of https://github.com/VinciGit00/Scrapegraph-ai.git synced 2026-06-04 21:01:04 +08:00

Author	SHA1	Message	Date
octo-patch	6a2f8ecc7b	feat: add MiniMax as a supported LLM provider MiniMax provides an OpenAI-compatible API, making integration straightforward. This adds: - MiniMax model wrapper class (OpenAI-compatible) - Model token mappings for MiniMax-M1, M2, and M2.5 models - Provider routing in abstract_graph factory - README update listing MiniMax as a supported provider	2026-03-14 22:54:38 +08:00
Vikrant-Khedkar	96dc59c797	remove client side validation to save cpu usage for user	2026-02-16 15:41:32 +05:30
Vikrant-Khedkar	b17b154bff	fix: handle list content in telemetry event validation	2026-02-16 15:20:04 +05:30
Vikrant-Khedkar	518945dd75	use custom api for tracing	2026-02-16 13:24:45 +05:30
Marco Vinciguerra	9c24ecc180	feat: update model tokens	2026-01-30 16:45:57 +01:00
Marco Vinciguerra	f315f3a8c0	feat: add new tests	2026-01-20 14:01:09 +01:00
majiayu000	ebd909ad74	fix: use 'content' instead of 'context' in generate_answer_node_k_level The PromptTemplate expects 'content' variable but the code was passing 'context', causing KeyError during graph execution. Fixes #995 Signed-off-by: majiayu000 <1835304752@qq.com>	2026-01-04 20:23:17 +08:00
majiayu000	621d3a5bba	fix: update langchain imports for v1.0+ compatibility - Change ResponseSchema and StructuredOutputParser imports to use langchain_classic.output_parsers instead of langchain_core.output_parsers - Change create_extraction_chain import to use langchain_classic.chains instead of langchain.chains - Add langchain-classic>=1.0.0 as explicit dependency - Relax async-timeout requirement to >=4.0.0 for compatibility Fixes #1017 Signed-off-by: majiayu000 <1835304752@qq.com>	2026-01-04 20:18:49 +08:00
Jesse Peters	e81db730a2	Updates dependencies	2025-12-18 18:34:11 -06:00
Marco Vinciguerra	2cd3c8c6d0	feat: add openai gpt 5.2	2025-12-12 16:23:38 -08:00
mohammadehsanansari	e230856fbe	added posthog proxy	2025-12-08 14:23:35 +05:30
Marco Vinciguerra	ece2bb4fa3	Merge pull request #1029 from ScrapeGraphAI/copilot/fix-e402-import-issues Fix E402 import ordering in smart_scraper_graph.py	2025-12-04 08:07:54 -08:00
copilot-swe-agent[bot]	ced0373951	Fix E402 errors in smart_scraper_graph.py by moving imports to top Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>	2025-12-03 23:35:17 +00:00
copilot-swe-agent[bot]	6deac76bec	Apply black and isort formatting to modified files Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>	2025-12-03 23:17:37 +00:00
copilot-swe-agent[bot]	7cb49e450a	Fix whitespace formatting errors (W291, W292, W293) Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>	2025-12-03 23:15:33 +00:00
Denis Ershov	6c5f7bb155	fix: add null check for document.body when reading scrollHeight On some pages document.body can be null (non-standard DOM structure or early script execution). Accessing document.body.scrollHeight caused errors in these cases.	2025-12-03 13:53:06 +03:00
copilot-swe-agent[bot]	8cf81c986a	Add documentation explaining __new__ usage in Nvidia class Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>	2025-11-26 20:01:37 +00:00
copilot-swe-agent[bot]	f23072cc8f	Fix linting issues - remove unused imports and whitespace Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>	2025-11-26 19:59:34 +00:00
copilot-swe-agent[bot]	cddf497c81	Add NVIDIA LLM integration support - Created Nvidia wrapper model class in scrapegraphai/models/nvidia.py - Updated models/__init__.py to export Nvidia class - Updated abstract_graph.py to use Nvidia wrapper instead of direct ChatNVIDIA import - Added nvidia as optional dependency in pyproject.toml - Created example usage file for NVIDIA in examples/smart_scraper_graph/nvidia/ Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>	2025-11-26 19:56:44 +00:00
copilot-swe-agent[bot]	9439fe5932	Fix langchain import issues blocking tests Co-authored-by: VinciGit00 <88108002+VinciGit00@users.noreply.github.com>	2025-11-26 17:33:59 +00:00
Harsh Abasaheb Chavan	e81a4ed745	feat: Add configurable timeout to FetchNode - Add timeout parameter to FetchNode (default: 30 seconds) - Apply timeout to requests.get() calls to prevent indefinite hangs - Implement timeout for PDF parsing using ThreadPoolExecutor - Propagate timeout to ChromiumLoader via loader_kwargs - Add comprehensive unit tests for timeout functionality - Fully backward compatible (timeout can be disabled with None) Fixes issue with requests.get() and PDF parsing blocking indefinitely on slow/unresponsive servers or large documents. Usage: node_config={'timeout': 30} # Custom timeout node_config={'timeout': None} # Disable timeout node_config={} # Use default 30s timeout	2025-11-01 09:08:13 +00:00
Lorenzo Padoan	8f0433cfb6	fix: url redirect	2025-10-23 19:11:16 -07:00
Marco Vinciguerra	79db9b9f13	feat: update model tokens	2025-10-22 09:45:34 -07:00
Mirza-Samad-Ahmed-Baig	4cc1fc5061	Fix critical schema transformation bugs and improve logging - Fixed typo in docstring (trasfrom -> transforms) - Added comprehensive error handling for missing schema keys - Added fallback values for malformed array items and missing references - Improved logging in SmartScraperGraph (replaced print with logger) - Added proper validation for pydantic schema structure These fixes prevent KeyError exceptions and improve production reliability.	2025-07-25 13:33:48 +05:00
Marco Vinciguerra	10c39b8978	Merge pull request #993 from ScrapeGraphAI/main allignement	2025-06-24 17:29:48 +02:00
Samuel Yiu	df24c391a3	removed an extra space in TEMPLATE_CHUKS_CSV in generate_answer_node_csv_prompts.py	2025-06-21 15:42:42 +01:00
Marco Vinciguerra	0c2481fffe	feat: add new oss link	2025-06-21 13:09:47 +02:00
Marco Vinciguerra	73403755da	feat: add markdownify endpoint	2025-06-13 12:41:21 +02:00
Marco Vinciguerra	8c54162087	feat: update logs	2025-06-07 16:53:55 +02:00
Marco Vinciguerra	cd29791894	feat: add adv	2025-06-07 16:41:11 +02:00
Marco Vinciguerra	e846a14155	fix: bug on generate answer node	2025-06-06 20:42:20 +02:00
Marco Vinciguerra	9b4efaf287	Merge branch 'main' into pre/beta	2025-06-06 12:46:34 +02:00
Vinh Thieu	3f1827274c	fix: grok integration and add new grok models	2025-05-31 00:13:44 +07:00
Marco Vinciguerra	0c476a4a7b	feat: add grok integration	2025-05-30 14:25:24 +02:00
Marco Vinciguerra	8e706d43ef	feat: add new models	2025-05-23 12:25:35 +02:00
flst01	e660914994	Fixed Issue: Burr integration ParseNode by updating parse_node.py Before: Using Burr Integration in SmartScraperGraph resulted in error: ValueError: Action ParseNode attempted to write to keys {'content'} that it did not declare. It declared: (['parsed_doc'])!	2025-05-08 02:30:39 +02:00
flst01	79c8046711	Fix issue: Burr integration by updating fetch_node.py Before Burr integration would cause an error in fetch: ValueError: Action Fetch attempted to write to keys {'original_html'} that it did not declare. It declared: (['doc'])!	2025-05-08 02:10:55 +02:00
Marco Vinciguerra	5c37f3e490	Merge branch 'main' into pre/beta	2025-04-29 10:05:18 +02:00
Marco Vinciguerra	97ee48cb52	feat: add new openai models	2025-04-29 09:47:01 +02:00
souvik03-136	b552aa902f	feat: enhance error handling and validation across utility modules - Add Pydantic models for state validation in code_error_analysis.py and code_error_correction.py - Implement comprehensive key existence checks to prevent KeyError exceptions - Create custom exception hierarchy for better error management - Add improved PDF detection with regex pattern matching in research_web.py - Implement input validation for all public functions - Add detailed error messages and type hints	2025-04-28 21:09:13 +05:30
Marco Vinciguerra	54d5e46d4c	feat: add 4.1 integration	2025-04-15 15:05:41 +02:00
Marco Vinciguerra	3df0eaf405	Merge branch 'main' into pre/beta	2025-04-15 14:57:24 +02:00
CodeBeaver	b09a5838d1	codebeaver/pre/beta-963 - .	2025-04-14 07:50:46 +00:00
Marco Vinciguerra	df4aa5fd65	Merge pull request #962 from lrdoflnlss/add-js-scraping tune scraper	2025-04-14 09:14:11 +02:00
lrd	98a7bab7b1	for test	2025-04-11 17:15:08 +03:00
Marco Vinciguerra	087cbcbc8f	feat: add new model openai support	2025-03-26 08:44:04 +01:00
Pedro Perez de Ayala	562a97c3d2	Fix schema option not working	2025-03-13 21:57:19 +01:00
Marco Vinciguerra	fc0a148017	feat: add intrgration for o3min	2025-03-13 14:36:05 +01:00
Marco Vinciguerra	cff799b50d	fix: add new gpt model	2025-03-12 16:39:52 +01:00
Marco Vinciguerra	18dbd05725	Merge branch 'main' of https://github.com/ScrapeGraphAI/Scrapegraph-ai	2025-03-10 11:27:35 +01:00

1 2 3 4 5 ...

1145 Commits