mirror of
https://github.com/VinciGit00/Scrapegraph-ai.git
synced 2026-06-12 21:01:54 +08:00
- Add timeout parameter to FetchNode (default: 30 seconds)
- Apply timeout to requests.get() calls to prevent indefinite hangs
- Implement timeout for PDF parsing using ThreadPoolExecutor
- Propagate timeout to ChromiumLoader via loader_kwargs
- Add comprehensive unit tests for timeout functionality
- Fully backward compatible (timeout can be disabled with None)
Fixes issue with requests.get() and PDF parsing blocking indefinitely
on slow/unresponsive servers or large documents.
Usage:
node_config={'timeout': 30} # Custom timeout
node_config={'timeout': None} # Disable timeout
node_config={} # Use default 30s timeout
|
||
|---|---|---|
| .. | ||
| graphs | ||
| inputs | ||
| nodes | ||
| utils | ||
| Readme.md | ||
| test_chromium.py | ||
| test_cleanup_html.py | ||
| test_csv_scraper_multi_graph.py | ||
| test_depth_search_graph.py | ||
| test_fetch_node_timeout.py | ||
| test_generate_answer_node.py | ||
| test_json_scraper_graph.py | ||
| test_json_scraper_multi_graph.py | ||
| test_models_tokens.py | ||
| test_omni_search_graph.py | ||
| test_scrape_do.py | ||
| test_script_creator_multi_graph.py | ||
| test_search_graph.py | ||
| test_smart_scraper_multi_concat_graph.py | ||
Test section
Regarding the tests for the folder graphs and nodes it was created a specific repo as a example (link of the repo). The test website is hosted here. Remember to activating Ollama and having installed the LLM on your pc
For running the tests run the command:
pytest