Scrapegraph-ai/scrapegraphai
Umut CAN 827f7260ad This commit focuses on optimizing the utility modules in the codebase for better performance and maintainability. Key improvements include: - More efficient HTML processing with combined regex operations and optimized tag handling - Enhanced deep copy functionality with better type handling and optimized recursion - Refactored web search with improved error handling and modular helper functions The changes maintain all existing functionality while improving code quality, performance, and maintainability. Documentation and type hints have been enhanced throughout.
Optimize utils modules for better performance and maintainability

- Improve HTML cleanup and minification:
  - Combine regex operations for better performance
  - Add better error handling for HTML processing
  - Optimize tag removal and attribute filtering

- Enhance deep copy functionality:
  - Add special case handling for primitive types
  - Improve type checking and error handling
  - Optimize recursive copying for collections

- Refactor web search functionality:
  - Add input validation and error handling
  - Split search logic into separate helper functions
  - Improve proxy handling and configuration
  - Add better timeout and error management
  - Optimize URL filtering and processing

Technical improvements:
- Better type hints and documentation
- More efficient data structures
- Improved error handling and validation
- Reduced code duplication
- Better separation of concerns

No breaking changes - all existing functionality maintained
2024-10-28 22:40:32 +03:00
..
builders add docstring files 2024-10-24 15:28:27 +02:00
docloaders add docstring files 2024-10-24 15:28:27 +02:00
graphs Merge pull request #764 from ScrapeGraphAI/pre/beta 2024-10-26 10:05:15 +02:00
helpers add docstring files 2024-10-24 15:28:27 +02:00
integrations removed unused files 2024-10-12 09:41:02 +02:00
models add docstring files 2024-10-24 15:28:27 +02:00
nodes Merge pull request #764 from ScrapeGraphAI/pre/beta 2024-10-26 10:05:15 +02:00
prompts Merge pull request #764 from ScrapeGraphAI/pre/beta 2024-10-26 10:05:15 +02:00
telemetry removed unused files 2024-10-12 09:41:02 +02:00
utils This commit focuses on optimizing the utility modules in the codebase for better performance and maintainability. Key improvements include: - More efficient HTML processing with combined regex operations and optimized tag handling - Enhanced deep copy functionality with better type handling and optimized recursion - Refactored web search with improved error handling and modular helper functions The changes maintain all existing functionality while improving code quality, performance, and maintainability. Documentation and type hints have been enhanced throughout. 2024-10-28 22:40:32 +03:00
__init__.py add docstring files 2024-10-24 15:28:27 +02:00