Scrapegraph-ai/examples
2025-01-06 02:36:05 +01:00
..
anthropic chore: pandas package is now optional 2025-01-06 02:36:05 +01:00
azure add examples for document_scraper 2024-11-24 10:27:24 +01:00
bedrock add examples for document_scraper 2024-11-24 10:27:24 +01:00
benchmarks docs: Updated the graph_config in the documentation. 2024-09-12 14:50:02 +08:00
deepseek add examples for document_scraper 2024-11-24 10:27:24 +01:00
ernie add examples for document_scraper 2024-11-24 10:27:24 +01:00
extras #772 added functionality to change browser to firefox 2024-12-18 09:10:46 +05:30
fireworks add examples for document_scraper 2024-11-24 10:27:24 +01:00
google_genai add examples for document_scraper 2024-11-24 10:27:24 +01:00
google_vertexai add examples for document_scraper 2024-11-24 10:27:24 +01:00
groq add examples for document_scraper 2024-11-24 10:27:24 +01:00
huggingfacehub add examples for document_scraper 2024-11-24 10:27:24 +01:00
integrations feat: updated pydantic to v2 2024-09-17 23:08:56 +02:00
local_models add examples for document_scraper 2024-11-24 10:27:24 +01:00
mistral add examples for document_scraper 2024-11-24 10:27:24 +01:00
model_instance test:add moonshot example 2024-08-21 19:31:27 +08:00
moonshot add examples for document_scraper 2024-11-24 10:27:24 +01:00
nemotron add examples for document_scraper 2024-11-24 10:27:24 +01:00
oneapi add examples for document_scraper 2024-11-24 10:27:24 +01:00
openai chore: pandas package is now optional 2025-01-06 02:36:05 +01:00
scrapegraph-api feat: add api integration 2024-11-24 10:54:40 +01:00
single_node chore(examples): update model names 2024-08-27 12:41:18 +02:00
together add examples for document_scraper 2024-11-24 10:27:24 +01:00
readme.md add new test for script generator 2024-04-18 10:39:53 +02:00

Benchmark analysis

Local models

The two websites benchmark are:

Both are strored locally as txt file in .txt format because in this way we do not have to think about the internet connection

The time is measured in seconds

The model runned for this benchmark is Mistral on Ollama with nomic-embed-text

Hardware Example 1 Example 2
Macbook pro 14' m1 11.60s 26.61s
Macbook pro 16' m2 max 8.05s 12.17s

Note: the examples on Docker are not runned on other devices than the Macbook because the performance are to slow (10 times slower than Ollama). Indeed the results are the following:

Hardware Example 1 Example 2
Macbook 14' m1 pro 139.89 Too long

Performance on APIs services

Example 1: personal portfolio

URL: https://perinim.github.io/projects Task: List me all the projects with their description.

Name Execution time (seconds) total_tokens prompt_tokens completion_tokens successful_requests total_cost_USD
gpt-3.5-turbo 25.22 445 272 173 1 0.000754
gpt-4-turbo-preview 9.53 449 272 177 1 0.00803

Example 2: Wired

URL: https://www.wired.com Task: List me all the articles with their description.

Name Execution time (seconds) total_tokens prompt_tokens completion_tokens successful_requests total_cost_USD
gpt-3.5-turbo 25.89 445 272 173 1 0.000754
gpt-4-turbo-preview 64.70 3573 2199 1374 1 0.06321