Scrapegraph-ai/examples/extras/undected_playwright.py
Abe Flansburg a86e7d6210
enhancement: add support for Playwright's storage_state parameter (#832)
* add support for Playwright `storage_state`

* add storage_state param to node_config

* add sleep for testing

* add sleep in _with_js_support for testing

* remove asyncio.sleep() from tests

* fix typo in existing example filename; add auth example

* add example `authenticated_playwright`

* update source link in example to /feed

* add `storage_state` to missing graphs
2024-12-03 09:02:11 +01:00

48 lines
1.3 KiB
Python

"""
Basic example of scraping pipeline using SmartScraper
"""
import os
from dotenv import load_dotenv
from scrapegraphai.graphs import SmartScraperGraph
from scrapegraphai.utils import prettify_exec_info
load_dotenv()
# ************************************************
# Define the configuration for the graph
# ************************************************
groq_key = os.getenv("GROQ_APIKEY")
graph_config = {
"llm": {
"model": "groq/gemma-7b-it",
"api_key": groq_key,
"temperature": 0
},
"headless": False,
"backend": "undetected_chromedriver"
}
# ************************************************
# Create the SmartScraperGraph instance and run it
# ************************************************
smart_scraper_graph = SmartScraperGraph(
prompt="List me all the projects with their description.",
# also accepts a string with the already downloaded HTML code
source="https://perinim.github.io/projects/",
config=graph_config
)
result = smart_scraper_graph.run()
print(result)
# ************************************************
# Get graph execution info
# ************************************************
graph_exec_info = smart_scraper_graph.get_execution_info()
print(prettify_exec_info(graph_exec_info))