mirror of
https://github.com/VinciGit00/Scrapegraph-ai.git
synced 2026-06-04 21:01:04 +08:00
| .. | ||
| .env.example | ||
| markdownify_scrapegraphai.py | ||
| readme.md | ||
Markdownify Graph Example
This example demonstrates how to use the Markdownify graph to convert HTML content to Markdown format.
Features
- Convert HTML content to clean, readable Markdown
- Support for both URL and direct HTML input
- Maintains formatting and structure of the original content
- Handles complex HTML elements and nested structures
Usage
from scrapegraphai import Client
from scrapegraphai.logger import sgai_logger
# Set up logging
sgai_logger.set_logging(level="INFO")
# Initialize the client
sgai_client = Client(api_key="your-api-key")
# Example 1: Convert a website to Markdown
response = sgai_client.markdownify(
website_url="https://example.com"
)
print(response.markdown)
# Example 2: Convert HTML content directly
html_content = """
<div>
<h1>Hello World</h1>
<p>This is a <strong>test</strong> paragraph.</p>
</div>
"""
response = sgai_client.markdownify(
html_content=html_content
)
print(response.markdown)
Parameters
The markdownify method accepts the following parameters:
website_url(str, optional): The URL of the website to convert to Markdownhtml_content(str, optional): Direct HTML content to convert to Markdown
Note: You must provide either website_url or html_content, but not both.
Response
The response object contains:
markdown(str): The converted Markdown contentmetadata(dict): Additional information about the conversion process
Error Handling
The graph handles various edge cases:
- Invalid URLs
- Malformed HTML
- Network errors
- Timeout issues
If an error occurs, it will be logged and raised with appropriate error messages.
Best Practices
- Always provide a valid URL or well-formed HTML content
- Use appropriate logging levels for debugging
- Handle the response appropriately in your application
- Consider rate limiting for large-scale conversions