mirror of
https://github.com/VinciGit00/Scrapegraph-ai.git
synced 2026-06-12 21:01:54 +08:00
| .. | ||
| ollama | ||
| openai | ||
| .env.example | ||
| README.md | ||
Document Scraper Graph Example
This example demonstrates how to use Scrapegraph-ai to extract data from various document formats (PDF, DOC, DOCX, etc.).
Features
- Multi-format document support
- Text extraction
- Document parsing
- Metadata extraction
Setup
- Install required dependencies
- Copy
.env.exampleto.env - Configure your API keys in the
.envfile
Usage
from scrapegraphai.graphs import DocumentScraperGraph
graph = DocumentScraperGraph()
content = graph.scrape("document.pdf")
Environment Variables
Required environment variables:
OPENAI_API_KEY: Your OpenAI API key