+ State of the Art models for Document Intelligence +
+ + + +
-
-
-
-
- olmOCR-bench score vs open-weight model size — see detailed scores
-
| Detection | OCR |
@@ -38,15 +57,10 @@ For our managed API or on-prem document intelligence solution, check out [our pl
Surya is named for the [Hindu sun god](https://en.wikipedia.org/wiki/Surya), who has universal vision.
-## Community
-
-[Discord](https://discord.gg//KuZwXNGnfH) is where we discuss future development.
-
## Examples
Each row links to four annotated views of the same page: text-line detection,
-layout, reading order, and (when present) table recognition. Generated with
-the latest Surya 2 checkpoint.
+layout, reading order, and (when present) table recognition.
| Name | Detection | Layout | Order | Table Rec |
|------------------|:-----------------------------------:|---------------------------------------------:|------------------------------------------------:|------------------------------------------------:|
@@ -56,18 +70,9 @@ the latest Surya 2 checkpoint.
| Handwritten Notes | [Image](static/images/handwritten.png) | [Image](static/images/handwritten_layout.png) | [Image](static/images/handwritten_reading.png) | [Image](static/images/handwritten_tablerec.png) |
| Corporate Doc | [Image](static/images/corporate.png) | [Image](static/images/corporate_layout.png) | [Image](static/images/corporate_reading.png) | [Image](static/images/corporate_tablerec.png) |
-# Hosted API
-
-There is a hosted API for all surya models available [here](https://www.datalab.to?utm_source=gh-surya):
-
-- Works with PDF, images, word docs, and powerpoints
-- Consistent speed, with no latency spikes
-- High reliability and uptime
-
# Commercial usage
-Our model weights use a modified AI Pubs Open Rail-M license (free for research, personal use, and startups under $2M funding/revenue) and our code is GPL. For broader commercial licensing or to remove GPL requirements, visit our pricing page [here](https://www.datalab.to/pricing?utm_source=gh-surya).
-
+The Surya code is licensed under Apache 2.0. The model weights use a modified AI Pubs Open Rail-M license (free for research, personal use, and startups under $2M funding/revenue). For broader commercial licensing of the model weights, visit our pricing page [here](https://www.datalab.to/pricing?utm_source=gh-surya).
# Installation
@@ -81,19 +86,7 @@ pip install surya-ocr
## Upgrading from Surya v1
-Surya 2 replaces the per-task encoder-decoder models (`FoundationPredictor` + `RecognitionPredictor` + `LayoutPredictor` + `TableRecPredictor` each holding their own torch checkpoints) with a single vision-language model served by `vllm` (Docker, GPU) or `llama-server` (Apple Silicon / CPU). If you have v1 code:
-
-```python
-# v1 (no longer works)
-from surya.foundation import FoundationPredictor
-from surya.recognition import RecognitionPredictor
-
-foundation = FoundationPredictor()
-rec = RecognitionPredictor(foundation)
-predictions = rec([image], det_predictor=det)
-```
-
-Migrate to:
+Surya 2 replaces the per-task encoder-decoder models (`FoundationPredictor` + `RecognitionPredictor` + `LayoutPredictor` + `TableRecPredictor` each holding their own torch checkpoints) with a single vision-language model served by `vllm` (Docker, GPU) or `llama-server` (Apple Silicon / CPU). If you have v1 code, you can migrate to this:
```python
# v2
@@ -107,13 +100,8 @@ predictions = rec([image]) # full-page OCR by default
What's different:
- `SuryaInferenceManager` replaces `FoundationPredictor`. Same manager instance is shared across `LayoutPredictor`, `RecognitionPredictor`, `TableRecPredictor`.
-- `RecognitionPredictor` defaults to **full-page mode** (one VLM call per page). Pass `layout_results` to opt into per-block mode.
-- `surya.texify` / `TexifyPredictor` is gone — math is recognized inline by the OCR pass and returned inside `