Api
OCR Endpoint
Extract text and data from documents
OCR Endpoint
Process a document (image or PDF) using a specified Vision LLM model.
URL: POST https://api.docsrouter.com/v1/ocr
Request Body
The request body should be a JSON object with the following properties:
| Field | Type | Required | Description |
|---|---|---|---|
url | string | No* | Publicly accessible URL of the image/document. |
base64 | string | No* | Base64-encoded image data (without data URI prefix). |
model | string | No | The ID of the model to use. Default: google/gemini-2.0-flash-001. |
options | object | No | Additional processing options. |
* Either url or base64 is required.
Options Object
| Field | Type | Default | Description |
|---|---|---|---|
extract_tables | boolean | false | If true, attempts to parse tables into structured JSON. |
language | string | auto | Hint for the document language (e.g., "en", "es"). |
output_format | enum | text | Preferred output format: text, json, markdown. |
Example Request
curl -X POST https://api.docsrouter.com/v1/ocr \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/invoice.jpg",
"model": "openai/gpt-4o",
"options": {
"extract_tables": true
}
}'Response Body
{
"id": "req_123...",
"object": "ocr.result",
"created": 1716312345,
"model": "openai/gpt-4o",
"result": {
"text": "INVOICE #001...",
"blocks": [
{ "type": "text", "content": "INVOICE #001" },
...
],
"tables": [
{
"headers": ["Item", "Price"],
"rows": [["Widget", "$10.00"]]
}
],
"confidence": 95,
"language": "en"
},
"usage": {
"pages_processed": 1,
"tokens_used": 450,
"provider_cost_cents": 2,
"platform_fee_cents": 1,
"total_cost_cents": 3,
"processing_time_ms": 1800
}
}