OCR Router
Api

OCR Endpoint

Extract text and data from documents

OCR Endpoint

Process a document (image or PDF) using a specified Vision LLM model.

URL: POST https://api.docsrouter.com/v1/ocr

Request Body

The request body should be a JSON object with the following properties:

FieldTypeRequiredDescription
urlstringNo*Publicly accessible URL of the image/document.
base64stringNo*Base64-encoded image data (without data URI prefix).
modelstringNoThe ID of the model to use. Default: google/gemini-2.0-flash-001.
optionsobjectNoAdditional processing options.

* Either url or base64 is required.

Options Object

FieldTypeDefaultDescription
extract_tablesbooleanfalseIf true, attempts to parse tables into structured JSON.
languagestringautoHint for the document language (e.g., "en", "es").
output_formatenumtextPreferred output format: text, json, markdown.

Example Request

curl -X POST https://api.docsrouter.com/v1/ocr \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/invoice.jpg",
    "model": "openai/gpt-4o",
    "options": {
      "extract_tables": true
    }
  }'

Response Body

{
  "id": "req_123...",
  "object": "ocr.result",
  "created": 1716312345,
  "model": "openai/gpt-4o",
  "result": {
    "text": "INVOICE #001...",
    "blocks": [
      { "type": "text", "content": "INVOICE #001" },
      ...
    ],
    "tables": [
      {
        "headers": ["Item", "Price"],
        "rows": [["Widget", "$10.00"]]
      }
    ],
    "confidence": 95,
    "language": "en"
  },
  "usage": {
    "pages_processed": 1,
    "tokens_used": 450,
    "provider_cost_cents": 2,
    "platform_fee_cents": 1,
    "total_cost_cents": 3,
    "processing_time_ms": 1800
  }
}

On this page