OCR Router

Introduction

Unified Vision LLM & OCR API Gateway

Introduction

DocsRouter is a unified API gateway that simplifies integrating distinct Optical Character Recognition (OCR) and Vision LLM providers into your applications.

Instead of managing multiple API keys, distinct response formats, and separate billing for Google Gemini, OpenAI GPT-4o, Anthropic Claude, and others, DocsRouter provides a single standardized API.

Why DocsRouter?

  • Unified API: One standard request/response format for all providers. Switch models by changing just one string.
  • Vision LLM Power: Leverage state-of-the-art multimodal LLMs like Gemini 2.0 Flash, GPT-4o, and Claude 3.5 Sonnet for superior document understanding.
  • Standardized Output: Get clean, structured JSON output regardless of the underlying provider.
  • Failover & Routing: (Coming Soon) Automatically route to the cheapest or most accurate provider.
  • Unified Billing: Manage a single credit balance for all your OCR needs.

Supported Providers

DocsRouter currently supports the following Vision LLM providers via OpenRouter:

  • Google: Gemini 2.0 Flash, Gemini 1.5 Pro
  • OpenAI: GPT-4o, GPT-4o Mini
  • Anthropic: Claude 3.5 Sonnet
  • Mistral: Mistral OCR (Coming Soon)

Next Steps

On this page