← Back to Blog

AP Automation 2025 - How to Extract Line Items from Any Invoice

by DocsRouter Team

AP Automation Invoice

Accounts Payable (AP) departments are drowning in paper. The average cost to process a single invoice manually is $12.90, and it takes 9.7 days.

DocsRouter allows you to bring that cost down to pennies and the time down to seconds.

The Challenge: Line Item Extraction

Getting the "Total Amount" is easy. Even legacy OCR can do that. The real challenge is line items.

An invoice might have a table that spans 3 pages. It might have merged cells. It might be handwritten. Most legacy tools (like AWS Textract or Azure Form Recognizer) struggle with complex, nested tables.

The Solution: Vision LLMs

Models like GPT-5.2 and Gemini 3 Pro treat the document as a visual scene. They understand that a row continues on the next page because they "read" the context, not just the pixels.

How to Implement 2-Way Matching

2-way matching verifies that the Invoice amount matches the Purchase Order amount.

  1. Ingest: Send the PDF to POST /v1/ocr with model: "openai/gpt-5-2".
  2. Extract: Request a schema that includes line_items[] and po_number.
  3. Match: DocsRouter returns structured JSON. Your backend simply checks:
    if (ocrResult.total === databasePO.total) {
      approveInvoice();
    }

Why accurate line items matter

It's not just about paying the bill. It's about data.

You can't answer these questions if you only extract the Total. DocsRouter unlocks the data trapped in your PDFs.