Back to blog
invoicesautomationapiaccounts-payable

How to Automate Invoice Processing with an API

February 14, 2026Smole Team

How to Automate Invoice Processing with an API

Invoice processing is one of the most common — and most tedious — document workflows in any business. Someone receives an invoice, opens it, types the vendor name into a spreadsheet, copies the line items, double-checks the total, and files it away. Multiply that by hundreds of invoices per month, and you have a full-time job that adds no value.

Here's how to automate the entire process using an API.

What Gets Extracted from an Invoice

A typical invoice contains a consistent set of data points:

  • Vendor information — Name, address, VAT ID, bank details
  • Invoice metadata — Invoice number, date, due date, payment terms
  • Customer details — Who the invoice is addressed to
  • Line items — Description, quantity, unit price, total per item
  • Totals — Subtotal, tax rate, tax amount, grand total
  • Payment information — IBAN, BIC, payment method

With schema-based extraction, you define exactly which of these fields you need, and the API returns them as clean JSON.

Building an Invoice Extraction Schema

Simple Schema (Key Fields Only)

If you just need the basics for bookkeeping:

{
  "type": "object",
  "properties": {
    "vendor_name": { "type": "string" },
    "invoice_number": { "type": "string" },
    "date": { "type": "string", "format": "date" },
    "due_date": { "type": "string", "format": "date" },
    "total": { "type": "number" },
    "currency": { "type": "string" }
  }
}

Full Schema (Complete Extraction)

For accounts payable automation where you need every detail:

{
  "type": "object",
  "properties": {
    "vendor": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "address": { "type": "string" },
        "vat_id": { "type": "string" },
        "iban": { "type": "string" },
        "bic": { "type": "string" }
      }
    },
    "invoice_number": { "type": "string" },
    "date": { "type": "string", "format": "date" },
    "due_date": { "type": "string", "format": "date" },
    "payment_terms": { "type": "string" },
    "customer": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "address": { "type": "string" }
      }
    },
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": { "type": "string" },
          "quantity": { "type": "number" },
          "unit_price": { "type": "number" },
          "vat_rate": { "type": "number" },
          "total": { "type": "number" }
        }
      }
    },
    "subtotal": { "type": "number" },
    "vat_amount": { "type": "number" },
    "total": { "type": "number" },
    "currency": { "type": "string" }
  }
}

Processing an Invoice

1. Register Your Schema

curl -X POST https://api.smole.tech/api/schemas \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "invoice-full",
    "schema": { ... }
  }'

Save the returned schema ID — you'll use it for every invoice.

2. Upload an Invoice

curl -X POST https://api.smole.tech/api/pipeline/file \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@invoice.pdf" \
  -F "schemaId=SCHEMA_ID"

This works with PDF invoices (digital or scanned), photographed invoices, Word documents, and even HTML invoices.

3. Retrieve the Extracted Data

curl https://api.smole.tech/api/pipeline/PIPELINE_ID \
  -H "Authorization: Bearer YOUR_API_KEY"

Example response:

{
  "vendor": {
    "name": "TechParts Distribution GmbH",
    "address": "Industriestr. 15, 70469 Stuttgart",
    "vat_id": "DE298374651",
    "iban": "DE89370400440532013000",
    "bic": "COBADEFFXXX"
  },
  "invoice_number": "TP-2025-1847",
  "date": "2025-11-20",
  "due_date": "2025-12-20",
  "payment_terms": "Net 30",
  "customer": {
    "name": "Your Company GmbH",
    "address": "Musterstr. 10, 10115 Berlin"
  },
  "line_items": [
    { "description": "Server RAM 64GB DDR5", "quantity": 4, "unit_price": 189.00, "vat_rate": 0.19, "total": 756.00 },
    { "description": "NVMe SSD 2TB", "quantity": 2, "unit_price": 245.00, "vat_rate": 0.19, "total": 490.00 },
    { "description": "Network Cable Cat6 (50m)", "quantity": 10, "unit_price": 12.50, "vat_rate": 0.19, "total": 125.00 }
  ],
  "subtotal": 1371.00,
  "vat_amount": 260.49,
  "total": 1631.49,
  "currency": "EUR"
}

Handling Invoice Variations

Invoices vary wildly in format. Some are clean PDFs from accounting software, others are handwritten notes, and everything in between. Schema-based extraction handles this because it understands the content, not the layout.

Different Layouts

The same schema works whether the vendor name is at the top-left, top-right, or in a header. The extraction engine finds the data by context, not by position.

Different Languages

Invoices in German, English, French, or any other language are processed the same way. The field names in your schema are in your preferred language — the extraction maps document content accordingly.

Missing Fields

If a field doesn't exist in the invoice (e.g., no BIC code), the API returns null for that field. Your code should handle optional fields gracefully.

Integrating with Your Systems

Once you have structured JSON, feeding it into your existing tools is straightforward:

Accounting Software

Push extracted data to QuickBooks, Xero, Datev, or Lexware via their APIs. Map Smole's JSON fields to the accounting system's expected format.

ERP Systems

Feed invoice data into SAP, Oracle, or Microsoft Dynamics. The structured JSON maps cleanly to ERP entry formats.

Spreadsheets

For simpler workflows, write extracted data to Google Sheets or Excel via their APIs. Each invoice becomes a row.

Databases

Insert extracted data directly into PostgreSQL, MySQL, or any database. The JSON structure maps naturally to relational tables.

Batch Processing Invoices

For high-volume scenarios — processing a month's worth of invoices at once:

async function processInvoiceBatch(files, schemaId) {
  // Submit all invoices
  const pipelines = await Promise.all(
    files.map(file =>
      submitPipeline(file, schemaId)
    )
  );

  // Poll for results
  const results = await Promise.all(
    pipelines.map(p => pollForResult(p.id))
  );

  return results;
}

Smole handles concurrent requests, so batch processing is efficient even at scale.

Cost of Manual vs Automated Processing

ManualAutomated
Time per invoice15-30 minutesSeconds
Error rate2-5%Near zero
Scales with volumeNo (need more people)Yes (same API)
Works after hoursNoYes

For a company processing 200 invoices per month at 20 minutes each, that's nearly 67 hours of manual work per month — almost a full-time position.

Try It Now

Upload an invoice in the Playground to see extraction results instantly. Define your schema, drop in a PDF, and get JSON back in seconds.

For full API integration details, see the documentation.