Invoice & Receipt Extraction

Manual invoice processing is slow, error-prone, and doesn't scale. Whether you're handling 10 invoices a week or 10,000, extracting line items, totals, and vendor details shouldn't require manual data entry.

Invoice extraction: from PDF to structured JSON

Input document
INVOICE #INV-2024-0847

Vendor: Meridian Office Supplies GmbH
VAT ID: DE298374651
Date: 2024-11-15
Due Date: 2024-12-15
Payment Terms: Net 30

Bill To:
  Acme Corp
  Friedrichstrasse 123
  10117 Berlin, Germany

-----------------------------------------
Item              Qty   Unit Price   Total
-----------------------------------------
A4 Copy Paper      25     4.99     124.75
Ink Cartridge BK    4    29.90     119.60
Ink Cartridge CL    2    34.50      69.00
Desk Organizer      3    18.75      56.25
USB-C Hub           1    45.00      45.00
-----------------------------------------

                    Subtotal:     414.60
                    VAT (19%):     78.77
                    TOTAL:        493.37

Payment: Bank Transfer
IBAN: DE89370400440532013000
BIC: COBADEFFXXX
Extracted JSON
{
  "vendor": {
    "name": "Meridian Office Supplies GmbH",
    "vat_id": "DE298374651"
  },
  "invoice_number": "INV-2024-0847",
  "date": "2024-11-15",
  "due_date": "2024-12-15",
  "payment_terms": "Net 30",
  "bill_to": {
    "name": "Acme Corp",
    "address": "Friedrichstrasse 123, 10117 Berlin, Germany"
  },
  "line_items": [
    { "description": "A4 Copy Paper", "quantity": 25, "unit_price": 4.99, "total": 124.75 },
    { "description": "Ink Cartridge BK", "quantity": 4, "unit_price": 29.90, "total": 119.60 },
    { "description": "Ink Cartridge CL", "quantity": 2, "unit_price": 34.50, "total": 69.00 },
    { "description": "Desk Organizer", "quantity": 3, "unit_price": 18.75, "total": 56.25 },
    { "description": "USB-C Hub", "quantity": 1, "unit_price": 45.00, "total": 45.00 }
  ],
  "subtotal": 414.60,
  "vat_rate": 0.19,
  "vat_amount": 78.77,
  "total": 493.37,
  "payment": {
    "method": "Bank Transfer",
    "iban": "DE89370400440532013000",
    "bic": "COBADEFFXXX"
  }
}

Define your schema

Tell Smole what data to extract using a JSON Schema.

{
  "type": "object",
  "properties": {
    "vendor": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "vat_id": { "type": "string" }
      }
    },
    "invoice_number": { "type": "string" },
    "date": { "type": "string", "format": "date" },
    "due_date": { "type": "string", "format": "date" },
    "payment_terms": { "type": "string" },
    "bill_to": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "address": { "type": "string" }
      }
    },
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": { "type": "string" },
          "quantity": { "type": "number" },
          "unit_price": { "type": "number" },
          "total": { "type": "number" }
        }
      }
    },
    "subtotal": { "type": "number" },
    "vat_rate": { "type": "number" },
    "vat_amount": { "type": "number" },
    "total": { "type": "number" },
    "payment": {
      "type": "object",
      "properties": {
        "method": { "type": "string" },
        "iban": { "type": "string" },
        "bic": { "type": "string" }
      }
    }
  }
}

Try with your own documents

Upload a document and define your schema. See results in seconds.