Back to blog
receiptsocrextractionexpense-management

Receipt OCR API: Extract Data from Receipts Automatically

February 6, 2026Smole Team

Receipt OCR API: Extract Data from Receipts Automatically

Receipts are small, crumpled, faded, and full of valuable data. Whether you're building an expense management app, automating bookkeeping, or tracking business purchases, extracting data from receipts is a common need — and a surprisingly hard one.

Traditional OCR reads the text but doesn't understand it. Schema-based extraction reads the text and turns it into structured data you can use directly.

The Challenge with Receipts

Receipts are harder than most documents because:

  • Small text — Thermal printers use tiny fonts
  • Faded ink — Thermal paper degrades over time
  • Varying layouts — Every store has a different format
  • Abbreviations — "CHKN BRST" instead of "Chicken Breast"
  • Crumpled and skewed — Receipts get folded, wrinkled, and photographed at angles
  • Multiple languages — Store names and items in local languages

Despite all this, schema-based extraction produces reliable results because it understands the structure of a receipt, not just the characters.

Receipt Extraction Schema

Basic Schema

For expense tracking — just the key fields:

{
  "type": "object",
  "properties": {
    "store_name": { "type": "string" },
    "date": { "type": "string", "format": "date" },
    "total": { "type": "number" },
    "currency": { "type": "string" },
    "payment_method": { "type": "string" }
  }
}

Detailed Schema

For full receipt digitization with line items:

{
  "type": "object",
  "properties": {
    "store": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "address": { "type": "string" },
        "phone": { "type": "string" },
        "tax_id": { "type": "string" }
      }
    },
    "date": { "type": "string", "format": "date" },
    "time": { "type": "string" },
    "receipt_number": { "type": "string" },
    "items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "name": { "type": "string" },
          "quantity": { "type": "number" },
          "unit_price": { "type": "number" },
          "total": { "type": "number" }
        }
      }
    },
    "subtotal": { "type": "number" },
    "tax_rate": { "type": "number" },
    "tax_amount": { "type": "number" },
    "total": { "type": "number" },
    "payment_method": { "type": "string" },
    "currency": { "type": "string" }
  }
}

Example: Grocery Receipt

Upload a photo of a grocery receipt:

curl -X POST https://api.smole.tech/api/pipeline/file \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@grocery-receipt.jpg" \
  -F "schemaId=YOUR_SCHEMA_ID"

Get structured data:

{
  "store": {
    "name": "REWE City",
    "address": "Friedrichstr. 67, 10117 Berlin",
    "phone": null,
    "tax_id": "DE136695976"
  },
  "date": "2025-12-18",
  "time": "14:32",
  "receipt_number": "4821-0094",
  "items": [
    { "name": "Bio Bananen", "quantity": 1, "unit_price": 1.99, "total": 1.99 },
    { "name": "Hafermilch 1L", "quantity": 2, "unit_price": 1.49, "total": 2.98 },
    { "name": "Vollkornbrot", "quantity": 1, "unit_price": 2.79, "total": 2.79 },
    { "name": "Tomaten 500g", "quantity": 1, "unit_price": 1.89, "total": 1.89 },
    { "name": "Olivenöl 500ml", "quantity": 1, "unit_price": 4.99, "total": 4.99 }
  ],
  "subtotal": 14.64,
  "tax_rate": 0.07,
  "tax_amount": 1.02,
  "total": 14.64,
  "payment_method": "EC-Karte",
  "currency": "EUR"
}

Every item extracted with the correct price, even from a thermal-printed receipt photographed on a phone.

Example: Restaurant Receipt

{
  "store": {
    "name": "Trattoria Milano",
    "address": "Kantstr. 42, 10625 Berlin"
  },
  "date": "2025-12-20",
  "items": [
    { "name": "Pizza Margherita", "quantity": 2, "unit_price": 11.50, "total": 23.00 },
    { "name": "Insalata Mista", "quantity": 1, "unit_price": 8.90, "total": 8.90 },
    { "name": "Tiramisu", "quantity": 2, "unit_price": 6.50, "total": 13.00 },
    { "name": "Acqua 0.75L", "quantity": 1, "unit_price": 3.50, "total": 3.50 }
  ],
  "subtotal": 48.40,
  "tax_rate": 0.19,
  "tax_amount": 9.20,
  "total": 57.60,
  "payment_method": "Visa ending 4821",
  "currency": "EUR"
}

Building a Receipt Processing App

Mobile Expense Tracker

from fastapi import FastAPI, UploadFile
import requests

app = FastAPI()

@app.post("/scan-receipt")
async def scan_receipt(file: UploadFile):
    """Upload a receipt photo and get structured data back."""
    resp = requests.post(
        f"{API_BASE}/pipeline/file",
        headers={"Authorization": f"Bearer {API_KEY}"},
        files={"file": (file.filename, await file.read())},
        data={"schemaId": RECEIPT_SCHEMA_ID}
    )
    pipeline_id = resp.json()["id"]

    # Wait for result
    result = poll_for_result(pipeline_id)

    return {
        "store": result["store"]["name"],
        "date": result["date"],
        "total": result["total"],
        "items": result["items"],
        "category": classify_expense(result),
    }

def classify_expense(receipt):
    """Simple category classification based on store name."""
    store = receipt.get("store", {}).get("name", "").lower()
    if any(w in store for w in ["rewe", "edeka", "aldi", "lidl"]):
        return "Groceries"
    if any(w in store for w in ["restaurant", "trattoria", "café"]):
        return "Dining"
    if any(w in store for w in ["shell", "aral", "esso"]):
        return "Transport"
    return "Other"

Tips for Better Receipt Extraction

Photo Quality

  • Flatten the receipt before photographing — smooth out wrinkles and folds
  • Use good lighting — even, diffused light without shadows
  • Fill the frame — get the receipt to fill most of the photo
  • Focus on the text — make sure the smallest text is readable in the photo

Schema Tips

  • Include currency — Essential for international expense tracking
  • Use payment_method — Useful for reconciliation with bank statements
  • Keep items flexible — Not every receipt has quantity and unit_price; some just have item name and total
  • Add store.tax_id — Needed for VAT reclaim in many European countries

Use Cases

  • Expense management — Employees snap receipts, data flows to accounting
  • Bookkeeping automation — Small businesses processing daily sales receipts
  • VAT recovery — Extracting tax details for cross-border VAT reclaims
  • Personal finance — Tracking spending by category from receipt photos
  • Audit trails — Digitizing paper receipts for compliance documentation

Try It Now

Photograph a receipt and upload it in the Playground. Define your schema and see structured data in seconds — even from faded, crumpled thermal paper.

For API integration, see the documentation.