Internal Document Automation

Internal forms, expense reports, and operational documents still get processed manually in many organizations. Automate the data capture step — extract structured data from any internal document format.

Document automation: from form to structured JSON

Input document
EXPENSE REPORT

Submitted by: Julia Bergmann
Employee ID: EMP-4821
Department: Marketing
Submission Date: 2024-10-28
Reporting Period: October 1-31, 2024
Approver: Stefan Haas, Marketing Director

Purpose: Q4 Campaign Launch - Client meetings
         and trade show attendance

-------------------------------------------------
Date        Category        Description     Amount
-------------------------------------------------
2024-10-03  Travel          Train BER-MUC   189.00
2024-10-03  Accommodation   Hotel Munich     245.00
2024-10-04  Meals           Client dinner    127.50
2024-10-04  Travel          Taxi to venue     34.80
2024-10-15  Conference      AdTech Expo      599.00
2024-10-15  Travel          Flight BER-HAM   156.00
2024-10-16  Accommodation   Hotel Hamburg     198.00
2024-10-16  Meals           Team lunch        89.40
2024-10-22  Office          Print materials  312.00
2024-10-28  Software        Design tool sub   49.99
-------------------------------------------------
                            TOTAL:         2,000.69

Payment Method: Corporate Card ending 4821
Receipts Attached: 10 of 10

Employee Signature: J. Bergmann
Date: 2024-10-28
Extracted JSON
{
  "employee": {
    "name": "Julia Bergmann",
    "employee_id": "EMP-4821",
    "department": "Marketing"
  },
  "submission_date": "2024-10-28",
  "reporting_period": {
    "start": "2024-10-01",
    "end": "2024-10-31"
  },
  "approver": {
    "name": "Stefan Haas",
    "title": "Marketing Director"
  },
  "purpose": "Q4 Campaign Launch - Client meetings and trade show attendance",
  "expenses": [
    { "date": "2024-10-03", "category": "Travel", "description": "Train BER-MUC", "amount": 189.00 },
    { "date": "2024-10-03", "category": "Accommodation", "description": "Hotel Munich", "amount": 245.00 },
    { "date": "2024-10-04", "category": "Meals", "description": "Client dinner", "amount": 127.50 },
    { "date": "2024-10-04", "category": "Travel", "description": "Taxi to venue", "amount": 34.80 },
    { "date": "2024-10-15", "category": "Conference", "description": "AdTech Expo", "amount": 599.00 },
    { "date": "2024-10-15", "category": "Travel", "description": "Flight BER-HAM", "amount": 156.00 },
    { "date": "2024-10-16", "category": "Accommodation", "description": "Hotel Hamburg", "amount": 198.00 },
    { "date": "2024-10-16", "category": "Meals", "description": "Team lunch", "amount": 89.40 },
    { "date": "2024-10-22", "category": "Office", "description": "Print materials", "amount": 312.00 },
    { "date": "2024-10-28", "category": "Software", "description": "Design tool sub", "amount": 49.99 }
  ],
  "total": 2000.69,
  "payment_method": "Corporate Card ending 4821",
  "receipts_attached": 10,
  "receipts_expected": 10
}

Define your schema

Tell Smole what data to extract using a JSON Schema.

{
  "type": "object",
  "properties": {
    "employee": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "employee_id": { "type": "string" },
        "department": { "type": "string" }
      }
    },
    "submission_date": { "type": "string", "format": "date" },
    "reporting_period": {
      "type": "object",
      "properties": {
        "start": { "type": "string", "format": "date" },
        "end": { "type": "string", "format": "date" }
      }
    },
    "approver": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "title": { "type": "string" }
      }
    },
    "purpose": { "type": "string" },
    "expenses": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "date": { "type": "string", "format": "date" },
          "category": { "type": "string" },
          "description": { "type": "string" },
          "amount": { "type": "number" }
        }
      }
    },
    "total": { "type": "number" },
    "payment_method": { "type": "string" },
    "receipts_attached": { "type": "integer" },
    "receipts_expected": { "type": "integer" }
  }
}

Try with your own documents

Upload a document and define your schema. See results in seconds.