How to Extract Data from Images with an API
How to Extract Data from Images with an API
Not every document arrives as a neat PDF. Receipts get photographed on phones. Whiteboards get snapped in meetings. Forms get scanned on flatbed scanners. Business cards get captured at conferences. The data in these images is valuable — but locked inside pixels.
Here's how to extract structured data from any image using OCR and schema-based extraction.
Supported Image Formats
Smole processes all common image formats:
| Format | Common Sources |
|---|---|
| JPEG/JPG | Phone photos, camera captures |
| PNG | Screenshots, digital exports |
| TIFF | Enterprise scanners, fax machines |
| BMP | Legacy systems |
| WEBP | Web downloads |
| GIF | Older document captures |
You can also send images embedded in PDFs — the system detects image-based pages automatically.
How It Works
- OCR reads the text from the image, handling rotation, skew, and varying quality
- Text reconstruction assembles the OCR output into structured content
- Schema-based extraction maps the content to your defined JSON structure
All three steps happen in a single API call.
Example: Extracting a Photographed Receipt
You snap a photo of a receipt with your phone. Define what you want:
{
"type": "object",
"properties": {
"store_name": { "type": "string" },
"date": { "type": "string", "format": "date" },
"items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "number" }
}
}
},
"subtotal": { "type": "number" },
"tax": { "type": "number" },
"total": { "type": "number" },
"payment_method": { "type": "string" }
}
}
Upload the photo:
curl -X POST https://api.smole.tech/api/pipeline/file \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@receipt.jpg" \
-F "schemaId=YOUR_SCHEMA_ID"
Get structured data:
{
"store_name": "Edeka Markt Berlin",
"date": "2025-12-15",
"items": [
{ "name": "Bio Vollmilch 1L", "price": 1.29 },
{ "name": "Dinkelbrötchen 4x", "price": 2.49 },
{ "name": "Tomaten 500g", "price": 1.99 },
{ "name": "Olivenöl 500ml", "price": 4.99 }
],
"subtotal": 10.76,
"tax": 0.75,
"total": 11.51,
"payment_method": "EC Card"
}
Example: Business Card Capture
{
"type": "object",
"properties": {
"name": { "type": "string" },
"title": { "type": "string" },
"company": { "type": "string" },
"email": { "type": "string", "format": "email" },
"phone": { "type": "string" },
"address": { "type": "string" },
"website": { "type": "string" }
}
}
Upload a photo of a business card, get a clean contact record you can push to your CRM.
Example: Whiteboard Notes
After a meeting, capture the whiteboard:
{
"type": "object",
"properties": {
"title": { "type": "string" },
"action_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"task": { "type": "string" },
"assignee": { "type": "string" },
"deadline": { "type": "string" }
}
}
},
"key_decisions": {
"type": "array",
"items": { "type": "string" }
},
"notes": { "type": "string" }
}
}
Tips for Better Image Extraction
Photo Quality
- Lighting: Even lighting without harsh shadows produces the best results. Avoid glare on glossy paper.
- Focus: Make sure text is sharp. Blurry images significantly reduce OCR accuracy.
- Angle: Shoot straight-on when possible. Slight angles are handled, but extreme perspective distortion can cause issues.
- Resolution: Higher is better, but even phone cameras at default settings work well.
Schema Design
- Be specific with field names —
store_nameworks better thannamewhen there are multiple name-like fields in the image - Use appropriate types —
numberfor prices so€4.99becomes4.99,stringwithformat: "date"for dates - Keep schemas focused — Extract the data you need, not everything visible in the image
Building an Image Processing Pipeline
For automated workflows — expense tracking, document scanning, form processing:
import requests
from pathlib import Path
def process_image(image_path, schema_id):
"""Extract structured data from an image."""
with open(image_path, "rb") as f:
resp = requests.post(
"https://api.smole.tech/api/pipeline/file",
headers={"Authorization": f"Bearer {API_KEY}"},
files={"file": f},
data={"schemaId": schema_id}
)
resp.raise_for_status()
pipeline_id = resp.json()["id"]
# Poll for result
while True:
result = requests.get(
f"https://api.smole.tech/api/pipeline/{pipeline_id}",
headers={"Authorization": f"Bearer {API_KEY}"}
).json()
if result["status"] == "completed":
return result["extraction"]["data"]
elif result["status"] == "failed":
raise Exception(f"Failed: {result.get('error')}")
time.sleep(2)
# Process all images in a folder
for img in Path("./receipts").glob("*.jpg"):
data = process_image(str(img), receipt_schema_id)
save_to_database(data)
Use Cases
- Expense management — Employees photograph receipts, data flows into your accounting system automatically
- Inventory counting — Photograph shelf labels or inventory sheets, extract quantities and SKUs
- Form digitization — Scan paper forms, extract structured data for processing
- Document archiving — Photograph legacy documents, extract key metadata for searchable archives
- Field data capture — Technicians photograph serial plates, inspection reports, or delivery notes on-site
Try It Now
Take a photo of a receipt, invoice, or business card and upload it in the Playground. Define your schema and see extraction results in seconds.
For API integration, see the documentation.
Related articles
Extract Structured Data from Documents with Python
How to extract structured JSON data from PDFs, scanned documents, and Word files using Python. Complete code examples with requests, error handling, and batch processing.
ocrHow to Extract Data from Scanned Documents
Learn how to extract structured data from scanned PDFs, photographed documents, and image-based files using OCR and schema-based extraction.
receiptsReceipt OCR API: Extract Data from Receipts Automatically
Extract store names, items, prices, totals, and payment methods from receipts using OCR and schema-based extraction. Works with photos, scans, and digital receipts.
