Add AI receipt scanning with OCR pipeline and debug toggle

- OCR pipeline: Tesseract (images) + pdfplumber (PDFs) → AI text prompt →
  rule-based regex fallback; works with any text model, not just vision models
- Scan Receipt toolbar button parses a photo and pre-fills the transaction form;
  receipt image is automatically attached to the created transaction
- AI settings page: provider, API key (AES-256-GCM encrypted), custom URL,
  model, and per-user debug toggle that gates the OCR/AI debug panel
- Fix CSRF cookie secure=False so HTTP deployments work; add 7-day max_age
- Fix attachment_refs missing from _to_response (attachments never appeared in UI)
- Fix multipart boundary lost when Content-Type was set manually in axios calls
- nginx: raise client_max_body_size to 15 MB, add 120s proxy timeout for OCR
- Migration 0005: add ai_debug boolean to users table
- Update README and CLAUDE.md with AI scanning docs and architecture notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
megaproxy 2026-04-22 22:07:38 +00:00
parent a7c54ca61c
commit 26e2a055db
16 changed files with 397 additions and 99 deletions

View file

@ -70,7 +70,7 @@ backend/app/
dependencies.py — get_db, get_redis, get_current_user dependencies.py — get_db, get_redis, get_current_user
api/ api/
router.py — Central router; investments/reports/budgets have no prefix (paths self-contained) router.py — Central router; investments/reports/budgets have no prefix (paths self-contained)
v1/ — One file per domain (auth, accounts, transactions, budgets, reports, investments, predictions) v1/ — One file per domain (auth, accounts, transactions, budgets, reports, investments, predictions, settings)
db/models/ — SQLAlchemy 2.0 Mapped models db/models/ — SQLAlchemy 2.0 Mapped models
schemas/ — Pydantic request/response models (separate Create/Update/Response per domain) schemas/ — Pydantic request/response models (separate Create/Update/Response per domain)
services/ — Business logic; each service owns one domain services/ — Business logic; each service owns one domain
@ -88,6 +88,21 @@ backend/app/
- Import deduplication via SHA-256 `import_hash` on transactions - Import deduplication via SHA-256 `import_hash` on transactions
- Every mutation writes to `AuditLog` (append-only; app role has no UPDATE/DELETE on that table) - Every mutation writes to `AuditLog` (append-only; app role has no UPDATE/DELETE on that table)
- Soft deletes: `deleted_at` timestamp; all queries must filter `WHERE deleted_at IS NULL` - Soft deletes: `deleted_at` timestamp; all queries must filter `WHERE deleted_at IS NULL`
- `_to_response()` in `transaction_service.py` must include all fields returned to the frontend — omitting a field here makes it invisible to the UI even if it's in the DB
### AI / receipt parsing (`api/v1/settings.py`, `api/v1/transactions.py`)
- User AI config (provider, encrypted API key, base URL, model, debug flag) lives on the `users` table; managed via `GET/PUT/DELETE /settings/ai`
- `ai_api_key_enc` is AES-256-GCM encrypted with `encrypt_field`/`decrypt_field`
- Receipt parsing pipeline in `_call_ai_parse()`: OCR text extraction (`_extract_ocr_text`) → AI text prompt → rule-based fallback (`_rule_based_parse`)
- Images: pytesseract; PDFs: pdfplumber (text layer) → pdf2image + tesseract (scanned fallback)
- AI receives OCR text, not the image — works with any text model, not just vision models
- `_RECEIPT_TEXT_PROMPT` uses `.format(ocr_text=...)` — escape literal braces in the JSON example with `{{` and `}}`
- `POST /transactions/parse-receipt` — scan without an existing transaction (used by "Scan Receipt" toolbar button)
- `POST /transactions/{id}/attachments/{att_id}/parse` — parse an already-uploaded attachment
- `ai_debug` boolean on user controls whether the OCR/AI debug panel shows in the transaction form; check `aiSettings?.debug` on the frontend via the `["ai-settings"]` query key
### CSRF cookie
- Set with `secure=False` and `max_age=604800` (7 days) intentionally — the CSRF token is a public value readable by JS; `Secure` would break HTTP deployments. Session/auth cookies remain properly secured.
### Frontend layout ### Frontend layout
``` ```

View file

@ -14,6 +14,7 @@ Runs entirely on your own hardware via Docker Compose. Designed for LAN access w
- Transfer detection between accounts - Transfer detection between accounts
- Recurring transaction rules (rrule) - Recurring transaction rules (rrule)
- Receipt and document attachments on transactions (JPEG, PNG, WebP, PDF — up to 10 MB each) - Receipt and document attachments on transactions (JPEG, PNG, WebP, PDF — up to 10 MB each)
- **AI receipt scanning** — photograph a receipt to auto-extract merchant, amount, date, and description into a new transaction; receipt is automatically attached
- CSV import with **auto-detection** for 10 UK bank formats: Monzo, Starling, Revolut, Barclays, Lloyds, NatWest, HSBC, Santander, Nationwide, and generic fallback - CSV import with **auto-detection** for 10 UK bank formats: Monzo, Starling, Revolut, Barclays, Lloyds, NatWest, HSBC, Santander, Nationwide, and generic fallback
- SHA-256 deduplication prevents re-importing the same transactions - SHA-256 deduplication prevents re-importing the same transactions
@ -101,6 +102,7 @@ Ten independent security layers:
| Database | PostgreSQL 16 with pgcrypto and RLS | | Database | PostgreSQL 16 with pgcrypto and RLS |
| Cache / Sessions | Redis 7 | | Cache / Sessions | Redis 7 |
| ML | Prophet, statsmodels, NumPy, SciPy | | ML | Prophet, statsmodels, NumPy, SciPy |
| OCR | Tesseract 5, pdfplumber, pdf2image |
| Background jobs | APScheduler (in-process) | | Background jobs | APScheduler (in-process) |
| Containerisation | Docker Compose | | Containerisation | Docker Compose |
@ -172,6 +174,34 @@ Forward your domain to `http://<host>:4000`. The frontend nginx serves the React
--- ---
## AI Receipt Scanning
Receipt scanning uses OCR (Tesseract) to extract text from the image first, then optionally passes that text to an AI model to parse it into structured fields. This means **any text-capable LLM works** — you're not limited to vision models.
If no AI is configured, or if the AI call fails, a rule-based parser runs on the OCR text as a fallback (finds totals, dates, and merchant names via regex).
### Setup
Go to **Settings → AI** and fill in:
| Field | Description |
|-------|-------------|
| Provider | `Anthropic` or `OpenAI-compatible` |
| API Key | Your key (stored AES-256-GCM encrypted on your server) |
| Custom API URL | Optional — for Open WebUI, LM Studio, Ollama, etc. |
| Model | Optional — defaults to `claude-haiku-4-5-20251001` or `gpt-4o-mini` |
| Debug mode | Shows OCR text and raw AI response in the scan form when enabled |
For **Open WebUI**: set the provider to `OpenAI-compatible`, enter `http://your-server:port` as the URL (MyMidas appends `/v1/chat/completions`), and enter the model name exactly as shown in Open WebUI's interface.
Use the **Test connection** button to verify your settings before scanning.
### Usage
Click **Scan Receipt** in the transactions toolbar, select a photo or PDF. The form opens pre-filled with extracted fields — review and save. The receipt image is automatically attached to the created transaction.
---
## Backups ## Backups
Encrypted backups run automatically every night at 3 AM (GPG AES-256 symmetric encryption). Backups are stored in `./data/backups/` and retained for 30 days. Encrypted backups run automatically every night at 3 AM (GPG AES-256 symmetric encryption). Backups are stored in `./data/backups/` and retained for 30 days.

View file

@ -5,6 +5,9 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
gnupg \ gnupg \
gzip \ gzip \
gosu \ gosu \
tesseract-ocr \
tesseract-ocr-eng \
poppler-utils \
&& rm -rf /var/lib/apt/lists/* && rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir uv RUN pip install --no-cache-dir uv
WORKDIR /app WORKDIR /app

View file

@ -0,0 +1,21 @@
"""add ai_debug flag to users
Revision ID: 0005
Revises: 0004
Create Date: 2026-04-22
"""
from alembic import op
import sqlalchemy as sa
revision = "0005"
down_revision = "0004"
branch_labels = None
depends_on = None
def upgrade() -> None:
op.add_column("users", sa.Column("ai_debug", sa.Boolean(), nullable=False, server_default="false"))
def downgrade() -> None:
op.drop_column("users", "ai_debug")

View file

@ -22,6 +22,7 @@ class AiSettingsResponse(BaseModel):
has_api_key: bool has_api_key: bool
base_url: str | None base_url: str | None
model: str | None model: str | None
debug: bool
class AiSettingsSave(BaseModel): class AiSettingsSave(BaseModel):
@ -29,6 +30,7 @@ class AiSettingsSave(BaseModel):
api_key: str = "" api_key: str = ""
base_url: str = "" base_url: str = ""
model: str = "" model: str = ""
debug: bool = False
@router.get("/ai", response_model=AiSettingsResponse) @router.get("/ai", response_model=AiSettingsResponse)
@ -38,6 +40,7 @@ async def get_ai_settings(user: User = Depends(get_current_user)):
has_api_key=bool(user.ai_api_key_enc), has_api_key=bool(user.ai_api_key_enc),
base_url=user.ai_base_url, base_url=user.ai_base_url,
model=user.ai_model, model=user.ai_model,
debug=user.ai_debug,
) )
@ -54,6 +57,7 @@ async def save_ai_settings(
"ai_provider": body.provider, "ai_provider": body.provider,
"ai_base_url": body.base_url.rstrip("/") or None, "ai_base_url": body.base_url.rstrip("/") or None,
"ai_model": body.model.strip() or None, "ai_model": body.model.strip() or None,
"ai_debug": body.debug,
} }
if body.api_key.strip(): if body.api_key.strip():
@ -68,6 +72,7 @@ async def save_ai_settings(
has_api_key=True, has_api_key=True,
base_url=values["ai_base_url"], base_url=values["ai_base_url"],
model=values["ai_model"], model=values["ai_model"],
debug=body.debug,
) )

View file

@ -278,84 +278,205 @@ async def delete_attachment(
await db.commit() await db.commit()
_RECEIPT_PROMPT = ( _RECEIPT_TEXT_PROMPT = (
"You are a receipt parser. Extract information from this receipt and return ONLY a JSON object " "You are a receipt parser. Below is the raw text extracted from a receipt via OCR.\n\n"
"with exactly these keys (use null for any field you cannot determine):\n" "Receipt text:\n{ocr_text}\n\n"
'{"merchant": "store name", "amount": 0.00, "currency": "GBP", ' "Extract the information and return ONLY a JSON object with exactly these keys "
"(use null for any field you cannot determine):\n"
'{{"merchant": "store name", "amount": 0.00, "currency": "GBP", '
'"date": "YYYY-MM-DD", "description": "brief description", ' '"date": "YYYY-MM-DD", "description": "brief description", '
'"category": "one of: Food & Drink, Transport, Shopping, Entertainment, Health, Travel, Bills & Utilities, Other"}\n' '"category": "one of: Food & Drink, Transport, Shopping, Entertainment, Health, Travel, Bills & Utilities, Other"}}\n'
"Return ONLY the JSON object. No markdown, no explanation, no code fences." "Return ONLY the JSON object. No markdown, no explanation, no code fences."
) )
_EMPTY_RESULT: dict = {
"merchant": None, "amount": None, "currency": None,
"date": None, "description": None, "category": None,
"raw": None, "ocr_text": None,
}
def _extract_ocr_text(file_bytes: bytes, mime_type: str) -> str:
"""Extract text from an image or PDF. Returns empty string on failure."""
if mime_type == "application/pdf":
import io
import pdfplumber
try:
with pdfplumber.open(io.BytesIO(file_bytes)) as pdf:
pages_text = [page.extract_text() or "" for page in pdf.pages[:4]]
text = "\n".join(pages_text).strip()
if text:
return text
except Exception:
pass
# Scanned PDF — convert first page to image then OCR
try:
from pdf2image import convert_from_bytes
import pytesseract
images = convert_from_bytes(file_bytes, first_page=1, last_page=1, dpi=200)
if images:
return pytesseract.image_to_string(images[0])
except Exception:
pass
return ""
else:
import io
import pytesseract
from PIL import Image
try:
img = Image.open(io.BytesIO(file_bytes))
return pytesseract.image_to_string(img)
except Exception:
return ""
def _rule_based_parse(ocr_text: str) -> dict:
"""Extract receipt fields from OCR text using regex. Best-effort."""
import re
from datetime import datetime
lines = [ln.strip() for ln in ocr_text.splitlines() if ln.strip()]
# Merchant: skip very short lines and lines that look like addresses/phone numbers
merchant = None
for ln in lines[:5]:
if len(ln) > 2 and not re.match(r"^[\d\s\-\+\(\)]+$", ln) and not re.match(r"^\d+\s+\w+", ln):
merchant = ln
break
# Currency from symbols
currency = None
if "£" in ocr_text:
currency = "GBP"
elif "" in ocr_text:
currency = "EUR"
elif "$" in ocr_text:
currency = "USD"
# Amount: prefer lines containing total/amount keywords, then fall back to largest number
amount = None
total_line_pat = re.compile(
r"(?:total|amount\s*due|grand\s*total|balance\s*due|subtotal|net\s*total)"
r"[^\d£$€]*([£$€]?\s*\d{1,6}[.,]\d{2})\b",
re.IGNORECASE,
)
all_amount_pat = re.compile(r"[£$€]?\s*(\d{1,6}[.,]\d{2})\b")
for m in total_line_pat.finditer(ocr_text):
raw = re.sub(r"[£$€\s]", "", m.group(1)).replace(",", ".")
try:
amount = float(raw)
break
except ValueError:
pass
if amount is None:
candidates = []
for m in all_amount_pat.finditer(ocr_text):
try:
candidates.append(float(m.group(1).replace(",", ".")))
except ValueError:
pass
if candidates:
amount = max(candidates)
# Date: try common formats
date = None
date_patterns = [
(r"\b(\d{4}[-/]\d{2}[-/]\d{2})\b", ["%Y-%m-%d", "%Y/%m/%d"]),
(r"\b(\d{2}[-/]\d{2}[-/]\d{4})\b", ["%d-%m-%Y", "%d/%m/%Y", "%m/%d/%Y"]),
(r"\b(\d{1,2}\s+(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*\.?\s+\d{4})\b", ["%d %B %Y", "%d %b %Y"]),
(r"\b((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[a-z]*\.?\s+\d{1,2},?\s+\d{4})\b", ["%B %d, %Y", "%b %d, %Y"]),
]
for pattern, fmts in date_patterns:
m = re.search(pattern, ocr_text, re.IGNORECASE)
if m:
raw_date = m.group(1).rstrip(".")
for fmt in fmts:
try:
date = datetime.strptime(raw_date, fmt).strftime("%Y-%m-%d")
break
except ValueError:
pass
if date:
break
description = merchant # simple default
return {
"merchant": merchant,
"amount": amount,
"currency": currency,
"date": date,
"description": description,
"category": None,
"raw": None,
"ocr_text": ocr_text,
}
def _strip_code_fence(text: str) -> str:
if text.startswith("```"):
parts = text.split("```")
text = parts[1] if len(parts) > 1 else text
if text.startswith("json"):
text = text[4:]
return text.strip()
async def _call_ai_parse(file_bytes: bytes, mime_type: str, user_row) -> dict: async def _call_ai_parse(file_bytes: bytes, mime_type: str, user_row) -> dict:
"""Call the configured AI provider and return parsed receipt fields.""" """
import base64 Parse a receipt: OCR text extraction AI (text prompt) rule-based fallback.
AI is optional; rules always run as fallback if AI is unconfigured or fails.
"""
import json import json
import httpx import httpx
from app.core.security import decrypt_field from app.core.security import decrypt_field
if not user_row.ai_provider or not user_row.ai_api_key_enc: # Step 1: extract text via OCR / PDF text layer
raise HTTPException(status_code=400, detail="No AI provider configured. Add your API key in Settings → AI.") ocr_text = _extract_ocr_text(file_bytes, mime_type)
has_ai = bool(user_row and user_row.ai_provider and user_row.ai_api_key_enc)
# Step 2: attempt AI parse if configured
if has_ai and ocr_text.strip():
api_key = decrypt_field(user_row.ai_api_key_enc) api_key = decrypt_field(user_row.ai_api_key_enc)
b64 = base64.standard_b64encode(file_bytes).decode()
custom_base_url = (user_row.ai_base_url or "").rstrip("/") custom_base_url = (user_row.ai_base_url or "").rstrip("/")
custom_model = (user_row.ai_model or "").strip() custom_model = (user_row.ai_model or "").strip()
prompt = _RECEIPT_TEXT_PROMPT.format(ocr_text=ocr_text)
try: try:
if user_row.ai_provider == "anthropic": if user_row.ai_provider == "anthropic":
base_url = custom_base_url or "https://api.anthropic.com" base_url = custom_base_url or "https://api.anthropic.com"
model = custom_model or "claude-haiku-4-5-20251001" model = custom_model or "claude-haiku-4-5-20251001"
if mime_type == "application/pdf":
content_block = {"type": "document", "source": {"type": "base64", "media_type": "application/pdf", "data": b64}}
else:
content_block = {"type": "image", "source": {"type": "base64", "media_type": mime_type, "data": b64}}
async with httpx.AsyncClient(timeout=60) as client: async with httpx.AsyncClient(timeout=60) as client:
resp = await client.post( resp = await client.post(
f"{base_url}/v1/messages", f"{base_url}/v1/messages",
headers={"x-api-key": api_key, "anthropic-version": "2023-06-01", "content-type": "application/json"}, headers={"x-api-key": api_key, "anthropic-version": "2023-06-01", "content-type": "application/json"},
json={"model": model, "max_tokens": 512, "messages": [{"role": "user", "content": [content_block, {"type": "text", "text": _RECEIPT_PROMPT}]}]}, json={"model": model, "max_tokens": 512, "messages": [{"role": "user", "content": prompt}]},
) )
resp.raise_for_status() resp.raise_for_status()
text = resp.json()["content"][0]["text"].strip() raw = resp.json()["content"][0]["text"].strip()
elif user_row.ai_provider == "openai": elif user_row.ai_provider == "openai":
base_url = custom_base_url or "https://api.openai.com" base_url = custom_base_url or "https://api.openai.com"
model = custom_model or "gpt-4o-mini" model = custom_model or "gpt-4o-mini"
if mime_type == "application/pdf" and not custom_base_url:
raise HTTPException(status_code=400, detail="PDF parsing is not supported with the OpenAI provider. Use an image format or switch to Anthropic.")
async with httpx.AsyncClient(timeout=60) as client: async with httpx.AsyncClient(timeout=60) as client:
resp = await client.post( resp = await client.post(
f"{base_url}/v1/chat/completions", f"{base_url}/v1/chat/completions",
headers={"Authorization": f"Bearer {api_key}", "content-type": "application/json"}, headers={"Authorization": f"Bearer {api_key}", "content-type": "application/json"},
json={"model": model, "max_tokens": 512, "messages": [{"role": "user", "content": [ json={"model": model, "max_tokens": 512, "messages": [{"role": "user", "content": prompt}]},
{"type": "image_url", "image_url": {"url": f"data:{mime_type};base64,{b64}"}},
{"type": "text", "text": _RECEIPT_PROMPT},
]}]},
) )
resp.raise_for_status() resp.raise_for_status()
text = resp.json()["choices"][0]["message"]["content"].strip() raw = resp.json()["choices"][0]["message"]["content"].strip()
else: else:
raise HTTPException(status_code=400, detail="Unknown provider") raw = None
except httpx.HTTPStatusError as e:
raise HTTPException(status_code=502, detail=f"AI provider error: {e.response.status_code}")
except httpx.RequestError:
raise HTTPException(status_code=502, detail="Could not reach AI provider")
if text.startswith("```"):
text = text.split("```")[1]
if text.startswith("json"):
text = text[4:]
text = text.strip()
if raw:
cleaned = _strip_code_fence(raw)
try: try:
parsed = json.loads(text) parsed = json.loads(cleaned)
except json.JSONDecodeError:
raise HTTPException(status_code=502, detail="AI returned an unexpected response. Try again.")
return { return {
"merchant": parsed.get("merchant"), "merchant": parsed.get("merchant"),
"amount": parsed.get("amount"), "amount": parsed.get("amount"),
@ -363,8 +484,24 @@ async def _call_ai_parse(file_bytes: bytes, mime_type: str, user_row) -> dict:
"date": parsed.get("date"), "date": parsed.get("date"),
"description": parsed.get("description"), "description": parsed.get("description"),
"category": parsed.get("category"), "category": parsed.get("category"),
"raw": text, "raw": raw,
"ocr_text": ocr_text,
} }
except json.JSONDecodeError:
# AI returned something non-JSON — fall through to rules, keep raw for debug
pass
except (httpx.HTTPStatusError, httpx.RequestError):
pass # fall through to rule-based
# Step 3: rule-based fallback (also used when AI is not configured)
if ocr_text.strip():
return _rule_based_parse(ocr_text)
# Nothing worked
if has_ai:
raise HTTPException(status_code=400, detail="Could not extract any text from the file. Try a clearer image.")
raise HTTPException(status_code=400, detail="No AI configured and OCR extracted no text. Add an API key in Settings → AI or try a clearer image.")
@router.post("/parse-receipt") @router.post("/parse-receipt")

View file

@ -7,7 +7,6 @@ from fastapi import Request, Response
from starlette.middleware.base import BaseHTTPMiddleware from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse from starlette.responses import JSONResponse
from app.config import get_settings
SAFE_METHODS = {"GET", "HEAD", "OPTIONS"} SAFE_METHODS = {"GET", "HEAD", "OPTIONS"}
@ -57,7 +56,8 @@ class CSRFMiddleware(BaseHTTPMiddleware):
"csrf_token", token, "csrf_token", token,
httponly=False, # must be readable by JS httponly=False, # must be readable by JS
samesite="lax", samesite="lax",
secure=not get_settings().is_development, secure=False, # CSRF token is public by design; Secure would break HTTP deployments
max_age=604800, # 7 days — survive browser restarts
) )
return response return response
@ -65,7 +65,7 @@ class CSRFMiddleware(BaseHTTPMiddleware):
response = await call_next(request) response = await call_next(request)
if not existing_csrf: if not existing_csrf:
token = str(uuid.uuid4()) token = str(uuid.uuid4())
response.set_cookie("csrf_token", token, httponly=False, samesite="lax", secure=not get_settings().is_development) response.set_cookie("csrf_token", token, httponly=False, samesite="lax", secure=False, max_age=604800)
return response return response
if request.url.path in {"/api/v1/auth/login", "/api/v1/auth/login/totp"}: if request.url.path in {"/api/v1/auth/login", "/api/v1/auth/login/totp"}:

View file

@ -33,6 +33,7 @@ class User(Base):
ai_api_key_enc: Mapped[bytes | None] = mapped_column(LargeBinary, nullable=True) ai_api_key_enc: Mapped[bytes | None] = mapped_column(LargeBinary, nullable=True)
ai_base_url: Mapped[str | None] = mapped_column(Text, nullable=True) ai_base_url: Mapped[str | None] = mapped_column(Text, nullable=True)
ai_model: Mapped[str | None] = mapped_column(Text, nullable=True) ai_model: Mapped[str | None] = mapped_column(Text, nullable=True)
ai_debug: Mapped[bool] = mapped_column(Boolean, default=False, nullable=False)
accounts: Mapped[list["Account"]] = relationship(back_populates="user", lazy="noload") # type: ignore[name-defined] accounts: Mapped[list["Account"]] = relationship(back_populates="user", lazy="noload") # type: ignore[name-defined]
sessions: Mapped[list["Session"]] = relationship(back_populates="user", lazy="noload") # type: ignore[name-defined] sessions: Mapped[list["Session"]] = relationship(back_populates="user", lazy="noload") # type: ignore[name-defined]

View file

@ -47,6 +47,7 @@ def _to_response(t: Transaction) -> dict:
"notes": _dec(t.notes_enc), "notes": _dec(t.notes_enc),
"tags": t.tags or [], "tags": t.tags or [],
"is_recurring": t.is_recurring, "is_recurring": t.is_recurring,
"attachment_refs": t.attachment_refs or [],
"created_at": t.created_at, "created_at": t.created_at,
"updated_at": t.updated_at, "updated_at": t.updated_at,
} }

View file

@ -31,6 +31,9 @@ dependencies = [
"structlog>=24.0", "structlog>=24.0",
"pillow>=11.0", "pillow>=11.0",
"python-magic>=0.4", "python-magic>=0.4",
"pytesseract>=0.3",
"pdfplumber>=0.11",
"pdf2image>=1.17",
"psycopg2-binary>=2.9", "psycopg2-binary>=2.9",
] ]

View file

@ -3,6 +3,9 @@ server {
root /usr/share/nginx/html; root /usr/share/nginx/html;
index index.html; index index.html;
# Allow uploads up to 15 MB (receipt images can be several MB)
client_max_body_size 15m;
# Proxy API calls to the backend container # Proxy API calls to the backend container
location /api/ { location /api/ {
proxy_pass http://backend:8000; proxy_pass http://backend:8000;
@ -10,6 +13,8 @@ server {
proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme; proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 120s; # OCR + AI can take a few seconds
proxy_send_timeout 120s;
} }
# All other routes index.html (React SPA) # All other routes index.html (React SPA)

View file

@ -5,6 +5,7 @@ export interface AiSettings {
has_api_key: boolean; has_api_key: boolean;
base_url: string | null; base_url: string | null;
model: string | null; model: string | null;
debug: boolean;
} }
export interface AiSettingsSave { export interface AiSettingsSave {
@ -12,6 +13,7 @@ export interface AiSettingsSave {
api_key?: string; api_key?: string;
base_url?: string; base_url?: string;
model?: string; model?: string;
debug?: boolean;
} }
export interface ParsedReceipt { export interface ParsedReceipt {
@ -22,6 +24,7 @@ export interface ParsedReceipt {
description: string | null; description: string | null;
category: string | null; category: string | null;
raw: string | null; raw: string | null;
ocr_text: string | null;
} }
export async function getAiSettings(): Promise<AiSettings> { export async function getAiSettings(): Promise<AiSettings> {
@ -51,8 +54,7 @@ export async function parseReceipt(txnId: string, attachmentId: string): Promise
export async function parseReceiptFile(file: File): Promise<ParsedReceipt> { export async function parseReceiptFile(file: File): Promise<ParsedReceipt> {
const form = new FormData(); const form = new FormData();
form.append("file", file); form.append("file", file);
const { data } = await api.post("/transactions/parse-receipt", form, { // Do NOT set Content-Type manually — axios sets it with the multipart boundary automatically
headers: { "Content-Type": "multipart/form-data" }, const { data } = await api.post("/transactions/parse-receipt", form);
});
return data; return data;
} }

View file

@ -113,9 +113,8 @@ export async function importCsv(
export async function uploadAttachment(txnId: string, file: File): Promise<AttachmentRef> { export async function uploadAttachment(txnId: string, file: File): Promise<AttachmentRef> {
const form = new FormData(); const form = new FormData();
form.append("file", file); form.append("file", file);
const res = await api.post(`/transactions/${txnId}/attachments`, form, { // Do NOT set Content-Type manually — axios sets it with the multipart boundary automatically
headers: { "Content-Type": "multipart/form-data" }, const res = await api.post(`/transactions/${txnId}/attachments`, form);
});
return res.data; return res.data;
} }

View file

@ -660,6 +660,7 @@ function AiSection() {
const [apiKey, setApiKey] = useState(""); const [apiKey, setApiKey] = useState("");
const [baseUrl, setBaseUrl] = useState(""); const [baseUrl, setBaseUrl] = useState("");
const [model, setModel] = useState(""); const [model, setModel] = useState("");
const [debug, setDebug] = useState(false);
const [showKey, setShowKey] = useState(false); const [showKey, setShowKey] = useState(false);
const [success, setSuccess] = useState(""); const [success, setSuccess] = useState("");
@ -670,6 +671,7 @@ function AiSection() {
if (d.provider) setProvider(d.provider); if (d.provider) setProvider(d.provider);
if (d.base_url) setBaseUrl(d.base_url); if (d.base_url) setBaseUrl(d.base_url);
if (d.model) setModel(d.model); if (d.model) setModel(d.model);
setDebug(d.debug);
return d; return d;
}, },
}); });
@ -680,6 +682,7 @@ function AiSection() {
api_key: apiKey, api_key: apiKey,
base_url: baseUrl, base_url: baseUrl,
model, model,
debug,
}), }),
onSuccess: () => { onSuccess: () => {
qc.invalidateQueries({ queryKey: ["ai-settings"] }); qc.invalidateQueries({ queryKey: ["ai-settings"] });
@ -800,6 +803,28 @@ function AiSection() {
</p> </p>
</div> </div>
<div className="flex items-center justify-between py-1">
<div>
<p className="text-sm font-medium">Debug mode</p>
<p className="text-xs text-muted-foreground">Show OCR text and raw AI responses when scanning receipts</p>
</div>
<button
type="button"
role="switch"
aria-checked={debug}
onClick={() => setDebug(v => !v)}
className={cn(
"relative inline-flex h-6 w-11 items-center rounded-full transition-colors focus:outline-none",
debug ? "bg-primary" : "bg-secondary border border-input"
)}
>
<span className={cn(
"inline-block h-4 w-4 transform rounded-full bg-white shadow transition-transform",
debug ? "translate-x-6" : "translate-x-1"
)} />
</button>
</div>
<div className="flex gap-3 flex-wrap"> <div className="flex gap-3 flex-wrap">
<button <button
onClick={() => saveMutation.mutate()} onClick={() => saveMutation.mutate()}

View file

@ -27,6 +27,8 @@ export interface TransactionInitialValues {
amount?: number; amount?: number;
date?: string; date?: string;
currency?: string; currency?: string;
raw?: string | null;
ocr_text?: string | null;
} }
interface Props { interface Props {
@ -37,9 +39,10 @@ interface Props {
isLoading: boolean; isLoading: boolean;
initialValues?: TransactionInitialValues; initialValues?: TransactionInitialValues;
parsedFromReceipt?: boolean; parsedFromReceipt?: boolean;
showAiDebug?: boolean;
} }
export default function TransactionFormModal({ accounts, categories, onClose, onSubmit, isLoading, initialValues, parsedFromReceipt }: Props) { export default function TransactionFormModal({ accounts, categories, onClose, onSubmit, isLoading, initialValues, parsedFromReceipt, showAiDebug }: Props) {
const { register, handleSubmit, watch, formState: { errors } } = useForm<Form>({ const { register, handleSubmit, watch, formState: { errors } } = useForm<Form>({
resolver: zodResolver(schema), resolver: zodResolver(schema),
defaultValues: { defaultValues: {
@ -83,12 +86,38 @@ export default function TransactionFormModal({ accounts, categories, onClose, on
</div> </div>
<form onSubmit={handleSubmit(handleFormSubmit)} className="p-6 space-y-4"> <form onSubmit={handleSubmit(handleFormSubmit)} className="p-6 space-y-4">
{parsedFromReceipt && ( {parsedFromReceipt && !showAiDebug && (
<div className="flex items-center gap-2 bg-primary/10 border border-primary/20 rounded-lg px-3 py-2 text-xs text-primary"> <div className="flex items-center gap-2 bg-primary/10 border border-primary/20 rounded-lg px-3 py-2 text-xs text-primary">
<Sparkles className="w-3.5 h-3.5 shrink-0" /> <Sparkles className="w-3.5 h-3.5 shrink-0" />
Fields pre-filled from receipt review before saving Fields pre-filled from receipt review before saving
</div> </div>
)} )}
{parsedFromReceipt && showAiDebug && (
<div className="bg-primary/5 border border-primary/20 rounded-lg p-3 space-y-2">
<p className="text-xs font-semibold text-primary flex items-center gap-1.5">
<Sparkles className="w-3.5 h-3.5" /> AI scan result review before saving
</p>
<div className="grid grid-cols-2 gap-x-4 gap-y-1 text-xs">
<span className="text-muted-foreground">Merchant</span><span className="font-medium truncate">{initialValues?.merchant ?? <span className="text-destructive/70 italic">not detected</span>}</span>
<span className="text-muted-foreground">Amount</span><span className="font-medium">{initialValues?.amount != null ? initialValues.amount : <span className="text-destructive/70 italic">not detected</span>}</span>
<span className="text-muted-foreground">Date</span><span className="font-medium">{initialValues?.date ?? <span className="text-destructive/70 italic">not detected</span>}</span>
<span className="text-muted-foreground">Currency</span><span className="font-medium">{initialValues?.currency ?? <span className="text-destructive/70 italic">not detected</span>}</span>
<span className="text-muted-foreground">Description</span><span className="font-medium truncate">{initialValues?.description ?? <span className="text-destructive/70 italic">not detected</span>}</span>
</div>
{initialValues?.ocr_text && (
<details className="text-xs">
<summary className="text-muted-foreground cursor-pointer hover:text-foreground select-none">OCR extracted text</summary>
<pre className="mt-1.5 bg-background border border-border rounded p-2 text-xs overflow-x-auto whitespace-pre-wrap break-all">{initialValues.ocr_text}</pre>
</details>
)}
{initialValues?.raw && (
<details className="text-xs">
<summary className="text-muted-foreground cursor-pointer hover:text-foreground select-none">Raw AI response</summary>
<pre className="mt-1.5 bg-background border border-border rounded p-2 text-xs overflow-x-auto whitespace-pre-wrap break-all">{initialValues.raw}</pre>
</details>
)}
</div>
)}
{/* Type */} {/* Type */}
<div> <div>

View file

@ -1,9 +1,9 @@
import { useRef, useState } from "react"; import { useRef, useState } from "react";
import { useQuery, useMutation, useQueryClient } from "@tanstack/react-query"; import { useQuery, useMutation, useQueryClient } from "@tanstack/react-query";
import { getTransactions, deleteTransaction, createTransaction, getCategories } from "@/api/transactions"; import { getTransactions, deleteTransaction, createTransaction, getCategories, uploadAttachment } from "@/api/transactions";
import type { Transaction } from "@/api/transactions"; import type { Transaction } from "@/api/transactions";
import { getAccounts } from "@/api/accounts"; import { getAccounts } from "@/api/accounts";
import { parseReceiptFile } from "@/api/settings"; import { parseReceiptFile, getAiSettings } from "@/api/settings";
import type { ParsedReceipt } from "@/api/settings"; import type { ParsedReceipt } from "@/api/settings";
import { formatCurrency } from "@/utils/currency"; import { formatCurrency } from "@/utils/currency";
import { cn } from "@/utils/cn"; import { cn } from "@/utils/cn";
@ -52,6 +52,7 @@ export default function TransactionList() {
const [showForm, setShowForm] = useState(false); const [showForm, setShowForm] = useState(false);
const [selectedTxn, setSelectedTxn] = useState<Transaction | null>(null); const [selectedTxn, setSelectedTxn] = useState<Transaction | null>(null);
const [receiptParsed, setReceiptParsed] = useState<ParsedReceipt | null>(null); const [receiptParsed, setReceiptParsed] = useState<ParsedReceipt | null>(null);
const receiptFileRef = useRef<File | null>(null);
const [scanError, setScanError] = useState<string | null>(null); const [scanError, setScanError] = useState<string | null>(null);
const [scanning, setScanning] = useState(false); const [scanning, setScanning] = useState(false);
const receiptInputRef = useRef<HTMLInputElement>(null); const receiptInputRef = useRef<HTMLInputElement>(null);
@ -80,6 +81,7 @@ export default function TransactionList() {
const { data: accounts = [] } = useQuery({ queryKey: ["accounts"], queryFn: getAccounts }); const { data: accounts = [] } = useQuery({ queryKey: ["accounts"], queryFn: getAccounts });
const { data: categories = [] } = useQuery({ queryKey: ["categories"], queryFn: getCategories }); const { data: categories = [] } = useQuery({ queryKey: ["categories"], queryFn: getCategories });
const { data: aiSettings } = useQuery({ queryKey: ["ai-settings"], queryFn: getAiSettings });
const deleteMutation = useMutation({ const deleteMutation = useMutation({
mutationFn: deleteTransaction, mutationFn: deleteTransaction,
@ -89,30 +91,47 @@ export default function TransactionList() {
}, },
}); });
const createMutation = useMutation({ const createMutation = useMutation({ mutationFn: createTransaction });
mutationFn: createTransaction,
onSuccess: () => { async function handleCreateTransaction(data: any) {
try {
const txn = await createMutation.mutateAsync(data);
if (receiptFileRef.current) {
try {
await uploadAttachment(txn.id, receiptFileRef.current);
} catch (e: any) {
setScanError(`Transaction saved but receipt attachment failed: ${e?.response?.data?.detail ?? e?.message ?? "unknown error"}`);
}
receiptFileRef.current = null;
}
qc.invalidateQueries({ queryKey: ["transactions"] }); qc.invalidateQueries({ queryKey: ["transactions"] });
qc.invalidateQueries({ queryKey: ["accounts"] }); qc.invalidateQueries({ queryKey: ["accounts"] });
setShowForm(false); setShowForm(false);
setReceiptParsed(null); setReceiptParsed(null);
}, } catch {
}); // Transaction creation failed — createMutation.error has the detail, form stays open
}
}
async function handleReceiptFile(file: File) { async function handleReceiptFile(file: File) {
setScanning(true); setScanning(true);
setScanError(null); setScanError(null);
try { try {
const parsed = await parseReceiptFile(file); const parsed = await parseReceiptFile(file);
const hasAnyField = parsed.merchant || parsed.amount || parsed.description || parsed.date; // Always open the form — the modal shows a debug panel with what was/wasn't detected
if (!hasAnyField && parsed.raw) {
setScanError(`AI couldn't extract any fields. Raw response: "${parsed.raw}"`);
} else {
setReceiptParsed(parsed); setReceiptParsed(parsed);
receiptFileRef.current = file;
setShowForm(true); setShowForm(true);
}
} catch (e: any) { } catch (e: any) {
setScanError(e?.response?.data?.detail ?? "Could not parse receipt. Check your AI settings."); const detail = e?.response?.data?.detail;
const status = e?.response?.status;
if (detail) {
setScanError(typeof detail === "string" ? detail : `HTTP ${status}: ${JSON.stringify(detail)}`);
} else if (status) {
setScanError(`Server error ${status} — check backend logs (docker compose logs backend).`);
} else {
setScanError(`Network error — backend may be unreachable. ${e?.message ?? ""}`);
}
} finally { } finally {
setScanning(false); setScanning(false);
if (receiptInputRef.current) receiptInputRef.current.value = ""; if (receiptInputRef.current) receiptInputRef.current.value = "";
@ -351,8 +370,8 @@ export default function TransactionList() {
<TransactionFormModal <TransactionFormModal
accounts={accounts} accounts={accounts}
categories={categories} categories={categories}
onClose={() => { setShowForm(false); setReceiptParsed(null); }} onClose={() => { setShowForm(false); setReceiptParsed(null); receiptFileRef.current = null; }}
onSubmit={(data) => createMutation.mutate(data)} onSubmit={handleCreateTransaction}
isLoading={createMutation.isPending} isLoading={createMutation.isPending}
initialValues={receiptParsed ? { initialValues={receiptParsed ? {
description: receiptParsed.description ?? undefined, description: receiptParsed.description ?? undefined,
@ -360,8 +379,11 @@ export default function TransactionList() {
amount: receiptParsed.amount ?? undefined, amount: receiptParsed.amount ?? undefined,
date: receiptParsed.date ?? undefined, date: receiptParsed.date ?? undefined,
currency: receiptParsed.currency ?? undefined, currency: receiptParsed.currency ?? undefined,
raw: receiptParsed.raw,
ocr_text: receiptParsed.ocr_text,
} : undefined} } : undefined}
parsedFromReceipt={!!receiptParsed} parsedFromReceipt={!!receiptParsed}
showAiDebug={aiSettings?.debug ?? false}
/> />
)} )}