Add AI receipt scanning with OCR pipeline and debug toggle

- OCR pipeline: Tesseract (images) + pdfplumber (PDFs) → AI text prompt →
  rule-based regex fallback; works with any text model, not just vision models
- Scan Receipt toolbar button parses a photo and pre-fills the transaction form;
  receipt image is automatically attached to the created transaction
- AI settings page: provider, API key (AES-256-GCM encrypted), custom URL,
  model, and per-user debug toggle that gates the OCR/AI debug panel
- Fix CSRF cookie secure=False so HTTP deployments work; add 7-day max_age
- Fix attachment_refs missing from _to_response (attachments never appeared in UI)
- Fix multipart boundary lost when Content-Type was set manually in axios calls
- nginx: raise client_max_body_size to 15 MB, add 120s proxy timeout for OCR
- Migration 0005: add ai_debug boolean to users table
- Update README and CLAUDE.md with AI scanning docs and architecture notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
megaproxy 2026-04-22 22:07:38 +00:00
parent a7c54ca61c
commit 26e2a055db
16 changed files with 397 additions and 99 deletions

View file

@ -70,7 +70,7 @@ backend/app/
dependencies.py — get_db, get_redis, get_current_user
api/
router.py — Central router; investments/reports/budgets have no prefix (paths self-contained)
v1/ — One file per domain (auth, accounts, transactions, budgets, reports, investments, predictions)
v1/ — One file per domain (auth, accounts, transactions, budgets, reports, investments, predictions, settings)
db/models/ — SQLAlchemy 2.0 Mapped models
schemas/ — Pydantic request/response models (separate Create/Update/Response per domain)
services/ — Business logic; each service owns one domain
@ -88,6 +88,21 @@ backend/app/
- Import deduplication via SHA-256 `import_hash` on transactions
- Every mutation writes to `AuditLog` (append-only; app role has no UPDATE/DELETE on that table)
- Soft deletes: `deleted_at` timestamp; all queries must filter `WHERE deleted_at IS NULL`
- `_to_response()` in `transaction_service.py` must include all fields returned to the frontend — omitting a field here makes it invisible to the UI even if it's in the DB
### AI / receipt parsing (`api/v1/settings.py`, `api/v1/transactions.py`)
- User AI config (provider, encrypted API key, base URL, model, debug flag) lives on the `users` table; managed via `GET/PUT/DELETE /settings/ai`
- `ai_api_key_enc` is AES-256-GCM encrypted with `encrypt_field`/`decrypt_field`
- Receipt parsing pipeline in `_call_ai_parse()`: OCR text extraction (`_extract_ocr_text`) → AI text prompt → rule-based fallback (`_rule_based_parse`)
- Images: pytesseract; PDFs: pdfplumber (text layer) → pdf2image + tesseract (scanned fallback)
- AI receives OCR text, not the image — works with any text model, not just vision models
- `_RECEIPT_TEXT_PROMPT` uses `.format(ocr_text=...)` — escape literal braces in the JSON example with `{{` and `}}`
- `POST /transactions/parse-receipt` — scan without an existing transaction (used by "Scan Receipt" toolbar button)
- `POST /transactions/{id}/attachments/{att_id}/parse` — parse an already-uploaded attachment
- `ai_debug` boolean on user controls whether the OCR/AI debug panel shows in the transaction form; check `aiSettings?.debug` on the frontend via the `["ai-settings"]` query key
### CSRF cookie
- Set with `secure=False` and `max_age=604800` (7 days) intentionally — the CSRF token is a public value readable by JS; `Secure` would break HTTP deployments. Session/auth cookies remain properly secured.
### Frontend layout
```