megaproxy 26e2a055db Add AI receipt scanning with OCR pipeline and debug toggle

- OCR pipeline: Tesseract (images) + pdfplumber (PDFs) → AI text prompt →
  rule-based regex fallback; works with any text model, not just vision models
- Scan Receipt toolbar button parses a photo and pre-fills the transaction form;
  receipt image is automatically attached to the created transaction
- AI settings page: provider, API key (AES-256-GCM encrypted), custom URL,
  model, and per-user debug toggle that gates the OCR/AI debug panel
- Fix CSRF cookie secure=False so HTTP deployments work; add 7-day max_age
- Fix attachment_refs missing from _to_response (attachments never appeared in UI)
- Fix multipart boundary lost when Content-Type was set manually in axios calls
- nginx: raise client_max_body_size to 15 MB, add 120s proxy timeout for OCR
- Migration 0005: add ai_debug boolean to users table
- Update README and CLAUDE.md with AI scanning docs and architecture notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-22 22:07:38 +00:00

6.9 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

MyMidas — a self-hosted personal finance tracker. Full-stack: FastAPI backend + React/TypeScript frontend, running in Docker Compose with PostgreSQL and Redis.

Commands

Backend (Python 3.12 + FastAPI)

# Dev server (from backend/)
python -m uvicorn app.main:app --reload --port 8080

# Apply migrations
python -m alembic upgrade head

# Create a new migration
python -m alembic revision --autogenerate -m "description"

# Run tests
pytest tests/
pytest tests/test_foo.py::test_bar   # single test

Frontend (React 18 + Vite)

cd frontend
npm run dev      # Vite dev server on :5173, proxies /api → localhost:8080
npm run build    # TypeScript check + production build
npm run lint     # ESLint

Docker (production)

docker compose up -d --build          # Full rebuild and start
docker compose up -d --build backend  # Rebuild backend only
docker compose logs -f backend        # Follow backend logs

Architecture

Request flow

Browser → Frontend nginx (:4000) or Vite dev (:5173)
  → /api/* proxied to FastAPI (:8000 in Docker, :8080 in dev)
    → Middleware: SecurityHeaders → CSRF → CORSMiddleware
    → Rate limiter (Redis sliding window)
    → Route handler → get_current_user dependency
      → Service layer (business logic + encryption)
        → AsyncSession → PostgreSQL (RLS enforced)

Auth architecture

JWT: RS256, 15-min access tokens + 7-day HttpOnly refresh cookie
get_current_user dependency validates the JWT, checks session exists and isn't revoked in DB, then calls SET LOCAL app.current_user_id to activate PostgreSQL RLS
TOTP: optional TOTP on login; before TOTP is verified, a short-lived challenge_token is issued
CSRF: double-submit cookie — backend sets csrf_token cookie on first GET; frontend reads it and sends X-CSRF-Token header on all mutating requests

Backend layout

backend/app/
  main.py          — FastAPI app factory (middleware stack, router include)
  config.py        — Pydantic Settings from env vars
  dependencies.py  — get_db, get_redis, get_current_user
  api/
    router.py      — Central router; investments/reports/budgets have no prefix (paths self-contained)
    v1/            — One file per domain (auth, accounts, transactions, budgets, reports, investments, predictions, settings)
  db/models/       — SQLAlchemy 2.0 Mapped models
  schemas/         — Pydantic request/response models (separate Create/Update/Response per domain)
  services/        — Business logic; each service owns one domain
  core/
    security.py    — Crypto primitives: Argon2id, RS256 JWT, AES-256-GCM, TOTP
    middleware.py  — CSRFMiddleware, SecurityHeadersMiddleware
  workers/
    scheduler.py   — APScheduler in-process jobs
    ml/            — Prophet/SARIMA forecasts, Monte Carlo simulations

Service layer conventions

All DB ops use AsyncSession + await
PII fields (account name, transaction description/merchant/notes, TOTP secret) are AES-256-GCM encrypted; stored as bytea named with _enc suffix (e.g. description_enc). Use encrypt_field / decrypt_field from core/security.py
Import deduplication via SHA-256 import_hash on transactions
Every mutation writes to AuditLog (append-only; app role has no UPDATE/DELETE on that table)
Soft deletes: deleted_at timestamp; all queries must filter WHERE deleted_at IS NULL
_to_response() in transaction_service.py must include all fields returned to the frontend — omitting a field here makes it invisible to the UI even if it's in the DB

AI / receipt parsing (`api/v1/settings.py`, `api/v1/transactions.py`)

User AI config (provider, encrypted API key, base URL, model, debug flag) lives on the users table; managed via GET/PUT/DELETE /settings/ai
ai_api_key_enc is AES-256-GCM encrypted with encrypt_field/decrypt_field
Receipt parsing pipeline in _call_ai_parse(): OCR text extraction (_extract_ocr_text) → AI text prompt → rule-based fallback (_rule_based_parse)
- Images: pytesseract; PDFs: pdfplumber (text layer) → pdf2image + tesseract (scanned fallback)
- AI receives OCR text, not the image — works with any text model, not just vision models
- _RECEIPT_TEXT_PROMPT uses .format(ocr_text=...) — escape literal braces in the JSON example with {{ and }}
POST /transactions/parse-receipt — scan without an existing transaction (used by "Scan Receipt" toolbar button)
POST /transactions/{id}/attachments/{att_id}/parse — parse an already-uploaded attachment
ai_debug boolean on user controls whether the OCR/AI debug panel shows in the transaction form; check aiSettings?.debug on the frontend via the ["ai-settings"] query key

Set with secure=False and max_age=604800 (7 days) intentionally — the CSRF token is a public value readable by JS; Secure would break HTTP deployments. Session/auth cookies remain properly secured.

Frontend layout

frontend/src/
  api/         — Typed axios clients per domain; all use the shared client in api/client.ts
  pages/       — Route-level components (one directory per domain)
  components/  — Shared UI (layout, charts, modals)
  store/       — Zustand: authStore (token/userId/displayName), uiStore (theme/sidebar/currency)
  utils/       — Currency formatting, cn() class helper

Frontend data fetching

TanStack Query (React Query v5) for all server state. queryKey conventions determine cache invalidation — always invalidate the correct key after mutations. The axios client in api/client.ts handles token injection, CSRF header, and 401 auto-refresh transparently.

Background jobs (APScheduler, in-process)

Every 15 min: sync asset prices (yfinance + CoinGecko)
Every 60 min: sync FX rates
2 AM daily: net worth snapshot
3 AM daily: encrypted GPG backup
Sunday weekly: retrain ML prediction models

Database patterns

UUID PKs everywhere
PostgreSQL RLS policies on every table keyed to app.current_user_id (set per-request by get_current_user)
postgres/init/ contains init SQL; alembic manages schema evolution
Migrations run automatically on container start (alembic upgrade head in Dockerfile CMD)

Themes

7 CSS variable themes in frontend/src/index.css: obsidian (default dark), arctic (light), midnight, vault, terminal, synthwave, ledger. Applied as a class on <html>.

Environment variables

See .env.example for the full list. Key ones:

ENCRYPTION_KEY — 32-byte base64 for AES-256-GCM field encryption
DATABASE_URL — asyncpg connection string
ENVIRONMENT — development enables /docs, /redoc, /openapi.json and relaxes CORS
ALLOW_REGISTRATION — gates the register endpoint (default false in prod)
BASE_CURRENCY — default display currency (default GBP)

6.9 KiB Raw Blame History