Session log: MCP v2 PR-1 + PR-1b (policy engine + dispatcher)
Document the fan-out approach (3 Sonnet agents + 1 Haiku), the event/reply RPC pattern, the 10 hard-deny rules and their caveats, the audit + confirm + Always-Allow UX, and the four integration bugs worth remembering (Tauri 2 Emitter trait import, McpError 'static strings, React 18 StrictMode listen() race, lifting the audit subscription out of AuditTab).
This commit is contained in:
parent
26ffe8859a
commit
09019a0ad7
1 changed files with 89 additions and 0 deletions
89
memory.md
89
memory.md
|
|
@ -51,6 +51,95 @@ Durable memory for this project. Read at session start, update before session en
|
|||
|
||||
## Session log
|
||||
|
||||
### 2026-05-26 — MCP v2 PR-1 + PR-1b: policy engine, audit log, dispatcher, `set_label` end-to-end
|
||||
|
||||
First two of four planned PRs for the MCP write surface. Shipped via fan-out (3 Sonnet agents in parallel + 1 Haiku for fuzz tests, then sequential integration by me). Two clean commits: `464c576` (PR-1 foundation) and `26ffe88` (PR-1b dispatcher + bug fixes).
|
||||
|
||||
**Architecture: Pattern A (event/reply RPC across the IPC boundary).** Frontend keeps tree authority (it's `useState` in App.tsx); backend MCP tool handlers can't synchronously call into JS. Tauri 2's `invoke` is JS→Rust only, so a backend-initiated mutation has to round-trip through events:
|
||||
|
||||
```
|
||||
[MCP tool handler] [App.tsx]
|
||||
build {requestId, tool, args, ...} ⟶ emit "mcp://request"
|
||||
register oneshot in PendingActions frontend dispatcher:
|
||||
await rx with 30s timeout 1. policy check decided needsConfirm
|
||||
2. if needsConfirm → modal queue
|
||||
3. runMcpHandler mutates tree
|
||||
4. invoke("mcp_action_reply", {id, result})
|
||||
⟵ oneshot resolves
|
||||
emit "mcp://audit" with outcome
|
||||
return to MCP client
|
||||
```
|
||||
|
||||
`TileService` now holds an `AppHandle` and an `Arc<PendingActions>` (oneshot registry keyed by uuid-shaped id). The dispatch helper centralises policy → emit → await → audit emission for every write tool.
|
||||
|
||||
**Policy engine (`src-tauri/src/mcp_policy.rs`, 1152 lines).** Three-tier `allow / ask / deny`, deny-first precedence mirroring Claude Code's `.claude/settings.json` shape — users already know this DSL. Glob matcher (`*` only, not regex) with shell-operator-aware subcommand splitting on `&&`, `||`, `;`, `|`, `|&`, `&`, newline — a deny rule fires if ANY subcommand matches (defeats `safe-cmd && rm -rf /`).
|
||||
|
||||
**Hard-deny list — compiled-in, non-overridable, visible-only-in-UI.** Ten regex patterns the user CANNOT disable, applied to `write_pane` shell content:
|
||||
1. `rm -rf /` (and option-order variants like `-Rf`)
|
||||
2. `rm -rf ~` / `rm -rf $HOME`
|
||||
3. `rm -rf /*`
|
||||
4. `:(){ :|:& };:` (fork bomb)
|
||||
5. `mkfs.<fs> /dev/...`
|
||||
6. `dd ... of=/dev/(sd|nvme|hd|disk)...`
|
||||
7. `> /etc/(passwd|shadow|sudoers)`
|
||||
8. `curl|wget ... | (sudo )?(ba?sh|zsh)` (pipe to shell from network)
|
||||
9. `chmod -R 777 /`
|
||||
10. `find / ... -delete`
|
||||
|
||||
Caveats deliberately disclosed in the UI: best-effort accident prevention only (`\rm`, `${SHELL} -c`, aliases all bypass); POSIX-only in v2 (PowerShell equivalents deferred to v2.1); evaluated on the bytes sent in one `write_pane`, not after the remote shell composes them. *Not a sandbox.*
|
||||
|
||||
73 fuzz tests for the matcher (positive variations + lookalike negatives like `rm -rf /tmp/foo`, `dd of=backup.img`, `chmod 777 /tmp/file`). The shape-of-rule test grid is in `mod hard_deny_fuzz` at the bottom of mcp_policy.rs.
|
||||
|
||||
**Audit log surface.** Backend emits `mcp://audit` after every tool call resolves with `{tsMs, tool, argsSummary (truncated 80), result: ok|denied|failed, durationMs}`. Ring buffer of 200 entries. Args summary explicitly capped — `write_pane` text would otherwise turn the panel into a secret-leak surface if Claude pastes a token.
|
||||
|
||||
**`McpPanel` refactored into three tabs: Config / Audit / Policy.** Config kept the existing snippet/regen UI. Audit is a presentational table with chip-coloured rows. Policy is three vertically-stacked allow/ask/deny buckets with add/remove + a Save button, plus a read-only "Always blocked (built-in)" section showing the 10 hard-deny labels with "Cannot be disabled" badges.
|
||||
|
||||
**Confirm modal (`McpConfirm.tsx`).** Amber-bordered modal. Shows tool, policy reason ("default", a matched ask rule, etc.), a human-readable summary built per-tool (`Rename pane "X" → "Y"`), and an expandable raw-args block. Enter = accept, Esc = reject. Third button: **"Always allow {tool}"** — appends bare tool name to the policy allow bucket inline, then resolves the current call. Toast confirms.
|
||||
|
||||
**Default policy is empty → every call asks.** Restrictive by design; the user enables parts. Saved to `%APPDATA%\com.megaproxy.tiletopia\mcp-policy.json` via the same atomic tmp+rename pattern as `mcp.json`/`hosts.json`/`workspace.json`.
|
||||
|
||||
**Classifier hook scaffold (no-op).** `PolicyClassifier` trait + `ClassifierHint` enum + `NoopClassifier` in mcp_policy.rs. Not wired into `evaluate()` yet — placeholder for v2.1 where a small LLM (Haiku via Anthropic API, or local Ollama) classifies ambiguous Ask calls to maybe-upgrade them to Allow. Architecture supports it without further refactor.
|
||||
|
||||
**Demo tool wired end-to-end: `set_label`.** Pure metadata change; reuses the existing `ops.setLabel` → `changeLabel(tree, leafId, label)` path. No PTY, no SSH, no async spawn complexity. Perfect proof-of-concept for the dispatcher — every other v2 tool follows the same shape: arg struct, validate, dispatch_action with stable args_repr, frontend handler in `runMcpHandler` switch.
|
||||
|
||||
**Bugs hit during integration:**
|
||||
|
||||
1. **Tauri 2 trait-not-in-scope.** `AppHandle::emit` moved onto `tauri::Emitter` trait in Tauri 2. The error message helpfully says "trait `Emitter` which provides `emit` is implemented but not in scope" — just `use tauri::Emitter;` next to `Manager`. Worth remembering for any future event-emission code.
|
||||
|
||||
2. **`McpError` constructors want `'static` strings.** Signature is `impl Into<Cow<'static, str>>`. Passing `&format!(...)` or `&e.to_string()` fails (`temporary value dropped while borrowed`). Pass the owned `String` directly — auto-converts to `Cow::Owned`. Bit me at three sites in dispatch_action.
|
||||
|
||||
3. **React 18 StrictMode race with `listen()`.** Classic pattern bug: `useEffect(() => { let un; void listen(...).then(fn => { un = fn }); return () => un?.() }, []);` is broken in StrictMode because the cleanup runs before the Promise resolves on the pretend-unmount, leaving the first subscription dangling. Real-world symptom was duplicate audit entries and modal-needs-two-clicks (each event handled by both subscriptions). Fix is the cancelled-flag pattern:
|
||||
```ts
|
||||
let cancelled = false;
|
||||
let unlisten;
|
||||
void listen(...).then(fn => { if (cancelled) fn(); else unlisten = fn; });
|
||||
return () => { cancelled = true; unlisten?.(); };
|
||||
```
|
||||
Worth using *anywhere* we subscribe-via-Promise inside `useEffect`, not just for MCP events. Vite HMR also surfaces this if you're not careful — a clean restart confirmed the fix held.
|
||||
|
||||
4. **Stale state when audit subscription lived in AuditTab.** AuditTab unmounts when the user switches tabs or closes the panel; events fired during that window were dropped. Lifted subscription to App.tsx, made AuditTab presentational (props in, table out). Same pattern any "always-on log" should follow.
|
||||
|
||||
5. **rmcp's DNS-rebinding allowlist re-bit us once.** The earlier session disabled it for WSL connectivity; PR-1 didn't regress this but it's a pattern to keep flagged — `StreamableHttpServerConfig::default().disable_allowed_hosts()` stays mandatory for our use case.
|
||||
|
||||
**Frontend ↔ backend contract worth saving:**
|
||||
|
||||
- `mcp://request` event payload (camelCase): `{requestId, tool, args, needsConfirm, reason}`
|
||||
- `mcp://audit` event payload: `{tsMs, tool, argsSummary, result: {kind:"ok"|"denied"|"failed", ...}, durationMs}`
|
||||
- `mcp_action_reply` Tauri command takes `{requestId, result}` where result is externally-tagged `{Ok: value}` or `{Err: msg}` — that's serde's default tagging for `Result<T,E>`, NOT a custom shape.
|
||||
- Tauri 2 command argument-name binding: JS sends `{policy}`, Rust receives `policy: McpPolicy` — direct lowercase match. McpPolicy has no `#[serde(rename_all = ...)]`, so field keys (`version`, `permissions`, `deny`, `ask`, `allow`) match identity. Verified with debug-log instrumentation during the save-not-persisting investigation (it was actually working; user's first test predated the cargo rebuild).
|
||||
|
||||
Open follow-ups specific to this session:
|
||||
|
||||
- **PR-2 (next):** `close_pane`, `swap_panes`, `promote_pane`, `apply_preset`. Same dispatcher shape; the `apply_preset` data-loss case wants an `allow_drops: true` arg rather than a separate modal (per the earlier scope notes).
|
||||
- **PR-3 (the hard one):** `spawn_pane`, `write_pane`, `connect_host`. Needs (a) spawn-completion oneshot resolution chain (await `registerPaneId`), (b) per-host SSH confirm even on spawn (Claude opening a shell on prod is equally consequential to writing to it), (c) rate limiter on `write_pane` (per OWASP LLM06 + MCP spec MUST).
|
||||
- **PR-4:** `add_host` + `extraArgs` sanitiser (ProxyCommand exfil risk for OpenSSH).
|
||||
- **v2.1 classifier:** wire `PolicyClassifier` into `evaluate()` so Ask calls can be auto-upgraded to Allow by a small LLM. Haiku is the cheap/fast pick; needs an API key surface in settings.
|
||||
- **PowerShell hard-deny patterns** (`Remove-Item -Recurse -Force C:\`, `Format-Volume`, etc.). Deferred until users actually use PowerShell panes with MCP enabled.
|
||||
- **`.mcpb` bundle** — still on the list; PR-1b's stdio-shim recipe is what it would package up.
|
||||
- **Confirm modal queueing UX** — currently shows one at a time, FIFO. If Claude burst-sends many tool calls, the user gets serial modals. Probably fine for v2; if it gets annoying, add a "reject all pending" button.
|
||||
- **Audit log persistence** — currently ephemeral ring of 200. A `mcp-audit.jsonl` append-only file in app data dir would let users see "what did Claude do overnight". Trade-off: secrets-in-summaries risk if `write_pane` text leaks past the 80-char truncation. Deferred.
|
||||
- **xterm.js RenderService errors** (`Cannot read properties of undefined (reading 'dimensions')`) showed up in dev tools during this session — completely unrelated to MCP work, likely a pane being resized or detached mid-render. File when convenient.
|
||||
|
||||
### 2026-05-26 — MCP persistence + Claude Code OAuth bug + `mcp-remote` shim
|
||||
|
||||
Set out to fix two paper cuts (port + token re-rolled every server restart, so firewall rules and `.mcp.json` had to be re-pasted). Ended up unwinding a multi-layer breakage in Claude Code's HTTP-MCP client.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue