Fix: closing any window killed all (tokio::spawn panic on close path)

The synchronous on_window_event CloseRequested handler reached
WindowsState::schedule_save -> tokio::spawn, which panics ("no reactor
running") because that callback runs on the main thread with no ambient
Tokio runtime; the unhandled main-thread panic aborted the whole
process, taking every window + PTY down. (push_window_workspaces hit the
same line safely because it's an async tauri::command.)

- window_state.rs: tokio::spawn -> tauri::async_runtime::spawn (global
  runtime, works from any thread). Verified against tauri 2.11 source.
- lib.rs: defensive .build().run() guard — prevent_exit while any window
  remains so no path can orphan live PTYs; close logging warn!->debug!.

Source-verified; pending Windows runtime test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
megaproxy 2026-05-30 01:09:46 +01:00
parent 8b5f65a14a
commit 9144ba64b6
3 changed files with 69 additions and 6 deletions

View file

@ -108,6 +108,18 @@ Four-agent research pass (terminal-landscape, AI-orchestration, xterm/Tauri ecos
## Session log
### 2026-05-30 — FIX: closing any window killed all windows (Tokio-runtime panic)
**Symptom:** after dragging a pane out (or spawning) a daughter window, closing *either* the main or a daughter window closed them all, dumping `exit code 101`.
**Root cause (confirmed via a 3-agent Workflow + reading the installed `tauri-runtime-wry-2.11.2` / `tauri-2.11.2` source):** NOT the exit logic and NOT WebView2. It was a **panic on the main thread**. The synchronous `on_window_event` `CloseRequested` handler in `lib.rs` calls `WindowsState::forget()``schedule_save()``tokio::spawn` (`window_state.rs:95`). That callback runs on the wry event-loop main thread with **no ambient Tokio runtime**, so `tokio::spawn` panics (`there is no reactor running…`); an unhandled main-thread panic aborts the whole process, taking every window + PTY down. `push_window_workspaces` hit the same `schedule_save` line but never crashed because it's an `async #[tauri::command]` that already runs inside Tauri's managed Tokio runtime — the bug only fired on the window-close path.
**Fix (`src-tauri/src/window_state.rs`):** swap `tokio::spawn`**`tauri::async_runtime::spawn`**, which schedules onto Tauri's global lazily-init'd Tokio runtime and works from *any* thread (incl. sync callbacks). Verified against `tauri-2.11.2/src/async_runtime.rs`: same `JoinHandle` shape, has `.abort()` (needed for the debounce cancel), and `tokio::time::sleep` still works inside the spawned future. Imports: `JoinHandle`+`spawn` now from `tauri::async_runtime`, `Duration` from `std::time`, `sleep` from `tokio::time`. **Rule learned: never call `tokio::spawn`/`tokio::*` runtime APIs from `on_window_event`, the `RunEvent` `.run()` closure, `Drop` impls, or any sync helper reachable from them — use `tauri::async_runtime::spawn`. Audit found this was the ONLY unsafe instance (`mcp.rs:800` and `mcp.rs:1502` are in async contexts → safe).**
**Also `src-tauri/src/lib.rs` (defensive, not the primary fix):** switched `.run(generate_context!())``.build(…).run(|app, event| …)` and on `RunEvent::ExitRequested` call `api.prevent_exit()` iff `code.is_none() && !webview_windows().is_empty()` — belt-and-suspenders so no future path can tear down the process (and orphan live PTYs) while any window remains; explicit `AppHandle::exit(Some)` is always honored. Verified-from-source semantics: wry emits `ExitRequested{code:None}` **only** when the last window is destroyed (window store empty), and `manager.on_window_close` removes the window from `webview_windows()` *before* `ExitRequested` fires, so the count is accurate and there's no zombie risk. Window close/destroy logging demoted `warn!``debug!` (run `RUST_LOG=tiletopia=debug` to trace).
**Status: source-verified, NOT yet runtime-verified — needs a Windows `pnpm tauri dev` build** (cargo toolchain is Windows-only; can't build from WSL). Regression test = step 1: drag a pane out, close the daughter, main must survive with no exit-101. **Known minor follow-up:** a deliberately-closed window's *own* panes leak their PTYs (webview JS doesn't run XtermPane unmount cleanup on OS close), so those WSL shells linger orphaned — lower priority than persistence, not fixed.
### 2026-05-28/29 — bug fix + feature batch from the backlog (post-0.4.0)
Started from a user-reported **stuck/ghost cursor** in panes; fixed by switching xterm from the DOM renderer to `@xterm/addon-canvas` (DOM renderer leaves a stale cursor block under the Claude TUI's rapid hide/show + blink). User verified fixed on Windows.