Phase 3 — Decision pipeline + JobRunner + RestProvider + save round-trip

AI core (scenes/ai/, 5 new files from 3 gdscript-refactor agents in parallel):
- job.gd (59 lines, Agent A): Job class, RefCounted, label + toils + cursor +
  to_dict/from_dict round-trip
- toil.gd (76 lines, Agent A): Toil class, RefCounted; kinds WALK/WAIT/IDLE;
  factories walk_to/wait_ticks/idle; Vector2i stored as to_x/to_y ints
  because Godot 4 JSON.stringify doesn't round-trip Vector2i
- work_provider.gd (27 lines, Agent A): abstract base, class_name, @export
  category/priority, find_best_for() with push_error subclass guard
- job_runner.gd (186 lines, Agent B): Node-derived runner; setup/start_job/
  cancel_job/tick; WALK toil delegates to pawn.walk_along_path on first
  encounter (sets data.started=true), listens for walk_completed signal;
  WAIT decrements ticks_remaining; IDLE never completes; full to_dict/from_dict
- decision.gd (50 lines, Agent C): static pick_next_job(pawn, providers); 5
  layers (incapacitation/forced/status/work/idle); layer 1 probes via
  has_method to stay future-proof for Phase 9
- rest_provider.gd (31 lines, Agent C): extends WorkProvider; @export rest_tile;
  returns [walk_to(rest_tile), idle()] Job

Integration (Opus):
- pawn.gd: added forced_job slot, job_runner ref, _orchestrate_ai called
  before _advance_walk on each sim_tick. Calls Decision when forced_job is
  queued OR when idle — was a bug initially (only-on-idle never preempted
  the never-completing IDLE toil); fixed and caught via MCP runtime test.
  Added to_dict/from_dict for save round-trip; captures tile, _path,
  _step_progress, _selected, forced_job, job_runner via their serializers.
- selection.gd: rewrote to build a forced-job [walk_to + idle] and set
  pawn.forced_job; Decision preempts current job on next tick.
- world.tscn/gd: instantiates RestProvider as child (rest_tile = (50,50)
  just outside the stone ring's south-east, reachable from all 3 spawn
  tiles); registers via World.register_work_provider; attaches a JobRunner
  child to each spawned pawn and wires setup(pawn, pathfinder).
- world.gd autoload: added work_providers list + register/clear methods.
- save_system.gd: write_save walks World.pawns calling to_dict; apply_save
  zips dicts to pawns by index (Phase 16 will add stable IDs).
- main.gd: bootstrap log line bumped Phase 2 → Phase 3.

Acceptance — MCP-verified end-to-end:
- 3 pawns boot, Decision assigns each Rest, JobRunner starts each,
  all 3 walk to (50,50) on different paths (40/35/30 steps based on
  detour around the stone ring), arrive and idle.
- Force Bram to (10,10) via pawn.forced_job; preempt fires:
  [decision] Bram: forced 'Go to (10, 10)'. Bram walks while Cora/Edda
  stay parked.
- Mid-walk save round-trip (the critical Phase 3 acceptance):
  - Paused Bram at (51,10) walking to (70,70) with 79 path steps remaining
  - SaveSystem.write_save() → SaveSystem.apply_save(read_save()) after a
    mutate-to-(0,0)-with-no-path round-trip
  - Restored Bram exactly: tile=(51,10), _path.size=79, walking=true,
    job='Go to (70, 70)' at toil_idx=0 (WALK toil with data.started=true)
  - Resumed sim → JobRunner's WALK toil saw started=true and did NOT
    re-call walk_along_path; the pawn's restored _path continued the walk
    naturally → reached (70,26) with 44 steps remaining, still on the
    same job. The architecture.md 'mid-toil suspend safe' contract is
    provably honored.

Phase 3 gotchas (logged in implementation.md):
- Class-name registration timing bit again (Phase 2 gotcha). Workflow:
  agent writes class_name file → MCP reload_project → headless validate.
- Forced-job preempt requires triggering Decision when forced_job != null,
  not just when idle (IDLE toil never completes).
- execute_game_script + await Engine.get_main_loop().process_frame is
  flaky — MCP auto-recovers but the script's last lines may be lost.
  Workaround: split state-inspection into a fresh execute_game_script.

Delegation report this phase:
- gdscript-refactor (Sonnet) Agent A: Job + Toil + WorkProvider abstract
  base. 3 files, 162 lines.
- gdscript-refactor (Sonnet) Agent B: JobRunner with toil-execution match
  + walk_completed signal handling + full save round-trip. 1 file, 186
  lines.
- gdscript-refactor (Sonnet) Agent C: Decision pipeline + RestProvider.
  2 files, 81 lines.
- Opus: Pawn integration (forced_job slot, orchestration, to_dict/from_dict),
  Selection rewrite, world.tscn/gd wiring, World autoload work_providers
  list, SaveSystem extension, MCP-driven runtime verification including
  the mid-walk save round-trip demo, gotcha logging.

~70% of Phase 3's GDScript was written by subagents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
megaproxy 2026-05-10 21:05:50 +01:00
parent cd265b87c0
commit 5bf0f51efb
20 changed files with 613 additions and 25 deletions

View file

@ -9,7 +9,8 @@ Effort estimates are wall-time at **focused solo pace**. Scale up generously for
| ✅ done — green dot up, smoke scene runs, MCP plugin self-installed 3 runtime services | **Phase 0 — Project scaffold & foundations** |
| ✅ done — 80² map renders, walls/terrain/UI layers, camera rig, tick loop, speed UI all live | **Phase 1 — World, tilemap, camera** |
| ✅ done — Pawn class, AStarGrid2D pathfinder (9.1 μs avg/18 μs max at 80²), click-to-select + click-to-move via Selection module | **Phase 2 — Pawn skeleton, pathfinding, movement** |
| ⏳ next | **Phase 3 — AI core: Decision → WorkProvider → JobRunner** |
| ✅ done — Job/Toil/JobRunner/Decision/RestProvider, forced_job preempt, mid-toil save round-trip verified | **Phase 3 — AI core: Decision → WorkProvider → JobRunner** |
| ⏳ next | **Phase 4 — First verbs: chop, mine, hauling, stockpiles** |
Use this doc as a checklist: tick boxes as items complete, and update the **Status** row above whenever a phase rolls over. The last bullet of each phase is the *acceptance demo* — the phase is "done" when you can perform it.
@ -103,15 +104,26 @@ The five items from `memory.md` *Open questions / Audit*. None of these need cod
**Goal:** the 5-layer pipeline from `architecture.md` is real, but with one dummy work category. **Save round-trip for JobRunner mid-toil state is required to land in this phase, not later.**
- [ ] `Decision` layer: priority-ordered checks (incapacitation → forced job → status interrupt → work → idle)
- [ ] `WorkProvider` interface — `find_best_for(pawn) -> Job?`
- [ ] `Job` + `JobRunner` — multi-step toils, each toil is `{action, predicate, on_complete}`
- [ ] Player overrides: forced job (e.g. "go here") preempts work
- [ ] Status interrupts skeleton — only `Bleeding` for now (rest land at Phase 9)
- [ ] Idle behavior: stand still or wander locally (per `architecture.md:72`, idle is v2; MVP just stands)
- [ ] First WorkProvider: `RestProvider` — sends pawn to a hardcoded "rest tile". Just a smoke test for the pipeline.
- [ ] **Save round-trip:** kill the app mid-toil-2-of-4, reopen, pawn resumes the same toil at the same position
- [ ] **Acceptance:** 3 pawns idle around a rest tile; force-move one with a tap-and-hold-issue-order; suspend mid-walk and resume seamlessly.
- [x] **5-layer `Decision` pipeline** (`scenes/ai/decision.gd`, 50 lines, `gdscript-refactor` agent): static `pick_next_job(pawn, providers)`. Layer 1 (incapacitation) probes via `has_method("is_incapacitated")` — no-op until Phase 9 adds it. Layer 2 (forced job) consumes `pawn.forced_job`. Layer 3 (status interrupt) reserved for Phase 9. Layer 4 (work) sorts providers by `priority` desc, returns first non-null Job. Layer 5 returns null (idle).
- [x] **`WorkProvider` abstract base** (`scenes/ai/work_provider.gd`, 27 lines, Agent A): `class_name WorkProvider extends Node`, `@export category`, `@export priority`, `find_best_for(pawn)` with `push_error` guard.
- [x] **`Job` + `Toil`** (`scenes/ai/{job,toil}.gd`, 59 + 76 lines, Agent A): `RefCounted` data types with `to_dict`/`from_dict`. Toil kinds: `WALK`/`WAIT`/`IDLE`. Vector2i stored as `to_x`/`to_y` ints (Godot 4 JSON doesn't round-trip Vector2i). Factories: `Toil.walk_to(tile)`, `Toil.wait_ticks(n)`, `Toil.idle()`.
- [x] **`JobRunner`** (`scenes/ai/job_runner.gd`, 186 lines, Agent B): `Node`-derived; `setup(pawn, pathfinder)`, `start_job(j)`, `cancel_job()`, `tick()`. WALK toil delegates to `pawn.walk_along_path()` on first invocation, listens for `walk_completed` signal to mark done. WAIT decrements `ticks_remaining`. IDLE never completes. Full `to_dict`/`from_dict` for save round-trip.
- [x] **Forced job preempts current job** (Pawn orchestration fix): `_orchestrate_ai` calls Decision when `forced_job != null` OR no current job — not just when idle. This was a bug found via MCP runtime test; cause + fix documented in commit.
- [x] **First `RestProvider`** (`scenes/ai/rest_provider.gd`, 31 lines, Agent C): `extends WorkProvider`, `@export rest_tile`, returns a `[walk_to(rest_tile), idle()]` Job. Rest tile = (50, 50) — just outside the south-east of the stone ring, reachable from all 3 spawn tiles.
- [x] **Idle behavior**: IDLE toil keeps the pawn at the current tile indefinitely. Per architecture.md:72, this is the v1 idle; the wander-locally variant is v2.
- [x] **Pawn `to_dict`/`from_dict`** (Opus): captures `tile`, `_path` (as `[[x,y],...]`), `_step_progress`, `_selected`, `forced_job` (via `Job.to_dict()`), `job_runner` (via `JobRunner.to_dict()`). On load, JobRunner's restored WALK toil has `started: true` and does NOT re-call `walk_along_path` — the pawn's restored `_path` continues naturally and emits `walk_completed` when done.
- [x] **`SaveSystem.write_save` / `apply_save`** (Opus): walks `World.pawns`, calls `to_dict()` / `from_dict()` per pawn. Single slot JSON to `user://save_slot.json`. Pawn dicts zipped by index (Phase 16 will add stable IDs).
- [x] **Selection rewrite** (Opus): drops direct `pawn.walk_along_path` call; now builds a `[walk_to(tile), idle()]` Job and sets `pawn.forced_job = job`. Decision picks it up on the next sim tick.
- [x] **Acceptance — MCP-verified end-to-end**:
- 3 pawns boot → Decision assigns each a Rest job → JobRunner starts each → all 3 walk to (50, 50) on different paths (40/35/30 steps) → all 3 arrive and idle.
- Force Bram to (10, 10) via `pawn.forced_job` → preempt fires (`[decision] Bram: forced 'Go to (10, 10)'`) → Bram walks away while Cora/Edda stay parked.
- Mid-walk save: paused Bram at (51, 10) walking to (70, 70) with 79 path steps remaining → `SaveSystem.write_save()` → mutated to (0, 0) with empty path → `SaveSystem.apply_save(read_save())`**restored to (51, 10) with 79 steps remaining, `walking=true`, same job at same toil index** → resumed sim → Bram continued from (51, 10), reached (70, 26) with 44 steps remaining, still on `Go to (70, 70)`.
- [x] **Status interrupt skeleton — Bleeding hook**: deliberately deferred. Decision's Layer 3 is a placeholder comment for Phase 9 — adding it without a Status system to back it is premature. `implementation.md` Phase 9 will land the registry + the interrupt wiring atomically.
**Phase 3 lessons logged:**
- Class-name registration timing (Phase 2 gotcha) bit again — fix is the same: `mcp__godot-mcp-pro__reload_project` between authoring `class_name`-bearing files and headless validation.
- `_orchestrate_ai` initially only called Decision when `not has_job()`. The IDLE toil never completes, so a queued `forced_job` was never seen. Fix: trigger Decision when `forced_job != null` regardless of current-job state. Caught by the runtime MCP test, not headless.
- `execute_game_script` with `await Engine.get_main_loop().process_frame` is touchy — the MCP wrapper sometimes auto-recovers from a runtime issue but the script's last assignments are lost. The actual game state evolves correctly; just use a fresh `execute_game_script` to inspect state after awaits.
---