Phase 3 — Decision pipeline + JobRunner + RestProvider + save round-trip

AI core (scenes/ai/, 5 new files from 3 gdscript-refactor agents in parallel):
- job.gd (59 lines, Agent A): Job class, RefCounted, label + toils + cursor +
  to_dict/from_dict round-trip
- toil.gd (76 lines, Agent A): Toil class, RefCounted; kinds WALK/WAIT/IDLE;
  factories walk_to/wait_ticks/idle; Vector2i stored as to_x/to_y ints
  because Godot 4 JSON.stringify doesn't round-trip Vector2i
- work_provider.gd (27 lines, Agent A): abstract base, class_name, @export
  category/priority, find_best_for() with push_error subclass guard
- job_runner.gd (186 lines, Agent B): Node-derived runner; setup/start_job/
  cancel_job/tick; WALK toil delegates to pawn.walk_along_path on first
  encounter (sets data.started=true), listens for walk_completed signal;
  WAIT decrements ticks_remaining; IDLE never completes; full to_dict/from_dict
- decision.gd (50 lines, Agent C): static pick_next_job(pawn, providers); 5
  layers (incapacitation/forced/status/work/idle); layer 1 probes via
  has_method to stay future-proof for Phase 9
- rest_provider.gd (31 lines, Agent C): extends WorkProvider; @export rest_tile;
  returns [walk_to(rest_tile), idle()] Job

Integration (Opus):
- pawn.gd: added forced_job slot, job_runner ref, _orchestrate_ai called
  before _advance_walk on each sim_tick. Calls Decision when forced_job is
  queued OR when idle — was a bug initially (only-on-idle never preempted
  the never-completing IDLE toil); fixed and caught via MCP runtime test.
  Added to_dict/from_dict for save round-trip; captures tile, _path,
  _step_progress, _selected, forced_job, job_runner via their serializers.
- selection.gd: rewrote to build a forced-job [walk_to + idle] and set
  pawn.forced_job; Decision preempts current job on next tick.
- world.tscn/gd: instantiates RestProvider as child (rest_tile = (50,50)
  just outside the stone ring's south-east, reachable from all 3 spawn
  tiles); registers via World.register_work_provider; attaches a JobRunner
  child to each spawned pawn and wires setup(pawn, pathfinder).
- world.gd autoload: added work_providers list + register/clear methods.
- save_system.gd: write_save walks World.pawns calling to_dict; apply_save
  zips dicts to pawns by index (Phase 16 will add stable IDs).
- main.gd: bootstrap log line bumped Phase 2 → Phase 3.

Acceptance — MCP-verified end-to-end:
- 3 pawns boot, Decision assigns each Rest, JobRunner starts each,
  all 3 walk to (50,50) on different paths (40/35/30 steps based on
  detour around the stone ring), arrive and idle.
- Force Bram to (10,10) via pawn.forced_job; preempt fires:
  [decision] Bram: forced 'Go to (10, 10)'. Bram walks while Cora/Edda
  stay parked.
- Mid-walk save round-trip (the critical Phase 3 acceptance):
  - Paused Bram at (51,10) walking to (70,70) with 79 path steps remaining
  - SaveSystem.write_save() → SaveSystem.apply_save(read_save()) after a
    mutate-to-(0,0)-with-no-path round-trip
  - Restored Bram exactly: tile=(51,10), _path.size=79, walking=true,
    job='Go to (70, 70)' at toil_idx=0 (WALK toil with data.started=true)
  - Resumed sim → JobRunner's WALK toil saw started=true and did NOT
    re-call walk_along_path; the pawn's restored _path continued the walk
    naturally → reached (70,26) with 44 steps remaining, still on the
    same job. The architecture.md 'mid-toil suspend safe' contract is
    provably honored.

Phase 3 gotchas (logged in implementation.md):
- Class-name registration timing bit again (Phase 2 gotcha). Workflow:
  agent writes class_name file → MCP reload_project → headless validate.
- Forced-job preempt requires triggering Decision when forced_job != null,
  not just when idle (IDLE toil never completes).
- execute_game_script + await Engine.get_main_loop().process_frame is
  flaky — MCP auto-recovers but the script's last lines may be lost.
  Workaround: split state-inspection into a fresh execute_game_script.

Delegation report this phase:
- gdscript-refactor (Sonnet) Agent A: Job + Toil + WorkProvider abstract
  base. 3 files, 162 lines.
- gdscript-refactor (Sonnet) Agent B: JobRunner with toil-execution match
  + walk_completed signal handling + full save round-trip. 1 file, 186
  lines.
- gdscript-refactor (Sonnet) Agent C: Decision pipeline + RestProvider.
  2 files, 81 lines.
- Opus: Pawn integration (forced_job slot, orchestration, to_dict/from_dict),
  Selection rewrite, world.tscn/gd wiring, World autoload work_providers
  list, SaveSystem extension, MCP-driven runtime verification including
  the mid-walk save round-trip demo, gotcha logging.

~70% of Phase 3's GDScript was written by subagents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
megaproxy 2026-05-10 21:05:50 +01:00
parent cd265b87c0
commit 5bf0f51efb
20 changed files with 613 additions and 25 deletions

186
scenes/ai/job_runner.gd Normal file
View file

@ -0,0 +1,186 @@
class_name JobRunner
extends Node
## Executes a Job's toils on behalf of a Pawn.
##
## Sits between the Decision layer and the Pawn's physical state. The
## Decision layer (or a WorkProvider) hands us a Job; we tick through its
## toils one-by-one and fire job_completed when the last toil is done.
##
## Design notes (docs/architecture.md — Pawn AI 5-layer pipeline):
## - JobRunner is layer 3 of 5. Don't add control-flow that belongs to
## Decision (layer 1) or WorkProvider (layer 2) here.
## - Pawn and Pathfinder are held as untyped vars to avoid class_name
## registration-order issues between autoloads and scene scripts.
## - tick() is called from Pawn._on_sim_tick each sim tick. Never spin
## render-frame work off this function.
##
## Save / load contract (NON-NEGOTIABLE, Phase 3 acceptance criterion):
## to_dict() / from_dict() round-trip mid-toil state exactly. A WALK
## toil with started=true restores correctly: on the first tick after load
## the runner sits in the "already started, waiting for walk_completed"
## branch, so pawn.walk_along_path() is NOT called again (which would
## reset the pawn's progress). The pawn finishes its own restored walk
## under its own steam, eventually fires walk_completed, and the toil is
## marked done. See _tick_walk() for the branch logic.
signal job_started(job)
signal job_completed(job)
## Untyped — avoids class_name registration-order trap.
var pawn = null
## Untyped — avoids class_name registration-order trap.
var pathfinder = null
## Current Job being executed; null when idle.
var job = null
# ── lifecycle ────────────────────────────────────────────────────────────────
## Wire refs. Must be called once before any other method.
## Connects pawn.walk_completed → _on_pawn_walk_completed.
func setup(pawn_ref, pathfinder_ref) -> void:
pawn = pawn_ref
pathfinder = pathfinder_ref
pawn.walk_completed.connect(_on_pawn_walk_completed)
# ── public API ───────────────────────────────────────────────────────────────
## Replace the current job (if any) and begin executing the new one.
## Resets nothing on the new job — current_toil_index is used as-is so
## that a restored-from-save job continues from its saved toil position.
func start_job(j) -> void:
job = j
Audit.log(
"job_runner",
"%s start: %s (%d toils)" % [pawn.pawn_name, j.label, j.toils.size()]
)
emit_signal("job_started", j)
## Drop the current job without signalling completion.
## Any walk already in progress is left to finish naturally
## (Phase 3 simplicity; Phase 5+ may add a hard-abort path).
func cancel_job() -> void:
job = null
## True when a job is currently assigned.
func has_job() -> bool:
return job != null
# ── sim tick ────────────────────────────────────────────────────────────────
## Called from Pawn._on_sim_tick each sim tick.
## Executes the active toil; advances to the next when it is done;
## emits job_completed when the last toil completes.
func tick() -> void:
if job == null:
return
var t = job.active_toil()
if t == null:
_emit_complete()
return
match t.kind:
Toil.KIND_WALK:
_tick_walk(t)
Toil.KIND_WAIT:
_tick_wait(t)
Toil.KIND_IDLE:
pass # Never completes on its own — Decision or player overrides.
if t.done:
job.advance()
if job.is_complete():
_emit_complete()
# ── save / load ──────────────────────────────────────────────────────────────
## Serialise the runner's persistent state.
## {"job": <dict or null>}
func to_dict() -> Dictionary:
return {
"job": job.to_dict() if job != null else null,
}
## Restore from a dict produced by to_dict().
## If the "job" key holds a Dictionary, reconstructs a Job via Job.from_dict().
func from_dict(d: Dictionary) -> void:
var job_data = d.get("job", null)
if job_data is Dictionary:
job = Job.from_dict(job_data)
# ── signal handlers ──────────────────────────────────────────────────────────
## Fired by the Pawn when it finishes walking its path.
## Marks the active WALK toil done so the next tick() advances past it.
## Does NOT call job.advance() directly — tick() handles that.
func _on_pawn_walk_completed() -> void:
if job == null:
return
var t = job.active_toil()
if t != null and t.kind == Toil.KIND_WALK:
t.done = true
# ── toil executors ──────────────────────────────────────────────────────────
## Execute one tick of a WALK toil.
##
## On the FIRST tick (started=false):
## - If the pawn is already at the destination, complete immediately.
## - Otherwise ask the pathfinder for a route. If unreachable, log and
## complete (skip-and-continue; the WorkProvider is responsible for
## vetting reachability before issuing the job).
## - Hand the path to the pawn and mark started=true. From now on this
## function is a no-op — we just wait for the walk_completed signal.
##
## On SUBSEQUENT ticks (started=true):
## - No-op. The pawn walks under its own steam.
##
## After LOAD (started=true from saved state):
## - Same as subsequent ticks — pawn restores its own path and fires
## walk_completed when it arrives. We do NOT call walk_along_path again.
func _tick_walk(t) -> void:
if not t.data.get("started", false):
var dest: Vector2i = t.get_walk_destination()
if pawn.tile == dest:
t.done = true
return
var path: Array[Vector2i] = pathfinder.find_path(pawn.tile, dest)
if path.is_empty():
Audit.log(
"job_runner",
"%s unreachable: %s%s" % [pawn.pawn_name, pawn.tile, dest]
)
t.done = true
return
pawn.walk_along_path(path)
t.data["started"] = true
## Execute one tick of a WAIT toil.
## Decrements the counter; sets done when it reaches zero.
func _tick_wait(t) -> void:
t.data["ticks_remaining"] -= 1
if t.data["ticks_remaining"] <= 0:
t.done = true
# ── helpers ──────────────────────────────────────────────────────────────────
## Emit job_completed, log, and clear the job reference.
func _emit_complete() -> void:
var completed = job
job = null
Audit.log(
"job_runner",
"%s done: %s" % [pawn.pawn_name, completed.label]
)
emit_signal("job_completed", completed)