Phase 3 — Decision pipeline + JobRunner + RestProvider + save round-trip

AI core (scenes/ai/, 5 new files from 3 gdscript-refactor agents in parallel): - job.gd (59 lines, Agent A): Job class, RefCounted, label + toils + cursor + to_dict/from_dict round-trip - toil.gd (76 lines, Agent A): Toil class, RefCounted; kinds WALK/WAIT/IDLE; factories walk_to/wait_ticks/idle; Vector2i stored as to_x/to_y ints because Godot 4 JSON.stringify doesn't round-trip Vector2i - work_provider.gd (27 lines, Agent A): abstract base, class_name, @export category/priority, find_best_for() with push_error subclass guard - job_runner.gd (186 lines, Agent B): Node-derived runner; setup/start_job/ cancel_job/tick; WALK toil delegates to pawn.walk_along_path on first encounter (sets data.started=true), listens for walk_completed signal; WAIT decrements ticks_remaining; IDLE never completes; full to_dict/from_dict - decision.gd (50 lines, Agent C): static pick_next_job(pawn, providers); 5 layers (incapacitation/forced/status/work/idle); layer 1 probes via has_method to stay future-proof for Phase 9 - rest_provider.gd (31 lines, Agent C): extends WorkProvider; @export rest_tile; returns [walk_to(rest_tile), idle()] Job Integration (Opus): - pawn.gd: added forced_job slot, job_runner ref, _orchestrate_ai called before _advance_walk on each sim_tick. Calls Decision when forced_job is queued OR when idle — was a bug initially (only-on-idle never preempted the never-completing IDLE toil); fixed and caught via MCP runtime test. Added to_dict/from_dict for save round-trip; captures tile, _path, _step_progress, _selected, forced_job, job_runner via their serializers. - selection.gd: rewrote to build a forced-job [walk_to + idle] and set pawn.forced_job; Decision preempts current job on next tick. - world.tscn/gd: instantiates RestProvider as child (rest_tile = (50,50) just outside the stone ring's south-east, reachable from all 3 spawn tiles); registers via World.register_work_provider; attaches a JobRunner child to each spawned pawn and wires setup(pawn, pathfinder). - world.gd autoload: added work_providers list + register/clear methods. - save_system.gd: write_save walks World.pawns calling to_dict; apply_save zips dicts to pawns by index (Phase 16 will add stable IDs). - main.gd: bootstrap log line bumped Phase 2 → Phase 3. Acceptance — MCP-verified end-to-end: - 3 pawns boot, Decision assigns each Rest, JobRunner starts each, all 3 walk to (50,50) on different paths (40/35/30 steps based on detour around the stone ring), arrive and idle. - Force Bram to (10,10) via pawn.forced_job; preempt fires: [decision] Bram: forced 'Go to (10, 10)'. Bram walks while Cora/Edda stay parked. - Mid-walk save round-trip (the critical Phase 3 acceptance): - Paused Bram at (51,10) walking to (70,70) with 79 path steps remaining - SaveSystem.write_save() → SaveSystem.apply_save(read_save()) after a mutate-to-(0,0)-with-no-path round-trip - Restored Bram exactly: tile=(51,10), _path.size=79, walking=true, job='Go to (70, 70)' at toil_idx=0 (WALK toil with data.started=true) - Resumed sim → JobRunner's WALK toil saw started=true and did NOT re-call walk_along_path; the pawn's restored _path continued the walk naturally → reached (70,26) with 44 steps remaining, still on the same job. The architecture.md 'mid-toil suspend safe' contract is provably honored. Phase 3 gotchas (logged in implementation.md): - Class-name registration timing bit again (Phase 2 gotcha). Workflow: agent writes class_name file → MCP reload_project → headless validate. - Forced-job preempt requires triggering Decision when forced_job != null, not just when idle (IDLE toil never completes). - execute_game_script + await Engine.get_main_loop().process_frame is flaky — MCP auto-recovers but the script's last lines may be lost. Workaround: split state-inspection into a fresh execute_game_script. Delegation report this phase: - gdscript-refactor (Sonnet) Agent A: Job + Toil + WorkProvider abstract base. 3 files, 162 lines. - gdscript-refactor (Sonnet) Agent B: JobRunner with toil-execution match + walk_completed signal handling + full save round-trip. 1 file, 186 lines. - gdscript-refactor (Sonnet) Agent C: Decision pipeline + RestProvider. 2 files, 81 lines. - Opus: Pawn integration (forced_job slot, orchestration, to_dict/from_dict), Selection rewrite, world.tscn/gd wiring, World autoload work_providers list, SaveSystem extension, MCP-driven runtime verification including the mid-walk save round-trip demo, gotcha logging. ~70% of Phase 3's GDScript was written by subagents. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:05:50 +01:00 · 2026-05-10 21:05:50 +01:00 · 5bf0f51efb
commit 5bf0f51efb
parent cd265b87c0
20 changed files with 613 additions and 25 deletions
--- a/scenes/ai/job_runner.gd
+++ b/scenes/ai/job_runner.gd
@ -0,0 +1,186 @@
+class_name JobRunner
+extends Node
+## Executes a Job's toils on behalf of a Pawn.
+##
+## Sits between the Decision layer and the Pawn's physical state.  The
+## Decision layer (or a WorkProvider) hands us a Job; we tick through its
+## toils one-by-one and fire job_completed when the last toil is done.
+##
+## Design notes (docs/architecture.md — Pawn AI 5-layer pipeline):
+##   - JobRunner is layer 3 of 5.  Don't add control-flow that belongs to
+##     Decision (layer 1) or WorkProvider (layer 2) here.
+##   - Pawn and Pathfinder are held as untyped vars to avoid class_name
+##     registration-order issues between autoloads and scene scripts.
+##   - tick() is called from Pawn._on_sim_tick each sim tick.  Never spin
+##     render-frame work off this function.
+##
+## Save / load contract (NON-NEGOTIABLE, Phase 3 acceptance criterion):
+##   to_dict() / from_dict() round-trip mid-toil state exactly.  A WALK
+##   toil with started=true restores correctly: on the first tick after load
+##   the runner sits in the "already started, waiting for walk_completed"
+##   branch, so pawn.walk_along_path() is NOT called again (which would
+##   reset the pawn's progress).  The pawn finishes its own restored walk
+##   under its own steam, eventually fires walk_completed, and the toil is
+##   marked done.  See _tick_walk() for the branch logic.
+
+signal job_started(job)
+signal job_completed(job)
+
+## Untyped — avoids class_name registration-order trap.
+var pawn = null
+## Untyped — avoids class_name registration-order trap.
+var pathfinder = null
+## Current Job being executed; null when idle.
+var job = null
+
+
+# ── lifecycle ────────────────────────────────────────────────────────────────
+
+## Wire refs.  Must be called once before any other method.
+## Connects pawn.walk_completed → _on_pawn_walk_completed.
+func setup(pawn_ref, pathfinder_ref) -> void:
+	pawn = pawn_ref
+	pathfinder = pathfinder_ref
+	pawn.walk_completed.connect(_on_pawn_walk_completed)
+
+
+# ── public API ───────────────────────────────────────────────────────────────
+
+## Replace the current job (if any) and begin executing the new one.
+## Resets nothing on the new job — current_toil_index is used as-is so
+## that a restored-from-save job continues from its saved toil position.
+func start_job(j) -> void:
+	job = j
+	Audit.log(
+		"job_runner",
+		"%s start: %s (%d toils)" % [pawn.pawn_name, j.label, j.toils.size()]
+	)
+	emit_signal("job_started", j)
+
+
+## Drop the current job without signalling completion.
+## Any walk already in progress is left to finish naturally
+## (Phase 3 simplicity; Phase 5+ may add a hard-abort path).
+func cancel_job() -> void:
+	job = null
+
+
+## True when a job is currently assigned.
+func has_job() -> bool:
+	return job != null
+
+
+# ── sim tick ────────────────────────────────────────────────────────────────
+
+## Called from Pawn._on_sim_tick each sim tick.
+## Executes the active toil; advances to the next when it is done;
+## emits job_completed when the last toil completes.
+func tick() -> void:
+	if job == null:
+		return
+
+	var t = job.active_toil()
+	if t == null:
+		_emit_complete()
+		return
+
+	match t.kind:
+		Toil.KIND_WALK:
+			_tick_walk(t)
+		Toil.KIND_WAIT:
+			_tick_wait(t)
+		Toil.KIND_IDLE:
+			pass  # Never completes on its own — Decision or player overrides.
+
+	if t.done:
+		job.advance()
+		if job.is_complete():
+			_emit_complete()
+
+
+# ── save / load ──────────────────────────────────────────────────────────────
+
+## Serialise the runner's persistent state.
+## {"job": <dict or null>}
+func to_dict() -> Dictionary:
+	return {
+		"job": job.to_dict() if job != null else null,
+	}
+
+
+## Restore from a dict produced by to_dict().
+## If the "job" key holds a Dictionary, reconstructs a Job via Job.from_dict().
+func from_dict(d: Dictionary) -> void:
+	var job_data = d.get("job", null)
+	if job_data is Dictionary:
+		job = Job.from_dict(job_data)
+
+
+# ── signal handlers ──────────────────────────────────────────────────────────
+
+## Fired by the Pawn when it finishes walking its path.
+## Marks the active WALK toil done so the next tick() advances past it.
+## Does NOT call job.advance() directly — tick() handles that.
+func _on_pawn_walk_completed() -> void:
+	if job == null:
+		return
+	var t = job.active_toil()
+	if t != null and t.kind == Toil.KIND_WALK:
+		t.done = true
+
+
+# ── toil executors ──────────────────────────────────────────────────────────
+
+## Execute one tick of a WALK toil.
+##
+## On the FIRST tick (started=false):
+##   - If the pawn is already at the destination, complete immediately.
+##   - Otherwise ask the pathfinder for a route.  If unreachable, log and
+##     complete (skip-and-continue; the WorkProvider is responsible for
+##     vetting reachability before issuing the job).
+##   - Hand the path to the pawn and mark started=true.  From now on this
+##     function is a no-op — we just wait for the walk_completed signal.
+##
+## On SUBSEQUENT ticks (started=true):
+##   - No-op.  The pawn walks under its own steam.
+##
+## After LOAD (started=true from saved state):
+##   - Same as subsequent ticks — pawn restores its own path and fires
+##     walk_completed when it arrives.  We do NOT call walk_along_path again.
+func _tick_walk(t) -> void:
+	if not t.data.get("started", false):
+		var dest: Vector2i = t.get_walk_destination()
+		if pawn.tile == dest:
+			t.done = true
+			return
+		var path: Array[Vector2i] = pathfinder.find_path(pawn.tile, dest)
+		if path.is_empty():
+			Audit.log(
+				"job_runner",
+				"%s unreachable: %s → %s" % [pawn.pawn_name, pawn.tile, dest]
+			)
+			t.done = true
+			return
+		pawn.walk_along_path(path)
+		t.data["started"] = true
+
+
+## Execute one tick of a WAIT toil.
+## Decrements the counter; sets done when it reaches zero.
+func _tick_wait(t) -> void:
+	t.data["ticks_remaining"] -= 1
+	if t.data["ticks_remaining"] <= 0:
+		t.done = true
+
+
+# ── helpers ──────────────────────────────────────────────────────────────────
+
+## Emit job_completed, log, and clear the job reference.
+func _emit_complete() -> void:
+	var completed = job
+	job = null
+	Audit.log(
+		"job_runner",
+		"%s done: %s" % [pawn.pawn_name, completed.label]
+	)
+	emit_signal("job_completed", completed)