Phase 3 — Decision pipeline + JobRunner + RestProvider + save round-trip

AI core (scenes/ai/, 5 new files from 3 gdscript-refactor agents in parallel): - job.gd (59 lines, Agent A): Job class, RefCounted, label + toils + cursor + to_dict/from_dict round-trip - toil.gd (76 lines, Agent A): Toil class, RefCounted; kinds WALK/WAIT/IDLE; factories walk_to/wait_ticks/idle; Vector2i stored as to_x/to_y ints because Godot 4 JSON.stringify doesn't round-trip Vector2i - work_provider.gd (27 lines, Agent A): abstract base, class_name, @export category/priority, find_best_for() with push_error subclass guard - job_runner.gd (186 lines, Agent B): Node-derived runner; setup/start_job/ cancel_job/tick; WALK toil delegates to pawn.walk_along_path on first encounter (sets data.started=true), listens for walk_completed signal; WAIT decrements ticks_remaining; IDLE never completes; full to_dict/from_dict - decision.gd (50 lines, Agent C): static pick_next_job(pawn, providers); 5 layers (incapacitation/forced/status/work/idle); layer 1 probes via has_method to stay future-proof for Phase 9 - rest_provider.gd (31 lines, Agent C): extends WorkProvider; @export rest_tile; returns [walk_to(rest_tile), idle()] Job Integration (Opus): - pawn.gd: added forced_job slot, job_runner ref, _orchestrate_ai called before _advance_walk on each sim_tick. Calls Decision when forced_job is queued OR when idle — was a bug initially (only-on-idle never preempted the never-completing IDLE toil); fixed and caught via MCP runtime test. Added to_dict/from_dict for save round-trip; captures tile, _path, _step_progress, _selected, forced_job, job_runner via their serializers. - selection.gd: rewrote to build a forced-job [walk_to + idle] and set pawn.forced_job; Decision preempts current job on next tick. - world.tscn/gd: instantiates RestProvider as child (rest_tile = (50,50) just outside the stone ring's south-east, reachable from all 3 spawn tiles); registers via World.register_work_provider; attaches a JobRunner child to each spawned pawn and wires setup(pawn, pathfinder). - world.gd autoload: added work_providers list + register/clear methods. - save_system.gd: write_save walks World.pawns calling to_dict; apply_save zips dicts to pawns by index (Phase 16 will add stable IDs). - main.gd: bootstrap log line bumped Phase 2 → Phase 3. Acceptance — MCP-verified end-to-end: - 3 pawns boot, Decision assigns each Rest, JobRunner starts each, all 3 walk to (50,50) on different paths (40/35/30 steps based on detour around the stone ring), arrive and idle. - Force Bram to (10,10) via pawn.forced_job; preempt fires: [decision] Bram: forced 'Go to (10, 10)'. Bram walks while Cora/Edda stay parked. - Mid-walk save round-trip (the critical Phase 3 acceptance): - Paused Bram at (51,10) walking to (70,70) with 79 path steps remaining - SaveSystem.write_save() → SaveSystem.apply_save(read_save()) after a mutate-to-(0,0)-with-no-path round-trip - Restored Bram exactly: tile=(51,10), _path.size=79, walking=true, job='Go to (70, 70)' at toil_idx=0 (WALK toil with data.started=true) - Resumed sim → JobRunner's WALK toil saw started=true and did NOT re-call walk_along_path; the pawn's restored _path continued the walk naturally → reached (70,26) with 44 steps remaining, still on the same job. The architecture.md 'mid-toil suspend safe' contract is provably honored. Phase 3 gotchas (logged in implementation.md): - Class-name registration timing bit again (Phase 2 gotcha). Workflow: agent writes class_name file → MCP reload_project → headless validate. - Forced-job preempt requires triggering Decision when forced_job != null, not just when idle (IDLE toil never completes). - execute_game_script + await Engine.get_main_loop().process_frame is flaky — MCP auto-recovers but the script's last lines may be lost. Workaround: split state-inspection into a fresh execute_game_script. Delegation report this phase: - gdscript-refactor (Sonnet) Agent A: Job + Toil + WorkProvider abstract base. 3 files, 162 lines. - gdscript-refactor (Sonnet) Agent B: JobRunner with toil-execution match + walk_completed signal handling + full save round-trip. 1 file, 186 lines. - gdscript-refactor (Sonnet) Agent C: Decision pipeline + RestProvider. 2 files, 81 lines. - Opus: Pawn integration (forced_job slot, orchestration, to_dict/from_dict), Selection rewrite, world.tscn/gd wiring, World autoload work_providers list, SaveSystem extension, MCP-driven runtime verification including the mid-walk save round-trip demo, gotcha logging. ~70% of Phase 3's GDScript was written by subagents. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:05:50 +01:00 · 2026-05-10 21:05:50 +01:00 · 5bf0f51efb
commit 5bf0f51efb
parent cd265b87c0
20 changed files with 613 additions and 25 deletions
--- a/scenes/ai/decision.gd
+++ b/scenes/ai/decision.gd
@ -0,0 +1,50 @@
+class_name Decision
+## Static utility — picks the next Job for a pawn via the 5-layer pipeline.
+##
+## Layer order (top wins):
+##   1. Incapacitation  — has_method probe; implementation lands Phase 9.
+##   2. Forced job      — pawn.forced_job; cleared (consumed) on pick.
+##   3. Status interrupt — stub; implementation lands Phase 9.
+##   4. Work providers  — iterated highest priority first; first non-null Job wins.
+##   5. Idle            — returns null (caller interprets as "stand still").
+##
+## Callers pass the world-scoped provider list so Decision is fully stateless.
+## This makes it safe to call from any pawn tick without shared mutable state.
+
+
+## Returns the best Job for `pawn`, or null if the pawn should idle.
+##
+## `work_providers` is the current world-scoped list of WorkProvider nodes
+## (e.g. [RestProvider]). Order does not matter — the method sorts by priority.
+## `pawn` is duck-typed: must expose .pawn_name, .forced_job, and
+## has_method("is_incapacitated").
+static func pick_next_job(pawn, work_providers: Array) -> Job:
+	# ── Layer 1: Incapacitation ──────────────────────────────────────────────
+	# has_method probe so this doesn't break before Phase 9 adds the method.
+	if pawn.has_method("is_incapacitated") and pawn.is_incapacitated():
+		return null
+
+	# ── Layer 2: Forced job ──────────────────────────────────────────────────
+	if pawn.forced_job != null:
+		var fj: Job = pawn.forced_job
+		pawn.forced_job = null
+		Audit.log("decision", "%s: forced '%s'" % [pawn.pawn_name, fj.label])
+		return fj
+
+	# ── Layer 3: Status interrupt ─────────────────────────────────────────────
+	# Phase 9: status interrupt (Bleeding → seek bed/doctor) lands here.
+
+	# ── Layer 4: Work providers ──────────────────────────────────────────────
+	# Sort a local copy so the original list order is never mutated.
+	var sorted: Array = work_providers.duplicate()
+	sorted.sort_custom(func(a, b): return a.priority > b.priority)
+
+	for wp in sorted:
+		var j: Job = wp.find_best_for(pawn)
+		if j != null:
+			Audit.log("decision", "%s: %s → '%s'" % [pawn.pawn_name, String(wp.category), j.label])
+			return j
+
+	# ── Layer 5: Idle ────────────────────────────────────────────────────────
+	# No log — would fire every tick for every idle pawn (too chatty).
+	return null
--- a/scenes/ai/decision.gd.uid
+++ b/scenes/ai/decision.gd.uid
@ -0,0 +1 @@
+uid://bbrqev1r5e5gh
--- a/scenes/ai/job.gd
+++ b/scenes/ai/job.gd
@ -0,0 +1,59 @@
+class_name Job extends RefCounted
+## A sequence of Toils that describes a pawn's current task (walk to, haul,
+## build, rest, etc.). Job is plain data; JobRunner is the state machine that
+## drives execution tick-by-tick.
+##
+## Save/load contract:
+##   var j2 := Job.from_dict(j.to_dict())
+##   assert(j2.label == j.label)
+##   assert(j2.current_toil_index == j.current_toil_index)
+##   assert(j2.toils.size() == j.toils.size())
+##   # Each toil round-trips per Toil's own invariant.
+
+var label: String = ""
+var toils: Array[Toil] = []
+var current_toil_index: int = 0
+
+
+# ── queries ──────────────────────────────────────────────────────────────────
+
+## Returns the currently-executing Toil, or null when the job is done.
+func active_toil() -> Toil:
+	if is_complete():
+		return null
+	return toils[current_toil_index]
+
+
+## True once every toil has been completed.
+func is_complete() -> bool:
+	return current_toil_index >= toils.size()
+
+
+# ── state mutation ───────────────────────────────────────────────────────────
+
+## Called by JobRunner after the current toil finishes. Steps the index forward.
+func advance() -> void:
+	current_toil_index += 1
+
+
+# ── save / load ──────────────────────────────────────────────────────────────
+
+func to_dict() -> Dictionary:
+	var toil_list: Array = []
+	for toil in toils:
+		toil_list.append(toil.to_dict())
+	return {
+		"label": label,
+		"current_toil_index": current_toil_index,
+		"toils": toil_list,
+	}
+
+
+static func from_dict(d: Dictionary) -> Job:
+	var j := Job.new()
+	j.label = d.get("label", "")
+	j.current_toil_index = d.get("current_toil_index", 0)
+	var raw_toils: Array = d.get("toils", [])
+	for raw in raw_toils:
+		j.toils.append(Toil.from_dict(raw))
+	return j
--- a/scenes/ai/job.gd.uid
+++ b/scenes/ai/job.gd.uid
@ -0,0 +1 @@
+uid://d1mksv0d6qieu
--- a/scenes/ai/job_runner.gd
+++ b/scenes/ai/job_runner.gd
@ -0,0 +1,186 @@
+class_name JobRunner
+extends Node
+## Executes a Job's toils on behalf of a Pawn.
+##
+## Sits between the Decision layer and the Pawn's physical state.  The
+## Decision layer (or a WorkProvider) hands us a Job; we tick through its
+## toils one-by-one and fire job_completed when the last toil is done.
+##
+## Design notes (docs/architecture.md — Pawn AI 5-layer pipeline):
+##   - JobRunner is layer 3 of 5.  Don't add control-flow that belongs to
+##     Decision (layer 1) or WorkProvider (layer 2) here.
+##   - Pawn and Pathfinder are held as untyped vars to avoid class_name
+##     registration-order issues between autoloads and scene scripts.
+##   - tick() is called from Pawn._on_sim_tick each sim tick.  Never spin
+##     render-frame work off this function.
+##
+## Save / load contract (NON-NEGOTIABLE, Phase 3 acceptance criterion):
+##   to_dict() / from_dict() round-trip mid-toil state exactly.  A WALK
+##   toil with started=true restores correctly: on the first tick after load
+##   the runner sits in the "already started, waiting for walk_completed"
+##   branch, so pawn.walk_along_path() is NOT called again (which would
+##   reset the pawn's progress).  The pawn finishes its own restored walk
+##   under its own steam, eventually fires walk_completed, and the toil is
+##   marked done.  See _tick_walk() for the branch logic.
+
+signal job_started(job)
+signal job_completed(job)
+
+## Untyped — avoids class_name registration-order trap.
+var pawn = null
+## Untyped — avoids class_name registration-order trap.
+var pathfinder = null
+## Current Job being executed; null when idle.
+var job = null
+
+
+# ── lifecycle ────────────────────────────────────────────────────────────────
+
+## Wire refs.  Must be called once before any other method.
+## Connects pawn.walk_completed → _on_pawn_walk_completed.
+func setup(pawn_ref, pathfinder_ref) -> void:
+	pawn = pawn_ref
+	pathfinder = pathfinder_ref
+	pawn.walk_completed.connect(_on_pawn_walk_completed)
+
+
+# ── public API ───────────────────────────────────────────────────────────────
+
+## Replace the current job (if any) and begin executing the new one.
+## Resets nothing on the new job — current_toil_index is used as-is so
+## that a restored-from-save job continues from its saved toil position.
+func start_job(j) -> void:
+	job = j
+	Audit.log(
+		"job_runner",
+		"%s start: %s (%d toils)" % [pawn.pawn_name, j.label, j.toils.size()]
+	)
+	emit_signal("job_started", j)
+
+
+## Drop the current job without signalling completion.
+## Any walk already in progress is left to finish naturally
+## (Phase 3 simplicity; Phase 5+ may add a hard-abort path).
+func cancel_job() -> void:
+	job = null
+
+
+## True when a job is currently assigned.
+func has_job() -> bool:
+	return job != null
+
+
+# ── sim tick ────────────────────────────────────────────────────────────────
+
+## Called from Pawn._on_sim_tick each sim tick.
+## Executes the active toil; advances to the next when it is done;
+## emits job_completed when the last toil completes.
+func tick() -> void:
+	if job == null:
+		return
+
+	var t = job.active_toil()
+	if t == null:
+		_emit_complete()
+		return
+
+	match t.kind:
+		Toil.KIND_WALK:
+			_tick_walk(t)
+		Toil.KIND_WAIT:
+			_tick_wait(t)
+		Toil.KIND_IDLE:
+			pass  # Never completes on its own — Decision or player overrides.
+
+	if t.done:
+		job.advance()
+		if job.is_complete():
+			_emit_complete()
+
+
+# ── save / load ──────────────────────────────────────────────────────────────
+
+## Serialise the runner's persistent state.
+## {"job": <dict or null>}
+func to_dict() -> Dictionary:
+	return {
+		"job": job.to_dict() if job != null else null,
+	}
+
+
+## Restore from a dict produced by to_dict().
+## If the "job" key holds a Dictionary, reconstructs a Job via Job.from_dict().
+func from_dict(d: Dictionary) -> void:
+	var job_data = d.get("job", null)
+	if job_data is Dictionary:
+		job = Job.from_dict(job_data)
+
+
+# ── signal handlers ──────────────────────────────────────────────────────────
+
+## Fired by the Pawn when it finishes walking its path.
+## Marks the active WALK toil done so the next tick() advances past it.
+## Does NOT call job.advance() directly — tick() handles that.
+func _on_pawn_walk_completed() -> void:
+	if job == null:
+		return
+	var t = job.active_toil()
+	if t != null and t.kind == Toil.KIND_WALK:
+		t.done = true
+
+
+# ── toil executors ──────────────────────────────────────────────────────────
+
+## Execute one tick of a WALK toil.
+##
+## On the FIRST tick (started=false):
+##   - If the pawn is already at the destination, complete immediately.
+##   - Otherwise ask the pathfinder for a route.  If unreachable, log and
+##     complete (skip-and-continue; the WorkProvider is responsible for
+##     vetting reachability before issuing the job).
+##   - Hand the path to the pawn and mark started=true.  From now on this
+##     function is a no-op — we just wait for the walk_completed signal.
+##
+## On SUBSEQUENT ticks (started=true):
+##   - No-op.  The pawn walks under its own steam.
+##
+## After LOAD (started=true from saved state):
+##   - Same as subsequent ticks — pawn restores its own path and fires
+##     walk_completed when it arrives.  We do NOT call walk_along_path again.
+func _tick_walk(t) -> void:
+	if not t.data.get("started", false):
+		var dest: Vector2i = t.get_walk_destination()
+		if pawn.tile == dest:
+			t.done = true
+			return
+		var path: Array[Vector2i] = pathfinder.find_path(pawn.tile, dest)
+		if path.is_empty():
+			Audit.log(
+				"job_runner",
+				"%s unreachable: %s → %s" % [pawn.pawn_name, pawn.tile, dest]
+			)
+			t.done = true
+			return
+		pawn.walk_along_path(path)
+		t.data["started"] = true
+
+
+## Execute one tick of a WAIT toil.
+## Decrements the counter; sets done when it reaches zero.
+func _tick_wait(t) -> void:
+	t.data["ticks_remaining"] -= 1
+	if t.data["ticks_remaining"] <= 0:
+		t.done = true
+
+
+# ── helpers ──────────────────────────────────────────────────────────────────
+
+## Emit job_completed, log, and clear the job reference.
+func _emit_complete() -> void:
+	var completed = job
+	job = null
+	Audit.log(
+		"job_runner",
+		"%s done: %s" % [pawn.pawn_name, completed.label]
+	)
+	emit_signal("job_completed", completed)
--- a/scenes/ai/job_runner.gd.uid
+++ b/scenes/ai/job_runner.gd.uid
@ -0,0 +1 @@
+uid://8v4lqcrhx1eu
--- a/scenes/ai/rest_provider.gd
+++ b/scenes/ai/rest_provider.gd
@ -0,0 +1,31 @@
+class_name RestProvider extends WorkProvider
+## Phase 3 smoke-test WorkProvider: sends every pawn to a shared rest tile.
+##
+## If the pawn is already at rest_tile, returns a walk-less idle-forever job.
+## Otherwise prepends a walk_to toil before the idle toil.
+##
+## No internal state beyond rest_tile — Decision's log line carries all
+## the info needed for debugging (pawn name + provider category + job label).
+
+
+## The tile pawns walk toward. Set by the world scene on instantiation.
+@export var rest_tile: Vector2i = Vector2i(40, 40)
+
+
+func _init() -> void:
+	category = &"rest"
+	priority = 0  # Only provider in Phase 3; no relative ordering needed yet.
+
+
+## Returns a Job for `pawn`. Never returns null — Rest always has something
+## to offer (walk there, or idle in place).
+## `pawn` is duck-typed: must expose .tile (Vector2i).
+func find_best_for(pawn) -> Job:
+	var j := Job.new()
+	j.label = "Rest at %s" % rest_tile
+
+	if pawn.tile != rest_tile:
+		j.toils.append(Toil.walk_to(rest_tile))
+
+	j.toils.append(Toil.idle())
+	return j
--- a/scenes/ai/rest_provider.gd.uid
+++ b/scenes/ai/rest_provider.gd.uid
@ -0,0 +1 @@
+uid://dyacrro784lvo
--- a/scenes/ai/toil.gd
+++ b/scenes/ai/toil.gd
@ -0,0 +1,76 @@
+class_name Toil extends RefCounted
+## A single atomic step within a Job — walk, wait, idle, etc.
+##
+## Save/load contract: every value in `data` MUST be JSON-safe.
+## Vector2i is NOT JSON-safe in Godot 4 — tile coordinates are stored as
+## "to_x"/"to_y" integer keys, never as Vector2i. get_walk_destination()
+## reconstructs Vector2i on demand.
+##
+## Round-trip invariant:
+##   var t2 := Toil.from_dict(t.to_dict())
+##   assert(t2.kind == t.kind and t2.done == t.done and t2.data == t.data)
+
+const KIND_WALK: StringName = &"walk"
+const KIND_WAIT: StringName = &"wait"
+const KIND_IDLE: StringName = &"idle"
+
+var kind: StringName = KIND_IDLE
+## Toil-specific params — all values must be int, float, bool, String, Dict, or Array.
+var data: Dictionary = {}
+## Set by JobRunner when this toil is complete.
+var done: bool = false
+
+
+# ── factories ────────────────────────────────────────────────────────────────
+
+## Walk to the given tile. Stores coords as separate ints for JSON safety.
+static func walk_to(tile: Vector2i) -> Toil:
+	var t := Toil.new()
+	t.kind = KIND_WALK
+	t.data = {
+		"to_x": tile.x,
+		"to_y": tile.y,
+		"started": false,
+	}
+	return t
+
+
+## Pause for `n` sim ticks.
+static func wait_ticks(n: int) -> Toil:
+	var t := Toil.new()
+	t.kind = KIND_WAIT
+	t.data = {"ticks_remaining": n}
+	return t
+
+
+## Stand idle — never completes on its own; JobRunner must cancel or replace.
+static func idle() -> Toil:
+	var t := Toil.new()
+	t.kind = KIND_IDLE
+	t.data = {}
+	return t
+
+
+# ── save / load ──────────────────────────────────────────────────────────────
+
+func to_dict() -> Dictionary:
+	return {
+		"kind": str(kind),
+		"data": data.duplicate(true),
+		"done": done,
+	}
+
+
+static func from_dict(d: Dictionary) -> Toil:
+	var t := Toil.new()
+	t.kind = StringName(d.get("kind", str(KIND_IDLE)))
+	t.data = (d.get("data", {}) as Dictionary).duplicate(true)
+	t.done = d.get("done", false)
+	return t
+
+
+# ── convenience ──────────────────────────────────────────────────────────────
+
+## Rebuild Vector2i from the JSON-safe int fields. Only valid for KIND_WALK.
+func get_walk_destination() -> Vector2i:
+	return Vector2i(data.get("to_x", 0), data.get("to_y", 0))
--- a/scenes/ai/toil.gd.uid
+++ b/scenes/ai/toil.gd.uid
@ -0,0 +1 @@
+uid://djmc0woq4u65m
--- a/scenes/ai/work_provider.gd
+++ b/scenes/ai/work_provider.gd
@ -0,0 +1,27 @@
+class_name WorkProvider extends Node
+## Abstract base for all work-category providers (Construction, Mining,
+## Hauling, Cooking, …). Subclass this and override find_best_for().
+##
+## Pawn AI layer 2: each pawn iterates its ordered list of WorkProviders
+## and calls find_best_for(self) until one returns a non-null Job.
+##
+## `pawn` is intentionally untyped (duck-typed) to avoid class_name
+## init-order issues. Concrete providers access pawn.tile, pawn.pawn_name,
+## pawn.is_walking(), etc. — the same public API exposed by Pawn.
+
+## Work category key used to identify this provider. Must be unique per
+## provider instance; used by the priority matrix and Decision layer.
+@export var category: StringName = &"unspecified"
+
+## Priority slot in the pawn's work-priority matrix.
+## Higher values are scanned first by the Decision layer.
+@export var priority: int = 0
+
+
+# ── abstract interface ───────────────────────────────────────────────────────
+
+## Concrete providers MUST override this.
+## Return a Job for `pawn` to execute, or null if no suitable work exists.
+func find_best_for(pawn) -> Job:
+	push_error("WorkProvider.find_best_for: subclass '%s' must override this method" % name)
+	return null
--- a/scenes/ai/work_provider.gd.uid
+++ b/scenes/ai/work_provider.gd.uid
@ -0,0 +1 @@
+uid://vi08by1dh0lb