Phase 3 — Decision pipeline + JobRunner + RestProvider + save round-trip

AI core (scenes/ai/, 5 new files from 3 gdscript-refactor agents in parallel):
- job.gd (59 lines, Agent A): Job class, RefCounted, label + toils + cursor +
  to_dict/from_dict round-trip
- toil.gd (76 lines, Agent A): Toil class, RefCounted; kinds WALK/WAIT/IDLE;
  factories walk_to/wait_ticks/idle; Vector2i stored as to_x/to_y ints
  because Godot 4 JSON.stringify doesn't round-trip Vector2i
- work_provider.gd (27 lines, Agent A): abstract base, class_name, @export
  category/priority, find_best_for() with push_error subclass guard
- job_runner.gd (186 lines, Agent B): Node-derived runner; setup/start_job/
  cancel_job/tick; WALK toil delegates to pawn.walk_along_path on first
  encounter (sets data.started=true), listens for walk_completed signal;
  WAIT decrements ticks_remaining; IDLE never completes; full to_dict/from_dict
- decision.gd (50 lines, Agent C): static pick_next_job(pawn, providers); 5
  layers (incapacitation/forced/status/work/idle); layer 1 probes via
  has_method to stay future-proof for Phase 9
- rest_provider.gd (31 lines, Agent C): extends WorkProvider; @export rest_tile;
  returns [walk_to(rest_tile), idle()] Job

Integration (Opus):
- pawn.gd: added forced_job slot, job_runner ref, _orchestrate_ai called
  before _advance_walk on each sim_tick. Calls Decision when forced_job is
  queued OR when idle — was a bug initially (only-on-idle never preempted
  the never-completing IDLE toil); fixed and caught via MCP runtime test.
  Added to_dict/from_dict for save round-trip; captures tile, _path,
  _step_progress, _selected, forced_job, job_runner via their serializers.
- selection.gd: rewrote to build a forced-job [walk_to + idle] and set
  pawn.forced_job; Decision preempts current job on next tick.
- world.tscn/gd: instantiates RestProvider as child (rest_tile = (50,50)
  just outside the stone ring's south-east, reachable from all 3 spawn
  tiles); registers via World.register_work_provider; attaches a JobRunner
  child to each spawned pawn and wires setup(pawn, pathfinder).
- world.gd autoload: added work_providers list + register/clear methods.
- save_system.gd: write_save walks World.pawns calling to_dict; apply_save
  zips dicts to pawns by index (Phase 16 will add stable IDs).
- main.gd: bootstrap log line bumped Phase 2 → Phase 3.

Acceptance — MCP-verified end-to-end:
- 3 pawns boot, Decision assigns each Rest, JobRunner starts each,
  all 3 walk to (50,50) on different paths (40/35/30 steps based on
  detour around the stone ring), arrive and idle.
- Force Bram to (10,10) via pawn.forced_job; preempt fires:
  [decision] Bram: forced 'Go to (10, 10)'. Bram walks while Cora/Edda
  stay parked.
- Mid-walk save round-trip (the critical Phase 3 acceptance):
  - Paused Bram at (51,10) walking to (70,70) with 79 path steps remaining
  - SaveSystem.write_save() → SaveSystem.apply_save(read_save()) after a
    mutate-to-(0,0)-with-no-path round-trip
  - Restored Bram exactly: tile=(51,10), _path.size=79, walking=true,
    job='Go to (70, 70)' at toil_idx=0 (WALK toil with data.started=true)
  - Resumed sim → JobRunner's WALK toil saw started=true and did NOT
    re-call walk_along_path; the pawn's restored _path continued the walk
    naturally → reached (70,26) with 44 steps remaining, still on the
    same job. The architecture.md 'mid-toil suspend safe' contract is
    provably honored.

Phase 3 gotchas (logged in implementation.md):
- Class-name registration timing bit again (Phase 2 gotcha). Workflow:
  agent writes class_name file → MCP reload_project → headless validate.
- Forced-job preempt requires triggering Decision when forced_job != null,
  not just when idle (IDLE toil never completes).
- execute_game_script + await Engine.get_main_loop().process_frame is
  flaky — MCP auto-recovers but the script's last lines may be lost.
  Workaround: split state-inspection into a fresh execute_game_script.

Delegation report this phase:
- gdscript-refactor (Sonnet) Agent A: Job + Toil + WorkProvider abstract
  base. 3 files, 162 lines.
- gdscript-refactor (Sonnet) Agent B: JobRunner with toil-execution match
  + walk_completed signal handling + full save round-trip. 1 file, 186
  lines.
- gdscript-refactor (Sonnet) Agent C: Decision pipeline + RestProvider.
  2 files, 81 lines.
- Opus: Pawn integration (forced_job slot, orchestration, to_dict/from_dict),
  Selection rewrite, world.tscn/gd wiring, World autoload work_providers
  list, SaveSystem extension, MCP-driven runtime verification including
  the mid-walk save round-trip demo, gotcha logging.

~70% of Phase 3's GDScript was written by subagents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
megaproxy 2026-05-10 21:05:50 +01:00
parent cd265b87c0
commit 5bf0f51efb
20 changed files with 613 additions and 25 deletions

50
scenes/ai/decision.gd Normal file
View file

@ -0,0 +1,50 @@
class_name Decision
## Static utility — picks the next Job for a pawn via the 5-layer pipeline.
##
## Layer order (top wins):
## 1. Incapacitation — has_method probe; implementation lands Phase 9.
## 2. Forced job — pawn.forced_job; cleared (consumed) on pick.
## 3. Status interrupt — stub; implementation lands Phase 9.
## 4. Work providers — iterated highest priority first; first non-null Job wins.
## 5. Idle — returns null (caller interprets as "stand still").
##
## Callers pass the world-scoped provider list so Decision is fully stateless.
## This makes it safe to call from any pawn tick without shared mutable state.
## Returns the best Job for `pawn`, or null if the pawn should idle.
##
## `work_providers` is the current world-scoped list of WorkProvider nodes
## (e.g. [RestProvider]). Order does not matter — the method sorts by priority.
## `pawn` is duck-typed: must expose .pawn_name, .forced_job, and
## has_method("is_incapacitated").
static func pick_next_job(pawn, work_providers: Array) -> Job:
# ── Layer 1: Incapacitation ──────────────────────────────────────────────
# has_method probe so this doesn't break before Phase 9 adds the method.
if pawn.has_method("is_incapacitated") and pawn.is_incapacitated():
return null
# ── Layer 2: Forced job ──────────────────────────────────────────────────
if pawn.forced_job != null:
var fj: Job = pawn.forced_job
pawn.forced_job = null
Audit.log("decision", "%s: forced '%s'" % [pawn.pawn_name, fj.label])
return fj
# ── Layer 3: Status interrupt ─────────────────────────────────────────────
# Phase 9: status interrupt (Bleeding → seek bed/doctor) lands here.
# ── Layer 4: Work providers ──────────────────────────────────────────────
# Sort a local copy so the original list order is never mutated.
var sorted: Array = work_providers.duplicate()
sorted.sort_custom(func(a, b): return a.priority > b.priority)
for wp in sorted:
var j: Job = wp.find_best_for(pawn)
if j != null:
Audit.log("decision", "%s: %s'%s'" % [pawn.pawn_name, String(wp.category), j.label])
return j
# ── Layer 5: Idle ────────────────────────────────────────────────────────
# No log — would fire every tick for every idle pawn (too chatty).
return null

View file

@ -0,0 +1 @@
uid://bbrqev1r5e5gh

59
scenes/ai/job.gd Normal file
View file

@ -0,0 +1,59 @@
class_name Job extends RefCounted
## A sequence of Toils that describes a pawn's current task (walk to, haul,
## build, rest, etc.). Job is plain data; JobRunner is the state machine that
## drives execution tick-by-tick.
##
## Save/load contract:
## var j2 := Job.from_dict(j.to_dict())
## assert(j2.label == j.label)
## assert(j2.current_toil_index == j.current_toil_index)
## assert(j2.toils.size() == j.toils.size())
## # Each toil round-trips per Toil's own invariant.
var label: String = ""
var toils: Array[Toil] = []
var current_toil_index: int = 0
# ── queries ──────────────────────────────────────────────────────────────────
## Returns the currently-executing Toil, or null when the job is done.
func active_toil() -> Toil:
if is_complete():
return null
return toils[current_toil_index]
## True once every toil has been completed.
func is_complete() -> bool:
return current_toil_index >= toils.size()
# ── state mutation ───────────────────────────────────────────────────────────
## Called by JobRunner after the current toil finishes. Steps the index forward.
func advance() -> void:
current_toil_index += 1
# ── save / load ──────────────────────────────────────────────────────────────
func to_dict() -> Dictionary:
var toil_list: Array = []
for toil in toils:
toil_list.append(toil.to_dict())
return {
"label": label,
"current_toil_index": current_toil_index,
"toils": toil_list,
}
static func from_dict(d: Dictionary) -> Job:
var j := Job.new()
j.label = d.get("label", "")
j.current_toil_index = d.get("current_toil_index", 0)
var raw_toils: Array = d.get("toils", [])
for raw in raw_toils:
j.toils.append(Toil.from_dict(raw))
return j

1
scenes/ai/job.gd.uid Normal file
View file

@ -0,0 +1 @@
uid://d1mksv0d6qieu

186
scenes/ai/job_runner.gd Normal file
View file

@ -0,0 +1,186 @@
class_name JobRunner
extends Node
## Executes a Job's toils on behalf of a Pawn.
##
## Sits between the Decision layer and the Pawn's physical state. The
## Decision layer (or a WorkProvider) hands us a Job; we tick through its
## toils one-by-one and fire job_completed when the last toil is done.
##
## Design notes (docs/architecture.md — Pawn AI 5-layer pipeline):
## - JobRunner is layer 3 of 5. Don't add control-flow that belongs to
## Decision (layer 1) or WorkProvider (layer 2) here.
## - Pawn and Pathfinder are held as untyped vars to avoid class_name
## registration-order issues between autoloads and scene scripts.
## - tick() is called from Pawn._on_sim_tick each sim tick. Never spin
## render-frame work off this function.
##
## Save / load contract (NON-NEGOTIABLE, Phase 3 acceptance criterion):
## to_dict() / from_dict() round-trip mid-toil state exactly. A WALK
## toil with started=true restores correctly: on the first tick after load
## the runner sits in the "already started, waiting for walk_completed"
## branch, so pawn.walk_along_path() is NOT called again (which would
## reset the pawn's progress). The pawn finishes its own restored walk
## under its own steam, eventually fires walk_completed, and the toil is
## marked done. See _tick_walk() for the branch logic.
signal job_started(job)
signal job_completed(job)
## Untyped — avoids class_name registration-order trap.
var pawn = null
## Untyped — avoids class_name registration-order trap.
var pathfinder = null
## Current Job being executed; null when idle.
var job = null
# ── lifecycle ────────────────────────────────────────────────────────────────
## Wire refs. Must be called once before any other method.
## Connects pawn.walk_completed → _on_pawn_walk_completed.
func setup(pawn_ref, pathfinder_ref) -> void:
pawn = pawn_ref
pathfinder = pathfinder_ref
pawn.walk_completed.connect(_on_pawn_walk_completed)
# ── public API ───────────────────────────────────────────────────────────────
## Replace the current job (if any) and begin executing the new one.
## Resets nothing on the new job — current_toil_index is used as-is so
## that a restored-from-save job continues from its saved toil position.
func start_job(j) -> void:
job = j
Audit.log(
"job_runner",
"%s start: %s (%d toils)" % [pawn.pawn_name, j.label, j.toils.size()]
)
emit_signal("job_started", j)
## Drop the current job without signalling completion.
## Any walk already in progress is left to finish naturally
## (Phase 3 simplicity; Phase 5+ may add a hard-abort path).
func cancel_job() -> void:
job = null
## True when a job is currently assigned.
func has_job() -> bool:
return job != null
# ── sim tick ────────────────────────────────────────────────────────────────
## Called from Pawn._on_sim_tick each sim tick.
## Executes the active toil; advances to the next when it is done;
## emits job_completed when the last toil completes.
func tick() -> void:
if job == null:
return
var t = job.active_toil()
if t == null:
_emit_complete()
return
match t.kind:
Toil.KIND_WALK:
_tick_walk(t)
Toil.KIND_WAIT:
_tick_wait(t)
Toil.KIND_IDLE:
pass # Never completes on its own — Decision or player overrides.
if t.done:
job.advance()
if job.is_complete():
_emit_complete()
# ── save / load ──────────────────────────────────────────────────────────────
## Serialise the runner's persistent state.
## {"job": <dict or null>}
func to_dict() -> Dictionary:
return {
"job": job.to_dict() if job != null else null,
}
## Restore from a dict produced by to_dict().
## If the "job" key holds a Dictionary, reconstructs a Job via Job.from_dict().
func from_dict(d: Dictionary) -> void:
var job_data = d.get("job", null)
if job_data is Dictionary:
job = Job.from_dict(job_data)
# ── signal handlers ──────────────────────────────────────────────────────────
## Fired by the Pawn when it finishes walking its path.
## Marks the active WALK toil done so the next tick() advances past it.
## Does NOT call job.advance() directly — tick() handles that.
func _on_pawn_walk_completed() -> void:
if job == null:
return
var t = job.active_toil()
if t != null and t.kind == Toil.KIND_WALK:
t.done = true
# ── toil executors ──────────────────────────────────────────────────────────
## Execute one tick of a WALK toil.
##
## On the FIRST tick (started=false):
## - If the pawn is already at the destination, complete immediately.
## - Otherwise ask the pathfinder for a route. If unreachable, log and
## complete (skip-and-continue; the WorkProvider is responsible for
## vetting reachability before issuing the job).
## - Hand the path to the pawn and mark started=true. From now on this
## function is a no-op — we just wait for the walk_completed signal.
##
## On SUBSEQUENT ticks (started=true):
## - No-op. The pawn walks under its own steam.
##
## After LOAD (started=true from saved state):
## - Same as subsequent ticks — pawn restores its own path and fires
## walk_completed when it arrives. We do NOT call walk_along_path again.
func _tick_walk(t) -> void:
if not t.data.get("started", false):
var dest: Vector2i = t.get_walk_destination()
if pawn.tile == dest:
t.done = true
return
var path: Array[Vector2i] = pathfinder.find_path(pawn.tile, dest)
if path.is_empty():
Audit.log(
"job_runner",
"%s unreachable: %s%s" % [pawn.pawn_name, pawn.tile, dest]
)
t.done = true
return
pawn.walk_along_path(path)
t.data["started"] = true
## Execute one tick of a WAIT toil.
## Decrements the counter; sets done when it reaches zero.
func _tick_wait(t) -> void:
t.data["ticks_remaining"] -= 1
if t.data["ticks_remaining"] <= 0:
t.done = true
# ── helpers ──────────────────────────────────────────────────────────────────
## Emit job_completed, log, and clear the job reference.
func _emit_complete() -> void:
var completed = job
job = null
Audit.log(
"job_runner",
"%s done: %s" % [pawn.pawn_name, completed.label]
)
emit_signal("job_completed", completed)

View file

@ -0,0 +1 @@
uid://8v4lqcrhx1eu

View file

@ -0,0 +1,31 @@
class_name RestProvider extends WorkProvider
## Phase 3 smoke-test WorkProvider: sends every pawn to a shared rest tile.
##
## If the pawn is already at rest_tile, returns a walk-less idle-forever job.
## Otherwise prepends a walk_to toil before the idle toil.
##
## No internal state beyond rest_tile — Decision's log line carries all
## the info needed for debugging (pawn name + provider category + job label).
## The tile pawns walk toward. Set by the world scene on instantiation.
@export var rest_tile: Vector2i = Vector2i(40, 40)
func _init() -> void:
category = &"rest"
priority = 0 # Only provider in Phase 3; no relative ordering needed yet.
## Returns a Job for `pawn`. Never returns null — Rest always has something
## to offer (walk there, or idle in place).
## `pawn` is duck-typed: must expose .tile (Vector2i).
func find_best_for(pawn) -> Job:
var j := Job.new()
j.label = "Rest at %s" % rest_tile
if pawn.tile != rest_tile:
j.toils.append(Toil.walk_to(rest_tile))
j.toils.append(Toil.idle())
return j

View file

@ -0,0 +1 @@
uid://dyacrro784lvo

76
scenes/ai/toil.gd Normal file
View file

@ -0,0 +1,76 @@
class_name Toil extends RefCounted
## A single atomic step within a Job — walk, wait, idle, etc.
##
## Save/load contract: every value in `data` MUST be JSON-safe.
## Vector2i is NOT JSON-safe in Godot 4 — tile coordinates are stored as
## "to_x"/"to_y" integer keys, never as Vector2i. get_walk_destination()
## reconstructs Vector2i on demand.
##
## Round-trip invariant:
## var t2 := Toil.from_dict(t.to_dict())
## assert(t2.kind == t.kind and t2.done == t.done and t2.data == t.data)
const KIND_WALK: StringName = &"walk"
const KIND_WAIT: StringName = &"wait"
const KIND_IDLE: StringName = &"idle"
var kind: StringName = KIND_IDLE
## Toil-specific params — all values must be int, float, bool, String, Dict, or Array.
var data: Dictionary = {}
## Set by JobRunner when this toil is complete.
var done: bool = false
# ── factories ────────────────────────────────────────────────────────────────
## Walk to the given tile. Stores coords as separate ints for JSON safety.
static func walk_to(tile: Vector2i) -> Toil:
var t := Toil.new()
t.kind = KIND_WALK
t.data = {
"to_x": tile.x,
"to_y": tile.y,
"started": false,
}
return t
## Pause for `n` sim ticks.
static func wait_ticks(n: int) -> Toil:
var t := Toil.new()
t.kind = KIND_WAIT
t.data = {"ticks_remaining": n}
return t
## Stand idle — never completes on its own; JobRunner must cancel or replace.
static func idle() -> Toil:
var t := Toil.new()
t.kind = KIND_IDLE
t.data = {}
return t
# ── save / load ──────────────────────────────────────────────────────────────
func to_dict() -> Dictionary:
return {
"kind": str(kind),
"data": data.duplicate(true),
"done": done,
}
static func from_dict(d: Dictionary) -> Toil:
var t := Toil.new()
t.kind = StringName(d.get("kind", str(KIND_IDLE)))
t.data = (d.get("data", {}) as Dictionary).duplicate(true)
t.done = d.get("done", false)
return t
# ── convenience ──────────────────────────────────────────────────────────────
## Rebuild Vector2i from the JSON-safe int fields. Only valid for KIND_WALK.
func get_walk_destination() -> Vector2i:
return Vector2i(data.get("to_x", 0), data.get("to_y", 0))

1
scenes/ai/toil.gd.uid Normal file
View file

@ -0,0 +1 @@
uid://djmc0woq4u65m

View file

@ -0,0 +1,27 @@
class_name WorkProvider extends Node
## Abstract base for all work-category providers (Construction, Mining,
## Hauling, Cooking, …). Subclass this and override find_best_for().
##
## Pawn AI layer 2: each pawn iterates its ordered list of WorkProviders
## and calls find_best_for(self) until one returns a non-null Job.
##
## `pawn` is intentionally untyped (duck-typed) to avoid class_name
## init-order issues. Concrete providers access pawn.tile, pawn.pawn_name,
## pawn.is_walking(), etc. — the same public API exposed by Pawn.
## Work category key used to identify this provider. Must be unique per
## provider instance; used by the priority matrix and Decision layer.
@export var category: StringName = &"unspecified"
## Priority slot in the pawn's work-priority matrix.
## Higher values are scanned first by the Decision layer.
@export var priority: int = 0
# ── abstract interface ───────────────────────────────────────────────────────
## Concrete providers MUST override this.
## Return a Job for `pawn` to execute, or null if no suitable work exists.
func find_best_for(pawn) -> Job:
push_error("WorkProvider.find_best_for: subclass '%s' must override this method" % name)
return null

View file

@ -0,0 +1 @@
uid://vi08by1dh0lb