Skip to main content

Worker gate

The worker gate is the worker-side reliability check for long-running agent jobs. It exists because an agent session ending is not the same as the machine being idle. Older field and endpoint names may still say wake_gate; treat that as compatibility naming. For how the worker gate fits the current host topology, callback retry model, and project decision artifact contract, see current runtime snapshot.

What it observes

The code models process snapshots, process info, telemetry samples, run records, and gate callbacks. Configuration supports sample_interval_sec, idle_sustain_sec, CPU and GPU idle thresholds, VRAM delta thresholds, workload profiles, and stale-project process reaper settings. Supported workload classes in the config model are unknown, cpu_only, gpu_required, inference_eval, training, control_plane, and agent_harness. The cpu_only and gpu_required classes map to machine-target routing through workload_machine_targets and worker_targets.

Gate idea

A run should only advance when relevant processes are gone or stale-safe, and telemetry has remained under configured thresholds for the sustain window. This avoids advancing the queue while a detached worker process still consumes resources.

Callback behavior

When the gate is satisfied, the app can send a completion callback to completion_callback_url using completion_callback_token. Deprecated n8n_* config aliases exist for older private prototypes, but new docs and configs should use completion_callback_*.

Stale process reaper

The reaper can identify stale project processes by command markers such as llama-cli, llama-server, vllm, and sglang. Keep the marker list conservative. Do not add broad substrings that could terminate unrelated user processes.

Operator guidance

Tune thresholds per workload, use dry-run dispatch and preflight endpoints first, keep worker tokens distinct where practical, and treat worker-gate evidence as operational evidence rather than scientific validation of generated results.