Ralph Orchestrator (,ralph go)
One entry point, ,ralph go, drives an opinionated planner -> executor -> reviewer -> re_reviewer loop with persistent state, self-healing (replans on RALPH_REPLAN), and full tmux observability. Both reviewer and re_reviewer run on every iteration, in sequence, so two different model families always cover each other's gaps. After a passing run, an optional reflector role distills durable lessons into the AI knowledge base so subsequent runs benefit from the learnings.
Use when spawning agents, kicking off a run, verifying a run, replanning, or interacting with the tmux Ralph control plane.
| Component | Source |
|---|---|
| CLI entry | home/exact_bin/executable_,ralph + scripts/ralph.py |
| Roles + diversity | home/dot_config/ralph/roles.json |
| Role prompts | home/dot_config/ralph/prompts/ |
| Dashboard (TUI) | tools/ralph-tui/ |
| Skill | home/exact_dot_agents/exact_skills/exact_ralph |
Roles and the diversity gate
Roles are configured in ~/.config/ralph/roles.json:
| Role | Default harness | Default model | Mode flag |
|---|---|---|---|
planner | cursor | claude-opus-4-8-thinking-xhigh | --mode plan |
executor | cursor | composer-2.5 | --force |
reviewer | cursor | claude-opus-4-8-thinking-xhigh | --mode ask |
re_reviewer | cursor | gpt-5.5-extra-high | --mode ask |
reflector | cursor | claude-opus-4-8-thinking-xhigh | --mode ask |
Anthropic-side roles pin Opus 4.8 at xhigh effort, exclusively — Fable 5 is listed by cursor-agent models but not usable on this account, and other effort tiers (max, high, …) are deliberately not used; the TUI picker and CURSOR_MODELS exclude Fable and non-xhigh Opus variants accordingly.
Model ids in the cursor harness are Cursor-internal aliases resolved by Cursor's own routing (list them with cursor-agent models), not entries in ai_models.yaml — the registry only covers gateway/openrouter ids used by pi. ralph.py validates cursor models against CURSOR_MODELS, a pinned snapshot of that live list; refresh the snapshot from cursor-agent models when Cursor rotates model generations.
Defaults are cursor-first because cursor's frontier models give the strongest output and judgement on this user's setup; pi stays fully supported and is required for non-cursor providers (anthropic/openai/google direct, openrouter, llama-cpp). The orchestrator enforces family_of(re_reviewer.model) != family_of(reviewer.model) (substring match on claude|gpt|gemini|llama|mistral|deepseek) so the mandatory second opinion never comes from the same family. --mode plan (planner) and --mode ask (reviewer/re_reviewer) are read-only — they prevent role hijacking (small models with full tools tend to skip JSON and just execute the goal) while still allowing read access for verification probes. --force on the executor auto-approves shell commands so the orchestrator can drive non-interactively. On pi the equivalent role-scoping is --no-tools for planner/reviewer/re_reviewer. Per-role prompts live at home/dot_config/ralph/prompts/.
Local models (llama-cpp / qwen3.6) are opt-in only; defaults never depend on ,llama-cpp serve being up. Swap them in by editing roles.json (e.g. point executor.harness=pi, executor.model=llama-cpp/local). See llama.cpp local inference.
Elastic-gated /review skill invocation
For elastic-belonging codebases — operator's day job — the reviewer and re-reviewer roles invoke the operator's review skill directly. The skill's verification disciplines become the primary instruction; Ralph's existing JSON output contract is preserved as the wire format. Non-elastic workspaces are unchanged.
- Detection:
is_elastic_workspace(path)parsesgit remote -vand matches(github\.com[:/])elastic/against any remote URL (HTTPS or SSH;originorupstream). Best-effort — non-git directories, missing paths, orgitfailures all yieldFalse. - Wiring:
elastic_review_preamble(role)reads~/.agents/skills/review/references/{judging_core,shared_rules,local_changes}.mdand renders a## REVIEW SKILL HEURISTICS (elastic)block containing (a) "you are running the/reviewskill in local_changes mode", (b) the skill'sjudging_core.mdcontent verbatim (the surface-agnostic verification disciplines: coverage checklist, severity, truth validation, the state-machine / deletion-safety / historical-rationale gates, and the post-review four-dimension lens + stage), (c) the skill'sshared_rules.mdcontent verbatim (PR/SCSI/GitHub-delivery rules), (d) the skill'slocal_changes.mdcontent verbatim, and (e) a format-normalization note translating the skill's "fix in working tree" guidance into Ralph'scriteria_unmet+next_taskJSON fields. The block is prepended to the dynamic context BEFORE## SPEC, so the model reads the skill instruction first then applies it to the inputs. - Override:
RALPH_REVIEW_SKILL_DIR=<path>swaps in a different skill source directory (useful for testing alternate review heuristics without touching~/.agents/). - Graceful degradation: when the skill files are missing the preamble silently degrades to empty (no crash) and the default review path runs unchanged. Operators without the skill installed see no behavior change.
- Output contract: unchanged. Reviewer still emits
{verdict, criteria_met, criteria_unmet, next_task, blocking_reason, notes}; re-reviewer still emits{agree_with_primary, final_verdict, ...}. The orchestrator's verdict parser is not aware that the elastic preamble was injected.
Runtime data
Runtime data stays outside chezmoi (per-session isolation, never under the project worktree):
| Data | Default path | Override |
|---|---|---|
| Knowledge capsules | ~/.local/share/ai-kb/ | AI_KB_HOME |
| Ralph run manifests | ~/.local/state/ralph/runs/ | RALPH_STATE_HOME |
| Ralph roles config | ~/.config/ralph/roles.json | RALPH_ROLES_CONFIG |
Core workflow
,ai-kb remember --title "Project rule" --body "Keep generated state out of git."
,ralph dry-run --goal "Memory rehearsal" # render the prompt only
,ralph go --goal "Build a tiny CLI tool" --workspace "$(mktemp -d)"
,ralph go --goal "Refactor module" --plan-only # stop after planner
,ralph go --goal "Refactor module" --workflow research # workflow hint
,ralph go --goal "Refactor module" \
--reviewer-model claude-sonnet-4-7 --re-reviewer-model gpt-5.5 # per-role overrides
,ralph answer <run-id> --json - <<< '{"q-1":"yes, the cache is ok"}' # post answers when parked at awaiting_human
,ralph runner <run-id> # internal: drive the state machine
,ralph resume <run-id> # re-launch runner if it died (no-op if alive/terminal)
,ralph replan <run-id> # queue replan; runner consumes it next loop
,ralph supervisor --json # resume dead non-terminal runners when safe
Tmux-native mode (default when $TMUX is set the runner detaches and your shell returns immediately; --foreground blocks inline; --subprocess skips tmux entirely):
,ralph go --goal "Build a tiny artifact" # detached runner; observe via dashboard
,ralph go --foreground --goal "Block until done" # inline state-machine drive
,ralph runs --json --session "$(tmux display-message -p '#S')"
,ralph preview <run-id> # rich summary; --mode tail for live tail
,ralph dashboard # alias for prefix+A: execs ~/.local/bin/ralph-tui
,ralph attach <run-id> --role executor-1
,ralph verify <run-id>
,ralph kill <run-id> [--role executor-1] # SIGINT pane(s), mark killed
,ralph kill --all # bulk-kill every non-terminal run
,ralph rm <run-id> # archive + drop ai-kb capsules
,ralph statusline # one-line tmux status segment + (^A) hint
Resumability: every iteration is a state machine driven by the manifest on disk. If the runner process dies (Ctrl-C, SSH drop, host reboot), ,ralph resume <run-id> re-launches it and the loop picks up at the earliest pending phase: per-iteration phases are pending -> exec -> review -> rereview -> decided. Role spawns are idempotent at the parent-manifest level: a role with status=completed in the manifest cache is reused without re-spawning; if a tmux role pane is still alive after a runner crash, the runner reuses that pane and waits for its exit marker instead of spawning a duplicate. Runner liveness is detected via an exclusive flock on <run_dir>/runner.pid. ,ralph supervisor can resume dead non-terminal runners and skips runs parked for manual control.
Multi-Ralph dashboard
The dashboard (tools/ralph-tui/, prefix + A opens the popup) is a Bubble Tea TUI installed at ~/.local/bin/ralph-tui by run_onchange_after_06-build-ralph-tui.sh.tmpl. It reads run state directly from the manifest tree under $RALPH_STATE_HOME via fsnotify, so updates are live without polling. All mutating actions still shell out to ,ralph so the orchestrator stays the only writer.
Layout:
- Left pane: scrollable list of every run (newest first).
*= runner alive,R= replan queued. Status colors track validation/phase. - Right top: selected run's header (id, goal, phase, status, runner) and a roles table with per-iteration history.
- Right bottom: live tail of the selected role's
output.log. - Modals: new-run form (
n), control menu (c), help overlay (?).
Keybindings:
- Fleet view rows: heartbeat dot (bright violet
●alive+fresh, amber●alive+stale, red●dead,Rreplan queued, blank never started), name, status badge, phase,n/Niterations vsmax_iterations, and a verdict-colored sparkline (green=pass, red=fail, amber=replan, blue=in-flight, dim=pending). AQ:Nbadge surfaces open clarifying questions. j/k(or arrows) move within the focused pane;tabcyclesruns -> roles -> tail;enter(or3) zooms the focused pane to fullscreen;1/2/3switch between detail / 2x2 role grid / zoom layouts.scycles the runs list sort (needparks-on-questions+live-work first,recentnewest first);Stoggles the cross-run activity drawer (recent decisions + verdicts across all parentkind=goruns).enter(on a run) attaches viatmux switch-client -t ralph-<short-rid>;enter(on a role) attaches to that role's window.popens a tmux capture-pane preview of the focused role's pane (read-only, refreshes every second + onr); inside the modal, capitalAlaunchestmux display-popup -E -- tmux attach-session -r -t TARGETfor a read-only popup attach without leaving the TUI.nopens the new-run form. Fields: goal, workspace, plan-only, workflow picker (auto/feature/bugfix/review/research), plus per-role harness AND model pickers for planner / executor / reviewer / re_reviewer. Pickers cycle withh/lor←/→;j/k(or↓/↑, ortab/shift+tab) move down/up between fields (gated so the goal/workspace text inputs still accept literal characters);enteradvances harness → matching model → next role. Harness cyclescursor → pi → command; model lists curated per harness intools/ralph-tui/internal/state/models.go;autoworkflow lets the planner pick, otherwise--workflow <name>is forwarded to,ralph go.A(capital) opens the answer modal when the selected run is parked atawaiting_human. One text input per open question;tab/j/k/shift+tabmove focus,enteradvances or submits,esc/ctrl+ccancels (qcancels only when focus is not on a text input). Submission pipes JSON answers through,ralph answer <rid> --json -so the orchestrator can resume. The status bar showsQ:Σaggregated across the fleet.K(capital) opens the AI KB browser modal. Type a query andenterto dispatch a hybrid search (lexical + dense, RRF + MMR, superseded capsules excluded);↑/↓move between hits to expand metadata (kind/scope/confidence/refs) on the right;esc/qcloses. The status bar showsKB:N(total non-superseded capsules).vverify;rmanual refresh;Rresume runner;Preplan;xkill;Xrm.xandXopen a confirmation modal before dispatching./filter the runs list;copens the control menu (verify, takeover, dirty, kill, rm, replan, resume);?toggles help;qquit.
Multi-Ralph isolation contract:
- Each
,ralph gorun owns a dedicated tmux session namedralph-<short-rid>. Multiple runs coexist without polluting the user's main session. - The dashboard never holds tmux state; quitting (
q) does not affect any running runners or sessions. kill <rid>andrm <rid>only touch their own dedicated session; concurrent runs are unaffected (covered byscripts/tests/test_scripts.py::TestRalphMultiRunIsolation).
Other tmux integrations:
prefix + Aopens the dashboard popup (only top-level prefix key Ralph claims). The popup runs~/.local/bin/ralph-tuidirectly; if the user picks a run +enter, the TUI exits cleanly andtmux switch-clientjumps to that run's session.prefix + Ris untouched (still reloads tmux config); start runs from the dashboard (n) or the palette.- The command palette (
prefix + r) lists Ralph entries (dashboard,ralph:start-goprompt,ralph:plan-onlyprompt, verify latest, attach prompt, doctor) that firetmux command-promptinstead of dumping text. - The session picker (
prefix + T) tags Ralph-owned tmux sessions with a colored badge (ralph✓ralph?ralph✗ralph●ralph⨯) and shows the matching,ralph runs --session …block in the preview. - The GitHub picker (
prefix + G)alt-Aon a PR/issue stages a Ralph handoff: closes the picker, resolves the matching worktree (or$PWD), and opens a,ralph gogoal prompt seeded with the PR/issue title. - A status-bar segment (appended after TPM/Catppuccin) shows
R:<running>andV:<needs_verification>counts via,ralph statusline.
Dashboard / control-plane invariants:
- Source of truth:
~/.local/state/ralph/runs/<run-id>/manifest.json. Mutate via the CLI only. - Each
gorun recordsphase(planning|executing|reviewing|rereviewing|replanning|done|failed|blocked),iterations[](each with its ownphase(pending|exec|review|rereview|decided),verdict,executor_id,reviewer_id,re_reviewer_id,task,next_task,spec_seq),roles{}(pane handles for planner-N / executor-N / reviewer-N / re_reviewer-N),spec,spec_seq,learned_ids, andrunner(pid + host + heartbeat + alive bit). spec.target_artifactis promoted to top-levelmanifest.artifact; on a passing verdict Ralph freezesartifact_sha256, andverifyrequires the artifact hash to match.executor_countmust be1until real multi-executor orchestration exists. Planner output with a higher value fails fast instead of being silently ignored.- Iteration records are appended at iteration START (phase=pending) and updated as phases progress; the runner is fully resumable from the manifest alone — see Resumability above.
- Human control: every role records pane handle, status, last output path, and
control_state(automated|manual_control|dirty_control|resume_requested). - Low-token observability: dashboards read manifests, logs, and tmux pane tails directly. No LLM is invoked for summarization unless the user explicitly requests a review/triage agent.
- Validation gates: a
gorun ispassedonly when the orchestrator loop exits withstatus=completed, the final verdict ispass, every role child isautomated, every role child passed validation, and the artifact hash gate passes whentarget_artifactis declared. - Manual takeover:
,ralph control <run-id> --role <role> --action takeover|dirty|resume|auto.resumeauto-runsverify. - Isolation: workspace and ralph state stay separated by run ID; worktrees remain source-code context, not scratch storage.
Verification
uvx pytest scripts/tests/ # python suite (ralph + isolation + resumability + artifact/control gates + workflows + answer + KB schema/hybrid retrieval/reflector/doc-ingest/curation)
( cd tools/ralph-tui && go test ./... ) # TUI suite (state, cmds, forms, runs, detail, answer, preview, activity, kb, app/view)
AI_KB_HOME="$(mktemp -d)" RALPH_STATE_HOME="$(mktemp -d)" \
,ralph go --goal "create $(mktemp -d)/hello.txt with content 'hi'" \
--workspace "$(mktemp -d)" --subprocess
--runtime local is deterministic and still exists for ,ralph dry-run smoke tests; ,ralph go itself reads runtimes from roles.json (Pi rich primary, Cursor Agent secondary).
Related
- Agent memory — the AI KB that Ralph reads from and writes to
- Review workflow — the skill reviewer/re-reviewer roles invoke on elastic repos
- llama.cpp local inference — opt-in local models for roles
- The Agentic Operating System — governance layer