Skip to content

HVE Cockpit: a host-agnostic web GUI for steering agentic coding sessions#2271

Draft
MichaelDanCurtis wants to merge 185 commits into
microsoft:mainfrom
MichaelDanCurtis:design/rpi-cockpit
Draft

HVE Cockpit: a host-agnostic web GUI for steering agentic coding sessions#2271
MichaelDanCurtis wants to merge 185 commits into
microsoft:mainfrom
MichaelDanCurtis:design/rpi-cockpit

Conversation

@MichaelDanCurtis

Copy link
Copy Markdown

Draft / for discussion. This proposes a body of work developed on a fork
(MichaelDanCurtis/hve-core) back to the upstream it was forked from. It is
opened as a draft to start a conversation, not as a merge-ready change.

What this is

The HVE Cockpit (rpi-cockpit/) is a host-agnostic web GUI that makes an
agentic coding session legible and steerable. An agent narrates its work
over an MCP bridge; the cockpit renders one view per kind of work (an RPI build
loop, a reviewer's findings panel, a backlog kanban, a team board, a codebase
map, and more), and the user steers through a directive queue the agent drains.
The charter boundary is strict: the cockpit captures intent (steer,
decisions, interventions) and the agent performs; the cockpit never launches or
controls agents itself.

Architecture

A one-way data spine: a typed beat (zod union) -> a pure reducer ->
a pure view-model projection -> WebSocket broadcast -> an unbundled
vanilla-JS browser client
. MCP tools emit beats via a bridge. No frontend
framework, no graph library. Tests are Vitest + happy-dom.

What's in it

  • 11 loop-view domains (RPI, reviewers findings, guided interview, backlog
    kanban, team board, 3D codebase map, data profile, gallery, prompt workbench,
    memory, and an n8n-style flow canvas for the gh-aw agentic-workflow
    pipeline), plus a representation map of shared primitives (timeline, decision
    flow, list, question, screen, context badges, app frame).
  • 45 MCP tools the agent narrates with; a cross-process live pane (a
    producer MCP server writes a state snapshot the consumer pane renders).
  • A WCAG 2.2 / responsive hardening pass and a Fluent + motion visual layer.
  • Full test coverage (Vitest), tsc clean.

Scope of this PR

  • rpi-cockpit/ — the cockpit package (source, the browser client, tests, and
    the design/plan docs under rpi-cockpit/docs/).
  • Integration touchpoints in the repo root, produced by the cockpit's own
    init step: registering the MCP server (.mcp.json, .vscode/mcp.json), a
    marker-delimited narration-contract block in CLAUDE.md / AGENTS.md /
    .github/copilot-instructions.md, narration pointers in the
    .github/agents/hve-core/* agent files, .gitignore and a preview
    launch.json, plus the original design docs under docs/.

The root-level changes are the natural discussion point for maintainers: they
are how the cockpit wires into the repo, and are easy to pare back to a
cockpit-only contribution if preferred.

Status

Developed iteratively (brainstorm -> spec -> plan -> subagent-driven build with
per-task and whole-branch reviews), each surface live-verified in the preview
pane. Opened here as a draft for upstream feedback.

Host-agnostic web cockpit that makes hve-core RPI agent sessions legible and interactive via an MCP bridge, without owning orchestration. Includes the design spec and a themeable Fluent mockup (Fluent / VS Code / Mica).
- server: bind HTTP listener to 127.0.0.1 explicitly (was binding all
  interfaces); EADDRINUSE/ephemeral fallback preserved.
- handlers: present_options now passes a finite fallback timeout to
  bridge.presentOptions so a stuck decision can't block the agent forever.
  Configurable via RPI_COCKPIT_DECISION_TIMEOUT_MS (default 1800000 ms).
- package.json: add prepare/postbuild scripts and engines node>=20.
- gitignore: ignore rpi-cockpit/dist build output.
- tests: cover the finite-timeout fallback for present_options.
…into host surfaces

Add `rpi-cockpit init [--host claude|codex|vscode|all] [--codex-global]`:
- Prepend shebang to src/index.ts so tsc emits it as dist/index.js line 1.
- Dispatch on process.argv[2]==='init' before starting the MCP server;
  parse --host/--codex-global, derive defaults from import.meta.url, run
  runInit(), print summary, exit 0.
- src/init.ts: idempotent read-merge writers for .mcp.json (CLAUDE_PROJECT_DIR),
  .vscode/mcp.json (workspaceFolder + cwd), hand-rolled .codex/config.toml
  ([mcp_servers.rpi-cockpit] with absolute entryPath/cwd, startup_timeout_sec=20),
  and marker-delimited narration blocks inlined from the contract into CLAUDE.md,
  AGENTS.md, and .github/copilot-instructions.md (preserving existing content).
- Ignore /.codex/ (machine-specific absolute path; never commit).
- TDD: tests/init.test.ts covers config shapes, host filters, preservation,
  and idempotency.
Run `rpi-cockpit init --host all` to wire the cockpit into every host
surface and inline the narration contract:

- .mcp.json (Claude): mcpServers.rpi-cockpit, type stdio, ${CLAUDE_PROJECT_DIR}
- .vscode/mcp.json (VS Code): servers.rpi-cockpit, type stdio, ${workspaceFolder}
- CLAUDE.md / AGENTS.md / .github/copilot-instructions.md: narration block
- .gitignore: allowlist .vscode/mcp.json so the generated config is committable
- README: per-host setup, config locations, 7-tool/UI verification, Codex TOML

.codex/config.toml is machine-specific and stays git-ignored (not committed).
Remove .github/instructions/hve-core/rpi-cockpit-narration.instructions.md: its
applyTo glob targeted agent/prompt definition files, so VS Code Copilot never
auto-applied it during real work. Narration now lives in each host always-on
instruction file (CLAUDE.md, AGENTS.md, .github/copilot-instructions.md) written
by init. Repoint the five RPI .agent.md references to the canonical
rpi-cockpit/agents/cockpit-instructions.md.
The old "Register with Claude Code" step told users to cp .mcp.json.example over
.mcp.json, which clobbers the correct init-generated config (type stdio +
${CLAUDE_PROJECT_DIR} + env). Remove that step (the per-host section already
covers Claude Code) and update .mcp.json.example to match the generated form.
The RPI Agent handoffs used send: true and targeted the RPI Agent itself
(/rpi continue=all, etc.). VS Code Copilot honors send: true as auto-submit
("the prompt automatically submits to start the next workflow step"), so after
every response a handoff auto-submitted back into the same agent -> infinite
self-invocation on any request. Switch the handoffs to send: false so they are
user-click buttons (their intended use). Native frontmatter issue, exposed by
running the agent in VS Code Copilot; not caused by the cockpit instrumentation.
VS Code Copilot auto-submits send: true handoffs ("the prompt automatically
submits to start the next workflow step"), which auto-advances the whole RPI
pipeline (planner -> implementor -> reviewer ...) with no pause for the user.
Flip the remaining seven hve-core agents handoffs to send: false so each step
waits for an explicit click, matching the RPI Agent fix in d046b33.
The Phase 5 cockpit narration told the agent to call present_options for the
Suggested Next Work list "then act on the returned id" — which, with the finite
present_options timeout falling back to the recommended option, could auto-start
a new RPI cycle with no human input. Reword so Discover surfaces the options and
acts only on an explicit user selection, otherwise stops and yields.
Also adds exhaustiveness arms in state.ts applyBeat/summarize (passthrough + label), required by strict tsc when extending the Beat union. Task 2 refines the applyBeat arm to set steerMenu.
…rill-in

Addresses the whole-branch review's one Major: the spec required the camera
to frame the graph bounding box on a new node set, but it was hard-pinned at
{x:40,y:40,z:1} and wide pipelines clipped at the right edge. gwFitToView
centers + scales the bbox into the live canvas (zoom clamped to [0.3,2]),
gated by a scope+node-set signature so live-run status updates keep the
user's pan/zoom while a fresh pipeline or a drill-in re-frames.
@MichaelDanCurtis

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

@MichaelDanCurtis

Copy link
Copy Markdown
Author

@jkim323 things were done, UX was made ... take a look at the workflow UX ..its pretty cool

Comment thread rpi-cockpit/package.json Outdated
Comment thread rpi-cockpit/package.json Outdated
Comment thread rpi-cockpit/package.json Outdated
Comment thread rpi-cockpit/package.json Outdated
Comment thread rpi-cockpit/package.json Outdated
Comment thread rpi-cockpit/package.json Outdated
Comment thread rpi-cockpit/package.json Outdated
Comment thread rpi-cockpit/package.json Outdated
Comment thread rpi-cockpit/package.json Outdated
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.84%. Comparing base (c6d1ace) to head (a08236e).
⚠️ Report is 39 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2271       +/-   ##
===========================================
- Coverage   81.24%   61.84%   -19.41%     
===========================================
  Files         127       10      -117     
  Lines       18829       76    -18753     
  Branches       12       12               
===========================================
- Hits        15298       47    -15251     
+ Misses       3528       26     -3502     
  Partials        3        3               
Flag Coverage Δ
docusaurus 61.84% <ø> (ø)
pester ?
pytest ?

Flags with carried forward coverage won't be shown. Click here to find out more.
see 127 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Addresses GitHub Advanced Security dependency-pinning-analyzer findings on
PR microsoft#2271: replace caret ranges with the exact versions already resolved in
package-lock.json (no install change). tsc + 357 tests green.
A real Claude Code slash command (.claude/commands/gallery.md) launches the
65-agent gallery producer and shows it in the preview pane; the cockpit
narration contract (cockpit-instructions.md + CLAUDE.md) documents /gallery
as a convention mapping to gallery_open, alongside the existing /Nav.
…low DAG

The agent-gallery producer predated the flow canvas, so agent microsoft#58's tile
narrated the old gh-aw screen card. Add a flow() helper and rewire microsoft#58 to
narrate the orchestration pipeline (flow_open + workflow nodes + label/event
edges) so its tile shows the n8n-style DAG, matching the new flow surface.
…t#60/microsoft#61/microsoft#62) gallery tiles as their own surfaces

The producer predated the memory and promptlab surfaces, so these four
meta-utility agents narrated context badges / a findings panel. Add memory()
and promptlab() producer helpers and rewire microsoft#57 to the memory view and
microsoft#60/microsoft#61/microsoft#62 to the promptlab workbench, so every meta-utility tile reflects
the shipped surface.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants