Build log

Shipping in the open

Generated from the engineering changelogs in the Skybolt repo — unedited, newest first.

2026-07-09

Agents Execution

Root cause. Codex was deliberately excluded from the hard ready_floor_seconds that gates the Claude and Antigravity seats, on the assumption its ^>\s*$ ready-prompt regex (SEAT_READY_PATTERNS["codex"]) was a reliable "ready" signal. It isn't — Codex paints a bare > on its loading screen before the input box is live, so the regex matched an early boot frame, readiness "confirmed", and only the general 2s post-ready grace stood between the match and the paste. That was not enough.
Fix. Added a codex entry to SEAT_READY_PROFILES (terminal.py) with ready_floor_seconds = _CODEX_READY_FLOOR_SECONDS (default 5s, env-tunable via SKYBOLT_CODEX_READY_FLOOR_SECONDS), mirroring how Claude/Antigravity already suppress every readiness signal — the regex match included — for the first N boot seconds. The floor is shorter than Claude's 10s because Codex comes up in ~4-5s. The fast > regex still fires the instant the real prompt appears once the floor elapses; the general 2s post-ready grace is unchanged.
Tests. tests/test_terminal.py::test_seat_ready_profile_codex_has_ready_floor (new); the old test_seat_ready_profile_default_has_no_settle_override no longer lists codex (it now has a profile) and covers aider/opencode/unknown instead.
Root cause. Review-ness of a seat was inferred solely from the edge's participation_role (plus the retired legacy review_agent role). The claim gate only admits a *review* session to a review-lane card (_claim_next / _is_review_session), and a reviewer routed as owner snapshots a plain repo-change contract (create_agent_session_record resolves the contract without the role_key), so _is_review_session returned False and the card wedged.
Fix — review-ness is now keyed off the ROLE, like merge roles. New REVIEW_ROLE_KEYS (constants.py) = {review_agent, code_review_agent, code_qa_reviewer}. resolve_work_contract resolves any of these to the read-only review contract regardless of the edge's participation, and the scheduler's _is_review_session role fallback recognises the same set — so a reviewer handed a card by ANY agent (not just the orchestrator, and not only via a reviewer-participation edge) claims its review-lane card and returns a pass/fail verdict that routes onward. An explicit participation_role="reviewer" edge still works for every other role.
Tests. tests/test_work_contracts.py (role precedence) and test_scheduler.py::test_review_role_claims_review_card_via_plain_workflow_edge (a code_review_agent seat with a default owner participation claims a review card).

Cloud Identity (skybolt.ai)

Bug. Popping a terminal into its own OS window failed with "Not authenticated" while docked terminals worked. _require_local_session (in store/project_store.py) only fell back to the middleware-validated per-launch engine bearer token when the skybolt_engine_session cookie was absent. A freshly-created pop-out webview attaches a lingering pre-restart cookie from the shared WebView cookie jar that no longer maps to a live local_sessions row, so the "cookie present" branch returned 401 without ever attempting the desktop-token recovery. Docked terminals reused the main window's live session, so they were unaffected.
Fix. _require_local_session now recovers the latest unexpired local session via the desktop engine token whenever the cookie is missing or present-but-invalid — same trust level (the per-launch token is loopback/host/origin-gated by require_engine_token middleware), so a valid cookie still binds to its exact session and an unauthenticated request still 401s.
Tests. test_desktop_engine_token_recovers_session_with_stale_cookie asserts a session-gated route succeeds with a stale skybolt_engine_session cookie + engine bearer token (regression guard); the existing dropped-cookie and logout tests continue to pass.

Dashboard

Removed the deterministic mock dashboard-stat fallback. A failed or still-loading local aggregate now produces an explicit loading or unavailable state, while project shortcuts continue to work.
The dashboard no longer shows invented activity, usage, cost, or build signals.

Settings & Storage

Documented destination preflight, no-merge/no-overwrite moves, rollback, and Engine recovery after a failed relocation.
Replaced obsolete git-crypt/blob-sync guidance with the active ADR-0069/0070 record-envelope boundary: the live database and raw artifacts remain device-local.

SSH Machines

Replaced silent first-connect acceptance with StrictHostKeyChecking=yes across SSH adapter and operation paths. A new host must be verified and added to the user's OpenSSH known_hosts before Skybolt can connect; a changed key remains a hard block.

2026-07-08

Agent Persona Catalog —

Catalog cleanup. Retired handyman ("Hank"), review_agent ("Rhett"), merge_agent ("Mac"), and the redundant code_qa_reviewer from the builtin catalog. Edited the canonical apps/api/app/data/builtin_personas.json (now 65 personas) and regenerated the engine fallback apps/engine/skybolt_engine/personas_data.py via scripts/generate-personas-data.py. Their role constants remain (dispatch machinery is removed in a follow-up wave); existing rosters keep working via persona_snapshot.
Default roster. Tagged chief_project_orchestrator, full_stack_developer, code_review_agent, git_merge_specialist with default_roster; added the DEFAULT_ROSTER_KEYS constant. New alembic migration 0036_default_roster_catalog prunes the retired rows and applies the tag on the cloud catalog (reversible).
Creation-time seeding. create_project (and the import routes when the imported roster is empty) now auto-add the default roster via seed_default_project_roster in personas_ops.py, emitting project_agents sync-outbox upserts. Seeding is creation-time only, so a removed default stays removed on reopen.
Role directives. Added overrides_json['directive'] support (AgentDirective schema + services/agent_directives.py), editable via ProjectAgentEditorModal and the rebranded Agent Builder wizard's new Directive step. See the agents-execution changelog for prompt injection and the pass_fail_reason review-contract routing.
Tests: tests/test_default_roster_seeding.py, tests/test_agent_directives.py (engine); customAgentWizardModel.test.ts, ProjectAgentEditorModal.test.tsx, CustomAgentWizardModal.test.tsx (web). Existing dispatcher/scheduler/orchestrator/app tests updated to provision the retired roles as custom personas and to start from a cleared roster.

Agent Terminal —

Claude Code seats still showed scattered/garbled glyphs while scrolling (other seats stayed clean). Root cause was two-layered: (1) the throttled clearTextureAtlas()-on-scroll fix from commit b3006d0 was reverted the next commit by e1faf46 "Safe Save" (a stale-checkout clobber), so the running build had no atlas repair; and (2) even the reverted fix only triggered on scroll — Claude repaints its input box/status region in the primary buffer with cursor-addressed rewrites that churn xterm's glyph texture atlas without moving the viewport, so an onScroll-only trigger missed those frames.
TerminalView now rebuilds the WebGL atlas (webglAddon.clearTextureAtlas()) from both terminal.onScroll and terminal.onWriteParsed, throttled (leading + trailing, 150ms min interval, O(1) timestamp compare on the hot path). The write-driven trigger is gated on terminal.buffer.active.type === "normal", so alt-screen TUIs (tmux, agy — the already-clean seats) pay nothing. The atlas is also rebuilt on handleDprChange (fractional Windows DPR shifts are a known atlas-corruption trigger). The DOM-renderer fallback keeps the translateZ(0) repaint nudge; the pending trailing timer is cleared on unmount. Why. A churned atlas leaves stale glyph rects baked into the canvas that terminal.refresh() and the compositor nudge can't repair — only rebuilding the atlas (what a manual resize does) fixes it, and only onWriteParsed covers Claude's in-place primary-buffer repaint pattern. Files. apps/web/src/components/TerminalView.tsx (atlas rebuild + triggers + cleanup); xterm mocks synced in TerminalView.test.tsx, SkyboltApp.test.tsx, TerminalPopoutApp.test.tsx. Tests. TerminalView.test.tsx — burst coalescing (leading + one trailing), scroll clears atlas (not the nudge) under WebGL, alternate-buffer gate, DOM-fallback never clears but still nudges, unmount clears the pending timer. Full suite: 66 passing. Follow-up. allowTransparency: true was left as-is: the theme's selectionBackground uses alpha and dropping it risks a visual regression that can't be verified without running the app. Re-verify on HEAD that the atlas fix is present (the "Safe Save" flow reverted it once before).
The pre-paste grace that only Antigravity had (a fixed wait after a seat reports ready, before the prompt is pasted, so the body doesn't drop into a not-yet-live input box) now applies to every seat. A new SKYBOLT_SEAT_POST_READY_GRACE_SECONDS (default 2.0) feeds the default seat_ready_profile and the claude profile; Antigravity keeps its longer 5.0 override.
The grace moved from Scheduler._deliver_seat_prompt into Scheduler._paste_then_submit (the single two-step paste chokepoint) as an optional pre_paste_seconds, so the launch path, the sync no-event-loop fallback, and the review-verdict nudge all get it once (no double-delay). The seat_prompt_startup_delay <= 0 escape hatch still bypasses it (instant paste for operators/tests).
The seat-ready route (POST …/terminals/{id}/seat-ready) now waits the profile's post-ready grace server-side after readiness confirms, so renderer-driven injection (planning sessions) inherits it with no client change. Bounded by the profile grace (≤ 5s), far within the client's 300s request timeout.
Web side: TerminalView.sendQueuedInput adds a TERMINAL_SUBMIT_PASTE_DELAY_MS (300ms) pre-paste delay for submit-type queued input (auto-fed card briefs/notes, inline chat, quick-launch — paths that never call seat-ready), re-checking socket readiness before the delayed body write so a terminal torn down mid-delay never throws. Why. "Add a delay everywhere we send prompts" — the Antigravity-only grace missed Claude/Codex and every renderer-driven paste path; a seat can report ready a beat before its input box accepts a paste, dropping the prompt. Files. apps/engine/skybolt_engine/terminal.py (_SEAT_POST_READY_GRACE_SECONDS, profiles), scheduler.py (_paste_then_submit pre_paste_seconds + _seat_pre_paste_grace, delivery/sync/ nudge call sites), routes/terminals.py (server-side grace); apps/web/src/components/TerminalView.tsx. Tests. test_scheduler.py (delivery-profile sleeps now [grace, settle] for every seat), test_terminal.py (profile grace coverage), test_app.py (route sleeps the grace), test_dispatcher.py (nudge carries the grace); TerminalView.test.tsx (pre-paste delay + drop on mid-delay socket close).

Agents Execution

SSH agent seats run ExitOnForwardFailure=no (services/ssh_ops.py). A seat connects through the user's SSH config, so it inherits any personal LocalForward/RemoteForward there. With the old ExitOnForwardFailure=yes, a personal forward that can't bind locally (classically a LocalForward 8000 that collides once the user — or a second concurrent seat — already holds 8000) killed the seat on launch (bind [127.0.0.1]:8000: Permission denied → the seat exits before the browser can attach). The failure has nothing to do with Skybolt: the seat's OWN reverse tunnel uses a per-session unique port (_acquire_reverse_port, 61000+) and rarely fails; if it genuinely can't come up (remote AllowTcpForwarding no, or port-span overflow), the seat still runs and the missing /complete callback surfaces at finalize (terminal_exited / no-verdict) instead of an opaque launch death. This surfaced now because SSH orchestration starting to work meant multiple seats run concurrently, each dragging the same inherited LocalForward and colliding after the first.
We still do NOT set ClearAllForwardings=yes on seats (it would discard the seat's own -R reverse tunnel along with the inherited forwards). The interactive-terminal path is unchanged.
Tests. test_seat_argv_reverse_tunnel_listen_and_target_ports_differ updated to assert ExitOnForwardFailure=no.
Follow-up (recommended): surface the real ssh stderr in the terminal UI instead of the generic "browser did not reach the local Execution Engine terminal socket" message, so a genuine reverse- tunnel failure (now non-fatal) is still visible.
Advisor participation → the no-worktree report contract (services/work_contracts.py). resolve_work_contract previously lumped advisor in with reviewer → review_only, which requires a git repo + a read-only worktree (correct for a code reviewer that inspects the diff, wrong for an advisor that only gives an opinion from context). advisor now resolves to report_only (no worktree, no git repo, structured report). Reviewers are unchanged. Note: the shipped default map's orchestrator→planning edges use participation_role="advisor", so planning now runs worktree-free and completes as a structured report rather than a pass/fail verdict.
sources is optional on a report completion (report_completion_error). summary + recommendation remain required (the consumable deliverable); sources are encouraged for research but optional (an advisory opinion often cites none), so their absence no longer 422s.
Report auto-hand-off to dependents (scheduler.py::_publish_report_handoff_notes). On a completed report_only card, the {summary, recommendation} is published as a handoff agent note targeted at every card that DEPENDS on it. Those notes ride the existing pipeline — injected into the blocked dev seats' prompts (active_notes_for_session) when the dependents_unblocked wake dispatches them, and surfaced in the orchestrator's board snapshot (notes_active). No new routing: research → devs is now automatic when the dev cards depend on the research card, and human gating stays opt-in (a human_gate node or a human-owned card). Notes are scoped to the dependents, not the just-completed card (which would clear them immediately).
Tests. test_advisor_participation_runs_the_no_worktree_report_contract, test_report_completion_requires_summary_and_recommendation_but_not_sources, test_report_completion_hands_off_finding_to_dependent_cards.
Deferred: workflow-map routing for advisory cards (route a report_only card onward on an edge / spawn a follow-up dev card carrying the report). Report_only cards are review_eligible=False and go straight to completed today; enabling map routing for them is a separate change.
Dead-on-arrival diagnosis for report-only cards (scheduler.py). A report-only seat (research / spike) that exited via terminal_exited almost instantly — the CLI never started (missing, unauthenticated, or it refused its launch flag) — was finalized against the report-only completion contract and reported the misleading "Report-only completion requires a summary", because that check (finalize_agent_session, ~5176) ran BEFORE the premature-exit guard (~5295) — and that guard skips report-only entirely (it is gated on requires_git_worktree, which a report-only contract lacks). Now the report-only rejection branch first runs the same premature-exit detection (terminal_exited + sub-SKYBOLT_SEAT_PREMATURE_EXIT_SECONDS elapsed) and routes to _finalize_premature_exit, so the card requeues/fails with the accurate "the CLI did not start … Last seat output: …" reason instead. worktree_path=None is passed for the report-only case (its path is the project checkout, not a managed worktree, so nothing is removed).
Tests. test_report_only_premature_exit_reports_launch_failure_not_missing_summary.
Still open (follow-up): a report-only seat that *ran* but exited/idled without POSTing a structured report still hard-fails ("requires a summary") — report-only has no transcript-recovery fallback or idle-nudge like review seats do. Recover-and-park is the planned fix (pending the recovered-report end-state choice: complete-with-report vs park-for-signoff).
spike and chore card types now have roster owners (services/personas_ops.py). Previously no role in ROLE_CARD_TYPE_AFFINITY owned spike or chore, so the orchestrator (correctly, per its "route by capability" rule) deferred those cards with "no active roster agent handles card_type". Now _DEV_CARD_TYPES/_GENERALIST_CARD_TYPES include chore + spike, spike is added to research_agent/solution_architect/adr_writer, and chore to devops_engineer. A spike stays a report_only contract regardless of who runs it (it completes as a recommendation, not a merge). security still has no default implementation owner (reviewers gate it via role_policy); assign one via a custom persona's card_type_affinity override if desired.
Roster cards show what each agent handles (web). AgentRosterCard renders a "Handles" chip row from capability_summary.handles_card_types (omitted for reviewers/advisors, which own no types), so it's visible which agents the orchestrator can route which card types to.
Orchestrator decisions are expandable (web). OrchestratorDecisionLog rows with a reason are click-to-expand (aria-expanded), revealing the full untruncated decision message in a wrapped panel instead of a single truncated line.
Tests. Engine test_spike_and_chore_card_types_have_roster_owners; web roster "Handles" row + decision-log expand suites.
New one-shot SSH argv builder (services/ssh_ops.py::ssh_headless_argv). A plain, piped ssh user@host -- "<remote cmd>" (via the existing ssh_base_args + _ssh_seat_remote_command) with no -tt/PTY, tmux persistence, or -R reverse tunnel — a headless decision turn only needs its prompt on stdin and its stream-json on stdout (it parses the result from stdout and never calls /complete). Reuses the same proven remote cd + env + login-shell wrapper as an interactive seat, so the remote PATH loads claude.
Orchestrator (orchestrator.py). _resolve_local_workspace_root → _resolve_headless_workspace returns (machine_kind, workspace_root): ("local", local_path) or ("ssh", remote_worktree_path) (the remote path is read from target["project_path"] directly, like the scheduler's SSH seat launch, not the local-validating _workspace_root). _decide_via_headless_seat branches on the kind — local builds the bare _orchestrator_headless_argv and runs in the local cwd; SSH builds _orchestrator_ssh_headless_argv (new _orchestrator_headless_command string + ssh_headless_argv) and launches with cwd=None (the local ssh client's cwd is irrelevant; the remote cd sets the working dir). open_headless and run_headless_seat are unchanged — the local ssh client pipes the prompt to the remote claude and streams its stdout back. No secret env is forwarded over SSH.
Messaging. The no-backend failure message no longer says "not a local machine"; it points at claude /login on "the local machine, or the SSH Machine for an SSH project".
Tests (tests/test_orchestrator_headless.py). New test_headless_ssh_target_runs_claude_on_remote (asserts the launched argv is ssh-wrapped: ssh, deploy@remote.example.com, remote cd -- /srv/app && … claude … -p --output-format stream-json --verbose, cwd=None, prompt on stdin), test_headless_unresolved_target_falls_through_to_hosted (renamed from the old SSH-falls-through test), and a _orchestrator_headless_command unit test. Not exercised here: a real remote claude run (needs a live SSH host + a signed-in remote CLI) — auth failure on the remote still falls back to the hosted model exactly as the local path does.
New "Move to Lane" control node (type:"lane", services/agent_workflows.py). Droppable like Fork / Decision / Human Gate: new WORKFLOW_NODE_TYPE_LANE in WORKFLOW_CONTROL_NODE_TYPES / WORKFLOW_CONTROL_ACTIONS (emits continue) / WORKFLOW_CONTROL_DESCRIPTIONS. It carries two config fields validated/persisted in _normalize_node: target_status (the destination lane) and park (bool, default false). next_workflow_step implements two modes — Continue (park=false carries the lane forward as pending_target_status through the routing BFS so the NEXT seated agent lands the card in it) and Park (park=true returns a lane_park step that route_next_workflow_session applies: move the card into the lane, hand it to a human assignee_type='human' with a pause_reason, stop routing, emit a workflow_lane_park event; re-entry is normal board interaction).
Widened lane allowlist. WORKFLOW_EDGE_TARGET_STATUSES (and the lane node's target) grew from {queue, review, ready_for_merge} to the full seven board columns, in board order — WORKFLOW_LANE_STATUSES = (long_term, planning, queue, working, review, ready_for_merge, completed). paused/needs_input are park STATES, not lanes, and stay excluded (parking is a lane node's park flag on a real lane). working/completed are selectable but risky (a working card with no live session stalls; completed skips the completion path); the editor warns about them.
Removed role-based auto-move (the semantic shift). _route_status_for_target no longer derives a lane from the agent's role/participation (reviewer→review, merge role→ready_for_merge, else queue) — it returns queue, the neutral dispatchable default. _resolve_route_status gains a pending_target_status param and applies one explicit precedence: pending lane → edge target_status → queue, with ready_for_merge still refused for a non-merge-eligible card (_card_merge_eligible, ADR-0067). Consequence: the deterministic merge gate now fires ONLY on an explicit ready_for_merge. The shipped software_delivery_default_workflow already sets target_status explicitly on the review edge (review) and both merge edges (ready_for_merge), so its implement → review → merge → completed pipeline is UNCHANGED; a custom map that omits the explicit merge lane parks instead of merging (intended).
Renderer. AgentWorkflowNode in api.ts gained target_status?/park? (edge target_status doc updated). AgentWorkflowMap.tsx adds the palette "Move to Lane" item, the controlActionsForType lane case, addControlNodeToMap defaults ({target_status:"queue", park:false}), the node inspector Target lane + After move (Continue/Park) selects, the widened edge "Board column on route" dropdown, and a workflowFromGraph serialization fix so lane fields persist through autosave.
No schema change — the lane node and widened target_status live on the settings-JSON workflow document; cards.workflow_state_json already exists. Older engines silently drop the unknown lane node / widened values.
Tests. tests/test_agent_workflows.py (test_move_to_lane_continue_carries_lane_to_next_agent, test_move_to_lane_park_holds_card_in_lane_for_human, and test_git_merge_specialist_target_routes_to_ready_for_merge updated to set target_status explicitly); tests/test_workflow_target_status.py (widened allowlist accept/reject params + lane-node normalization); web AgentWorkflowMap.test.tsx / AgentWorkflowMap.targetStatus.test.tsx.
Human merge trigger implemented (dead-lane fix). A human drag of an agent-owned card into Ready For Merge (or an owner flip to agent while it sits there) now seats the deterministic merge executor: routes/cards.py update_card calls the new agent_workflows.seat_merge_session_for_card (single-flight; no-op for human-owned cards, open sessions, or a roster with no active merge agent). Previously only the authored-edge trigger existed, so a dragged agent-owned ready_for_merge card wedged forever. The orchestrator has no direct merge verb — ADR-0078 Decision 2 now names exactly these two triggers. dispatch_diagnostics reports no_merge_agent / awaiting_merge for a seatless Ready-For-Merge card instead of the false "the Orchestrator will seat it"; the renderer badge map (CardDetailModal.dispatchReasonPresentation) covers both codes.
Merge-approval policy retired end-to-end (ADR-0078 Decision 4). The scheduler enforcement point was removed with the conveyor, leaving auto_merge silently dead config; the setting is now gone everywhere: services/merge_policy.py, the four merge-policy endpoints in routes/cards.py, the api-client functions, and the renderer toggles (ProjectMergePolicyControl, CardMergePolicyControl + their DevOpsSettingsModal / CardDetailModal mounts). "Auto-merge on review pass" is expressed only by the authored merge edge. Stale merge_policy* rows in project_settings_kv are ignored.
Recovered-verdict low-trust gate restored. finalize_agent_session threads verdict_from_transcript into _route_card_on_finalize; a review pass recovered from the seat's transcript never routes the authored merge edge — it parks in Review for human sign-off, matching the "a recovered pass never auto-merges" log line — and a recovered fail's blocker note carries its provenance line.
Deleted-with-test_dispatcher.py coverage re-homed (tests/test_scheduler.py unless noted): the 5 _review_verdict_from_transcript unit tests; transcript-recovery integrations (recovered pass parks with the merge edge routable, recovered fail carries provenance and parks with no rework route, ambiguous transcript parks); the one-shot review idle-seat nudge (REVIEW_VERDICT_NUDGE); review-fail blocker superseding (REVIEW_FAILED_NOTE_TITLE); review-missing-verdict park; the card-PATCH ownership/lane gates (Done forces human, Ready For Merge preserves a human owner, owner flips keep the lane); the card-diff route tests (empty without a branch; configured default branch as base); new seat_merge_session_for_card unit tests (tests/test_agent_workflows.py) + drag/owner-flip route triggers; and Ready-For-Merge diagnostics codes (tests/test_dispatch_diagnostics.py).
Removed the self-driving dispatcher runtime (dispatcher.py), its rule engine (services/dispatch_ops.py), and their tests (test_dispatcher.py, test_dispatch_lease_reclaim.py); de-registered it from runtime_hooks._BUILTIN_SERVICE_MODULES. DISPATCH_AUTHORITY_ROLE_KEYS is now {chief_project_orchestrator} — an active Orchestrator is what makes a project dispatch-capable. Initial dispatch of queued cards is the orchestrator's assign_card job.
Workflow (services/agent_workflows.py): removed _SELF_CONTAINED_WORKFLOW_ROLE_KEYS stripping (_strip_self_contained_roles) and the merge/self-contained short-circuit in route_next_workflow_session. Rebuilt software_delivery_default_workflow as the full implement → review → merge → completed pipeline: code_qa_reviewer folded into code_review_agent, a review edge with target_status="review", and merge edges into git_merge_specialist with target_status="ready_for_merge" (the retired trio is gone).
Kept the deterministic merge executor (scheduler._run_merge_session + safety gates), triggered only when a card reaches ready_for_merge via an authored edge or a human drag/owner-flip of an agent-owned card (see the 2026-07-08 fix-pass entry below), fulfilled for MERGE_AGENT_ROLE_KEYS.
Finalize park + wake (scheduler._route_card_on_finalize): replaced the auto-review, auto-complete, and review-fail-bounce fallbacks with a _park_card_for_orchestrator helper (needs_input + pause_reason + workflow_route_missing/orchestrator_kick + bus wake). Review fail no longer nulls workflow_state_json; the removed auto-merge conveyor leaves a passing review at the human sign-off gate. _release_session / _finalize_premature_exit permanent failures park the card with a blocker note instead of escalating to review.
Orchestrator feedback (orchestrator_ops.build_board_snapshot + build_orchestrator_prompt): per-card last_session (status/role/verdict/reason/signal/attempt_count/finished_at), pause_reason, workflow_state.last_reason, ready_for_merge in open lanes, and a new cards_blocked list (needs_input/paused). record_card_outcome now fires from finalize (completed), permanent-fail (failed), and budgets.pause_card (budget_paused). budget_tripped + workflow_agent_assigned added to the orchestrator WAKE_KINDS; finalize bus events carry card_id + verdict; new prompt rules teach the orchestrator to read last_session and parked cards.
Diagnostics (services/dispatch_diagnostics.py): reworked to orchestrator-required semantics (no_authority == no active Orchestrator; new awaiting_orchestrator; dispatch_lease always null). Renderer dispatch strings + the Agents-tab self-driving copy updated (ADR-0078); no agent is "self-driving" anymore.
Engine — headless decision backend (orchestrator.py). _execute_leased builds the decision prompt once, then _resolve_orchestrator_seat_backend checks the orchestrator agent's overrides.default_seat against seat_execution.is_headless_capable (claude_code only). When present on a local target, _decide_via_headless_seat runs one (or, on a validation miss, two) one-shot claude -p --output-format stream-json turns — open_headless + run_headless_seat + NOOP_SEAT_SINK + wait_for + terminate, modeled on routes/workspace.py::planning_cli_turn. The system+user messages are flattened to stdin; parse_decision_doc + the corrective retry are unchanged. auth_failed/non-completed → hosted fallback if one resolves, else a failed run with "run claude /login"; a completed-but-unparseable turn fails the run (no double-spend). Cost/tokens from the outcome meter into orchestrator_runs. The seat dangerous=false toggle is honored (no --dangerously-skip-permissions unless enabled). SSH targets fall through to the hosted model.
Engine — shared headless argv (services/seat_execution.py). Extracted headless_argv(command) (the -p --output-format stream-json --verbose suffix builder) so the scheduler seat launch, the planning one-shot, and the orchestrator decision turn share one implementation; scheduler._headless_argv now delegates to it.
Engine — per-agent seat model (services/model_defaults.py). resolve_agent_seat_command now reads an optional overrides.default_seat.model and resolves --model with precedence agent-override > seat pin > kind default (sanitized via sanitize_seat_model, so a malformed value can never inject a shell token).
Renderer — model selection (agents/agentModelSelection.ts). New buildAgentModelChoices wraps the shared buildChatChoices (untouched — five surfaces depend on it) and expands each CLI seat into "Seat default" plus curated per-model variants from settings/seatModels.ts. Combo choice ids encode the model as seat:<seatId>|<model> (the | is outside the seat-model sanitizer charset, so it cannot collide with :// in model values). agentUpdateForChoice/agentModelChoiceId round-trip the model and spread existing overrides so unrelated keys survive.
Renderer — roster card (agents/AgentRosterCard.tsx, AgentsTabBody.tsx). For the Orchestration agent the dropdown is filtered to headless-capable seats and shows "Decisions run via Claude Code headless"; every agent's CLI seats now offer per-model variants.
Tests. New engine tests/test_orchestrator_headless.py (argv helper, backend-resolution precedence, happy-path parse+meter, corrective retry, auth_failed fallback + login-hint failure, SSH fall-through) and tests/test_agent_seat_model_override.py (override precedence, sanitization, GLM default, quoting). Extended agentModelSelection.test.ts and AgentRosterCard.test.tsx (combo round-trips, separator safety, unrelated-key preservation, orchestrator filter + helper copy).
Engine (services/agent_workflows.py). _normalize_edge accepts and validates target_status against a safe allowlist WORKFLOW_EDGE_TARGET_STATUSES = {queue, review, ready_for_merge} (raises WorkflowValidationError otherwise); working and the terminal/park states are excluded so an edge cannot strand a card in a non-dispatchable column or skip terminal side-effects. _edge and _control_branch_edge emit target_status: null so default/canonical maps round-trip. New _resolve_route_status replaces the hard-coded _route_status_for_target call in route_next_workflow_session: an explicit target_status wins over the role-derived column, except ready_for_merge falls back to the derived column for a card whose work contract is not merge-eligible (report-only/review-only, ADR-0067) — mirroring the scheduler's auto_merge guard. The configured target_status and the resolved_status actually applied are recorded in the workflow_route session metadata and the workflow_agent_assigned coordination event.
Renderer. AgentWorkflowEdge in api.ts gained target_status?: string | null; workflowFromGraph threads it into the saved override (the file's known silent-drop footgun); and the edge inspector adds a "Board column on route" select (Automatic plus the allowlisted columns via CARD_STATUS_TITLES) wired through updateSelectedEdge. Persistence and validation flow for free through the existing autosave → putProjectSettings → write_project_settings → validate_workflow_setting path.
Tests. New tests/test_workflow_target_status.py (normalization accept/reject/default, routing honor + derived fallback, the ready_for_merge merge-eligibility fallback, and override round-trip through _merge_workflow); new AgentWorkflowMap.targetStatus.test.tsx (workflowFromGraph round-trips target_status, defaults to null, and via the workflow.edges fallback path).

Cloud Identity (skybolt.ai)

Recovery-code rekey. Account settings now has a password-gated Rekey action beside Recovery code -> Reveal. The engine route POST /auth/recovery-code/rekey rotates the local account key and recovery code, SQLCipher-rekeys the global database, reseals secret_vault values, updates custody, and queues current CLOUD sync rows so their zero-knowledge envelopes are republished under the new key. The old recovery code stops unlocking after rotation; the replacement is shown once for storage in the user's password manager.
Backup-code replenishment. POST /api/v1/auth/totp/backup-codes now accepts either a current authenticator code or one unused backup code. When a backup code is used, it is consumed before the existing backup-code set is deleted and replaced, so a user who burned most of their 10 one-time codes can recover a fresh set without waiting for an authenticator code path.
Tests. API coverage asserts regeneration with a backup code invalidates the old set and lets a new code log in. Engine coverage asserts rekey rotates the recovery code, rewraps vault values, preserves the active local session, rejects the old recovery code after lock, and rejects wrong-password rekey attempts. Web coverage asserts the Account settings rekey modal posts the password and displays the replacement code plus sync queue count.
Runtime hardening. Rekey now temporarily stops registered runtime services and restarts them afterward so background scheduler/sync/cloud-refresh threads release pooled database handles before SQLCipher runs PRAGMA rekey. The SQLCipher rekey also happens before vault-row updates so the connection is clean for the key rotation.
Bug. After "Start over on this device", restore failed with the stale-local-data 422 ("That recovery code doesn't match the encrypted data on this machine" — rendered as "Skybolt found local encrypted data that this recovery code cannot open"), and Delete local data and restart looped forever. With at-rest encryption on (ADR-0040) and the machine wiped (uninitialized — no custody, no DB file), the background services (scheduler/dispatcher/orchestrator call storage.schema.connect() directly, bypassing the require_unlocked HTTP middleware) re-created the global database as a PLAINTEXT file within seconds: connect() picked the plain sqlite3 driver whenever no key was loaded, and a connect on a missing path CREATES the file. The next /auth/cloud/restore then saw "existing data" it couldn't open under the recovery key; worse, reencrypt_plaintext_db raced the loops' live file handles on Windows (leaving a .enc-tmp artifact). Same symptom as the ADR-0073 cloud-synced-replica case, entirely different cause. Renderer untouched.
Fix — fail-fast lock gate (storage/schema.py). New GlobalDatabaseLockedError(sqlite3.OperationalError) plus a set_at_rest_required(bool) module gate: connect(encrypt=True) now raises — never opening, and never creating, the file — while at-rest is required and no account key is loaded. Because it subclasses sqlite3.OperationalError, the background loops' existing broad/sqlite3.Error handlers treat it as an ordinary "db unavailable" and stand down until unlock — no call-site changes.
app.py. create_app arms the gate from settings.at_rest_encryption (before _prepare_at_rest), and a new exception handler maps GlobalDatabaseLockedError to 423 Locked (mirroring ProjectDatabaseLockedError) for anything that slips past the require_unlocked middleware.
routes/cloud_auth.py. _restore_unlock_in_place closes this thread's pooled global connection right after the recovery unlock, before ensure_db_initialized, so the plaintext→SQLCipher re-encrypt can't fight our own stale Windows handle.
Self-heal. A machine already stranded with a stray plaintext DB recovers on the next restore: the in-place path opens the plaintext file, finds no profile for the account, and falls through to a fresh restore that replaces it with ciphertext.
Tests. New engine tests/conftest.py autouse fixture disarms the process-global gate between tests (create_app arms it; without the reset, later plaintext-mode tests would raise GlobalDatabaseLockedError). Dedicated regression tests for the gate and the wiped-machine restore loop are being added alongside this entry.
Bug. An account with TOTP 2FA could never finish "restore on this device" through the inline sign-in flow. Cloud login with a valid 2FA code succeeds — and the cloud's F.2 replay guard (2026-07-01 entry) consumes that code's 30 s step — then the engine answers needs_recovery because the machine has no local profile (e.g. right after "Start over on this device", 2026-07-07 entry). The renderer kept the consumed code in state, hid the Authentication code field (showTotp excluded needsRecovery), and re-sent the same code with the recovery-code submit to /auth/cloud/restore — which performs a second full cloud login → invalid_totp. The user saw "That authentication code didn't match. Try again." at the recovery-code prompt, deterministically, with a CORRECT recovery code — and misread it as a recovery-code failure.
Fix — AuthScreen.tsx (renderer only; engine and cloud unchanged). When login flips to needs_recovery, the stored 2FA code is cleared and the Authentication code field stays visible through the inline recovery step (showTotp no longer excludes it), so the restore submit carries a FRESH code alongside the recovery code; the notice says each code works only once.
Adjacent fix — silent needs_totp no-op on restore. cloudRestore can answer a 2xx {"needs_totp": true} (e.g. a 2FA user left the restore screen's optional auth-code field blank), but the renderer called finishAuth(undefined) — a silent no-op. Both restore submit paths (restore screen + inline needsRecovery) now reveal/require the code field with a notice instead of doing nothing; api.ts cloudRestore returns the new CloudRestoreResult union carrying the needs_totp variant.
Hardening — clear the code on failed restores too (review finding). A restore that fails AFTER its embedded cloud login (e.g. a mistyped recovery code → 422) has still consumed the submitted 2FA code. Both restore submit paths now clear the stored code on any non-503 failure, so correcting the recovery code and resubmitting can never replay a dead code into the same misleading 401. (503 keeps the code: offline means it was never spent.)
Tests. Web AuthScreen.test.tsx +3 (chained needs_totp → needs_recovery → fresh-code restore; restore-screen needs_totp re-prompt with a toBeRequired check; consumed-code cleared after a failed restore). Engine test_cloud_auth.py +4 contract tests: restore with a fresh / missing / wrong TOTP code, and test_cloud_restore_replayed_totp_code_401_and_fresh_code_succeeds — FakeCloud now models the F.2 replay guard, encoding why the recovery step must demand a fresh code.

In-App IDE

Fixed: the Files tab now remembers its inner mode (Workspace / Generated / Assets / Public Site) per project. The mode is persisted in localStorage under skybolt:app:project-files-mode:<projectId> and restored whenever the Files tab remounts, instead of always reopening on Workspace. Covered by a new FilesTabModePersistence.test.tsx.
Fixed: saved Files layouts are no longer clobbered on project open. When Files is the first tab shown, the IDE used to mount before the deferred (Wave 2) files_layout settings arrived, restore an empty layout, and let the debounced saver overwrite the stored one — losing the open editor files. The surface now passes initialLayoutState as undefined until settings load; CodeWorkspace waits for that signal before claiming the project, marking itself hydrated, or persisting, and restores once the real value (or null) arrives. Added regression coverage in CodeWorkspace.test.tsx asserting no save fires before hydration and that an undefined -> value transition restores correctly.
Added: the project menu image can be chosen from files already in the project. A "Choose from project files" picker (ProjectMenuIconPicker) lists PNG/JPEG/WEBP/GIF images from the asset store and the workspace source tree, fetches the bytes via getFileAssetRaw/getWorkspaceFileRaw, downscales anything over the 256 KB cap on a canvas, and feeds the existing menu_icon_data_url update flow (the upload handler now also accepts a Blob). No engine change.

Unified Files and AI Media Studio

Fixed "adding to Assets isn't working for images": after a successful project_private promotion, Files -> Generated now refreshes the Files -> Assets list (refreshFileAssets is threaded FilesTab -> GeneratedMediaMode) so the promoted image appears without a manual reload. Promotion success/error feedback now renders beside the acted-on gallery item (keyed by ${job?.id ?? "asset"}:${output.id}) instead of in the far-away form card, so a 422 like a missing public-site folder is visible where the user clicked.
Hardened the promote route: its body now runs off the event loop under run_project_write_with_retry (mirroring refresh_media_job), so a concurrent 12s hydrate holding the project write lock retries instead of surfacing "database is locked" on image-heavy projects.
Added a configurable per-project public-site directory (public_site_path, schema v28->v29 additive column on projects + project_execution_targets, mirrored in _ensure_default_execution_target, accepted on PATCH /projects). _promote_public_site, public_site_available, and _remove_public_site_copy resolve _public_site_root(target) (configured path, default apps/site/public) with the _path_inside guard anchored to the resolved root. A project that is not shaped like the Skybolt monorepo can now publish to a directory it chooses, editable from Files -> Public Site (project admins). The path is device-local (never synced), matching assets_path.
Added optional input reference files (ADR-0077 image-edit): input_asset_ids (existing generated assets) and input_paths (assets-folder files) on MediaGenerationRequest, strictly validated engine-side (existence, image-only, per-file + total byte caps, assets-root/_path_inside guards) and recorded as relative refs in the job's request_metadata. OpenRouter image models receive the references as base64 data URLs; MiniMax returns a clear unsupported failure. Support is advertised per model via supports_input_references, and Files -> Generated shows a "Reference files" picker only when supported.

2026-07-07

Cloud Identity (skybolt.ai)

New engine route POST /auth/clear-local-data. The in-process twin of the desktop shell's reset_corrupt_data command: wipes this machine's local setup (global database + at-rest custody) and leaves /auth/state reporting needs_setup without an engine restart. Per-project repos and projects/*.db are untouched; cloud data is unaffected (sign-in + recovery code restores it). With encryption on, at_rest.reset() locks the engine (background loops stand down) and the files are deleted with a brief Windows retry, falling back to truncating the DB to zero bytes (already treated as "no profile"). With encryption off, background loops keep live handles on the plaintext DB — a just-unlinked name lingers in Windows delete-pending and recreating it fails with "disk I/O error" — so the wipe runs through SQLite instead: every row deleted (PRAGMA defer_foreign_keys), schema kept.
resetCorruptData() (web api.ts) is now dual-path. Desktop (including dev.ps1 Tauri dev): unchanged — the shell stops the engine, deletes the files, relaunches. Browser dev (dev.ps1 -Web/-Engine, no Tauri): calls the new engine route and reloads the page. This also un-breaks the existing Delete local data and restart stale-data button and the DatabaseRecoveryScreen in browser dev, which previously threw "requires the desktop app".
AuthScreen footer action. Sign-in and unlock modes gain a Start over on this device link that reveals a danger-styled confirm panel ("Delete local data and start over") wired to the same reset. This is the escape hatch for stale-old-account machines where a correct recovery code is rejected against leftover local custody (see 2026-06-17 entry).
store/project_store.close_pooled_global(). Extracted from the _reset_open_caches test seam: closes the calling thread's pooled global connection so its OS file handle is released before the in-place delete; other threads' handles cycle via the key-epoch bump.
Tests. Engine test_at_rest.py covers both encryption modes end-to-end (wipe → needs_setup → immediate re-setup in the same process); web AuthScreen.test.tsx covers the footer reveal → confirm → reset call on both the login and unlock screens.

2026-07-06

Cloud Identity (skybolt.ai)

Password-gated reveal path. Added local engine route POST /auth/recovery-code/reveal. It requires a valid local session, an already-unlocked at-rest account key, and password confirmation against the local custody wrap before returning the current recovery code. skybolt.ai is not contacted and still has no key escrow.
Account settings UI. The skybolt.ai account card now includes a Recovery code section with a themed modal that asks for the local password, displays the code, and offers a copy action.
Tests. Engine coverage verifies success, wrong-password rejection, and missing-session rejection; web coverage verifies the Account modal reveal flow.

Machines

Fixed the Environment tab's Add/Edit Machine modal so background project refreshes do not overwrite the in-progress setup draft. Users can now finish Local or SSH Machine setup without racing the refresh cadence.

2026-07-01

Agent Terminal —

A read-only mirror of a scheduler seat (the Card Detail live terminal) could report its OWN fitted grid size to the engine, resizing the shared agent PTY out from under the seat and garbling/overlapping the TUI for every viewer.
TerminalView now locks a read-only terminal's xterm grid to the seat's true geometry (READONLY_SEAT_COLS/READONLY_SEAT_ROWS = 120×40, matching the engine's fixed seat spawn size) and CSS-scales it to the panel; every size report sends the seat geometry — never the local grid — while readOnlyRef is set, so a mirror can never resize the shared PTY. Why. Agent seats are a fixed 120×40 PTY; the read-only path was enforced for input but the resize channel was still a client write-path into the shared terminal. Files. apps/web/src/components/TerminalView.tsx (READONLY_SEAT_* grid lock + read-only size reporting).

Agents Execution

Busy specialists wait instead of being skipped. A role that is present-but-busy on the roster is no longer treated like a missing stage: routing parks and re-queues the card for it. Only a genuinely EMPTY roster for the role honours on_missing: "skip".
Human Gate is resumable. New resume endpoint in routes/questions.py (resume_human_gate in agent_workflows.py) approves/rejects a card parked at a workflow Human Gate node — previously nothing ever released it. Parked gates now also surface as human_gate items in the unified inbox (routes/inbox.py, driven by the workflow_human_gate / workflow_human_gate_resolved coordination events).
needs_clarification parks and raises a question. A needs_clarification completion is a hard stop for a human, not a route: the card parks (guarded by the parkable-status set) and an agent question is raised, instead of routing onward as if it were a failure edge.
Reviewer/advisor claim fix. The claim path resolves the work contract with the session's persisted participation_role, and _is_review_session now recognizes workflow-routed participation_role='reviewer' sessions (not just the legacy Review Agent role), so those sessions claim review-lane cards and return pass/fail verdicts under a review contract.
Invalid synced workflow override parks the card. A project override that cannot route the card (e.g. synced from a peer in a bad state) records a workflow_override_invalid coordination event and parks the card for a human instead of failing routing opaquely.
Follow-up dependency edges sync. Workflow-created card_dependencies rows now enqueue a sync_outbox.note_change upsert, so the dependency edge propagates to other machines instead of existing only locally.
Account default workflow applied. Workflow resolution honours the account-level default workflow (ACCOUNT_DEFAULT_WORKFLOW_KEY = "default") when computing the shipped base, instead of always falling back to the hard-coded Software Delivery Default.
Unreachable merge-fail block removed. route_next_workflow_session short-circuits for every role in MERGE_AGENT_ROLE_KEYS (merge roles finalize through the scheduler's deterministic merge path), so the dead merge-failure routing block below it was removed.
Question options normalization. _normalize_question_options (services/questions.py) tolerates malformed agent payloads (bare strings, nulls), dedupes ids, caps at 12 options, and clears a recommended_option_id that doesn't match a surviving option; free_text questions always persist with no options.
Tests: test_agent_workflows.py, test_questions.py, and test_scheduler.py cover the busy-wait, gate resume, clarification park, reviewer claim, invalid-override park, and normalization paths.
The Git Merge Specialist (git_merge_specialist, "Grant") is no longer a self-contained dispatcher persona. It is now a first-class, routable Agent Workflow Map node the orchestrator can send cards to. Only the three engine-driven dispatchers stay off the map: Handyman, legacy Review Agent, and legacy Merge Agent (Mac). This aligns the map with the persona library and roster model, which already treated Grant as orchestrator-led (SELF_DRIVING_ROLE_KEYS was already the same three).
Merge execution stays deterministic and gated. Grant remains in MERGE_AGENT_ROLE_KEYS, so routing a card to Grant sends it to ready_for_merge (_route_status_for_target) where the existing one-at-a-time, non-mergeable-blocked, conflict→human merge runs. route_next_workflow_session now short-circuits for every role in MERGE_AGENT_ROLE_KEYS (not just the self-contained set), so a failed Grant merge still falls back to the conflict→human gate instead of routing onward.
Removed git_merge_specialist from _SELF_CONTAINED_WORKFLOW_ROLE_KEYS (engine) and WORKFLOW_SELF_CONTAINED_ROLE_KEYS (renderer); added it to _PRIMARY_POSITIONS at the pipeline tail. The shipped Software Delivery Default does not wire Grant in — it ships as a disconnected, disabled palette node the project can place and connect. Grant shows under the "Review and QA" job-type filter.
Tests: test_agent_workflows.py now asserts Grant appears in the effective workflow, its override node/route survive the strip pass, and routing to Grant lands the card in ready_for_merge with a session; the Merge Agent (Mac) keeps the negative coverage. AgentWorkflowMap.test.tsx asserts Grant renders on the canvas while Mac stays hidden.
Fixed a stale Active/Skipped chip on the map: node availability now recomputes from the LIVE roster (rosteredRoleKeys in AgentWorkflowMap.tsx) instead of the /effective snapshot fetched when the map opened. A node placed/connected from the palette reads Active immediately when its role has a rostered agent — previously it kept showing "Skipped" until a full reload because autosave never refetches /effective. Availability is display-only (workflowFromGraph ignores it), so the resync never triggers an autosave or resets positions/edges. AgentWorkflowMap.test.tsx asserts a rostered node reads Active even when the snapshot marked it skipped.
OrchestratorService._acquire_lease now reclaims an EXPIRED held orchestrator lease before contending (new _reclaim_expired_lease, mirroring the dispatcher). An orphaned lease — engine hard-killed mid-wake, or a _release_lease commit that lost a "database is locked" race — no longer blocks every acquire until the scheduler's TTL reaper runs (up to lease_ttl_seconds, default 600s); the next wake steals it and runs.
Added a per-project contention backoff (_note_lease_backoff / _in_lease_backoff, SKYBOLT_ORCHESTRATOR_LEASE_BACKOFF_SECONDS, default 10s): when a wake loses the lease race, _tick stops re-running the doomed INSERT for that project until the cooldown lapses. This removes the recurring INSERT ... resource_leases + ROLLBACK write pressure that surfaced as UNIQUE constraint failed: resource_leases.resource_type, resource_leases.resource_key in the logs and contributed to "database is locked" errors elsewhere (scheduler claim path, cloud login _create_session). The IntegrityError was already caught (treated as lease_held); the fix is to stop retrying it in a tight loop.
Tests: tests/test_orchestrator.py::test_acquire_reclaims_an_expired_orchestrator_lease, ::test_acquire_preserves_a_fresh_orchestrator_lease, ::test_run_once_self_heals_past_a_stale_orchestrator_lease, and ::test_contended_lease_backs_off_instead_of_hammering.
route_next_workflow_session now returns control to the scheduler for self-contained roles such as handyman, review_agent, merge_agent, and git_merge_specialist.
This preserves the legacy dispatcher routing for those roles. A Review Agent fail returns the card to the queue with the current blocker note instead of being parked as an unresolved workflow routing issue.
Tests: tests/test_dispatcher.py::test_review_fail_returns_card_to_queue_with_note and tests/test_dispatcher.py::test_review_fail_supersedes_prior_review_blocker.

Cloud Control Plane —

TOTP encrypted at rest + replay guard — migration 0035_totp_encrypt_replay: users.totp_secret becomes Fernet ciphertext (idempotent in-place re-encrypt of plaintext rows) and a last-used guard blocks a TOTP code replayed within pyotp's ±1-step window.
Anti-enumeration: POST /api/v1/auth/signup answers a uniform 202 with no tokens for new AND existing emails (the engine chains a login after the 202); unknown and legacy (kdf_salt NULL) emails collapse into the same generic login 401. Details in cloud-identity/changelog.md.
Per-account plan-tiered sync storage quota (sync_v1.py + config.py sync_storage_quota_for_plan: base SYNC_STORAGE_QUOTA_BYTES 1 GiB; pro ×10, team ×25, enterprise ×100): an over-quota push is rejected whole; tombstones/shrinks always pass. Details in cross-machine-sync/changelog.md.
/api/v1/sync/push body cap raised to its batch budget: the ASGI body-limit override in main.py now derives from _MAX_PUSH_BYTES (24 MiB) instead of under-cutting it.
Push batch-duplicate insert regression fixed: an insert is recorded back into the batch's current_map, so a second item for the same (scope_key, entity, record_id) later in one push takes the update path instead of violating the unique index.
Retention pruning — new app/db/maintenance.py prunes dead rate_limit_counters windows and spent/expired email_tokens; run python -m app.db.maintenance from external cron (no in-process scheduler by design).

Cloud Identity (skybolt.ai)

Signup is non-enumerable — uniform 202 (cloud hardening F.3). POST /api/v1/auth/signup now answers the SAME 202 {"message": ...} with no tokens for both a genuinely new email and an already-registered one (previously the 201-with-tokens vs 409 split was a direct account-existence oracle). The engine (routes/cloud_auth.py, services/cloud_identity.py) adapts to the contract: /auth/cloud/signup and /auth/cloud/migrate chain a login after the 202 to obtain tokens — a 401 on the chained login means the email already exists under a different password and surfaces as the friendly duplicate-account error. Legacy pre-F.3 409s are still handled.
Legacy-login enumeration closed. In /api/v1/auth/login, unknown emails AND legacy (pre-split-derivation, kdf_salt NULL) accounts both collapse into the generic 401 — the previous distinct 400 for legacy accounts was an existence oracle.
TOTP secrets encrypted at rest + replay guard (migration 0035_totp_encrypt_replay, ADR-0042 F.1/F.2). users.totp_secret is widened and stored as Fernet ciphertext; the migration re-encrypts any still-plaintext secret in place (idempotent). A per-user last-used guard rejects a TOTP code replayed within pyotp's ±1-step skew window.
Retention pruning for disposable identity tables (F.5). New apps/api/app/db/maintenance.py prunes dead rate_limit_counters windows (floored to the longest limiter window so a live window is never deleted) and spent/expired email_tokens. Deliberately no in-process loop — run python -m app.db.maintenance from external cron.
Legacy refresh-token keyring slot migrates on read. Follow-up to the per-data-dir namespacing entry below: read_refresh_token now falls back once to the old bare refresh-token slot, moves the token into this instance's namespaced slot, and deletes the legacy slot (_read_legacy_refresh_token / _migrate_legacy_refresh_token in services/cloud_identity.py). Upgrading no longer drops the cloud session or strands the old live token in the keychain (the "signs in once after upgrading" cost noted below is gone); a keychain write failure leaves the legacy slot in place to retry, never signing the user out.
Tests. api tests/control_plane/ signup-202/enumeration/TOTP/pruning coverage; engine test_cloud_auth.py chained-login signup/migrate paths and test_cloud_refresh.py legacy-slot migration.
Bug. POST /auth/cloud/login authenticated against skybolt.ai and then wrote the local session + cloud_identity row on a single _open_global connection. Under the scheduler's concurrent global-DB writes, the INSERT INTO local_sessions (in _create_session) could raise sqlcipher3.OperationalError: database is locked, which bubbled up as an HTTP 500. Because the require_engine_token guard and the error response sit OUTSIDE the CORS middleware, the 500 came back without Access-Control-Allow-Origin, so the renderer reported it as a CORS policy error ("No 'Access-Control-Allow-Origin' header") rather than a server error — a confusing symptom for a transient lock.
Fix — routes/cloud_auth.py::cloud_login. The link-local-profile write (create session + _upsert_cloud_identity) now runs through store.project_store.run_global_write_with_retry (the same message-matched retry the scheduler claim path and other routes use). The work is idempotent — a rolled-back session INSERT leaves nothing and the identity write is an upsert — so a transient lock clears on retry. The cloud network call already happened, so no connection is held across it. _apply_sync_default runs afterward on its own connection.
Note. The sibling auth writes (signup / link / restore) share the same raw-_open_global pattern and remain a follow-up if they surface the same lock. Deeper still, the contention came from the scheduler/orchestrator retrying resource_leases UNIQUE conflicts in a hot loop — worth a separate look.
Tests. tests/test_cloud_auth.py::test_cloud_login_retries_transient_db_lock_instead_of_500 injects a one-shot lock into _create_session and asserts login returns 200 (retried), not 500.
Bug. The cloud refresh token lived in ONE fixed OS keychain slot (skybolt-cloud / refresh-token), regardless of data dir. Two Skybolt instances on one machine — a dev.ps1 build (isolated Skybolt-Dev data dir) and an installed build (Skybolt data dir) — therefore shared a single cloud session. Whichever signed in last owned the slot; the other's refresh against its (different) cloud then 401'd and wiped the slot, so the two instances perpetually clobbered each other's sessions. Pointing the dev build at production while the installed build was also on production surfaced this as a Working offline modal even though production was reachable.
Fix — services/cloud_identity.py. _keyring_account() now namespaces the keychain account as refresh-token:<sha256(normcased-abspath(resolve_data_dir()))[:16]>, so each data dir gets its own refresh-token family — the same multi-device model skybolt.ai already supports. store / read / clear use it; falls back to the bare account name if the data dir can't be resolved.
One-time re-sign-in. The old fixed slot is orphaned (left in place, harmless), so each install signs in once after upgrading. This is intentional: it is how a repointed build acquires a fresh session for its new cloud.
Tests. tests/test_cloud_refresh.py::test_refresh_token_slot_is_namespaced_per_data_dir asserts two data dirs get distinct slots, can't see each other's token, and that one clearing its session leaves the other's intact.

Cross-Machine Sync

Push body cap raised to the batch budget. The relay's per-push budget is _MAX_PUSH_BYTES = 24 MiB (routes/sync_v1.py); the ASGI body-limit override for POST /api/v1/sync/push (apps/api/app/main.py) is now derived from it (encoded_json_limit(_MAX_PUSH_BYTES) + header slack), so a legitimately full batch no longer 413s at the body guard before the route's own budget check.
Per-account plan-tiered storage quota. A push whose projected new total stored ciphertext (existing bytes + this push's net delta) exceeds the account quota is rejected whole, writing nothing; tombstones (byte_size 0) and shrinks are always allowed. Base SYNC_STORAGE_QUOTA_BYTES = 1 GiB, tiered by Entitlement.plan via sync_storage_quota_for_plan (pro ×10, team ×25, enterprise ×100); <= 0 disables the cap.
Agent seats no longer leak into cloud sync; tombstoned on finalize. sync_backfill's terminal_sessions clause now also excludes control_mode='agent' rows (a leaked seat would relaunch the agent's CLI as a HUMAN terminal on Restore, since control_mode is scrubbed from the wire). The scheduler's finalize/fail path deletes the seat row AND enqueues a terminal_sessions delete tombstone (inline note_change, op="delete"), so a previously-leaked seat can't persist as an undying requested placeholder on peers.
Cloud-binding stamp never downgrades a known account id to empty. An epoch-only reset (or any stamp write) firing while the cloud identity isn't cached yet could persist <url>|-with-empty-account over a known <url>|<account> binding; the later cached identity then looked like an account switch and forced a needless full re-sync. _binding_value_to_store (sync_engine.py) keeps the known account id in that case; empty→known remains a benign refinement, never a reset.
Foreign one-per-kind seats dedupe before re-owning. _ensure_seat_ownership first drops foreign-user seats that would collide with an existing LOCAL seat of the same kind under the one-per-kind partial unique index, then dedupes the remaining foreign duplicates of a kind (opencode/glm stay exempt — several allowed). A residual collision leaves the foreign seats under their source user with a WARNING instead of failing the whole re-own pass (which previously blocked the scope from ever syncing again until hand-fixed).
persistence added to the terminal-metadata portable allowlist (_SESSION_METADATA_PORTABLE, sync_serialize.py). The tmux persistence marker describes the terminal's nature (the same on every machine), and a surviving machine needs it to confirm-drain a retired machine's dead tmux placeholders (drain_gone_tmux_records keys on it) — scrubbing it left those tiles undrainable elsewhere.
Tests. api sync quota/cap coverage in tests/control_plane/; engine test_sync_engine.py (binding no-downgrade, seat dedupe), test_scheduler.py (seat tombstone on finalize), and serializer allowlist coverage.
Root cause of divergence. Synced MIRROR tables enforce secondary UNIQUE constraints the cloud never coordinates (project_registry(account_id, slug), model_providers(account_id, project_id, name), catalog_models(… , name)). The server stores opaque per-record blobs, so it happily holds two projects/providers/models with the same slug/name (created on different devices). On pull, applying the second one INSERT … ON CONFLICT(id) — which resolves the PK, not the secondary UNIQUE — raised UNIQUE constraint failed, the record was skipped, and after _SKIP_RETRY_CAP (5) passes its cursor was advanced past it permanently. Whole projects (and their boards), providers, and models silently never appeared on a machine — diagnosed live: the shared DB had 9 projects, one client had 4, and the missing ones were oversized project_registry records (giant menu_icon_data_url icons) whose slugs collided.
Invisible. The skip was logged at DEBUG, so real data loss left no visible trace.
Fixes — services/sync_engine.py:
- Observability: the pull-apply skip is now a WARNING naming the entity, record, and the concrete exception (e.g. UNIQUE constraint failed: …) — no row VALUES leaked. A future poison record is a one-line diagnosis instead of a multi-hour hunt.
- Disambiguate on clash (_dedupe_unique_column + _UNIQUE_DEDUPE): when a pulled record's renamable display column (slug/name) already belongs to a DIFFERENT local row, its LOCAL value is suffixed with a short id (slug-513cebdc) so the record still applies with BOTH rows kept. Identity (id) is unchanged and the pulled row is never re-enqueued, so it stays local and never cascades/ping-pongs. Covers project_registry, model_providers, catalog_models. seat_profiles (one-per-kind invariant), agent_personas (key is an identity), and model_routes (compound business key) are deliberately excluded — a clash there is surfaced by the WARNING for a targeted fix.
- Recovery (_resync_pull_cursors + _PULL_RESYNC_EPOCH): a record already skipped had its cursor advanced past it, so it never re-syncs even after the fix. Bumping _PULL_RESYNC_EPOCH clears the pull cursors + pull_floor markers ONCE (keeping sync_record_state + the outbox, so it re-PULLS with no re-PUSH storm), letting the next pass re-pull and now-successfully-apply the previously-dropped records. v1 recovers the projects/providers/models dropped by this bug.
Tests — tests/test_sync_engine.py: disambiguation for a colliding project slug and a colliding provider name (project-scope framing); the skip/give-up mechanism tests now force the failure via a patched apply (fix-independent); and the one-time cursor re-pull (cursors cleared once, record_state kept).
Bug (silent data loss). sync_record_state / sync_cursor are not namespaced by cloud. When the engine was repointed at a different cloud — e.g. SKYBOLT_CLOUD_URL flipped from the local dev API (scripts/dev.ps1 sets http://localhost:50002 when apps/api/.env exists) to production (https://api.skybolt.ai) — every already-synced row still looked "confirmed" per the OLD cloud's record-state. Backfill therefore pushed NOTHING, and the boot "match cloud head" delete-reconcile then trimmed the local data down to the new, sparse cloud's head. Observed live: the installed build (default prod) logged delete-reconcile removed 190 local row(s) absent from cloud head while the dev build (local API) kept everything.
Fix — services/sync_engine.py::_reset_stale_sync_state. The one-time reset now triggers on a cloud-binding change in addition to a _WIRE_EPOCH bump. _current_cloud_binding fingerprints the cloud as <normalized base URL>|<signed-in cloud_user_id> and stamps it in sync_meta. When it changes (_binding_requires_reset), the reset wipes sync_record_state + sync_cursor (global + every project DB) so everything re-pushes and re-pulls under the new binding — keeping the outbox and all domain data. Because a reset re-enqueues everything and push runs BEFORE delete-reconcile in the same pass, the local data seeds the new cloud instead of being trimmed to it.
No needless re-sync. A binding seen for the first time (existing installs upgrading) is recorded WITHOUT a reset; a same-URL cloud_user_id going empty→known is a refinement, not a switch. Only a different base URL, or two different known accounts on the same URL, forces the reset.
_WIRE_EPOCH 3 → 4 (one-time heal). Because "first sight" adopts without a reset, a machine whose record-state ALREADY predates the binding stamp (synced to one cloud, now pointed at another) wouldn't be caught by the binding check alone — it would push nothing and get trimmed to the new cloud's sparse head. Bumping the wire epoch forces one clean re-sync on the next pass so such a machine re-pushes everything to its CURRENT cloud (push precedes delete-reconcile, so data seeds instead of being trimmed) and establishes the binding baseline. Cost: every install re-syncs once on upgrade (idempotent, converges).
Tests. tests/test_sync_engine.py adds five cases: cloud-URL change resets + re-stamps, unchanged binding (incl. trailing-slash normalization) is a no-op, first-sight stamps without reset, account-id becoming known is not a switch, and a different account on the same URL resets.

SSH Machines

SSH git plumbing now runs through a LOGIN but non-interactive remote shell (services/ssh_ops.py). An interactive login shell runs the user's .bashrc, whose output lands on stdout AHEAD of git's — corrupting parsed output (git status/git log grew phantom rows). Interactive terminal/seat sites keep _remote_interactive_login_shell_command (they legitimately want the user's full shell setup); only parsed git exec paths switch.
The orphan tmux reaper keep-set now spans all projects and keys on row existence, not status. The reaper lists skybolt-* sessions machine-wide, and tmux session names are shared across every project on the same SSH box; terminal_sessions.status is machine-local (it doesn't sync). A keep-set scoped to one project — or one that excluded archived projects — would reap another project's (or an archived project's) still-live sessions. The keep-set now collects every terminal row across every project with a skybolt_path, so only a session that maps to NO row at all is reaped.
Restart during reattach returns a clean 503. Restarting (or attaching/deleting) a tmux-backed terminal while the reattach supervisor is mid-reconnect answers 503 {"detail": "Terminal reconnecting", "reconnecting": true} (routes/terminals.py) instead of mis-finalizing the terminal as exited and tearing down the still-running remote session.
Tests: apps/engine/tests/test_ssh_persistence.py, apps/engine/tests/test_terminal.py.

Unified Files and AI Media Studio

Made deterministic media-catalog registration idempotent under races (services/media_studio.py): two concurrent scans/hydrations of the same deterministic row both INSERT, and the bare INSERT raised IntegrityError on the loser — marking a SUCCESSFUL generation as failed. Registration now uses ON CONFLICT(id) DO NOTHING (and ON CONFLICT(project_id, relative_path) DO UPDATE where appropriate) so the loser is a no-op and an already-hydrated file costs no write transaction.
Renderer polls no longer hit the engine from hidden tabs (ADR-0074): interval polls — including the Files -> Generated media refresh — skip while document.visibilityState === "hidden", and the Dev Ops readiness sweep is min-interval-guarded against focus/visibility flurries (ProjectSurface.tsx, GeneratedMediaMode.tsx).
Changed native generated media storage so new API-backed media bytes are written under the selected project's outputs/skybolt-generated/<job_id>/ folder instead of the engine app-data media folder when the target is local.
Changed Files -> Generated asset listing to hydrate missing media catalog rows from project output folders (minimax-output/, outputs/, and model-output/) so generated files copied by Git or another project-file sync appear on another machine without cloud raw-byte sync.
Added stale-row cleanup for local-only project-output media when the backing file has disappeared.
Preserved the old engine-local storage path as a compatibility fallback for existing rows and non-local targets.
Added route regression coverage for hydrating Generated from a project output file and for writing new generated bytes to the project output folder.

2026-06-30

Agent Terminal —

Added open-quick-card to the shared project-shell shortcut registry. The default chord is Ctrl/Cmd+Shift+Q, and it toggles the project-level Quick Card popup, focusing it when opened. Tests.
keyboardShortcuts.test.ts covers the default chord mapping.
TerminalView now shows a History toggle on read-only zoomable mirrors, which are the Card Detail agent-seat terminal and its pop-out window.
The History view renders the terminal's captured transcript from terminal_sessions.scrollback and appends live output chunks while the socket is connected. It sanitizes terminal escape sequences for a readable text view.
When a read-only mirror is zoomed in, plain wheel still pans the enlarged terminal, while Shift+wheel forces xterm's local scrollback. Why.
Scheduler seat CLIs such as Codex, Claude Code fullscreen, and Antigravity often own the alternate screen. In that mode the terminal screen itself has little or no normal scrollback, so a card pop-out can show current output but still feel impossible to scroll back. The engine already keeps the last captured PTY bytes locally, so the viewer now exposes that as a separate read-only history panel. Tests.
TerminalView.test.tsx covers sanitized history rendering, live output appending into the history view, and Shift+wheel scrollback while zoomed.

Agents Execution

Added a Project Board Quick Card composer for fast typed or OS voice-to-text capture.
The first non-empty line becomes the card title, the remaining text becomes the visible summary, and the full captured text is saved as an initial human instruction note when the role can author agent instructions.
Quick Card now submits on Enter, while Shift+Enter keeps a line break.
Added the rebindable open-quick-card shortcut, defaulting to Ctrl/Cmd+Shift+Q, which toggles the project-level Quick Card popup from any project page and focuses it when opened.
Moved the Quick Card popup out of the board tab mount path so returning to the board cannot replay an old shortcut signal and reopen it unexpectedly.
Quick Card recognizes voice-command phrasing such as "create a project card to ..." and auto-creates the card after a short pause. It can also detect lane phrases such as "put it in ready".
Before creating the card, Quick Card asks a hosted chat model to polish the title, summary, and instructions. If no hosted model is ready or the AI response is unusable, the original dictated text is still saved.
The composer now shows whether AI review is ready or off, and the submit button changes to "Review + Create" only when the hosted-model polish path can actually run.
Quick Card now has a dedicated project-level AI review model setting. When unset, it auto-prefers tiny fast hosted models in the Gemini Flash-Lite family, then openai/gpt-4.1-nano, when those catalog models are configured, before falling back to the first hosted model.
Quick Card hosted-model polish now bypasses normal agent-chat context setup: no project brain build, file index, or tool schema is sent for chat_threads.kind = "quick_card". OpenRouter Quick Card requests also ask for low-latency provider routing and no/excluded reasoning output.
Tests: QuickCardComposer.test.tsx.
Added a left-side Workflow Map palette for agent role nodes that are not currently on the board, made the right properties panel always visible for selected nodes or edges, and showed the selected agent's description/purpose.
Upgraded the Workflow Map Agent Catalog with search, job-type filtering, a Not on board / All types switch, and placeable skipped/not-yet-rostered agent nodes that still route with on_missing: "skip" until the project adds that role.
Added Add Project Agent Types to place every roster role type that is not already on the board, deduped by role key so projects with several agents of the same type still get one workflow node.
Replaced raw route condition JSON editing with Add Rule controls for card type, work mode, changed path, risk flag, branch target, and release-required routing.
Moved workflow warning details behind a Workflow Issues modal so warnings do not consume canvas height in the map panel.
Made agent and control-node palette items draggable so users can drop them directly onto the workflow canvas at the intended position while keeping click-to-add as the quick path.
Added control node types: Fork, AND Join, OR Join, Decision, and Human Gate. Control nodes render in the map, save in project overrides, route without roster seats, and Human Gate parks cards for input.
Improved map readability with obstacle-aware orthogonal edges that route around agent boxes instead of drawing across the top of them, while keeping dark edge labels for action/stage/reason labels.
Added animated directional flow traces and target arrows to active workflow routes so users can see the direction work/data moves through the map.
Tightened routing so source and target agent boxes are also avoided after a short handle escape, and added green routes connected to the selected agent plus a distinct selected-route color.
Made agent node cards opaque over routed edges and froze route recomputation during node drag, then recalculated after drag stop to reduce map lag.
Made workflow routes reconnectable by dragging line endpoints, and added context-aware right-click menus for agents, routes, and the canvas.
Added click-through selection for stacked routes from the same action handle, with a matching Select Next Stacked Route action in the route properties panel.
Added Delete Route to the selected-route properties panel and made route labels pass pointer events through to the wider route hit target for easier click/right-click selection.
Added Delete From Board for workflow agent nodes. It removes the node and connected routes from the project workflow map without deleting the project roster agent or persona.
Added debounced autosave for workflow map edits, live filtering for stale skipped-node warnings, and project-local display-name overrides for workflow nodes.
Serialized workflow-map autosaves so older project override writes cannot land after newer edits and restore a stale graph.
Excluded self-contained dispatcher personas from the Workflow Map: Handyman, legacy Review Agent, legacy Merge Agent, and Git Merge Specialist are stripped from defaults, imports, the canvas, warnings, and Add Project Agent Types so they keep their engine-owned behavior outside the map.
Kept the shipped Software Delivery Default focused on orchestrator-led roles: implementation lanes are Full Stack, Backend, Frontend, and Mobile, with Release Manager remaining a map stage only when release work is required.
Kept disconnected catalog-only roles disabled in the shipped default so Restore Default redraws a connected workflow while leaving extra agent types available in the left palette.
Kept workflow node display-name typing local until blur or Enter so renaming a node does not rerender and autosave the map on every character.
Replaced the manual Save Now action with JSON Export and Import actions that open copy/paste JSON modals. Imports redraw the graph immediately and then flow through autosave.
Added Remove Unused Agents to clean any on-board agent node with no incoming or outgoing routes.
Added typed agent questions with fixed options and recommended choices, surfaced in the Card Detail Agents tab so users can choose an option, add custom context, answer a blocked agent, and requeue the card.
Added workflow edge metadata for feedback_type and failure_route_strategy, plus default orchestrator-triage routes for security, dependency-security, and data-safety failures before direct developer rework.
Added orchestrator-selected workflow branch activation: assign_card can persist workflow_activate_role_keys, and AND Join waits only for those activated inbound lanes when the card has them.
Added specialist-first orchestration hints and the split_parallel_work decision. The orchestrator snapshot now marks agents as available/busy and specialist/generalist, and the new decision creates assigned specialist child cards that the parent waits on so frontend, backend, database, mobile, or similar lanes can run in parallel instead of defaulting to Full Stack.
Persisted workflow_route metadata on workflow-created sessions so receiving agents see the source action, reason code, feedback type, and routing strategy.
Tests: test_agent_workflows.py, test_questions.py, AgentWorkflowMap.test.tsx, and CardDetailModal.test.tsx.
Added role-type workflow actions, fixed failure reason codes, source_action, reason_codes, and route_mode: "same_card" to workflow edge normalization while keeping saved v1 maps compatible.
Updated deterministic routing so pass/done/planned edges move forward, fail/blocked edges route the same card backward, downstream completed stages are cleared for rework, and unmatched failure routes kick the orchestrator then park the card in needs_input.
Updated the React Flow project map with action output ports, dark custom edge-label pills, Skybolt-styled zoom/fit controls, hidden attribution, and editor fields for action/reason routing.
Added an authoritative Agent Workflow Map with a shipped Software Delivery Default. Projects start from the shipped default and save their edits at agent_workflow_override.
Added services/agent_workflows.py, project schema v25 cards.workflow_state_json, and GET /projects/{id}/agent-workflow/effective so the UI can show active/skipped nodes and routing warnings.
Scheduler finalization now routes through the workflow before legacy fallback behavior, so Unit Test Writer, Code QA Reviewer, Code Review Agent, and Release Manager run predictably when present and are skipped when absent.
Added the React Flow based project Workflow Map as a pop-out project panel with fullscreen and canvas zoom controls.
Tests: test_agent_workflows.py and AgentWorkflowMap.test.tsx.
Fixed orchestrator-routed repo-change specialists so they no longer mark a card completed while their work remains only on an isolated agent branch.
When a Review Agent is active, any completed repo-change implementation session now records its branch on the card and routes the card to review, matching the Handyman path. That branch then survives human/review sign-off and can be landed by the Merge Agent.
Added regression coverage for an orchestrator-style backend specialist session routing to review with cards.branch_name preserved.
Tightened local agent worktree cleanup so the fallback recursive delete can remove only a direct child of the repo's managed .skybolt/worktrees root.
Cleanup now refuses the managed root itself, the primary checkout, user-created external worktrees, and sibling paths that merely contain a folder named worktrees.
Added regression coverage for unmanaged worktrees paths and the managed root, and documented the seat launch invariant that write-capable agents must run from the resolved managed worktree.
Removed the hard 10-pod cap from the Overview tab's Live Agent Floor. The overview now renders every live session and idle roster agent, so projects with larger rosters no longer hide agents after the first ten.
Updated the floor model regression test to lock in the uncapped behavior and cover a nineteen-agent idle roster staying inside the visible floor.
Widened the idle-agent dock from four pods per row to seven, so a nineteen-agent roster lays out as three rows instead of five.
Removed Skybolt-owned GitHub credential injection from agent seats. The scheduler no longer exports GH_TOKEN, GITHUB_TOKEN, or GIT_CONFIG_* credential-helper variables for local or SSH seats.
Seat handoff files now carry only completion/checkpoint helpers. Skybolt no longer writes .skybolt/git-credentials; cleanup still removes that legacy file when encountered.
Agent Git commands now rely on the selected Machine's configured Git auth, matching Dev Ops push behavior.
Problem. Adding Avery, the Chief Project Orchestrator, and selecting an Antigravity seat could still fail with "No model is routed for orchestration." The orchestrator was reading legacy model_routes, but the current UI saves either a project default chat model or an agent-level model/seat override. New projects have no visible route editor.
Fix. The orchestration runtime still honors legacy routes first, then falls back to Avery's catalog-model default, then the project default catalog model. CLI seats such as Antigravity remain excluded because orchestration is an in-engine structured model call, not a PTY-backed seat.
Tests. test_orchestrator.py covers Avery's catalog model default, the project catalog model default, and the clearer failure when the default is an Antigravity seat.

Changelog — minimax-integration

Documentation update: ADR-0077 makes MiniMax the first concrete AI Media Studio provider adapter. Generated media review now lives in Files -> Generated with the existing model-output watcher and planned explicit save/promote actions.

Cross-Machine Sync

Bug. The launch sync modal treated last_pass.status="no-token" as "skybolt.ai is unreachable." If a linked profile had sync enabled but the local cloud refresh token was missing or expired, every app launch could show the Working offline modal even though the failure was a sign-in/session state, not a network outage.
Fix. services/sync_service.py now reports token-refresh transport failures as deferred with the cloud-unreachable detail, while a truly missing token remains no-token. SyncStartupHost shows the offline modal only for transport-looking failures and dismisses for persistent non-network states (locked, no-identity, no-token).
Tests. tests/test_sync_service.py covers token-refresh offline vs no-token, and SyncStartupHost.test.tsx covers the no-token non-modal path.

Git Operation Surface —

Retired Skybolt-stored GitHub PATs from Dev Ops. The project credential API, renderer token card, and github_credentials service are removed.
Git push and commit-push now run only through the selected Machine's existing Git auth. The in-app SSH passphrase fields and ssh_passphrase request fields were removed from direct Git operations and approval resolution.
Dev Ops setup/settings now show a reusable GitHub Machine Auth Guide with Local Windows, Local Linux, and SSH Machine instructions. SSH Machine copy clarifies that local credentials do not apply to the remote host.
The reusable guide now renders as a compact GitHub Auth panel with an Open GitHub Auth Guide button. The full modal explains why auth belongs to the selected Machine, why SSH remotes may ask for a private-key passphrase, how to load that key with ssh-agent, how to verify auth, and how to switch to an HTTPS remote through gh auth setup-git when desired.
Git auth-looking push failures now tell the user to authenticate Git on the selected Machine with Git Credential Manager, gh auth login, or ssh-agent.
SSH public-key push failures now get a more specific message: the selected Machine is using an SSH remote, GitHub did not accept the key, and the user should check ssh-add -l, ssh -T git@github.com, the repo remote, the GitHub account that owns the public key, or switch the remote to HTTPS. SSH Machine guidance also explains that a one-terminal ssh-agent may not be visible to Skybolt's non-interactive SSH session and recommends a persistent agent such as keychain or a systemd user ssh-agent.
SSH Git operations now run through the remote login shell before invoking the strict Git argv command. This lets Machine-local shell setup such as keychain export SSH_AUTH_SOCK for Skybolt's Git fetch/pull/push path, matching the behavior users see in a normal remote terminal.
Fixed the ready Dev Ops panel's Default branch selector sometimes snapping back to the previous branch after a change. The select handler updated gitDefaultBranchDraft and immediately called the save path, so saveProjectGitSettings often read React's previous draft value instead of the branch the user just selected.
The selector now passes the chosen branch directly into saveProjectGitSettings, and ProjectSurface builds the outgoing git_settings.default_branch from that explicit value. Changing the branch is now a single write, like the other project settings.
Regression coverage: DevOpsTab.test.tsx now asserts that changing the selector calls the save path with the selected branch value.

In-App IDE

Documentation update: ADR-0077 keeps the in-app IDE as Files -> Workspace inside the unified Files tab. The workspace API and local-first source boundary are unchanged; Assets, Generated, and Public Site modes are documented under documents/features/ai-media-studio/.

Public Site

Documentation update: ADR-0077 adds future Files -> Public Site staging for media promoted from AI Media Studio. The mode may prepare explicit apps/site file changes, but normal Git, CI, build, and deployment flows remain separate and user-controlled.

Unified Files and AI Media Studio

Added /projects/{id}/media/providers, /media/jobs, /media/assets, asset content, promotion, refresh, and model-output import route coverage.
Added local-only media storage, project_private promotion into the configured assets folder with Brain Off by default, and confirmed public_site promotion into apps/site/public/generated.
Hardened media job execution so provider exceptions and output-import failures are recorded as failed jobs with sanitized messages instead of escaping the route or leaving jobs running.
Added generated-media deletion. Local-only items remove the engine-local stored file and catalog row; already-promoted project/private or public-site copies remain owned by their target surface.
Added MiniMax as the first local CLI adapter and OpenRouter as the first API-backed adapter that reads project model catalog rows instead of hard-coded model names.
Kept the Media Studio creation form model-first: users choose media type and model, while the provider adapter is derived from the selected Settings model.
Limited v1 generated media to image and video files, with local-target-only file imports and promotions until SSH media handling is implemented through the Machine abstraction.
Updated docs and tests for explicit public-site confirmation and no prompt leakage into repo-carried file_assets metadata.
Implemented the merged top-level Files tab in the renderer with Workspace, Assets, Generated, and Public Site modes.
Removed the standalone top-level Assets tab from project navigation; legacy assets and context defaults now resolve to Files.
Moved the existing asset manager behavior into Files -> Assets while preserving upload, delete, open, and Brain-toggle behavior.
Added Files -> Generated with a native Media Studio panel that calls the provider/model/job/promotion API contract and degrades cleanly when the routes are absent.
Changed generated media previews to fetch asset bytes through the authenticated API client and render object URLs, matching the workspace image viewer pattern used by the desktop app.
Moved generated and imported media cards into a full-width gallery below the Media Studio panel, with click-to-expand image previews that toggle closed on the enlarged image.
Added a Delete action to generated media cards with a themed confirmation dialog and immediate gallery cleanup.
Replaced the prominent model-output watcher panel with a right-side recent-errors panel capped at ten failed/cancelled jobs; successful outputs now live only in the gallery.
Added a frontend Public Site mode that reads media provider status and public-site media assets, remaining unavailable until backend detection or public-site outputs are available.
Added focused renderer tests for Files mode switching, asset compatibility, stored-tab fallback, and the removed top-level Assets navigation.
Added the Unified Files and AI Media Studio feature docs.
Defined the merged top-level Files tab with Workspace, Assets, Generated, and Public Site modes.
Defined media visibility tiers: local_only, project_private, and public_site.
Recorded MiniMax as the first concrete provider adapter and OpenAI Images as a future image adapter.
Excluded OpenAI video v1 from the adapter plan because official OpenAI docs currently mark the Videos API and Sora 2 family as deprecated and scheduled for shutdown on 2026-09-24.
Captured the local-first publication boundary: raw bytes stay local by default, public promotion is explicit, and generation never auto-edits, auto-commits, or auto-deploys.

2026-06-29

Agents Execution

Symptom. A budget-paused card moved back to Ready did nothing until the owner was toggled agent→human→agent, which then kicked it off.
Root cause. When the runaway break (F3) trips in the scheduler's claim loop, it paused the card and continued but left the declined queued session non-terminal. Its card is paused (excluded from the claim query), so the session lingered forever — it held the agent's single-flight slot (_agent_has_open_session), so the Handyman couldn't pick up ANY other card, and it blocked dispatch_ops._oldest_eligible_card (which skips a card that already has a non-terminal session) so the card could never be re-dispatched after resume. The agent→human toggle "fixed" it only because _cancel_card_agent_sessions failed that stranded session.
Fix. scheduler._claim_next now fails the queued session it declines when pausing for a budget breach (mirrors dispatch_ops._reap_orphaned_sessions for backlog-parked cards). On resume to Ready the card has no open session, so the card_queued wake lets the dispatcher seat a fresh session and the scheduler launches it — no toggle needed. Bonus: a paused card no longer freezes its agent's single-flight slot.
Note. A card paused BEFORE this fix still carries a pre-fix stranded session; clear it once with the owner toggle (or re-queue it). All pauses after the fix resume cleanly.
Tests. test_cost_and_breaks.py::test_requeue_resets_session_budget now asserts no non-terminal session remains for the card after the pause.
Why. A set of behavior-shaping constants were hardcoded with no user control: the Merge Agent commit lookback, the agent tool-round cap, the command-run timeout, the git output caps, and the orchestrator/dispatcher loop timing. Power users wanted to tune these per account.
New advanced account-settings category. store/settings_store.py adds advanced to ACCOUNT_SETTING_CATEGORIES, a flat clamped map (ADVANCED_TUNING_BOUNDS, 15 keys) and grouped resolvers: resolve_account_merge_history_lookback, resolve_account_max_tool_rounds, resolve_account_command_timeout, resolve_account_git_limits, resolve_account_orchestrator_timing, resolve_account_dispatcher_timing. Values clamp on read, MAX-across-accounts on a shared machine, and never raise (read on hot loops). Account-level only — no per-project tier.
Precedence. A SKYBOLT_* env override (orchestrator/dispatcher, and any existing per-call override like planning's tool-round budget or an explicit command timeout) still wins; otherwise the account value applies; otherwise the built-in default. Captured per-knob via _env_pinned_timing.
Threading.
- *Merge lookback* — _merge_prompt_text takes a history_lookback param; _prompt_text resolves it (conn in scope).
- *Tool rounds* — resolved in routes/chat.py::_prepare_chat_run; planning keeps its larger budget.
- *Command timeout* — resolved inside command_exec.execute_command_run when the caller passes no explicit timeout; gates.py and startup_commands.py resolve it for their runs too.
- *Git caps* (git_log_max_count, git_conflict_list_max, max_card_diff_chars) — resolved at the conn-holding callers (routes/git.py, routes/cards.py card-diff) and threaded into _execute_project_git_operation(..., git_limits=...), which injects them into payload under reserved _-keys that the git_ops consumers read (_payload_int_limit). Agent/approval git paths keep the built-in defaults.
- *Orchestrator/dispatcher timing* — a refresh_timing_knobs() runs at the top of each _tick(), re-resolving the loop knobs live (env-pinned ones untouched). The orchestrator's hourly-cost break already reads account budget caps; this adds the timing side.
UI. New Advanced tab in Account Settings, inputs grouped Git / Merge / Agent loop / Orchestrator / Dispatcher, each clamped client-side to the same ranges.
Tests. test_advanced_tuning.py (resolvers read/clamp/skip-malformed, route round-trip, git payload-cap threading, merge-lookback prompt, orchestrator/dispatcher refresh + env precedence); plus the web normalizer/payload/modal coverage for the Advanced tab.
Why. Only retry_budget (the default max_sessions_per_card) had an account-level default; the other F3 budget caps — max_attempts_per_card, card_max_cost_usd, orchestrator_max_cost_usd_per_hour, orchestrator_max_runs_per_hour — could only be set per-project (agent_budgets) or per-card (budget_json), with a hardcoded DEFAULT_BUDGETS fallback. Users wanted to set these once for their whole account.
Change. The account agents settings category now accepts those four keys (named exactly like their DEFAULT_BUDGETS keys, unlike the retry_budget alias). resolve_account_budget_caps (store/settings_store.py) resolves them — clamped to AGENT_BUDGET_CAP_BOUNDS (attempts 1–100, costs $1–$1000, orchestrator runs 1–10000), MAX across accounts on a shared machine — into a {budget_key: value} dict.
Cascade (unchanged precedence). A new account_budget_caps keyword threads through get_project_budgets / set_project_budgets / get_effective_budgets / card_budget_breach (services/budgets.py), setting the base default at the same layer as the retry budget: per-card budget_json > per-project agent_budgets > account caps > built-in DEFAULT_BUDGETS. The scheduler caches them on self.account_budget_caps (refreshed in refresh_default_retry_budget, passed into card_budget_breach); the orchestrator's _over_cost_budget resolves them so an account-level orchestrator_max_cost_usd_per_hour actually throttles wakes; the routes/cards.py /agent-budgets + /cards/{id}/usage endpoints resolve them so reported budgets reflect the account defaults.
Schema. No structural change — the agents category is an open dict clamped on read; the AccountSettingsUpdate doc comment lists the new keys.
UI. Account Settings → Agents tab gains four inputs beside Max concurrent / Retry budget (cost caps accept decimals; attempt/run caps are integers), wired through AccountAgentsSettings / accountAgentsSettingsFromUnknown / accountAgentsSettingsPayload.
Tests. test_cost_and_breaks.py::test_account_budget_caps_become_project_defaults, ::test_project_override_wins_over_account_budget_cap, ::test_account_budget_cap_is_clamped; test_scheduler.py::test_refresh_reads_account_budget_caps; plus the web normalizer/payload/modal tests.
Why. On a merge conflict the Merge Agent seat resolved the hunks from the conflict markers alone. The markers show *what* clashes but not *why* each side changed those lines, so the seat could guess wrong — dropping an intentional deletion or clobbering work one branch deliberately added.
Change. AgentScheduler._merge_prompt_text now adds an explicit "understand the work first" step: when the git merge <source> conflicts, the seat reads up to MERGE_HISTORY_LOOKBACK (15) recent commits on both branches — git log --stat -n 15 <source> (incoming work) and git log --stat -n 15 HEAD (work already on this branch) — plus the per-file patches behind each conflicted path (git log -p -n 15 -- <conflicted-path>). It resolves from that intent, keeping the additions/deletions each side meant to make so neither branch's work is dropped.
Still local-only. The new history step uses read-only git log only; the seat continues to never fetch, checkout, pull, or push — the engine lands the resolved branch onto the default branch afterwards.
Tests. test_merge_prompt_directs_history_inspection_on_conflict (asserts the 15-commit per-branch lookback, per-file patch inspection, and the keep-both-sides intent); the existing test_merge_prompt_is_local_only_no_push still guards no fetch/push/origin.
Why. Setting an agent to GLM launched claude (pointed at Z.AI), not opencode, and there was no opencode path in the agent seat system at all — only the terminal quick-launch menu ran opencode --model …. opencode-backed models (GLM, MiniMax, …) couldn't run as agents.
New opencode seat kind. Launches the opencode CLI; the seat's pinned model becomes --model <provider/model> (the sanitizer already allows /). It is agent-roster-only — a new AGENT_ONLY_SEAT_KINDS set + an includeAgentOnly option on buildChatChoices/isEnabledCliSeat keep it out of the chat composer, planning picker, and default-model picker.
GLM repointed. SEAT_KIND_COMMANDS["glm"] is now opencode (was claude); the Z.AI ANTHROPIC_BASE_URL env is gone (SEAT_KIND_ENV is now empty). SEAT_KIND_DEFAULT_MODEL gives an unpinned GLM seat the default model zai-coding-plan/glm-5.2, so existing GLM seats keep meaning "GLM" under opencode. Both glm and opencode are flagless, so unattended runs use the existing seat_autoaccept terminal monitor (same as mmx).
Model auto-list (execution-target-aware). New services/opencode_cli.py (resolve_opencode_cli, list_models) runs opencode models where the work runs: a local target shells out locally; an SSH target runs the probe over SSH through a login+interactive shell (PATH/config matching SSH-backed terminals, mirroring the capability probe). GET /projects/{id}/opencode/models resolves the project's default execution target and serves it. The seat form auto-populates the model picker (Refresh + free-text fallback + install hint when opencode is missing/unreachable).
Capability/detection follow-through. Local + SSH machine probes now report opencode and base glm on it; the Windows command resolver allow-list and localCliTerminalShell/ machineSupportsLocalCliTarget track opencode for GLM and the generic seat.
Not done: no dedicated MiniMax-via-opencode preset kind (use the generic OpenCode seat and pick minimax-coding-plan/MiniMax-M3); the existing mmx MiniMax-CLI kind is untouched.
What. The per-card "retry budget" (the default max_sessions_per_card — each agent session is one retry of a card before the F3 runaway break pauses it) is now editable in Account Settings → Agents tab, beside Max concurrent agents, and its built-in default rose from 6 to 10.
Account setting. New key retry_budget in the existing agents account-settings category (global DB account_settings, clamped to 1–100 via AGENT_RETRY_BUDGET_MIN/MAX), so it syncs across the user's machines like max_concurrent. resolve_account_retry_budget (store/settings_store.py) reads it (MAX across accounts on a shared machine; bad values skipped).
Budget wiring. DEFAULT_BUDGETS["max_sessions_per_card"] is now 10. get_project_budgets/get_effective_budgets/card_budget_breach (services/budgets.py) take a default_retry_budget keyword that sets the base default; precedence (later wins): built-in default → account retry_budget → project agent_budgets → per-card budget_json. The scheduler caches it on self.default_retry_budget, refreshed each loop by refresh_default_retry_budget() and passed into the breach check; routes/cards.py /agent-budgets + /cards/{id}/usage resolve it on their project connection (which sees account_settings via main).
Renderer. AccountAgentsSettings gains retryBudget (default 10); accountAgentsSettingsFromUnknown parses both agents keys (max_concurrent, retry_budget) independently with per-key fallback; accountAgentsSettingsPayload serializes both. The Agents tab adds a "Retry budget" number input (1–100) that preserves Max concurrent agents on change (AccountSettingsModal).
Tests. test_cost_and_breaks.py (default 10, account setting becomes the default, project override wins), test_scheduler.py::test_refresh_default_retry_budget_reads_account_setting, AccountSettingsModal.test.tsx + surfaceShared.test.ts (parse/clamp/round-trip both keys).
Problem. A freshly launched seat must look interactive before the engine pastes its prompt, or the trailing Enter is swallowed and the seat never starts. For seats with no ready-prompt regex (Antigravity's agy), the only signal was the CLI-agnostic output-settle: PTY grew >64 bytes then went quiet for ~1.5s. But agy re-runs its sign-in flow on every launch with multi-second pauses *between* boot phases, so a pause longer than the settle window tripped "ready" mid-sign-in — the prompt landed in a not-yet-live input box and was lost.
Per-CLI readiness profile. terminal.seat_ready_profile(command) returns a SeatReadyProfile (ready-prompt regex + optional settle override + settle floor). Every seat keeps its historical behaviour except those in SEAT_READY_PROFILES; Antigravity (agy) overrides the settle window (SKYBOLT_ANTIGRAVITY_SETTLE_SECONDS, default 4.0s) and adds a floor (SKYBOLT_ANTIGRAVITY_SETTLE_FLOOR_SECONDS, default 10.0s).
Settle floor. LocalTerminalBroker.wait_for_seat_ready gained settle_floor_seconds: the output-settle signal is suppressed until that long after the wait begins, so an early mid-sign-in pause can't be mistaken for the ready prompt. A real ready-prompt regex match is authoritative and is never gated by the floor (so a future agy regex fires immediately).
Both readiness paths. The scheduler's autonomous seat-prompt delivery (_deliver_seat_prompt) and the renderer-driven seat-ready route now resolve the profile. The route keeps caller-tuned timeout/settle and adds the engine-owned floor (0 for every non-profiled seat → no-op for them).
Next step. The agy profile still falls back to settle/floor because we have no ready-prompt regex for it yet. Capture one with python -m skybolt_engine.debug.capture_seat agy and drop it into SEAT_READY_PATTERNS["agy"] / the profile's pattern; the timers then become the fallback.
Tests. tests/test_terminal.py (profile resolution for agy/default seats; floor suppresses settle within the window; floor only delays settle, fires once elapsed). tests/test_scheduler.py (_deliver_seat_prompt passes the profile's settle + floor for agy, defaults for Claude). tests/test_app.py (seat-ready route applies the Antigravity floor; 0 for non-profiled seats).

Desktop App

The main window now reopens exactly where it was — same monitor, position, and size — and restores its maximized / full-screen state. Previously nothing was persisted; the OS picked the placement on every launch, so "it remembered the monitor" was incidental.
Implemented with tauri-plugin-window-state (Cargo.toml), registered in the main() builder with flags SIZE | POSITION | MAXIMIZED | FULLSCREEN (not VISIBLE). The main window is skip_initial_state'd and restored explicitly in create_main_window: it is created visible: false (tauri.conf.json), restore_state runs, then show() — so it paints once in place with no flash at the 1440×960 config default. The plugin still saves the main window on exit and auto-save/restores terminal pop-out windows (a free bonus).
Robustness handled by the plugin: a saved window whose monitor was unplugged (or that would land off-screen) is clamped back onto a visible monitor, so it can never reopen "lost." Geometry is stored in %APPDATA%/ai.skybolt.desktop/.window-state.json, which is outside the ADR-0073 cloud-sync concern (no data_dir path travels in it, and clamping makes it safe even if the file did sync).
No capability change: save/restore are driven from Rust, not renderer IPC.
Closed the "the app vanished and engine.log was empty" gap: engine.log only carries the Python sidecar, so a crash in the Rust shell or the WebView left no trace. Each process now has a durable log under <data-dir>/logs/.
Rust shell (main.rs): a std::panic::set_hook writes panics (with backtrace) to a new desktop.log, plus a startup + engine-spawn breadcrumb. Release builds have no console (windows_subsystem = "windows"), so the default panic printer went to the void; the hook is the fix. The sidecar is also spawned with PYTHONUNBUFFERED=1 so the skybolt-engine.log mirror keeps the last lines before a hard crash. Tests: format_panic_record unit test.
Renderer (crashLog.ts, wired in main.tsx): global error / unhandledrejection listeners ship failures to the engine via shipClientLog → POST /client-logs, so renderer crashes land in engine.log (tagged [renderer:…]). Capped per session; best-effort and never throws. Tests: crashLog.test.ts.
Engine (app.py): _install_engine_crash_diagnostics arms faulthandler → engine-faults.log (native segfault / access-violation tracebacks, e.g. SQLCipher) plus sys.excepthook and threading.excepthook routed through the durable skybolt.crash logger. New token-authenticated /client-logs route (in the unlock allowlist so lock-screen crashes are captured). Tests: test_client_logs.py, test_engine_logging.py (+crash-diagnostics).
Docs: troubleshooting.md gains an "App crashes / vanishes with nothing in engine.log" triage table mapping each symptom to the file that holds the answer (including the Windows Event Viewer path for a hard WebView2 crash).

Git Operation Surface —

The three git output safety caps — GIT_LOG_MAX_COUNT (100), GIT_CONFLICT_LIST_MAX (200), and MAX_CARD_DIFF_CHARS (400000) — are now overridable per account via the advanced account-settings category (git_log_max_count / git_conflict_list_max / max_card_diff_chars, clamped). The built-in constants remain the defaults.
Threading: conn-holding callers resolve resolve_account_git_limits(conn) and pass _execute_project_git_operation(..., git_limits=...). git_exec._augment_payload_with_limits injects the resolved caps into payload under reserved _-prefixed keys, and the consuming git_ops helpers (_git_log, _git_card_diff, _git_conflicts/_conflict_failure/_git_merge_continue via _conflict_paths) read them through _payload_int_limit, falling back to the constant. Wired at the Dev Ops git route (routes/git.py) and the card-diff endpoint (routes/cards.py); agent-tool and approval git paths pass no limits and keep the defaults. The reserved keys never reach git argv.
See documents/features/agents-execution/changelog.md (2026-06-29 "advanced engine tuning") for the full account-settings mechanism.

2026-06-28

Agent Persona Catalog —

merge_agent ("Mac") instructions + definition-of-done rewritten for the local-only automated merge (see agents-execution changelog): the seat resolves merge conflicts in an isolated worktree branched from the default branch and never pushes — dropped the old "pushed to origin" / "fetch, check out the default branch, push" wording. Edited in the canonical apps/api/app/data/builtin_personas.json and regenerated the engine fallback apps/engine/skybolt_engine/personas_data.py via scripts/generate-personas-data.py.

Agent Terminal —

The Card Detail modal's live agent seat terminal now appears (and updates) without closing/reopening the card. Previously a seat spawned while the modal was open never showed until a manual refresh. Why.
The card modal also shows the seat's holder/activity, and it's fed seatTerminals from ProjectSurface's runtime poll. That poll's scope is tab-driven, and the board tab (where cards are opened) used "metadata", which refreshes work/sessions but NOT terminals (includeTerminals is only true for "terminals"/"all"). So seatTerminals went stale while the modal was open and the live seat never arrived. Fix (ProjectSurface.tsx).
While a card detail modal is open (editingCard set), the runtime poll escalates to the "all" scope so it refreshes BOTH terminals and session/work state.
An open also triggers an immediate "terminals" fetch, so the live seat surfaces at once instead of after up to one poll interval. Reverts to the tab-driven scope on close.
Scheduler-launched agent seat terminals (the PTY a Kanban card's agent runs in) no longer appear in the Terminals grid. They are surfaced live, read-only, inside the card itself — the "Agents" tab of the Card Detail modal, under "Activity" — so a card's agent is watched where the work lives.
The Terminals grid now shows only human shells and user-opened Agent Terminals (T7, lease holder user:<id>). The discriminator is the new isSchedulerSeatTerminal predicate (surfaceShared.ts): control_mode == 'agent' with an agent_session_id / agent_lease_session_id that is not a user: lease.
TerminalView gained a readOnly prop: xterm disableStdin, no onData pump, and no broadcast-sink registration, so the card panel streams output but never writes to the PTY. It still reconnects/refreshes its attach token via onAttach.
Switching into the Terminals tab now fetches the terminal list immediately instead of waiting a full PROJECT_REFRESH_MS poll tick, so the grid is never briefly blank on switch-in. Why. Scheduler seats popped in and out of the Terminals grid as cards dispatched and completed — distracting, and the grid only refreshed the list while the Terminals tab was active (so the seats often appeared "all at once" only after the user opened a terminal). Moving them onto the card removes the churn and ties each live agent to the card it's working. Files. apps/web/src/app/project/surface/surfaceShared.ts (isSchedulerSeatTerminal) + surfaceShared.test.ts; apps/web/src/app/project/ProjectSurface.tsx (exclude seats from visibleTerminals, build schedulerSeatTerminals, pass to the modal, immediate terminals fetch on tab activate); apps/web/src/components/TerminalView.tsx (readOnly); apps/web/src/app/project/modals/CardDetailModal.tsx (seatTerminals prop + live read-only terminal in the Agents tab) + CardDetailModal.test.tsx. See agents-execution/changelog.md for the card-lifecycle framing.
The per-terminal "Task" assignment moved off the docked bar under every panel. The card selection box (TerminalTaskSelector) now lives inside a new TerminalTaskModal, opened from a clipboard button in the TerminalView header. The button is accented while a card is assigned.
The task button is wired only for regular terminals (control_mode != "agent"). Agent seat terminals — scheduler seats and T7 agent terminals — no longer show any task option; their work comes from the scheduler / inline chat, and only scheduler-launched seats carry the SKYBOLT_COMPLETE_* env needed to report a completion verdict, so feeding them a card here was misleading. Why. The docked "Task" bar appeared on every terminal, including agent seats that can't usefully act on a manually-assigned card. Consolidating it into a header-button modal declutters the panel and scopes the feature to the terminals it's meant for. Files. apps/web/src/app/project/tabs/terminals/TerminalTaskModal.tsx (new), TerminalTaskModal.test.tsx (new); TerminalView.tsx (new onAssignTask / taskAssigned props + header button + TaskIcon) and TerminalView.test.tsx (button visibility/dispatch tests); TerminalsTabBody.tsx (drop the inline selector, open the modal from the header button for regular terminals only). TerminalTaskSelector.tsx is unchanged and reused inside the modal.
TerminalView now renders through the @xterm/addon-webgl WebGL renderer (new dependency, @xterm/addon-webgl@^0.19.0, the v6-stable release published alongside @xterm/xterm@6.0.0) instead of xterm's default DOM renderer. The addon is loaded right after terminal.open(), inside a try/catch, and its onContextLoss disposes the addon so xterm reverts to the DOM renderer if a GPU context is lost. The addon is also disposed before terminal.dispose() on unmount so contexts are released deterministically across a large grid.
The existing translateZ(0) repaint nudge + refreshTerminal were kept — they are now the safety net for the DOM-renderer fallback state only. Why. The garbled/"random characters until I resize the window" corruption during Claude Code / Codex / agy sessions is the WebView2 (Tauri) compositor not re-rasterizing the DOM-rendered xterm screen tile under heavy streamed output. Every prior fix (settle-fits, DPR refit, surfaceActive refit, and escalating repaint nudges — see 2026-06-27) only chased the symptom. The WebGL renderer paints the whole screen to one canvas that re-rasterizes atomically, eliminating that corruption class outright. Canvas-as-fallback was evaluated and rejected: @xterm/addon-canvas has no v6-compatible release (latest 0.7.0 is the xterm-5 line, peer ^5), so DOM is the only viable fallback on xterm 6 — which also matches xterm.js's current direction (WebGL primary, DOM fallback). Dependency note. Run pnpm install after pulling (lockfile updated). The addon declares no peer constraint, so it installs cleanly against xterm 6.0.0. Tests. Added a vi.mock("@xterm/addon-webgl", …) double (jsdom has no WebGL) exposing onContextLoss/dispose; the existing 47 TerminalView tests pass, including the repaint-nudge test that still exercises the DOM-fallback path.

Agents Execution

Problem. A review seat routes its card only if it POSTs a pass/fail verdict to /complete (via the sh .skybolt/complete helper). On most cards the seat finished WITHOUT POSTing — typically a read-only reviewer that narrated its verdict in prose and ended its turn instead of running the helper — so the card parked as "Review inconclusive — review it manually." Systematic, not transient.
Idle nudge (primary, live seats). scheduler._check_running_sessions: when a review seat hits the idle timeout with no verdict, it is nudged ONCE (REVIEW_VERDICT_NUDGE, pasted via the existing _paste_then_submit) to run the completion helper, its idle clock is reset, and finalize is deferred one more idle window. A seat that ignores the nudge is finalized on the next expiry. Tracked per session in _review_nudged (cleared in forget_session).
Transcript recovery (fallback, dead seats). finalize_agent_session: before parking a verdictless review, _review_verdict_from_transcript scans the seat's scrollback for the verdict JSON it printed but never POSTed. Recovers ONLY an unambiguous single verdict — the prompt's own examples contain both values, so an echoed prompt or a waffling reviewer yields two and we decline (fail closed to the human park). ANSI-stripped; anchored on review_verdict: <value> so prose never matches.
Safety. A recovered verdict routes like a posted one EXCEPT a recovered pass never auto-merges — it always takes the human sign-off gate (_route_card_on_finalize(verdict_from_transcript=...)), since it is lower confidence than an explicit POST. A recovered fail's blocker note carries a provenance line; the outcome records verdict_source="transcript" for audit.
Tests. tests/test_dispatcher.py: pass/fail recovery routes the card (no inconclusive note), ambiguous transcript still parks, idle seat is nudged exactly once then parks, plus unit coverage of _review_verdict_from_transcript (single pass/fail, both-present declines, prose declines, ANSI).
Problem. When the board is idle because the ready cards are waiting on dependencies, the Agents tab showed only the one-line headline ("Agents are idle because the ready cards are waiting on dependencies.") with no way to see *what* must finish first. The blocking card titles existed only embedded in each card's free-text reason, truncated to the first three.
Structured blocker list. services/dispatch_diagnostics.py → _unmet_dependencies now returns {card_id, title, status} per incomplete dependency (was: titles only), and every card entry carries a blocking_dependencies list — populated only for the deps_unmet gate, [] everywhere else. The endpoint passthrough means GET /projects/{id}/dispatch-diagnostics surfaces it with no route change; the reason sentence is unchanged (still the truncated summary).
Renderer. The Agents tab's headline banner now lists, under the message, each blocked ready card and the unfinished dependency cards it needs, each tagged with the blocker's current board status (web, AgentsTabBody). Shown whenever a card is blocked on dependencies; hidden otherwise.
Tests. Engine: test_dispatch_diagnostics.py asserts the structured blocking_dependencies shape for a deps-unmet card and [] for a non-deps card. Web: AgentsTabBody.test.tsx asserts the breakdown lists the blocked card + dependency + status, and is omitted when nothing is dep-blocked.
Root cause cured (not just reported). The deterministic dispatcher takes one cross-process dispatch resource lease per project for the sub-second life of a pass (dispatcher._acquire_lease → run_dispatch → _release_lease). If the engine is hard-killed mid-pass (force-closed desktop app, crash, power loss), the lease is left held and — because _acquire_lease was a bare INSERT that returned lease_held on conflict — silently blocked every future dispatch pass for that project, with no error logged, until the scheduler's TTL reaper happened to release it (up to the 120s lease TTL, and only while the scheduler itself was healthy). This is the "agent-mode cards, no assigned agent, no activity, clean logs" stall.
Steal-on-acquire (_reclaim_expired_lease). Before contending, the dispatcher now reclaims a held dispatch lease whose heartbeat_at+ttl_seconds has elapsed. A normal pass holds the lease sub-second, so a lease older than its TTL is provably orphaned; a within-TTL lease is left alone so a live pass is never stolen from. Dispatch self-heals on the next pass (~poll interval) instead of waiting on the reaper. Logged at WARNING so the recovery is visible.
Boot reclaim (_reclaim_orphaned_dispatch_leases). On the engine's first unlocked loop the dispatcher releases every held dispatch lease across all projects — unconditionally, since no in-process pass can survive a restart. This covers the recent-crash case (lease age < TTL) that steal-on-acquire would otherwise delay by up to the TTL.
Safety. Both paths only ever touch resource_type='dispatch' leases; session/card/worktree/ file-scope leases are untouched. The boot reclaim runs once, before the first dispatch tick, so it cannot race a live pass.
Tests. tests/test_dispatch_lease_reclaim.py — steal-on-acquire reclaims an expired lease, a fresh (within-TTL) lease is preserved (live pass safe), boot reclaim releases a held lease regardless of age, and run_once over a stale lease returns ran (self-healed) instead of lease_held. Full test_dispatcher.py suite still green.
Problem. The scheduler and the deterministic dispatcher decide silently: an eligible-looking card nothing picks up, or a roster of agents all showing "Idle — watching the queue", is indistinguishable from "there is simply no work", with nothing in the logs. Recurring, hard to debug. Confirmed in the field: cards in agent mode with no assigned agent and no activity, agents idle for minutes, and the engine log clean (no errors, no claims — claims aren't logged).
New read-only service services/dispatch_diagnostics.py → explain_project_dispatch(conn, project). Side-effect-free; it replays the real eligibility gates and returns, per actionable card (queue/review/ready_for_merge), the first failing reason — human_owned, deps_unmet (with the blocking card titles), no_authority, no_role_agent, all_role_agents_busy, dispatch_locked, should_dispatch, awaiting_claim, or working — and, per active agent, whether it's busy or idle with a human reason. Mirrors dispatch_ops._oldest_eligible_card / run_dispatch and scheduler._claim_next.
Surfaces the silent dispatch-lease stall. When a card clears every gate yet still has no session, the service inspects the project's cross-process dispatch lease (resource_leases type dispatch) and reports it as dispatch_locked with age/TTL/staleness — the usual cause of a frozen-looking roster (a hard-killed engine never released the lease; it only self-heals when the scheduler's TTL reaper releases it, ≤120s).
Endpoint GET /projects/{project_id}/dispatch-diagnostics (routes/agents.py), read-only, any project member, safe to poll. Returns {cards, agents, dispatch_lease, summary}; summary carries a one-line headline when the board is idle for an explainable reason.
Renderer. The Agents tab now shows each idle agent's real reason in place of the generic "Idle — watching the queue", plus a headline banner when dispatch is stalled (web).
Precision (adversarial review). dispatch_locked is reported only for a *stale* lease, not a freshly-held one (a lease is held sub-second during every normal pass — a held-but-fresh lease just means a pass is in flight, so the card reads should_dispatch); the summary.dispatch_locked flag is likewise stale-gated. Idle-agent reasons are scoped to the agent's own lane (ROLE_LANE) so a Review agent is never told "other agents hold the eligible cards" about queue work only a Handyman could claim. The awaiting_claim message for a queued session now names the persistent scheduler-side gates (scope overlap, model cap, missing repo, seat budget) instead of implying a purely transient wait.
Tests. tests/test_dispatch_diagnostics.py covers human-owned, deps-unmet (with blocker titles), no-authority, no-role-agent, all-workers-busy, working state, the stale-dispatch-lease surfacing, the fresh-held-lease-is-not-locked case, lane-scoped idle attribution, and the endpoint contract.
Follow-up. This is diagnosis only — it does not change dispatch behavior. A separate hardening (clear held dispatch leases on engine boot + steal-on-expiry at acquire) would remove the silent ≤120s stall window entirely rather than just reporting it.
Bug. Moving a card out of an agent-actionable lane (Ready/In Progress/Review/Ready For Merge) back to a backlog lane (Planning / Long Term) left its agent session attached and non-terminal. The card kept its "Queued"/"Working" pill indefinitely (the pill is derived purely from the latest session's status, ignoring the column), and — worse — the orphaned session stranded the work: a queued session pinned the owning agent's single-flight slot (_agent_has_open_session) so that agent pulled no other Ready card, and an in-flight (claimed/running) session kept holding the card/worktree/file-scope leases the scheduler counts in _inflight_card_scopes, blocking scope-overlapping cards from being claimed. Review/merge sessions skip the scope check, so reviewers and mergers kept working while implementation agents sat idle. Only an assignee->human takeover (or landing in Done) cancelled sessions; a plain status move did not.
Route fix (routes/cards.py). update_card now cancels the card's open agent sessions on an actionable -> backlog (planning/long_term) status move, reusing the existing _cancel_card_agent_sessions path (releases leases, kills the seat, clears the pill). New AGENT_ACTIONABLE_LANES / AGENT_BACKLOG_LANES constants gate it; the journal records "Card moved back to a planning lane; active agent session cancelled." When the card returns to Ready the dispatcher seats a fresh session (resetting the attempt count).
Backstop (services/dispatch_ops.py). _reap_orphaned_sessions now also fails queued sessions whose card sits in a backlog lane (_BACKLOG_LANE_STATUSES), not just sessions whose card was deleted. This auto-heals already-stranded cards and covers status changes that bypass the route (e.g. a card move synced in from another device), unblocking the pinned agent on the next pass.
Scope note. This fixes the *stranded-session* stall. How many implementation seats run *simultaneously* once unstuck is still governed separately by file-scope serialization, per-model max_concurrency, and the global concurrency cap.
Tests. test_scheduler.py::test_move_to_planning_cancels_open_agent_session_and_terminal (route fix: seat torn down, leases released, session failed, journal note) and test_dispatcher.py::test_queued_session_on_parked_card_is_reaped (backstop: parked-card queued session reaped, agent re-seats a fresh session on return to Ready).
Supersedes the same-day "Automated merges land on the CHECKED-OUT branch" entry below. The checked-out-branch approach coupled agent behavior to transient UI state; the project's configured default branch is now the single integration branch for both forking and merging, independent of what the user has checked out.
Branch FROM the default (new). Agent worktrees previously branched from HEAD (whatever was checked out in the primary repo), which made the base unpredictable. _agent_worktree_base_ref now resolves the base ref to the configured default branch for normal sessions (a Merge Agent conflict seat still carries its own _merge_base_ref); both _launch and _launch_ssh pass it to add_worktree/add_worktree_ssh. Falls back to HEAD only when no default branch is configured or it isn't a local ref yet.
Merge INTO the default, checkout-independent (new). A new engine-internal git primitive _git_merge_into_branch (git_ops.py, registered in git_exec._GIT_ACTION_HANDLERS as merge_into_branch) lands a source branch on a target branch regardless of what's checked out: an in-place git merge when the target IS checked out (blocked if the working copy has uncommitted tracked changes), otherwise a throwaway managed worktree under .skybolt/worktrees/_merge-* where the merge runs via git -C and the worktree is always removed afterward. On conflict the merge is aborted so the target branch stays clean. _engine_merge_into_default and _land_seat_merge now call it with target_branch=<configured default> and close the source/seat branches afterward.
Security. merge_into_branch is deliberately absent from GIT_OPERATION_ACTIONS, so the user git-operations route and the agent action-tool surface both reject it (422 / denied) — only the engine's automated flow can invoke it via _execute_project_git_operation. It works on Local and SSH Machines through the shared handler map.
UI. The Default branch selector moved out of the Dev Ops settings modal to a prominent card at the top of the Dev Ops panel, with the note "Agents branch new work from this branch and merge it back here — regardless of what you have checked out." (DevOpsTab.tsx, DevOpsSettingsModal.tsx).
Problem. The scheduler's machine-wide cap on concurrently running agent seats (global_concurrency_cap) defaulted to 3. With a realistic roster (e.g. 5 Handymen) that throttled throughput so most agents idled after the first wave, since the cap is GLOBAL across all projects on a machine (shared admission control — over-cap sessions stay queued as backpressure, not failure).
Fix. DEFAULT_GLOBAL_CONCURRENCY_CAP (scheduler.py) is now 20, large enough to let a realistic roster run in parallel.
Per-machine override (new). The cap is now overridable per machine via a new account-level setting: category agents, key max_concurrent, an integer clamped to 1–24 (AGENT_CONCURRENCY_CAP_MIN/AGENT_CONCURRENCY_CAP_MAX). It is edited in the web app's Account Settings → new Agents tab (AccountSettingsModal), stored in the global DB account_settings table, and syncs across the user's machines like other account settings. Resolver: resolve_machine_concurrency_cap (store/settings_store.py); when more than one account is present on a machine it takes the MAX configured value.
Precedence (highest first). SKYBOLT_AGENT_MAX_CONCURRENT env var (unchanged power-user / CI override) → the account-level agents.max_concurrent setting → the built-in default (20).
Live refresh. AgentScheduler.refresh_concurrency_cap() re-reads the cap once per loop — called in _run just before tick_once (NOT in tick_once, which tests drive with a hand-set cap) — so an Account Settings change takes effect within ~1 tick without an engine restart. A transient DB read error leaves the current cap in place.
Schema. AccountSettingsUpdate (schemas.py) gained an optional agents field; the agents category was added to ACCOUNT_SETTING_CATEGORIES (store/settings_store.py).
Note. The cap is distinct from the separate machine PTY budget (SKYBOLT_AGENT_MAX_MACHINE_PTYS, default 24), which independently limits concurrent live PTYs.
Problem. On a full engine/app restart, recover() requeues each interrupted session (up to MAX_ATTEMPTS = 3) and force-removes its worktree, then rebuilds it on the same branch at its last commit. Committed work survives, but uncommitted work is lost and the relaunched seat starts with no memory of what it was doing — a long card re-does everything since its last commit. A naive progress file in .skybolt/ does not help: the seat's worktree (and any file in it) is destroyed on recovery, and .skybolt is in BLOCKED_WORKSPACE_PREFIXES.
Fix (engine, shipped). Step checkpoints make git the per-step checkpoint and hand a relaunched seat exactly where it left off:
- Commit-per-step prompt. scheduler._prompt_text's implement branch now tells the seat to plan the card as ordered steps, do one at a time, commit after each step, and run the checkpoint helper with its plan JSON. It also instructs verify, don't assume — re-read the files a step needs first, because the branch may have advanced (other merges, or a prior attempt).
- .skybolt/checkpoint helper (seat_files.checkpoint_helper_script, installed by install_seat_files_local / install_seat_files_ssh when a checkpoint_url is given) — a twin of .skybolt/complete, POSTing to POST /agent-sessions/{sid}/checkpoint with the same scoped completion token.
- checkpoint_agent_session route (routes/agents.py, body AgentCheckpointRequest) mirrors the normalized plan into agent_sessions.work_metadata_json["step_plan"] (the durable truth, preserved across requeue — no schema migration) and writes a visible local artifact <project_root>/.skybolt/agent-steps/<sid>.json (gitignored via agent-steps/ in git_sync.SKYBOLT_GITIGNORE_LINES). It does not finalize.
- Resume. On relaunch, _prompt_text injects a RESUME block from the saved plan (step_plans.format_step_plan_resume): completed steps with their commit shas, the next unfinished step, and the re-verify instruction.
- Cleanup. finalize_agent_session deletes the session's artifact; recover() runs a boot sweep (_sweep_step_artifacts) that drops artifacts for done/failed/absent sessions and keeps requeued ones. Result: a crash costs at most one step of rework.
- New module services/step_plans.py owns the canonical plan shape, the resume-block formatter, and the artifact lifecycle. SSH targets keep the plan only in the DB mirror (no on-disk artifact).
- Tests: tests/test_step_plans.py (helpers), tests/test_seat_files.py (checkpoint helper install), and tests/test_scheduler.py (endpoint mirror + artifact, finalize cleanup, resume injection, boot sweep). See documents/decisions/adr-0076-agent-step-checkpoints-and-crash-resume.md.
Problem. The automated merge always targeted the configured default branch (git_settings_json.default_branch, falling back to main) and required the primary working copy to be checked out on it. A user working on an integration branch like dev (with main still set as the default) saw every merge skipped — "the repository is on 'dev', not the default branch 'main'" — even though dev was exactly where they wanted the work merged.
Fix. The merge target is now the branch currently checked out in the primary working copy (the branch the Dev-Ops panel shows as current), resolved by _merge_primary_gate. Check out dev and the agent merges into dev; the configured default branch is now only the base agents create feature branches from, not the merge destination. _engine_merge_into_default returns the resolved target so the conflict seat branches from it (_merge_base_ref) and _land_seat_merge lands the resolved merge onto it. The block-to-human gate now trips on a detached HEAD (no branch to merge into) instead of "not on the default branch"; the dirty-tracked-files and missing-source-branch guards are unchanged. Still local-only — never pushes.
Problem. The Merge Agent ("Mac") ran in its own isolated worktree and was told to git checkout <default-branch> there — git refuses to check out a branch already checked out in the primary working copy, so the merge never landed locally and the seat instead ran git push origin <default-branch>, pushing to GitHub. The card's feature branch and worktree were also never deleted after a merge, and the user wanted no auto-push at all.
Fix. A card in ready_for_merge now merges into the local default branch configured in the Dev-Ops panel and never pushes. Two paths, both runner-agnostic (Local + SSH) via _execute_project_git_operation, both off the event loop with no DB transaction held:
- *Clean path (no LLM seat):* scheduler._run_merge_session (top of _launch/_launch_ssh) gates the primary working copy (on the default branch, no uncommitted tracked changes — untracked files like the engine's own .skybolt/ are ignored), then reuses the Dev-Ops "Merge & close" primitive _git_merge_close against the primary repo to merge the source branch in and delete it + its worktree. The card completes with no seat launched.
- *Conflict path (seat resolves, engine lands):* on a conflict the engine aborts (restoring a clean default branch) and seats Mac in a worktree branched from the default branch (add_worktree(base_ref=default_branch)). Mac runs a purely-local git merge <source> + resolve + commit (no fetch/checkout/pull/push); finalize_agent_session then fast-forwards the local default branch onto the seat's resolved branch (_land_seat_merge) and deletes the original source branch + worktree.
- *Block-to-human gate:* if the primary is not on the default branch, has uncommitted tracked changes, the source branch is missing, or the land cannot be applied, the card stays ready_for_merge, flips to assignee_type='human', and gets a blocker note — the working copy is never touched. _route_card_on_finalize takes a merge_landed flag so a seat pass whose land failed routes here instead of completed.
Persona. The merge_agent ("Mac") instructions + definition-of-done were rewritten (canonical apps/api/app/data/builtin_personas.json, regenerated personas_data.py): local-only conflict resolution on a default-based branch, never push.
Problem. A failed review records its reason as a blocker agent_note ("Review failed"), which IS injected into the rework agent's next prompt. But those blockers were only resolved when the card reached completed — a review-fail card returns to queue, so every failed cycle left its blocker active and stacked another. After a few rounds the rework prompt carried several "Review failed" notes, all marked "MUST follow", including ones the agent had already addressed. The agent looked like it was ignoring feedback (it was re-shown old, fixed objections) while the reviewer had moved on to new gaps — the reported "loop where the agent says it's done but nothing the reviewer said is fixed."
Fix (scheduler.py _route_card_on_finalize). When a review session finalizes with a real verdict (pass OR fail), the card's prior active "Review failed" blockers are resolved before the new verdict is applied — a fresh verdict supersedes the previous review's objections. On fail the new specific reason is then added as the single active blocker; on pass all prior review blockers clear. Inconclusive (no verdict) leaves them untouched (nothing was evaluated). The shared REVIEW_FAILED_NOTE_TITLE constant keeps the insert and the supersede-cleanup matched. Resolved notes are tracked for sync and retained as history (status resolved, not deleted).
Problem. When a review seat parked a card as "Review inconclusive", there was no way to tell *why*: the engine logged only to stdout, and the desktop shell's stdout mirror (<data-dir>/logs/skybolt-engine.log) stayed empty because Python block-buffers stdout when it is a pipe (not a TTY), so lines never reached the shell's reader. The only durable record (seat transcript, signal, verdict) lived in the SQLCipher-encrypted project DB.
Fix — durable logging. app._configure_engine_logging now (1) line-buffers stdout/stderr so the desktop mirror fills promptly, and (2) attaches an engine-owned rotating file handler at <data-dir>/logs/engine.log (5 MB × 3, UTF-8/errors=replace, flushed per record) so diagnostics are durable even in dev (no shell) and independent of the IPC pipe. Handlers are attached to BOTH logger trees in use (skybolt.* and the package-qualified skybolt_engine.*, e.g. the agents route) so no module is dropped. SKYBOLT_LOG_LEVEL (e.g. DEBUG) overrides the INFO default for triage.
Fix — seat-completion instrumentation. Added targeted log lines at every point the review-inconclusive incident was opaque: finalize_agent_session logs each session's signal/role/verdict and WARNs when a review seat yields no usable verdict (recording the signal, the raw verdict actually sent, and leftover payload keys — this disambiguates "seat never POSTed" from "seat POSTed a malformed verdict"); the terminal_exited / idle_timeout auto-finalize triggers log when a seat is torn down; the /complete route logs accepted POSTs and WARNs on token-rejected POSTs (a reverse-tunnel POST with a bad/expired token previously 403'd silently and looked identical to no POST at all); and SSH reverse-tunnel port acquisition logs, WARNing on span-exhaustion fallback (a collision there kills the seat on launch via ExitOnForwardFailure, surfacing as a no-verdict finalize — raise SKYBOLT_SSH_REVERSE_PORT_SPAN).
Problem. Seat secrets were delivered only as environment variables — the completion token/URL (SKYBOLT_COMPLETE_TOKEN/SKYBOLT_COMPLETE_URL) and the GitHub credential (GH_TOKEN + GIT_CONFIG_*). Some agent CLIs do not pass the seat process's environment to the shell commands they run (confirmed for Codex, whose app-server brokers command execution), so the agent's own curl/git saw empty vars: review seats couldn't POST a verdict ("Review inconclusive") and couldn't git fetch a private repo ("Repository not found"), even with a valid PAT configured.
Fix. Deliver the same two secrets as FILES in the seat worktree's git-ignored .skybolt/ dir (new services/seat_files.py; SSH mirror in ssh_ops.py):
- .skybolt/complete (mode 0700) — a POSIX-sh helper with the URL + scoped token embedded; the agent runs sh .skybolt/complete <body-file> (or pipes JSON on stdin) to POST its completion. No env needed.
- .skybolt/git-credentials (mode 0600) — the project PAT in git-credential format, fed to git by a github.com-scoped credential helper installed into the repo config. The helper line is identical for every worktree and contains NO token — it dynamically cats the requesting worktree's own .skybolt/git-credentials (git rev-parse --show-toplevel).
- Installed at every launch site (local + SSH, worktree and no-worktree branches) before the seat starts; the completion URL is now built by one _completion_url helper shared with _seat_environment. The 4 completion prompt blocks (implementer / report-only / review / merge) now instruct the helper instead of a raw curl $SKYBOLT_COMPLETE_URL, and COMPLETION_SHELL_HINT is no longer appended.
Cleanup. finalize_agent_session removes the two files (SSH + local) on if worktree_path, independently of — and in addition to — worktree removal, so a no-worktree/report-only seat (whose worktree_path is the main repo, never torn down) doesn't leak the secrets. Best-effort.
Safety. Secrets are kept out of git via .git/info/exclude (resolved with git rev-parse --git-path info/exclude so worktrees update the shared exclude) — NEVER .skybolt/.gitignore, whose contents keep the encrypted DB committable. The PAT never lands in .git/config (only the token-free helper) or in any logged command. Env-var delivery is kept in parallel (belt-and-suspenders for CLIs that do propagate env).
Tests. tests/test_seat_files.py (builders, single-quote/newline guards, local install/cleanup against a real git init repo incl. exclude + mode + gitignore-untouched assertions); test_scheduler.py SSH-launch-installs-handoff and local-install-then-cleanup-on- finalize tests. Affected + integration suites: 267 passed; full engine suite: 1136 passed.
Problem. A seat could linger indefinitely, holding a worktree and a remote tmux session. The teardown paths all have blind spots: /complete never fires if the agent CLI can't read the completion env (e.g. Codex doesn't pass the seat env to its command shell), terminal_exited never fires for a tmux-backed SSH seat (tmux keeps the remote process alive after the ssh channel detaches), and idle_timeout watches *local* PTY output which a detached seat no longer produces. The SSH orphan reaper deliberately keeps any tmux session whose terminal_sessions row still exists (so it never kills a peer's live session on a shared box), so a never-finalized seat's tmux survives. Observed: seats from the previous day still running, detached.
Fix. New _reap_overlong_seats runs in the maintenance pass next to _reap_stale_leases. For THIS machine's owned sessions in an active lease state, once age since claimed_at exceeds seat_max_runtime_seconds it tears the seat down exactly like the TTL reaper — terminate the local PTY, _kill_seat_tmux_session (kills the remote tmux, not just detach), _cleanup_worktree, then _release_session (requeue, or fail-to-review past the attempt limit). Scoped to owned machines so it can never touch a peer's live seat. It is a backstop independent of heartbeat freshness — the heartbeat is bumped by the engine every tick for any running row, so it is not a liveness signal for the seat itself.
Config. SKYBOLT_SEAT_MAX_RUNTIME_SECONDS (default 7200 = 2h; 0 disables). Generous by design: the 30-min idle_timeout handles genuinely-idle seats; this only catches seats that are both long-running AND evading idle/exit detection (i.e. detached/stuck).
Tests. test_reap_overlong_seats_tears_down_stale_seat_with_fresh_heartbeat (trips with a FRESH heartbeat + old claim, proving independence from the TTL reaper; asserts the remote tmux kill, lease release, requeue, and card returned to queue) and test_reap_overlong_seats_disabled_when_cap_is_zero.
Follow-up. This bounds the damage but doesn't fix the cause for Codex seats — the real fix is file-based completion/credential delivery (see the SSH GitHub-token entry below and the Codex env discussion) so seats complete normally instead of being reaped.
Bug. Agents running on an SSH Machine authenticated none of their own git: git fetch/git push against a private GitHub repo failed with "Repository not found" even when the project's GitHub PAT was correctly configured and valid. Most visibly, review seats couldn't fetch the branch under review, so they reviewed the wrong/empty local state, withheld a pass/fail, and the card parked as "Review inconclusive."
Root cause. The GitHub credential (github_seat_env → GH_TOKEN + a github.com credential helper) is merged in _seat_launch_env, but only the local seat launch used that builder. Both SSH launch sites (scheduler.py, the no-worktree and worktree branches of _launch_ssh) called _seat_environment directly, which carries only the completion vars — silently dropping the GitHub token (and the per-kind _seat_env_overrides, e.g. GLM's base URL). The github_credentials docstring claimed it "works identically for local and SSH seats," but the SSH path was never wired to it.
Fix. _seat_launch_env now accepts complete_url_port (the SSH reverse-tunnel port, the only reason the SSH sites bypassed it) and threads it to _seat_environment. Both SSH launch sites now build their env through _seat_launch_env, so local and SSH seats get an identical env: completion vars + per-kind overrides + GitHub credential.
Tests. test_ssh_seat_launch_injects_github_credential (test_scheduler.py) drives a full SSH claim+launch with a project token set and asserts the remote seat command exports GH_TOKEN + GIT_CONFIG_COUNT (it would not on the old _seat_environment path). Existing _seat_launch_env / SSH-argv / local-claim tests still pass.
Note. A user-set PAT is the whole contract — no per-machine git setup is needed. The interactive SSH login is a separate session that never sees the seat env, so testing with git ls-remote origin from your own shell does not exercise the injected credential.
What. The seat terminal a card's agent runs in is shown live and read-only on the card — the Card Detail modal's "Agents" tab, under "Activity" — instead of in the Terminals grid. Open a card to watch its agent; scheduler seats no longer clutter the Terminals page or pop in/out as cards dispatch and complete.
How. A card resolves its seat from the project's seat-terminal list by matching terminal.agent_session_id (or the session's engine_terminal_id) to one of the card's agent_sessions, preferring a live one, then attaches read-only via the normal GET .../terminals/{id}/attach. No engine change: the linkage (agent_sessions.card_id → engine_terminal_id → the control_mode='agent' seat row) and the attach route already existed; this is a renderer surfacing change.
Card status is unchanged. A card still moves to In Progress only when the scheduler *claims* the session (atomic with the seat going working and the worktree being created — scheduler.py:_claim_next). A card sitting in Ready with an assigned agent has a queued session (no worktree yet); that is correct, not a stuck board.
Files (renderer). surfaceShared.ts (isSchedulerSeatTerminal), ProjectSurface.tsx, TerminalView.tsx (readOnly), CardDetailModal.tsx. See agent-terminal/changelog.md for the full file list and the grid-filtering details.
Bug. Scheduler-seated review (and merge/report) agents finished but never posted a verdict, leaving cards stuck in review. The seat reported SKYBOLT_COMPLETE_URL and SKYBOLT_COMPLETE_TOKEN "missing from this shell environment."
Root cause. The vars are NOT missing — the scheduler injects them into the seat PTY environment (confirmed for the local Windows ConPTY/pywinpty spawn and the SSH export path). The completion prompt only showed POSIX $NAME syntax, which expands to empty in PowerShell ($env:NAME) or cmd (%NAME%), so a seat checking with the wrong shell read an empty string and gave up.
Fix. COMPLETION_SHELL_HINT (scheduler.py) is appended to the review/merge/report/ generic completion instructions: it names both env vars, gives per-shell syntax ($NAME / $env:NAME / %NAME%), and tells the seat that an empty-looking value means wrong-shell syntax, not a missing variable — never report it missing or skip the POST.
Tests. test_completion_prompts_spell_out_cross_shell_env_syntax asserts each static completion prompt names both vars and includes the PowerShell + cmd syntaxes.
Note. Only scheduler-launched seats receive the completion token; terminals given a card via the task feed (or user-opened agent terminals) cannot post a verdict by design.
Bug. A card's branch is derived from its title and reused across attempts. If that branch was already checked out in a worktree the engine must not touch — the main repo checkout (fatal: 'feature/…' is already used by worktree at '/workspace/repos/…'), or a worktree the user set up — git worktree add failed on every attempt, so the card requeued 3× and failed permanently with no recourse. The SSH path had no reclaim logic at all; the local path reclaims only stale .skybolt-managed worktrees (and correctly refuses to nuke an unmanaged one).
Fix. When the branch is held by a worktree we won't reclaim, add_worktree/add_worktree_ssh now branch the work onto a fresh, session-stable fallback (<branch>-<session8>) and proceed — never touching the held branch/worktree. The actual branch is returned and the scheduler honors it for the seat prompt, the commit, and the merge; a requeue of the same session resumes the same fallback (attach-if-exists). The SSH path also gained the managed-sibling reclaim the local path already had (a stale per-session worktree under the same root is force-removed and retried).
Tests. test_add_worktree_falls_back_when_branch_held_by_unmanaged_worktree (replaces the old "fails cleanly" assertion) and test_add_worktree_fallback_branch_is_stable_per_session.
Note. This makes a stuck branch non-fatal, but the root condition — the remote's main repo checkout sitting on a feature branch — is worth clearing (git -C <repo> switch <default>): the worktree model assumes the main checkout stays on the default branch so feature branches stay free.
Bug. Every SSH agent seat opened its /complete reverse tunnel with ssh -R {engine_port}:127.0.0.1:{engine_port} — the engine's single fixed port, identical for all seats — under ExitOnForwardFailure=yes. So the moment two seats ran on one remote machine (two Hanks, or a Hank + the reviewer — i.e. any real concurrency), the second's -R bind failed ("remote port forwarding failed for listen port N") and ssh exited *before the CLI started* → a ~6s terminal_exited death. The premature-exit guard surfaced it on the card as "Agent seat failed to start … remote port forwarding failed". It blocked running multiple agents concurrently on one SSH machine.
Fix. Each SSH seat now gets its OWN remote reverse-tunnel listen port, unique among the active SSH seats (AgentScheduler._acquire_reverse_port, released in forget_session), forwarding back to the engine's single HTTP port: -R {unique_listen}:127.0.0.1:{engine_port}. The seat's SKYBOLT_COMPLETE_URL rides that listen port (_seat_environment(complete_url_port=…)). Range is configurable (SKYBOLT_SSH_REVERSE_PORT_BASE/_SPAN, default 61000–61255, above the Linux ephemeral range so binds rarely race a remote outbound connection). Reattach re-spawns the stored argv verbatim, so the port is preserved on reconnect, and forget_session only releases on real teardown.
Tests. test_seat_argv_reverse_tunnel_listen_and_target_ports_differ (ssh_persistence), test_acquire_reverse_port_is_unique_and_released, test_seat_environment_complete_url_uses_tunnel_port_for_ssh; updated the SSH claim/launch argv test to assert the new <unique listen>:127.0.0.1:<engine port> shape.
Problem. Agent seats launched with only SKYBOLT_* env vars and no git credentials, and every git path sets GIT_TERMINAL_PROMPT=0, so an agent's own git fetch origin --prune over HTTPS failed immediately ("requires a password") — on whichever machine the seat runs (local or the SSH remote).
Feature. A project can now store a GitHub token, managed by project-admins on the DevOps settings page. services/github_credentials.py stores it in the OS keychain (runtime resolver) and mirrors it to the encrypted secret_vault for zero-knowledge sync. _seat_launch_env resolves it fresh at launch (never persisted in the session row) and injects GH_TOKEN + a GIT_CONFIG_* credential helper scoped to https://github.com, so the agent's git fetch/git push authenticate non-interactively. One builder covers both local and SSH seats (SSH already exports the env on the remote; the keys pass _is_safe_env_key).
API. GET/PUT/DELETE /projects/{id}/github-credential (admin for writes); the token is write-only — endpoints only ever return {is_set}. UI: a self-contained GithubCredentialPanel in the DevOps settings modal (masked input, set/unset status, Clear, and a note recommending a fine-grained repo-scoped PAT — agents run with full permissions).
Security. Token never persisted in agent_sessions, never returned, never logged (added a github_pat_ redaction pattern for fine-grained tokens; classic ghp_/gho_… were already covered). secret_vault.delete_secret is now wired (via clear_github_token) and tombstones the removal.
Tests. test_github_credentials.py (seat-env content, keychain round-trip, SSH-allowlist keys, redaction); test_scheduler.py seat-env injection (present with a token, absent without). Frontend typecheck + lint pass.
Limitations / follow-ups. HTTPS + github.com only (SSH-key remotes and GitHub Enterprise hosts are follow-ups); the credential is project-scoped.
Bug. A work-producing seat that exited almost immediately (the CLI failed to launch — missing/unauthenticated on the target, or it refused its launch flag and exited; common on remote SSH seats) was finalized via the terminal_exited fallback and its card marked completed with an empty diff. A launch failure looked like finished work, and review/merge seats that died the same way defaulted to "no verdict" — making a broken seat indistinguishable from a real completion.
Fix. finalize_agent_session now guards the terminal_exited path: a non-review/non-merge, work-producing seat that exits within SKYBOLT_SEAT_PREMATURE_EXIT_SECONDS (default 60s) having produced no file changes is treated as a failed launch — requeued while attempts remain (_finalize_premature_exit, mirroring _release_session), else failed with the card escalated to review. The seat's last output is captured in the journaled/coordination reason so the operator can see *why* it died. The maintenance loop buckets these as requeued, not completed. The time bound is the safety margin: model latency means a genuine agent cannot do and commit work this fast, and real completions arrive via explicit /complete after minutes. 0 disables the guard.
UI surfacing. When retries are exhausted and the card escalates to review, the failure reason (including the seat's captured output) is written as a blocker agent-note. The board already renders blocker notes as a warning box on the card face (ProjectBoardCard / cardFaceNotes), so a dead-on-arrival seat is now obvious at a glance instead of buried in the session journal — no new frontend needed.
Tests. test_premature_seat_exit_requeues_instead_of_completing (requeue path) and test_premature_seat_exit_at_attempt_limit_blocks_card_for_review (fail → review + blocker note); test_terminal_exit_is_detected_as_claimed_completion now disables the guard to isolate the raw terminal_exited completion path; _finalize_premature_exit added to the sync-delete-coverage allowlist (agent-control seat terminals are raw-inserted and never synced, so deleting one needs no tombstone — same as its sibling finalize/release functions).

Cross-Machine Sync

Goal. Startup should sync only ACTIVE projects. An archived project should travel just enough to appear in the list and be unarchived (its name), and its working data should stay local until the user unarchives it — at which point it syncs instantly.
Fix — services/sync_engine.py::reconcile. Both project enumerations are now gated on project_registry.status != 'archived': the per-project push/pull/backfill loop, and the local_projects list used to backfill project-scoped GLOBAL-db entities (model_providers, catalog_models, model_routes, seat_profiles). Archived projects therefore push/pull NOTHING from their per-project scope. Their account-scope project_registry record (name + status) still syncs via the unchanged account backfill, so they remain listed and unarchivable.
Unarchive is instant and works on any machine. _materialize_local_projects is left unchanged, so an archived project pulled onto a fresh machine still gets an empty per-project DB + skybolt_path and can be opened/unarchived there (the projects row self-heals from the registry via _ensure_project_settings_row on first open). Unarchiving (PATCH /projects/{id} with status: 'active') flips the registry status and nudges request_sync; the next pass re-enumerates and — because the account pull applies the 'active' status before the loop — backfills + pushes + pulls the project's full contents in one pass.
Scope note. Pull and per-project-DB push are fully gated (the loop is skipped). The shared GLOBAL-db outbox drain (push_db(g)) could still upload an archived project's project-scoped GLOBAL row if one were enqueued by a live edit, but archived projects are not editable through the UI and backfill no longer enqueues them, so this is not reachable in normal use.
Tests. tests/test_sync_engine.py::test_reconcile_skips_archived_project_contents_and_syncs_on_unarchive drives reconcile end-to-end through the FakeCloud: an archived project's projects/cards records never reach the cloud while its project_registry record does; flipping status to 'active' and re-running the pass uploads its contents.

Git Operation Surface —

Added _git_merge_into_branch (git_ops.py), registered in git_exec._GIT_ACTION_HANDLERS as merge_into_branch, so it runs runner-agnostically on Local and SSH Machines. It merges branch=<source> into target_branch=<target> and lands the result on the target branch no matter what the working copy has checked out: an in-place git merge when the target is the branch checked out in the primary (blocked if it has uncommitted tracked changes), otherwise a throwaway managed worktree under .skybolt/worktrees/_merge-* where the merge runs via git -C and the worktree is always removed afterward. On conflict the merge is aborted so the target branch stays clean, and the conflicted paths ride back as state="merge_conflict". Never pushes; the source branch is closed by the caller, not here.
Engine-only by design. merge_into_branch is deliberately absent from GIT_OPERATION_ACTIONS, so the user route (routes/git.py) and the agent action-tool surface (agent_action_tools.py) both reject it (422 / denied). Only the automated merge flow invokes it via _execute_project_git_operation. It powers scheduler._engine_merge_into_default and _land_seat_merge, which now merge into the configured default branch instead of requiring it to be checked out. See agents-execution/changelog.md for the end-to-end behavior change.
Fixed the card-detail Changes tab showing far more than the card's own work — it spilled mainline / other cards' commits instead of just what this card's worktree changed. The card_diff (and sibling compare_branch) op runs git diff base...head; on a local execution target the base was left unset, so _git_card_diff fell back to _git_default_branch (a git symbolic-ref refs/remotes/origin/HEAD probe). On the many local repos with no origin/HEAD set that probe degrades to *the branch the main repo happens to be checked out on* — and if that branch diverged before the card's fork point, the three-dot merge-base reaches back too far and the diff balloons.
The local branch of _execute_project_git_operation now seeds base = _configured_default_branch(_project_git_settings(project)) for both card_diff and compare_branch, mirroring what the SSH branch already did. This is the same DevOps-configured default branch worktrees fork from and merge back into, so the diff now isolates exactly the card's commits. When no default branch is configured the base stays unset and falls back to the probe as before (no regression).
Regression test: test_card_diff_uses_configured_default_branch_as_base puts the main repo on a stale branch that diverged before the fork point and asserts the diff base is the configured main and the patch carries only the card's file.
Fixed the Dev Ops merge-conflict resolver re-popping on every tab re-entry after a conflict had already been resolved and pushed. The resolver's liveness check (devOpsValues.mergeConflictView) previously only cleared a stale merge/merge_close op (whose result_metadata.state == "merge_conflict" lingers in the local, non-synced git_operations log) when a NEWER conflicts probe reported the merge over. But conflicts only runs as a follow-up to an in-app merge_continue/merge_abort — it is NOT part of the Dev Ops readiness sweep — so a merge completed any other way (terminal, external git) left the stale op driving the resolver open forever.
git_ops._git_status_metadata now reports merge_in_progress (a git rev-parse -q --verify MERGE_HEAD check), so the recurring local_changes op — and the read-only GET /git-status poll — carry the authoritative "merge still in progress?" bit. GitStatusResponse gained the matching merge_in_progress: bool field. mergeConflictView now also clears a stale merge-conflict op when the latest local_changes op (which the readiness sweep re-runs on every Dev Ops entry/focus) is at least as recent as the merge op and reports merge_in_progress: false (strict === false, so an older probe predating the field can't falsely dismiss a live conflict).
The resolver's auto-open dedup key (DevOpsTab) is now persisted per project (devops:merge-auto-opened:<projectId>), mirroring the operation-log auto-open, instead of an in-memory ref that reset on remount — so re-entering the tab can't re-pop or flash-open the resolver for an op that already opened.

2026-06-27

Agent Terminal —

Strengthened the xterm GPU repaint nudge in TerminalView so it now also toggles an imperceptible translateZ(0) on the terminal host container, not just the .xterm-screen child. WebView2 caches the stale glyph raster on the composited host tile; under the planning console's deep static nesting the child-only toggle was a no-op, so streamed TUI output rendered as "random letters all over." The host nudge re-rasterizes the actual cached tile (the automatic equivalent of the manual window resize). Cleared on unmount.
The normal terminal-open path now pre-trusts the Antigravity (agy) workspace via a shared prepare_seat_workspace_trust helper (extracted from the scheduler), for local targets only. Previously only scheduler-launched seats got this, so a planning-console agy terminal would block on agy's interactive "trust this directory?" prompt. Why. The planning console reuses the shared TerminalView, but the existing nudge fixed only the shallow grid layout; and agy's directory-trust gate was never applied to directly-opened terminals. Tests. Extended the TerminalView nudge test to assert the host container is nudged alongside the screen layer; the engine seat-trust suite already covers ensure_antigravity_workspace_trusted, now reused by the shared helper.

Agents Execution

Critical engine repair. Five Python-2-style except A, B: clauses (in routes/terminals.py, routes/chat.py, routes/browser.py, and services/model_client.py) made the entire engine fail to import on Python 3, so any rebuilt sidecar was dead on launch. Parenthesized all of them (except (A, B):); the full engine suite now collects and passes again.
Manual "Send prompt to console" in the AI Planning Session CLI console. Automatic seat-readiness detection can't see past a seat's unbounded interactive sign-in (Antigravity especially), so the planning prompt could be auto-sent before the CLI could accept it and be lost. The console now shows a Send prompt to console button: the user watches the live console and sends (or re-sends) the exact prompt once the CLI is ready, bypassing the readiness gate (the user is the signal) via a new skipReadinessWait option threaded through runPlanningSessionAiTurn → sendPlanningPromptToTerminal. Clicking it bumps the modal's session generation so it supersedes any in-flight (stuck-polling) turn.
Antigravity readiness heuristic hardened (planningTerminalOutputLooksReady): it no longer treats a bare > / › prompt char or the word "workspace" as ready (those appear on the sign-in and workspace-trust screens); readiness now requires a real post-login marker. Engine-side seat-ready pattern/settle changes are deferred until a real agy PTY transcript is captured.
Runtime actions now consistently require _project_can_run across terminal, command, Git, hosted chat, browser, startup-command, manual agent-session, prompt-package, and agent-terminal paths. Viewers retain read-only list/transcript access but receive 403 "Project run access required" before launches, writes, leases, model runs, or teardown actions.
Git runtime checks now use the canonical agent_operator role through _project_can_run instead of the legacy hard-coded operator role.
Terminal, agent-terminal, and hosted-chat attach responses use short-lived scoped ws_token query tokens in WebSocket URLs; the global engine session token is no longer returned in those URLs, and attach tokens are hashed in memory.
Local-CLI planning now calls POST /projects/{project_id}/terminals/{terminal_id}/seat-ready before sending the planning prompt. The engine watches the actual PTY for a known ready pattern or an output-settle window, matching the scheduler's safer seat handoff path.
Antigravity planning uses a more conservative readiness policy: 90s ceiling, 5s quiet window, and no instant-ready fallback. If readiness never confirms, the workspace surfaces an actionable error instead of pasting into a startup, sign-in, or trust prompt.
seat_ready_pattern now resolves commands with arguments such as codex --no-alt-screen and claude --model opus, so readiness detection does not fall back to the generic pattern just because the launched command has flags.
Replaced the Project Board's separate AI Planning modal entry with a board-integrated Planning Workspace. The board toggle switches between Kanban columns and the workspace; planning sessions remain chat_threads.kind = "planning" with planning_state snapshots.
Planning model selection still reuses ChatChoice, but now surfaces every active supported CLI seat alongside enabled API providers and catalog models. Supported local planning seats are codex, claude_code, antigravity, glm, kimi, and mmx.
Added planning choice presentation metadata: API/seat kind, repo/tool capability, readiness, and light cost hints. The workspace persists default_planning_choice separately from the normal chat default.
Extended staged planning cards with key, card_type, work_mode, estimate, file_scope, priority, and depends_on. The details editor can adjust these before Add Selected/Add All.
Materialization now creates typed Planning cards in generated order, preserves file scope and priority, then posts ordering dependency edges after resolving depends_on keys.
Engine card create/update accepts file_scope and priority. New dependency API: GET /projects/{project_id}/card-dependencies, POST /projects/{project_id}/card-dependencies, and DELETE /projects/{project_id}/card-dependencies/{dependency_id} with self-edge, duplicate, cross-project, missing-card, and cycle rejection.
Machine probes and terminal launch mapping now expose readiness for newer local seat kinds: glm, kimi, and mmx.
Tests updated across engine and renderer: card/dependency route tests, SSH probe tests, planning parser tests, choice presentation tests, staged-card editor tests, terminal/chat choice tests, and the planning-focused SkyboltApp flow tests.
The older planning-drafts client helpers remain as documented drift. The board workspace does not call those routes.

Browser Redesign

Added an Open in browser button to the active-session header in DesktopChromiumBrowserView that opens the current session URL in the user's system default browser. The Tauri webview blocks window.open, so on desktop it routes through a new engine-shell command open_external_url (main.rs) that validates the URL (parse_external_url: http/https only, no embedded credentials) and calls app.shell().open(...). The Rust call bypasses the JS ACL, so no capability/permission JSON change was needed; the renderer falls back to window.open in a plain browser (dev) build. Shell::open is deprecated upstream, so the command carries #[allow(deprecated)] to keep CI's cargo clippy -- -D warnings green. Covered by a Rust external_url_policy unit test.

Cloud Control Plane —

Removed the legacy runner/PWA control-plane route module and schemas from the mounted Cloud API surface. OpenAPI regression coverage now asserts retired blob sync routes and /runners paths are absent.
Added Alembic migration 0033_drop_legacy_runners.py, which drops runner FKs/columns plus the legacy runners, runner_enrollments, runner_pairing_sessions, and project_environments tables on upgrade and recreates the schema shape on downgrade.
Hardened /api/v1/sync/push payload validation: live records require ciphertext and tombstones must carry ciphertext=null.

Cloud Deployment

Added the CI security baseline to .github/workflows/ci.yml: CodeQL SAST, a stdlib secret scan, pnpm audit, pip-audit over the Python apps, and a static production container-policy scan. The container scan checks deploy/production/docker-compose.prod.yml, apps/api/Dockerfile, and apps/site/Dockerfile for privileged/host/raw-socket/device/socket/writeable-host-bind cases without starting Docker or containers. The repo-level scripts/check.ps1 and scripts/check.sh mirror the new gates.

Cross-Machine Sync

Cloud sync push validation now rejects live records without ciphertext and tombstones that include ciphertext. The server keeps the zero-knowledge contract explicit at the Pydantic schema boundary before route logic writes any rows.

Dashboard

Added the engine GET /accounts/{account_id}/dashboard/stats aggregate and wired the renderer dashboard to prefer it while preserving deterministic mock stats as an unavailable-endpoint fallback.
Updated tests to cover supplied DashboardStats rendering and the engine aggregate contract.

Execution Engine

Project-bound cwd validation. Command runs, startup command working directories, and terminal profile cwd settings now reject absolute paths and escaping relatives. Local targets resolve through canonical project-root checks; SSH targets use strict POSIX relative normalization. Runtime launch paths repeat the check for legacy rows and fail closed without opening a shell outside the selected project.
Scoped terminal/chat WebSocket attach URLs. Terminal, agent-terminal, and hosted-chat attach responses now return short-lived path-scoped ws_token query tokens instead of embedding the global engine session token in direct_wss_url. Terminal and chat attach/URL tokens are stored hashed in memory and reject expired or path-mismatched presentation.
Project-scoped Docker broker backend. Added routes/docker.py and services/docker_broker.py with mediated Local Machine Docker/Compose APIs: Compose validate/up/down under the project root, stable isolated Compose project names (skybolt-<project-hash>), generated local-only override files with Skybolt labels and NET_RAW dropped, project-filtered container listing, project-owned container exec, and container log capture. Raw Docker/Compose stdout/stderr and logs are written only to machine-local docker-output files beside the engine database; API responses return metadata only. Policy blocks privileged mode, host network/PID/IPC/user namespace, explicit high-risk capabilities, unconfined security profiles, host device mounts, Docker socket mounts, external networks/volumes, container_name, and outside-project bind mounts. SSH Docker remains blocked until remote policy/log custody is implemented. Tests: tests/test_docker_broker.py fakes the Docker CLI so no daemon, container, or Compose stack is started.

Git Operation Surface —

Added worktree_prepare to the accepted Git operation actions. It prepares a per-card worktree and branch, records branch_name and worktree_path on the card after success, and does not launch an agent session.
Manual prepare is non-destructive: existing occupied worktree paths return blocked instead of being force-removed.

Public Site

Wired the public waitlist form to the new POST /api/waitlist endpoint, with local email validation, duplicate-aware success copy, rate-limit handling, and service-unreachable copy.
Public-site CI is now covered by the repo security baseline: CodeQL scans the JS/TS surface, pnpm audit --prod --audit-level high audits workspace production dependencies, and scripts/security-container-scan.py checks apps/site/Dockerfile for dangerous container patterns without starting Docker.
Upgraded the public-site Astro dependency from 5.18.2 to 6.4.6 to clear the high-severity Astro advisories surfaced by the new audit gate. This raises the effective Node floor to 22.12+.

2026-06-26

Agent Terminal —

settle-fits ([120,360,800] ms), so the last forced fit — and the cols baked into the auth handshake — land on a transient width. With no OS window resize, none of TerminalView's existing triggers re-fire, so the PTY size stays stale. This is the exact follow-up the 2026-06-25 ADR-0074 entry below flagged ("wire surfaceActive if the same stale-geometry symptom shows up [on the inline consoles]"). Fix. Reuse the ADR-0074 surfaceActive false→true force-refit. ProjectSurface drives the planning console's surfaceActive from a new planningConsoleActive state: false when no terminal is bound, and flipped to true two rAFs after the console binds/rebinds (mount → post-layout), once the grid track has resolved. That transition fires TerminalView's existing activation effect, which force-refits and re-sends the true cols/rows (SIGWINCH → reflow). Regular-grid and popout terminals are untouched; Modal size tokens unchanged. Renderer-only. Tests. TerminalView.test.tsx adds a regression asserting a surfaceActive false→true transition re-sends a resize control frame. tsc/eslint clean; TerminalView.test.tsx (43) + planning SkyboltApp.test.tsx (8) pass.
format_notes_prompt_section (the scheduler's security-reviewed note path). _terminal_out now surfaces assigned_card_id, and it is a portable terminal_sessions metadata key in sync_serialize.py, so the assignment syncs across machines. New schema TerminalAssignCardRequest. Renderer. useTerminalTaskFeed owns the assignment action + the auto-feed effect (in-memory note-delta detection, per-terminal fed-note tracking, silent seed on first sighting so a reload doesn't re-brief). TerminalTaskSelector is rendered per panel in TerminalsTabBody; ProjectSurface wires cards + agentNotes + queueTerminalInput. Client fns assignTerminalCard / getTerminalAgentFeed in api.ts; TerminalSession.assigned_card_id added. Tests. apps/engine/tests/test_terminal_task_feed.py (assign set/clear/404, agent-feed empty / full-brief / note-delta / all-fed); TerminalTaskSelector.test.tsx + useTerminalTaskFeed.test.tsx. Note. Feeding is automatic, so a note can paste mid-turn; a per-terminal pause toggle is a possible follow-up.

Agents Execution

Real-world test surfaced a gap: a CLI seat did write its artifact file, but the model emitted slightly-invalid JSON (an unescaped nested {"status":"ok"} inside a string), so parsePlanningSessionPlanArtifact rejected it and the poll waited silently — the user saw "nothing happened".
Prompt hardened (planningSessionAiPrompt, CLI branch): the seat must write strictly valid JSON — escape inner double-quotes (\"), no trailing commas/comments, no unescaped newlines, escape quotes in embedded example JSON — and re-read the file to confirm it parses before finishing.
Feedback instead of silence: waitForPlanningPlanArtifact now distinguishes a *changing* file (mid-write → keep waiting) from a *stable* unparseable file (seat finished with bad JSON) and sets planningArtifactNotice, surfaced as a warning banner above the Mission Room Console (AIPlanningSessionModal artifactNotice prop). It keeps polling, so once the seat is asked to rewrite valid JSON the plan loads automatically; the notice clears on success / new turn / close.
Problem: in an AI Planning Session a hosted model streams its reply and the renderer parses a FINAL_PLANNING_JSON: block into draft cards. But for local-CLI seats (Claude Code / Codex) the flow was fire-and-forget — the renderer typed /plan + the planning prompt into the live terminal and returned a static "Continue in the live console" string, which is what got parsed — so the CLI's plan was never ingested into cards. User symptom: "claude gave the FINAL_PLANNING_JSON output but it isn't getting picked up."
Why the obvious fixes fail: the terminal onOutput stream is raw PTY bytes (ANSI/TUI redraws, hard wraps) so scraping a multi-KB JSON is unreliable, and the capture buffer is capped at the last 12 KB. A .skybolt/ handshake file can't be read via the workspace file routes because .skybolt is in BLOCKED_WORKSPACE_PREFIXES (apps/engine/skybolt_engine/constants.py).
Decision (ADR-0075, "fully hands-off, any seat"): drop the forced CLI /plan mode for planning and have the seat write its final plan JSON to a deterministic artifact the renderer polls and auto-ingests. Tradeoff: the interview is now read-only by prompt instruction (the only intended write is the one plan-artifact file), not hard-enforced by the CLI's plan mode; the modal's Add Selected / Add All stays the human review gate.
New flow: the renderer types and submits the planning prompt (no /plan); the prompt tells the seat to write the plan JSON to .skybolt/planning-sessions/<thread_id>.json (and still print FINAL_PLANNING_JSON: for the human). The renderer polls a new engine route every 4s (up to a 20-min timeout) and auto-ingests through the same normalizePlanningSessionResult path the hosted seats use; it clears any stale artifact before sending and deletes the artifact after a successful ingest. The hosted planning path is unchanged.
Engine: new GET / DELETE /projects/{project_id}/planning-sessions/{thread_id}/plan (routes/workspace.py) — strict id validation (^[A-Za-z0-9_-]{1,128}$), reads/deletes only .skybolt/planning-sessions/<thread_id>.json under the project source root (Local + SSH) with a _path_inside defense-in-depth check. GET → 200 {content, modified_at}, 404 "Plan not ready" until written, 413 over MAX_WORKSPACE_RAW_BYTES, 422 invalid id; DELETE → {deleted: bool}, idempotent. New PlanningPlanArtifactOut model (schemas.py); SSH helpers _ssh_read_infra_file / _ssh_delete_infra_file (services/ssh_workspace.py) operate on the pre-validated controlled path, deliberately bypassing the workspace .skybolt blocklist for this one path.
Renderer: planningSessionPlanArtifactRelPath / parsePlanningSessionPlanArtifact and the planArtifactPath prompt param (modals/planningSessionModel.ts); getPlanningSessionPlanArtifact / deletePlanningSessionPlanArtifact (api.ts); sendPlanningPromptToTerminal (renamed from sendPlanningPlanModeToTerminal), waitForPlanningPlanArtifact with a planningPlanPollGenRef supersede guard, and the rewired runPlanningSessionLocalCliTurn (clear-stale → send → poll → ingest → cleanup) in ProjectSurface.tsx; the now-dead waitForPlanningTerminalText and planningTerminalPlanModeInput import removed; AIPlanningSessionModal.tsx status badge shows "console — waiting for plan" / "plan ready" for CLI seats.
Tests: engine tests/test_app.py (4 tests / 11 cases for the route); renderer SkyboltApp.test.tsx (the "renders a live console for local CLI planning seats" test now asserts no /plan, prompt + artifact-path submitted, and auto-ingest of cards) plus a new modals/planningSessionModel.test.ts.

2026-06-25

Agent Terminal —

Web (modified): apps/web/src/components/TerminalView.tsx (surfaceActive prop
- activation-refit effect), apps/web/src/components/TerminalView.test.tsx (two regression tests: refit on hidden→foreground, no refit while staying foreground), apps/web/src/app/project/tabs/terminals/TerminalsTabBody.tsx (surfaceActive prop forwarded to each panel), apps/web/src/app/project/ProjectSurface.tsx (passes active as surfaceActive). Validation
pnpm --filter @skybolt/web exec vitest run src/components/TerminalView.test.tsx — green (41 tests).
pnpm --filter @skybolt/web typecheck — green.
pnpm --filter @skybolt/web exec eslint (4 changed files, --max-warnings 0) — green. No backend changes. Renderer-only; no engine routes, schemas, or apps/api code changed. Follow-up. Only the Terminals-tab grid is wired; the Chats/planning inline terminal consoles still rely on their own mount/focus fits. Wire surfaceActive to those too if the same stale-geometry symptom shows up there.

Browser Redesign

Fixed two native Browser child-webview navigation bugs.
- "The browser randomly pops up on top of another tab" (visibility race): show_browser_webview / hide_browser_webview are async invokes that React fired on mount/unmount without ordering, so a slow first-time show (window.add_child) could resolve *after* the unmount hide and leave the webview composited over the newly-selected tab. All show/hide calls now funnel through a module-level FIFO promise chain (nativeWebviewQueue.ts, enqueueNativeWebviewOp) so the last-requested visibility always wins. Bounds-sync stays unqueued (idempotent, only matters while shown).
- "Can't tab out of the Browser" (focus trap): the preview is a separate native webview that owns keyboard focus, so Ctrl/Cmd+Tab / Ctrl/Cmd+Shift+Tab never reached the main window's listener. Because the child webview deliberately has no IPC (capabilities/default.json), the shell now injects a capture-phase keydown bridge (BROWSER_NAV_CHORD_SCRIPT) that forwards the two chords via a dummy skybolt-nav:// navigation; a WebviewBuilder::on_navigation hook cancels that navigation, re-emits skybolt://browser-nav-chord (fwd = cycle terminals, back = cycle tabs) to the main webview, and returns keyboard focus to it — but only when the app is already foreground, so a background loopback page can't yank OS focus. ProjectSurface listens via onBrowserNavChord and runs the same actions the keydown branches do. event.repeat is dropped to kill auto-repeat spam. Known limits: default chords only (the child can't read the user's keymap), and the bridge swallows Ctrl/Cmd+Tab on previewed pages (escape-hatch tradeoff). No capability change — the child webview stays IPC-less.

Cloud Metadata Sync —

Symptom: after reopening + restoring 5 SSH terminals, ~10s later they were deleted on this machine and vanished on the other machine too — and the remote tmux work was at risk (the orphan reaper kills skybolt-* sessions with no owning row).
Cause: the earlier "deaths tombstone" change — machines._mark_terminal_process_exited / _mark_terminal_machine_restarted enqueued a delete tombstone whenever a live terminal's local PTY was detected dead. For an SSH+tmux terminal a dropped local ssh process is a DETACH, not a death — the remote session lives on — so tombstoning it (a) propagated a delete to every machine and (b) orphaned the rows so the reaper could kill the remote tmux.
Fix: removed the tombstone-on-exit entirely (and the _tombstone_terminal_sync helper + the threaded settings params). A terminal exiting/detaching now only updates the LOCAL row to exited; it is never removed from the synced set. Cleanup of genuinely-dead rows stays with _prune_dead_terminals (prior-session, tmux-protected) and explicit user delete (which intentionally tombstones + kills the remote tmux). The live-only backfill filter still prevents dead rows from being pushed UP.
Test: test_reconcile_does_not_tombstone_a_terminal_that_stops_being_live (was the inverse assertion).
Confirmed SSH+tmux terminals already reattach the LIVE remote session on another workstation via the existing Restart action (tmux session skybolt-<synced-id> is stable across machines). No code change needed for Tier A.
Tier B (local) fix: routes/terminals.py::_reanchor_local_cwd — restart_terminal now re-anchors a synced terminal's foreign absolute cwd to THIS machine's project root (keeps a same-machine subdir; SSH keeps its remote path). Test added.
Full design + remaining work (bulk "Restore terminals" endpoint, frontend Auto/Ask/Off, subdir-cwd refinement) in documents/features/cross-machine-sync/terminal-continuity.md.
agent_notes never propagated: agent_notes.insert_agent_note now enqueues the sync upsert INTERNALLY (threaded settings), so every note — orchestrator, questions flow, routes, scheduler — syncs. (The scheduler's existing _note_synced wraps now double-enqueue the same key, which the outbox UNIQUE coalesces; harmless.)
orchestrator AddDependency: the card_dependencies insert now tombstone-free upsert is enqueued (was a silent bypass).
browser_sessions downgraded CLOUD → LOCAL: it was registered CLOUD but no write site ever enqueued it, so it only reached the cloud via backfill and would re-materialize at the default status on another machine — the same flood terminal_sessions had. Made the registry honest (no behavior change today). "Open browsers follow you" is deferred until it gets the terminal_sessions live-only discipline.
_purge_archived_project_agent_name: the archived-project_agents purge now tombstones each removed agent and re-syncs the notes it resolves (threaded settings from the agent create/rename routes) — so reclaiming a name slot propagates instead of being re-pulled.
sync_outbox.tracked_delete: new delete-side funnel (DELETE + tombstone atomically), the analog of tracked_upsert.
CI guard tests/test_sync_delete_coverage.py: static AST test that FAILS on any raw DELETE FROM <cloud_table> in a non-sync-layer file without a tombstone (recognizes tracked_delete/note_delete/note_change(op="delete")/_note_custom_persona_delete). A short, justified allowlist covers the legitimately-raw sites: agent-control terminals (never synced), built-in personas (never synced), _delete_custom_persona (callers tombstone via the helper), and the dead secret_vault.delete_secret (documented: must tombstone if ever wired).
Goal: signing in / booting a machine (with internet) should make it match the cloud 100%, including deletions made on other machines.
How deletes propagate: as tombstones in the per-scope version stream; the boot pull (already complete-to-head) applies them. The gap was (a) some deletes never emitted a tombstone, and (b) nothing ever removed a local row just because the cloud lacked it.
Cascade-delete tombstone fixes (UP direction — make deletes reach the cloud): SQLite FK ON DELETE CASCADE removed child rows locally with NO tombstone, leaving them live in the cloud forever. Now tombstoned explicitly before the parent delete:
- routes/cards.py delete_card → tombstones the card's card_dependencies rows.
- routes/terminals.py delete_terminal_profile / delete_terminal_group → tombstone the affected terminal_group_members.
Full-state delete-reconcile (DOWN direction — the guarantee):
- Cloud API (apps/api): new read-only GET /api/v1/sync/manifest?scope_key=&since=&limit= returning the LIVE (non-tombstoned) {entity, record_id, server_version} set for a scope (routes/sync_v1.py, schemas/sync_v1.py). No schema/migration.
- Engine: cloud_client.sync_manifest; sync_engine.reconcile_deletions deletes every CONFIRMED-synced local row (has sync_record_state) whose (entity, record_id) is absent from the manifest — catching deletions missed by any tombstone/cascade/purge. Safe by construction: a row with a pending un-pushed edit is kept (pushes UP); a local-only row (no record_state) is never touched; an EMPTY manifest while we hold synced rows is treated as suspect and skips deletion (never wipe a scope on one bad read); a CloudClientApiError (e.g. cloud not yet redeployed with /manifest) skips gracefully.
- Trigger: sync_service runs it on the first *successful* pass after boot (full_reconcile), and the existing "Refresh this machine" (restore) path also runs it — making restore a true match-head. Incremental on-change passes do NOT run it.
Rollout note: the engine tolerates an un-redeployed cloud (manifest 404 → skip), but the guarantee only takes effect once apps/api is redeployed with the /manifest endpoint.
Known follow-ups still open: agent_notes.insert_agent_note and orchestrator_ops AddDependency bypass the outbox; browser_sessions is registered CLOUD but never enqueued (doesn't sync at all); deleting a project doesn't tombstone its project-scope rows (masked by the registry tombstone + the boot reconcile). A single tracked_delete funnel + a CI guard against raw DELETE FROM <cloud_table> would prevent the whole class.
Tests: cloud-API manifest tests (live-only, pagination, isolation); engine reconcile_deletions (removes absent, keeps live + pending-local) + empty-manifest safety; cascade-tombstone tests for card_dependencies and terminal_group_members.
Terminal flood symptom: signing into a second machine re-opened 58 / 19 terminal tiles per project.
Root cause: sync_backfill enqueued EVERY terminal_sessions row regardless of status, so the historical pile of exited/failed/machine-restarted rows (and rows pulled in from the other install) all pushed to the cloud; on apply status isn't a synced field, so each landed at the table default 'requested', which the renderer treats as an open terminal.
Fix (keep auto-open, stop accumulation):
- sync_backfill._scope_clause now only enqueues terminal_sessions that are status='live' AND engine_terminal_id IS NOT NULL (genuinely open + locally bound). engine_terminal_id is device-local, so a synced-in copy (NULL) is never re-pushed.
- machines._mark_terminal_process_exited / _mark_terminal_machine_restarted now tombstone the cloud copy when a live terminal's PTY dies (via _tombstone_terminal_sync), keeping the local row — so the synced open-set tracks live terminals.
- One-time guarded backlog sweep sync_engine._sweep_dead_terminal_sessions (marker terminal_open_set_cleanup_v2) — a clean slate: removes every local terminal row that is NOT live+bound (the synced-in requested tiles, exited/failed, restart tiles), KEEPING the user's actually-attached terminals. Owned-dead rows also tombstone so the removal propagates; a synced-in requested copy is delete-only (tombstoning it could delete a peer's live terminal). (v1 only tombstoned exited/failed; v2 clears the whole accumulated tile flood and re-runs on bump.)
Card writes that bypassed the sync outbox (so changes never propagated): wired through sync_outbox.note_change (the _note_scheduler_sync pattern, schema="proj"): orchestrator_ops.apply_decisions/_apply_one (CreatePrepTask/SplitCard card + card_dependencies inserts), budgets.pause_card, questions.raise_question / answer_question (card status + agent_questions). settings threaded from the callers (orchestrator._execute_leased, scheduler, routes/questions).
Known follow-ups (not yet done): agent_notes rows written via agent_notes.insert_agent_note still bypass the outbox; orchestrator_ops AddDependency also inserts a card_dependencies row directly. Same note_change treatment needed.
Tests: terminal backfill-filter, tombstone-on-exit, backlog-sweep, and >200-row push drain regressions added; full engine suite green.

Cross-Machine Sync

Symptom. Startup floods the engine and hits "database is locked" even for nearly-empty projects: the boot "match cloud head" pass made redundant cloud calls and re-ran the 4-scope manifest delete-sweep on every fast-retry, while identical settings re-writes enqueued and pushed for no reason.
Fix 1 — services/sync_engine.py::reconcile_deletions. Gather the scope's confirmed-synced sync_record_state set FIRST and early-return 0 BEFORE fetching the paginated cloud manifest when that set is empty. An empty/new scope has no confirmed-synced rows, so the manifest can never delete anything — this eliminates one GET /manifest per empty scope per boot pass. Pure reorder, zero behavior change.
Fix 2 — services/sync_service.py boot delete-sweep cap. The full 4-scope full_reconcile=True delete-sweep now runs at most ONCE per process boot. sync_once is split into sync_once_detailed returning (status, ran); the loop clears boot_reconcile_pending after the first pass that actually reached the network and pulled to head (ran is True), not on every transient fast-retry. A genuinely-offline boot that never pulled keeps the flag armed so it still reconciles once online. The "match cloud head on boot" guarantee is preserved.
Fix 3 — store/settings_store.py::write_project_settings value-equality short-circuit. A per-project setting whose stored value canonically equals the new value is now a TRUE no-op: no DB write, no updated_at bump, no sync_outbox row, no request_sync nudge. Comparison is order-independent (canonical JSON), so a re-PUT with reordered inner keys is correctly seen as unchanged. write_project_settings returns the keys that actually changed; PUT /projects/{id}/settings only note_changes those.
Tests. test_sync_engine.py::test_reconcile_deletions_skips_manifest_fetch_when_nothing_synced (no cloud manifest call when nothing is confirmed-synced), test_sync_service.py::test_boot_reconcile_sweeps_once_on_transient_then_synced + test_boot_reconcile_stays_armed_until_a_real_pull, and test_settings_api.py::test_project_settings_unchanged_value_is_a_true_noop.
Symptom. With an SSH-Machine project's tmux terminals live on two workstations, leaving one AFK long enough to drop the local ssh channels and then clicking back into the project showed "Establishing Connection" and then deleted every terminal tile on both machines and killed the remote tmux sessions — instead of reattaching the still-running remote work.
Root cause (the origin). services/machines.py::_reconcile_project_terminals, run on every GET /projects/{id}/terminals (project-open), gated only on broker.is_alive(...). A tmux-backed terminal whose ssh dropped is detached-not-closed (the SshReattachSupervisor is mid-grace), so is_alive is False but is_reattaching is True. The reconcile ignored is_reattaching (unlike the attach route's 503 "reconnecting" and the scheduler, which both honour it) and demoted the rows out of live — defeating the in-flight reattach and cascading into the reaper/sync teardown.
Fix 1. _reconcile_project_terminals now continues on broker.is_reattaching(...), leaving a reattaching tmux terminal live so the supervisor reattaches it (running work + scrollback intact).
Fix 2 (cross-machine tmux-kill hole). services/ssh_persistence.py::reap_project_sessions keyed its keep-set on locally-live rows only. Because terminal_sessions.status does NOT sync and the tmux session name is shared across machines (keyed on the synced row id), one machine's orphan reaper could kill a remote skybolt-<id> session another machine was still attached to. The keep-set now protects the session of every terminal row in the project regardless of status; only a session that maps to no row at all is reaped. A genuine close still kills the remote session via delete_terminal.
Tests. test_terminal.py::test_reconcile_keeps_a_reattaching_terminal_live (row stays live, no tombstone) and test_ssh_persistence.py::test_reap_kills_only_unknown_skybolt_sessions (a non-live/peer-live row's session is kept; only the true orphan is reaped).
Not yet addressed / follow-up. The exact source of the cross-machine row *deletion* (tombstone propagation) in the original incident is not fully pinned (candidates: a failed restore-relaunch, the one-shot dead-terminal sweep coinciding with a boot pass, or reconcile_deletions converging once a peer dropped the rows from cloud head). With Fix 1 the rows no longer leave live while reattachable, so that propagation is not reached in this scenario — but confirm from engine logs whether machine A was restarted on wake vs. merely idle, and the Restore terminals on open (Auto/Ask/Off) setting. See terminal-continuity.md.
Reused HTTP connection per reconcile pass (the speed-up). cloud_client.sync_session is a context manager that yields a CloudClient backed by ONE keep-alive httpx.Client (small connection pool, 60s timeout); sync_engine.reconcile now wraps the whole pass in it instead of calling get_client (which opened a fresh httpx.Client — a new TCP + TLS handshake — on every push/pull-page/manifest-page request). A boot pass makes many serial requests per scope across the account + every project, so the repeated handshake cost dominated launch-sync latency; reusing the connection means only the first request pays it. No behavior change — tests already inject their own client, so only the production transport changed. (services/cloud_client.py, services/sync_engine.py.)
Coarse sync progress surfaced for the UI. reconcile takes an optional on_progress callback and reports {phase, label, scopes_done, scopes_total, pushed, pulled} as it moves account → each project → done. sync_service feeds it into _report_progress and exposes sync_progress(); GET /sync/status now returns progress (plus launch_synced, true once the first post-boot pass has synced — the modal's dismiss cue). The callback is advisory: a raised callback never aborts the pass.
Session-scoped "Go offline" pause. sync_service.pause()/resume()/is_paused() hold an in-memory pause (resets on engine restart — it is NOT a persistent opt-out, which stays PUT /sync/consent). While paused, every pass is a strict no-op (sync_once → "paused", no cloud contact); local edits still queue and flush on resume. New routes POST /sync/pause and POST /sync/resume (session-gated) toggle it; GET /sync/status reports paused.
Launch sync modal (renderer). app/SyncStartupHost.tsx (mounted in SkyboltApp beside SyncNoticeHost) shows once per app open while the launch sync runs: a live progress bar of what the engine is doing, a Go offline button (calls pauseSync() and dismisses), and auto-dismiss the instant launch_synced flips. If skybolt.ai is unreachable it switches to an offline message + Continue rather than hanging; a 30s safety timeout never traps the user. SyncStatusBadge now renders a paused session with the off glyph and "Sync paused — click to resume" (calls resumeSync()). API: SyncProgress type, SyncStatus.paused/launch_synced/ progress, pauseSync()/resumeSync().
Tests: engine test_sync_service.py (pause no-op, pause/resume toggle, launch_synced flip, progress forwarded to status) and test_sync_routes.py (status exposes the new fields, pause/resume routes session-gated + toggle); renderer SyncStartupHost.test.tsx (syncing modal + progress bar, Go-offline pauses & dismisses, hidden when off / already synced, offline message).

Git Operation Surface —

Added GET /projects/{project_id}/git-status: a read-only working-tree status that returns the same payload a local_changes operation produces (mirrors git_ops._git_status_metadata field names) but writes NO git_operations row. Response is validated at the boundary by the new GitStatusResponse Pydantic model. Same project run-access gate as the other git reads. Auth path/method: GET /projects/{project_id}/git-status (no query params).
The renderer's recurring Dev Ops status poll (refreshDevOpsStatus, every 5s on the Dev Ops tab / 30s in the background) now calls this read-only endpoint via the new getGitStatus client instead of POST .../git-operations {action:"local_changes"} + re-listing — so a quiet project no longer appends a git_operations row (and takes the project-DB write lock) on every tick. The live status feeds the rail border and the "Changes to Commit" card through a new optional liveStatus argument to projectGitSyncStatus, which takes precedence over the operation log when present; the log/readiness path is unchanged as the pre-first-poll fallback. The mutating readiness sweep still POSTs operations as before.
POST /projects/{project_id}/git-operations no longer holds the attached project DB connection across the multi-second git subprocess. The non-gated branch now runs git with NO connection held, then opens a short, lock-retry-wrapped transaction to INSERT the result row. This stops every concurrent request from being forced onto a fresh connection during the git call (a writer-contention / "database is locked" amplifier). Approval-gate ordering and behavior for high-authority / protected-branch actions is unchanged. The short INSERT+commit (and the project-settings PUT path) now go through the new run_project_write_with_retry helper, so a transient "database is locked" becomes a brief backoff+retry instead of an HTTP 500.

SSH Machines

Fixed orphaned skybolt-* tmux sessions piling up on the remote (ADR-0061 cleanup gap). Only delete_terminal and finalize_agent_session killed a seat's remote session; every OTHER agent-seat teardown — requeue, fail, stale-lease reap, and engine-restart recovery — routes through scheduler._release_session, which deletes the terminal row and at most terminates the LOCAL ssh PTY. For a tmux-backed seat that only DETACHES the remote session, so each requeued/reaped/recovered SSH seat leaked a live skybolt-* session. Added AgentScheduler._kill_seat_tmux_session (keyed to the durable terminal_session_id, SSH-gated, best-effort off-thread) and call it from _reap_stale_leases and recover() so those paths kill the now-orphaned remote session. Local seats and gone/unreachable sessions are no-ops. Tests: apps/engine/tests/test_ssh_persistence.py (test_kill_seat_tmux_session_*), apps/engine/tests/test_scheduler.py (test_recover_kills_orphaned_seat_tmux_session).

2026-06-24

Cloud Metadata Sync —

Symptom: a second machine "synced some board items but not all" — cards trickled in and two installs never fully converged.
Root cause: sync_engine.push_db read only sync_outbox.pending(limit=_PUSH_BATCH=200), pushed one page, and returned; reconcile called it once per scope and server_has_more was captured but never acted on. So each reconcile pass uploaded at most 200 records per scope, relying on backfill's per-row request_sync to limp the rest out 200-at-a-time — badly amplified by the _WIRE_EPOCH=3 reset, which re-enqueues every card at once.
Fix: extracted _push_one_batch and made push_db loop it, draining across pages until a page is short (outbox empty) or makes no progress (only un-pushable/un-consented rows remain), bounded by _MAX_PUSH_PAGES. Each page commits, so a mid-drain rejection (handled by _safe_push) keeps earlier pages' progress. Mirrors pull_scope's loop.
Test: test_push_drains_outbox_beyond_one_batch enqueues 250 cards and asserts one push_db call uploads all 250 with an empty outbox afterward.
Known remaining (separate, not yet fixed): some card writes bypass the outbox entirely and so never enqueue — services/orchestrator_ops.py (CreatePrepTask/SplitCard card + dependency inserts), services/budgets.py (pause_card), services/questions.py (raise_question/answer_question status transitions). These need the _note_scheduler_sync treatment (thread settings, call note_change).
Symptom: a freshly set-up machine never pulled head down; /sync/status showed last_pass.status="error", detail="CloudClientApiError: cloud sync api error 422".
Root cause: the per-record sync API validates client_updated_at as Pydantic AwareDatetime (apps/api/app/schemas/sync_v1.py), but every domain table defaults updated_at TEXT NOT NULL DEFAULT (datetime('now')) — SQLite's datetime('now') is naive (no timezone). Any CLOUD row written via the column default or a plain insert (not sync_outbox.tracked_upsert, which stamps tz-aware datetime.now(UTC)) carried a naive updated_at. sync_backfill.backfill forwards it verbatim and _build_item put it on the wire, so the API rejected the entire push batch with 422.
Why it blocked the pull: sync_engine.reconcile runs push before pull and only caught CloudClientUnavailableError, so the CloudClientApiError(422) aborted the whole pass before any pull — one un-pushable row blocked all inbound sync indefinitely.
Fix (engine):
- sync_engine._wire_timestamp normalizes client_updated_at to tz-aware UTC at the wire chokepoint (_build_item); a naive stamp is treated as UTC (the engine's clock convention), an aware stamp is untouched, an unparseable value falls back to now.
- reconcile._safe_push isolates a payload rejection (HTTP 413/422) from the pull: it records ReconcileResult.push_error and continues to pull head down; auth/server errors and offline still fail/defer the pass as before.
- sync_service.sync_once now surfaces the server's CloudClientApiError.detail (names the offending record/field) instead of the opaque "cloud sync api error 422", and appends push_error to a synced pass's detail.
Tests: FakeCloud.sync_push now mirrors the AwareDatetime contract (rejects a naive client_updated_at) so convergence tests catch this class of bug; new test_backfilled_naive_updated_at_pushes_as_aware reproduces the default-updated_at card and asserts the wire stamp is tz-aware.

Desktop App

The data dir, at-rest.json custody, and bootstrap config.json now live only in the machine-local app data dir (%LOCALAPPDATA%/Skybolt, etc.). <Documents>/Skybolt is no longer a default, a config anchor, or an auto-adopt source — only a one-time migration source. Fixes a second-machine failure where OneDrive replicated <Documents>/Skybolt onto a PC that had never run Skybolt and the synced (usually corrupt) database produced *"That recovery code doesn't match the encrypted data on this machine"* on restore.
data_dir.rs (desktop) and settings.py (engine), kept in lockstep: default_data_dir is always local; bootstrap_config_path is <local>/config.json; resolve_data_dir adopts a <Documents>/Skybolt database only when it is not cloud-synced and local holds none (standalone safety net) — a synced copy is never adopted and resolution falls through to local.
New migrate_documents_install (called in the Tauri setup hook before the sidecar spawns) physically relocates a *non-synced* <Documents>/Skybolt tree to local via the existing move_data_dir; a synced copy is left untouched. migrate_legacy_config / migrate_legacy_config_files now move a previous version's config.json/at-rest.json from Documents (or the older %APPDATA%) into the local anchor.
Tests: data_dir.rs +3 (relocate_legacy_documents_install moves a non-synced install, never touches a synced one, and skips when overridden/already-local); test_data_dir.py and test_db_corruption.py updated for the local-only default, the local config anchor, the never-adopt-synced rule, and the in-place non-synced adopt, plus a Documents→local config-migration test.

Execution Engine

Cheaper project opens for fast project switching (ADR-0074). _open_project no longer re-opens the SQLCipher-encrypted global database on every request: it caches one global connection per worker thread, keyed by (db_path, at_rest.key_epoch()), and only ATTACH/DETACHes the per-project project.db per call. set_active_key now bumps a monotonic key_epoch, so any lock/unlock/rekey forces the pooled connection to re-open under the new key. The connection proxy commits/rolls back then DETACHes and returns the global connection to the pool instead of closing; a reentrant open (pooled connection already mid-use) falls back to a fresh self-closing connection, and a failed detach evicts+closes rather than poisoning the pool. Also: the idempotent on-open self-heal (_ensure_project_settings_row + the REPO-carrier empty-check hydration) now runs once per project.db per process (_HYDRATED_PROJECT_DBS) instead of on all ~25 concurrent project-switch requests; the archived-roster purge and the git-crypt/locked-DB header check still run on every open. The renderer keeps the last 5 projects warm, so the global connection stays hot and switching back pays no re-open cost. Note: the SQLCipher (at-rest enabled) path is not exercised by unit tests — validate lock/unlock/rekey in the running desktop app.

2026-06-22

Agent Persona Catalog —

Builtin catalog deletion now propagates locally. When the cached cloud catalog is the active source, _ensure_builtin_personas prunes local builtins whose keys are no longer present. Existing project agents are not deleted; they continue from their stored persona_snapshot.
Custom personas now enqueue ADR-0069/0070 sync outbox changes on create, clone, edit, delete, and delete-via-agent-removal. Custom personas are customer-owned account metadata, not staff-managed builtin catalog rows.
The cloud control plane now exposes direct custom persona deletion at DELETE /api/projects/{project_id}/agent-personas/{persona_id} with account-admin gating, builtin-delete rejection, and an in-use guard for personas still referenced by project agents.

Agent Terminal —

terminalInputFilter.ts: removed stripTerminalMouseModeEnables, sanitizeTerminalOutput, the MOUSE_MODE_PARAMS set, and DISABLE_TERMINAL_MOUSE_TRACKING. sanitizeTerminalInput no longer strips mouse reports (it still strips DA/CPR/DSR device-status responses), so wheel/click reach the PTY.
TerminalView.tsx: dropped the two DISABLE_TERMINAL_MOUSE_TRACKING writes (on mount and on reset) and the per-chunk output sanitizer; PTY output is now written to xterm verbatim. Trade-off (standard terminal behavior). While a CLI holds the mouse, plain click-drag goes to the app, so to select text for copy use Shift+drag (Option+drag on macOS — Skybolt already sets macOptionClickForcesSelection). Shift+wheel still scrolls xterm's own buffer, and the Ctrl/Cmd+C copy path is unchanged. Plain shells never enable mouse tracking, so their drag-select and native wheel scrollback are unaffected. Per-CLI notes. Codex uses alt-screen by default; its in-app scroll is the wheel (now forwarded) plus the Ctrl+T transcript overlay. Claude Code's default "classic" renderer already keeps output in native scrollback (wheel worked there before); its opt-in fullscreen mode is alt-screen and benefits from this fix. Stripping the alternate-screen enable (?1049h) was rejected — it leaves the native scrollback empty or garbled for full-screen redraw TUIs. Files changed
Web (modified): apps/web/src/components/terminalInputFilter.ts, apps/web/src/components/TerminalView.tsx, apps/web/src/components/TerminalView.test.tsx (replaced the mouse-enable-stripping test with one asserting mouse reports pass through). Validation
pnpm --filter @skybolt/web exec vitest run src/components/TerminalView.test.tsx — green (37 tests).
pnpm --filter @skybolt/web typecheck — green.
pnpm --filter @skybolt/web exec eslint (changed files, --max-warnings 0) — green. No backend changes. Renderer-only; no engine routes, schemas, or apps/api code changed.

Cross-Machine Sync

Per-user prefs (theme, last-project) now sync. PUT /me/prefs (routes/settings.py) enqueues a user_app_prefs upsert for each *syncable* key, resolving the user's primary account for the account-scope key. backfill gained a user_app_prefs clause so existing prefs converge on first enable. Device-local keys never sync — last-terminal-machine:* (a terminal's device binding) and settings-migrated-v1 (a per-device migration marker) — via the new settings_store.is_syncable_pref_key. Tests: test_user_prefs_enqueue_sync_for_syncable_keys_only, test_backfill_user_app_prefs_excludes_device_keys.
machines (SSH defs) — confirmed blocked on a prerequisite, NOT wired. Investigation found the machines table has no clean user-facing write point: revoke_machine is a stub (no DB write) and the only real write is a *derived registration* on the scheduler per-tick claim hot path (store/project_store.py, which is deliberately read-first to avoid the global writer lock / "database is locked"), handling local + SSH together with no EngineSettings in scope. Syncing SSH machine defs (which the list_account_machines docstring already anticipates) should be done by first building the real SSH-machine CRUD write point, then wiring note_change there — not by bolting an outbox write onto the hot path. Tracked as a follow-up.
Agent roster create/update now sync. create_project_agent / update_project_agent (routes/agents.py) enqueue a project_agents upsert (route-level, schema="proj", mirroring the already-wired delete), so a refreshed machine gets the user's actual roster instead of a stale one. Test: test_project_agent_create_and_update_enqueue_sync.
Machine-local-column audit (item 14). Among currently-wired CLOUD entities the only columns that nest machine-local data are projects.git_settings_json (merged on apply, preserving local repo paths) and terminal_sessions/browser_sessions metadata_json (scrubbed to portable config on push). auth_ref/credential_ref are stable vault references (the secret value is separately double-sealed), and is_local/local_only are portable flags — none need a per-column merge. The only column set still to review is on machines (host/credential_ref/metadata_json), part of the deferred machines wiring below.
Deferred (need hot-path care or a product decision — tracked): browser_sessions (its writes are runtime-heavy in browser.py; needs config-vs-runtime discrimination so navigation syncs without churn), card_dependencies (created in services/orchestrator_ops.py — threading settings into the orchestrator hot path), live agent-run state (card_outcomes/agent_questions/ orchestrator_runs — hot path), and the two product-decision entities user_app_prefs (theme/last-project — needs account-context resolution) and machines (SSH machine defs — needs a canonical write point + the column review above). See ADR-0072 open questions.
"Refresh this machine to my latest" — directional catch-up-to-head restore (Layer 2). A new POST /sync/restore flags a one-shot device restore (sync_meta restore_requested). The next reconcile, for every scope, skips backfill + push and pulls head with restore=True so the machine converges to head WITHOUT pushing its stale rows up. Before head overwrites a record that has a divergent pending local edit, that edit is moved to a new sync_quarantine table (engine schema v24 / project schema v22) — never silently dropped, never pushed. A local edit identical to head is just cleared (no divergence). The flag self-clears once the machine reaches head with nothing left un-applied (skipped == 0).
Quarantine review (keep / discard). GET /sync/quarantine lists set-aside edits across the global DB + every project DB; POST /sync/quarantine/keep re-applies the user's reviewed version (re-enqueued as a fresh edit so it becomes head, by explicit choice); POST /sync/quarantine/discard drops it. The outbox→quarantine move is race-safe (the (seq, updated_at) guard, like clear_flushed), so an edit made mid-restore is left pending, not lost.
Relay GET /api/v1/sync/head returns per-scope high-water server_version, so a device can measure how far behind head it is without a push (the staleness read for restore / future auto-detection). Read-only; reuses the existing SyncScopeVersion counter.
Tests: engine test_restore_adopts_head_and_quarantines_divergent_local_edit, test_restore_does_not_quarantine_a_local_edit_equal_to_head, the sync_outbox quarantine helpers (incl. the race-safe move), resolve_quarantine keep/discard, the /sync/restore + /sync/quarantine routes, and relay test_head_reports_per_scope_high_water.
Backfill no longer re-stamps stale rows to "now" (Layer 0). sync_backfill.backfill now enqueues each pre-existing row with its OWN updated_at (via a new updated_at override on sync_outbox.note_change) instead of a fresh device-monotonic stamp. A machine that has been offline for weeks therefore competes in the relay's conflict resolution as the stale edit it is, so it can no longer overwrite newer canonical state on another machine (the core "stale machine pollutes head" bug). Regression test: test_backfill_preserves_existing_updated_at; the old "resurrects a deleted card" test now asserts the frozen stamp loses to the newer tombstone.
server_version is the sole conflict authority — the wall-clock tiebreak is gone (Layer 1). The relay (apps/api routes/sync_v1.py) dropped _item_wins: a push now applies iff the record is new or base_version == current.server_version; any mismatch is a CONFLICT and the client adopts the server's record. The client's client_updated_at is retained only as routing metadata (and the >24h-future guard); a fast or skewed device clock can no longer win a conflict and overwrite head. Engine FakeCloud mirrors this. Conflict-loser semantics change from "newest clock wins" to "first-to-land wins; the loser adopts head" — divergent local edits are preserved for review by the Layer 2 quarantine (next). Tests updated: relay test_stale_edit_loses_even_with_newer_clock, engine test_concurrent_edit_first_pusher_wins_loser_adopts.
Decision record: ADR-0072. Layers 2 (catch-up-to-head refresh + quarantine) and 3 (coverage gaps) are pending.
First sign-in pulls promptly instead of after a ~10-minute delay. The background sync loop (sync_service._run) now retries a pass that hit a TRANSIENT failure (error/deferred — e.g. a slow first pull that timed out, a momentary network blip, or the account key not ready mid-pass) on a short, capped, exponential backoff (RETRY_BACKOFF_BASE_SECONDS=5 → …_CAP_SECONDS=120, up to MAX_FAST_RETRIES=8 attempts), instead of parking on the full POLL_INTERVAL_SECONDS (600s). A receive-only fresh machine makes no local edits to re-trigger the loop, so previously a single failed first pass meant data didn't arrive until the ~10-minute poll. Persistent states (disabled/locked/no-token/no-identity) deliberately still park on the long poll — they can stay that way indefinitely (signed out / engine locked) and fast-retrying would churn the global DB (which on Windows contends with at-rest re-encryption's os.replace). Decision logic extracted to the pure, tested _next_wait_seconds.
Honest "Synced" status on a fresh machine (renderer). SyncStatusBadge previously showed "Synced" whenever sync was enabled and nothing was pending — true on a brand-new machine before any cloud PULL had run. It now tracks everPulled (derived from scopes[].last_pull_at in /sync/status) and shows "Setting up sync…" with a spinner until the first pull lands, then flips to "Synced". Fast-polls (3s) while awaiting the initial pull.
Wire-epoch reset no longer resurrects deleted records. _reset_stale_sync_state (the one-time _WIRE_EPOCH reset that wipes sync_record_state + sync_cursor) now reports whether it ran, and reconcile then drains pending tombstones (_drain_reset_tombstones) for the account scope and each project scope BEFORE backfill. Without it, a row this machine still held but that was deleted on another machine would be re-enqueued by backfill as a fresh-stamped upsert that beats the server tombstone (LWW), resurrecting the record (e.g. a deleted Kanban card) on every machine. The drain is deletes-only, does not advance the cursor (the normal pull re-applies upserts), and SKIPS records with a pending local upsert so an un-pushed local edit is never clobbered. Steady state (no reset) is unchanged. Regression tests in tests/test_sync_engine.py reproduce the resurrection without the drain and confirm the drain prevents it without edit-loss.

Dashboard

Removed the inert dashboard AI insights panel and disabled Analyze projects action because dashboard AI generation is not planned for the current product direction.
Updated renderer tests and docs to describe only the real dashboard surfaces: first-run, quick launch, cross-project rollups, and project spotlights.
Earlier in the cleanup pass, the dashboard AI insights panel was documented as an inert placeholder. The later 2026-06-22 entry above supersedes that direction: the placeholder is removed instead of reserved.

Desktop App

Updated desktop packaging/update documentation to reflect the current release path: Windows builds prepare a bundled CPython runtime with apps/engine installed, map it into the bundle as engine-python, and use the Rust skybolt-engine binary as a launcher/fallback rather than as the production engine implementation.
Renamed local/CI validation labels from "sidecar stub" to "sidecar launcher" so check output no longer implies packaged builds ship only the health stub.

Execution Engine

Closing a terminal now propagates the close across machines. delete_terminal enqueues a cross-machine sync delete tombstone for the terminal_sessions row (mirroring delete_terminal_profile/delete_terminal_group). Previously the row was hard-deleted locally with no tombstone, so — because terminal_sessions sync as open-terminal config — the cloud copy survived and the next pull re-applied it, resurrecting the closed terminal on this and every other machine. Strict no-op when sync is off.
Dead terminals no longer accumulate and re-open on startup. _reconcile_project_terminals now prunes PRIOR-session human terminals that are exited, not restartable, and not tmux-backed (was_created_here is False), enqueuing a sync tombstone for each so the removal propagates and the row is not pulled back. Kept: live/active rows, machine-restarted Restart tiles (restore-on-restart), tmux-persistent rows (their remote session may still be alive), failed launch errors, agent-owned rows (scheduler/cards own those), and CURRENT-session exits (so a just-exited terminal still lists and attach 409s — pruned on a later session). Fixes "startup re-opens every terminal we've ever had."
Terminals grid hides dead/exited tiles (renderer). isVisibleTerminal now renders only active/in-capacity, restartable (machine-restarted), and failed rows; plain non-restartable exited rows are hidden, so the grid no longer shows every historical terminal.

Git Operation Surface —

Added an active-file decision summary to the merge-conflict resolver so users can see how many changed runs are taking current, incoming, both, or neither side.
Added batch controls for taking both sides or dropping all changed runs in the active file, alongside the existing take-all-current and take-all-incoming actions.
Expanded the conflicted-file rail with richer states for open, manually edited, resolved, file-level, loading, and load-error conflicts.
Improved the merge-conflict resolver with a visible resolved-file count, active file position, previous/next navigation, and auto-advance to the next unresolved file after marking a file resolved.
Added a Dev Ops Advanced Git Actions panel that wires existing engine actions for fetch, pull, stash save/list/pop, tag create/list/delete, rebase, cherry-pick, commit amend, and force push.
Added follow-up refresh rules so these actions re-probe local changes, history, conflicts, stash list, or tag list as appropriate after direct execution or approval resolution.
Added GitHub Actions run list/detail/rerun/cancel Git operation actions through the selected Machine's GitHub CLI (gh), including SSH Machines.
Expanded Azure pipeline support from read-only list/detail to list/detail/queue/rerun/cancel through Azure CLI (az) on the selected Machine.
The Pull Requests modal now includes a provider-aware CI runs panel for GitHub Actions or Azure Pipelines, with status, detail, open, rerun, and cancel controls. Azure Pipelines also exposes a queue form for pipeline id plus optional branch.
Accepted ADR-0071 and implemented native Azure DevOps controls through Azure CLI on the selected Machine (az locally or over SSH).
Added azure_auth_probe, PR list/detail/create/complete/abandon, and pipeline run list/detail Git operation actions. Queue/rerun/cancel controls were added in the later CI run controls entry above.
The PR modal now uses native Azure DevOps actions while retaining browser links for deep Azure views.
Added SSH-machine support for github_repositories, github_repository_branches, and repo_clone, so clone-based GitHub setup can run where the selected Machine lives.
repo_clone now mirrors local confirmation semantics over SSH: an existing target folder blocks with requires_confirmation, then allow_existing_files initializes/adopts the remote folder, configures origin, fetches the default branch, and checks it out.
Re-enabled clone-based GitHub/Azure setup choices for SSH Machines in the Dev Ops setup UI.
Added SSH-machine execution for GitHub PR lifecycle actions (github_prs, github_pr_detail, github_checks, github_create_pr, github_merge_pr, github_close_pr) by running gh inside the remote repo through the existing strict SSH command runner.
The initial SSH setup guard hid clone-based GitHub/Azure setup while remote clone was still blocked; the later remote-clone entry above supersedes that temporary UI posture.
Added ADR-0071 as the native Azure DevOps PR/pipeline auth direction: use Azure CLI on the selected Machine; keep browser handoff until that is implemented.
Updated Git Operations docs to distinguish implemented local GitHub PR lifecycle support (gh CLI on Local Machines) from then-remaining M7 work.
Updated the feature index and program status summary so future planning does not treat all GitHub PR lifecycle work as greenfield.

Project Members —

Clarified that add-member and role-change APIs are intentionally deferred until the account/team model, cloud/sync propagation, authorization semantics, and security tests are designed.

Public Site

Clarified that the waitlist and pricing surfaces are intentional placeholders. The waitlist must not transmit data until the API owns validation, retention/privacy, rate limiting, spam protection, and operator visibility. Pricing remains copy-only until a billing/licensing ADR exists.
Deferred the public-site waitlist cleanup into a larger waitlist/admin-portal wave. Any real waitlist endpoint now requires the admin/operator portal scope to be designed alongside the API and public form wiring.

2026-06-21

Agent Terminal —

Registering a CLI seat (Settings → Seats → New Seat) now pops a SeatTrustModal heads-up. A seat's CLI (Claude Code, Codex, mmx, Antigravity) asks the user to approve/trust the project directory the first time it runs there, and that prompt only appears inside a real terminal — so the modal warns the user before Skybolt opens one.
"Open terminal" launches an agent terminal for the just-registered seat via the existing createAgentTerminalSession action and jumps to the Terminals tab so the trust prompt is visible immediately. If the directory is already trusted, no prompt appears and the user can just close the terminal.
Applies to every seat kind. When the project role can't open terminals (canCreateTerminal is false) the open action is disabled with an explanation; "Not now" dismisses without launching anything.
Code: apps/web/src/app/project/modals/SeatTrustModal.tsx, ProjectSurface.handleSeatCreated / openSeatTrustTerminal.
Test coverage: SeatTrustModal.test.tsx (message copy, label fallback, disabled-without-terminal-permission, busy state).

Agents Execution

Problem: when a card tripped its per-card runaway break (max_sessions_per_card, attempts, or hosted cost), it went paused and there was no way to clear the meter. card_usage counts *all-time* agent_sessions/chat_runs for the card, so re-queuing a paused card immediately re-tripped card_budget_breach on the next scheduler tick and bounced it straight back to paused. The intended "resume by moving it back to the queue" path (per services/budgets.py) silently did nothing.
Engine: new local-only cards.budget_reset_at column (project-DB schema v20, storage/project_db.py, additive via the idempotent ladder + drift canary). When a paused card is PATCHed to queue (the "drag back to Ready" gesture), the cards route (routes/cards.py) stamps budget_reset_at = now and clears pause_reason. card_usage (services/budgets.py) gained a since= baseline that windows both the session/attempt query and the cost query to work created after the reset; card_budget_breach and the /usage endpoint pass the card's baseline. The result: one fresh budget window covers all three meters (sessions, attempts, cost), so the card actually becomes dispatchable again.
No renderer change: the board already PATCHes {status: "queue"} on a drag/move (ProjectSurface.moveCard), and queue is a valid move target from the Planning column where paused cards render. budget_reset_at is not synced (mirrors pause_reason/budget_json) since budgets meter machine-local work.
Tests: tests/test_cost_and_breaks.py::test_requeue_resets_session_budget — trip the session budget via the scheduler, assert paused + breach, PATCH back to queue, then assert budget_reset_at set, pause_reason cleared, and no breach. Schema-version assertions bumped to 20 in test_scheduler.py/test_storage_split.py.
Problem: the board only refreshed on the renderer's 5s PROJECT_REFRESH_MS poll, while the scheduler advances cards roughly every second (scheduler.py poll_interval = 1.0). A card that ran queue → working → review in under 5s was already at its final status by the next poll, so the board lagged and visibly "skipped" the intermediate columns.
Engine: new Server-Sent Events endpoint GET /projects/{id}/cards/events (routes/cards.py) subscribes to the process-global scheduler_bus, filters events to the requested project_id (the bus fans out to every subscriber, so per-project filtering is mandatory for isolation), and streams one data: frame per matching event with : ping heartbeats when idle. Authorization mirrors list_cards; the project DB is opened only for the auth check and released before the long-lived stream begins (the stream reads the in-process bus, never the DB). The subscription is always released on disconnect. The card PATCH route now also publishes card_status_changed on human-driven column moves — agent-driven transitions were already covered by the session_* bus events the stream forwards. (Realizes the "future WS fan-out" anticipated in services/events.py.)
Renderer: streamCardEvents (api.ts) opens the stream with a fetch-based reader — not the native EventSource, which cannot send the desktop shell's Authorization header — reusing the same auth context as every other request, and reconnects with capped linear backoff. ProjectSurface subscribes once per project and, on any frame, debounces a loadRuntimeState("metadata") refetch (CARD_EVENT_REFETCH_DEBOUNCE_MS = 300 in surfaceShared.ts) so a burst of events for one move collapses to a single refresh. The 5s poll stays as a backstop.
Tests: tests/test_card_event_stream.py (project filtering, forward + heartbeat, unsubscribe on disconnect, the auth gate, and an authorized request returning an SSE StreamingResponse) and api.cardStream.test.ts (SSE frame parsing / comment-frame handling).
Known gap: a few non-session card transitions still emit no bus event (e.g. budget pause → paused in services/budgets.py), so those surface on the next poll rather than instantly.
Root cause: the non-CLI planner runs each turn through the shared hosted-model loop (services/agent_chat.py::run_agent_chat). Its only "model is done" signal was "this streamed round produced zero executable tool calls" → return. A weaker tool-calling model (e.g. GLM 5 Turbo) could end a round with investigation preamble ("…Let me find where the scheduler actually sends prompts to seats:") but no usable structured tool call, so the loop returned that dangling line. With no FINAL_PLANNING_JSON:, the Mission Room rendered a half-finished message, produced no plan/cards, and asked no question — the "stuck" state. The run was marked done with no error.
run_agent_chat now detects an unfinished no-tool-call round (_looks_unfinished: finish_reason length/tool_calls, empty text, text ending on a colon, or a tool name written as prose) and, while rounds remain, appends the partial text + a continue/finish nudge and runs another round instead of returning the fragment. The final tools-disabled round is preceded by _FINAL_ANSWER_NUDGE so the budget-exhaustion path yields a real answer. Clean answers and clarifying questions (end in ?) are unaffected.
The loop's round budget is now the max_tool_rounds parameter (default MAX_TOOL_ROUNDS = 6). routes/chat.py::_prepare_chat_run detects a planning thread (metadata.kind == "planning") and passes _PLANNING_TOOL_ROUNDS = 12, so a first-turn file investigation isn't cut off mid-read.
Tests: tests/test_agent_file_tools.py::test_tool_loop_nudges_unfinished_turn_instead_of_stranding (regression for the exact colon-trailing shape) plus the updated round-cap backstop assertion.

Browser Redesign

Added drag-and-drop reordering to the Browser sessions rail. Dragging a session card onto another drops it before/after that card (based on the pointer's vertical position) and switches the list into a new "Custom order" sort. The hand-arranged order and the chosen sort persist per project in localStorage (skybolt:app:browser-session-order:<projectId> / skybolt:app:browser-session-sort:<projectId>), so a saved arrangement survives reloads and newly created sessions append at the end. Alt+ArrowUp / Alt+ArrowDown on a focused session is the accessible keyboard equivalent. Ordering stays client-side and presentational (applyManualOrder, reorderIds, moveIdByOffset in browserPresentation.ts); no engine schema change.
Added a Sort control to the Browser sessions rail. Users can order the session list by Newest first (default, matches the engine's created_at DESC), Oldest first, Title A–Z / Z–A, or Status. Sorting is client-side and presentational (sortBrowserSessions in browserPresentation.ts); the control only appears when there is more than one session. Surfaced created_at/updated_at (already returned by the engine serializer) on the BrowserSession API type.

Desktop App

Added a global UpdateDialog (apps/web/src/app/UpdateDialog.tsx), mounted once at the app root in SkyboltApp alongside EngineRecovery. On launch it runs one check_for_update (then re-probes every ~6 hours) and, when a newer signed version is available, pops up a modal asking the user to Install & Restart or Cancel for now. It reuses the existing check_for_update/install_update Tauri commands and the same pre-exit install semantics as the settings panel.
Replaced the never-mounted UpdateBanner (a passive corner card that was never wired into the app) with this modal. "Cancel for now" hides the dialog for the session; a newer version re-opens it and the same version re-appears on the next launch. Up-to-date / unconfigured / error results and any check failure stay silent so a failed probe never blocks the app.

Execution Engine

Mission Room Staged Cards collapsed to a clean step list. The planning-session staged-cards panel no longer renders every card's nine detail fields inline. Each card is now a compact row — Step N, assignee badge, title, and a two-line summary — with a View / Edit details button that opens a new nested PlanningCardDetailsModal for the full editable detail (title, assignee, summary, scope, and all list fields). Edits stay live (same draft-card state + snapshot persistence); the selection checkbox remains on the row so Add Selected/All is unchanged. New files: modals/PlanningCardDetailsModal.tsx (+ test). Test coverage: PlanningCardDetailsModal.test.tsx (renders details, live title/list edits via onUpdate, Done closes).
Staged cards default to unchecked. normalizePlanningSessionCard now seeds selected: false, so a fresh plan arrives with no boxes ticked — the user opts in per card (Add Selected) or skips the checkboxes and uses Add All. Snapshot resume still reapplies the saved per-card selection.

Git Operation Surface —

Bug: clicking Merge & close branch with another branch selected appeared to do nothing — the branch stayed in the list — until you clicked it a second time, which finally dropped it. Root cause: merge_close is a high-authority (gated) action, so the first click only *queues an approval*; the merge+delete runs later, when the approval is approved. requestDevOpsOperation fired the follow-up metadata probes (branches, commit_history, …) at *request* time — while the branch was still merely queued for deletion — and resolveApproval only reloaded existing operation rows, never re-running those probes after the operation actually executed. So the deleted branch lingered until the next git action re-probed branches. The second click's request-time probe was that "next action," which is why two clicks looked necessary.
Fix: resolveApproval in ProjectSurface.tsx now re-runs followUpGitActions(action) for the approved git operation before refreshing runtime state, mirroring the direct-action path. The Merge & Close list (and commit history / working tree) now updates the instant the approval is approved. This corrects the same lag for every gated git action (merge, merge_close, branch_delete, rebase, …), not just merge_close.

2026-06-20

Git Operation Surface —

Interval re-probe of the working tree: the Dev Ops "Changes to Commit" card used to refresh only once on tab entry (or on a manual Refresh) — the 5s runtime poll just reloaded existing operation rows, it never re-ran the local_changes probe, so newly-edited files didn't appear until you clicked Refresh. ProjectSurface now re-probes on an interval: every DEVOPS_ACTIVE_REFRESH_MS (5s) while the Dev Ops tab is open and every DEVOPS_BACKGROUND_REFRESH_MS (30s) elsewhere in the project (keeps the rail border below current while you work in other tabs). The poll is a new silent refreshDevOpsStatus() that fires only local_changes and reloads just the Git-operations list — it never sets busy (no toolbar/refresh-button flicker) and never raises the global error banner (a failed probe is recorded as an operation row, which drives the log + the red border). It only runs for a configured project with a repo path, and skips while a clone/connect is in flight. Both intervals live in surface/surfaceShared.ts.
Sync-state border on the project card: the active project's left accent line in the ProjectRail is now colored by its Git working-tree state instead of the fixed accent blue — green (#10b981) when the tree is clean and in sync, amber (#f59e0b) when there are uncommitted/unpushed changes (files need committing), red (#ef4444) when the latest working-tree probe is failing/blocked (sync erroring), and the original blue (#0077e4) when there's no Git data yet. The state is derived by the new projectGitSyncStatus(gitOperations, readiness) in tabs/devOpsValues.ts — the same Git-operation log + readiness the Dev Ops tab reads, so the card and the "Changes to Commit" view never disagree — computed in ProjectSurface and lifted to ProjectRail through IdeShell. Error keys off the continuous local_changes probe (re-run by the interval above) rather than one-off commit/push failures, so the border self-clears as soon as the next probe succeeds. Scope: only the active project's card carries the color (it's the only project whose Git status is loaded); other cards are unchanged.

2026-06-19

Machines

Scoped the project overview's Machine Dock (and its "machine online"/"browser preview" readiness signals) to the machines actually added to that project, instead of the whole account roster. GET /accounts/{id}/machines returns the local engine plus every active SSH machine — including ones registered for other projects, since SSH machine connection details sync across the account's devices (ADR-0035) — so an SSH machine like ai-dev-01 set up for a different project was appearing on every project's overview. The renderer now resolves the project's Execution Target(s) against the roster via the shared projectScopedMachines helper; the dock shows nothing (its "Add a machine" empty state) when no machine is configured yet. The account-wide list is unchanged elsewhere (machine pickers, DevOps fallback, browser tab).

2026-06-18

Account & Project Sync —

Bug: Startup and project-rail status reads call GET /account/project-sync/overview. That route could seal and upload enabled local_newer projects before returning, so a simple status read could turn app boot into cloud upload work.
Fix: The overview route now only classifies project state. local_newer stays visible until the background project-sync loop catches up or the user presses the per-project Sync now action.
Tests: Updated the overview regressions to verify GET /account/project-sync/overview does not call the push path.
Bug: A project could remain local_newer even though the user expected Skybolt to upload the newer local state automatically. The background project-blob loop only compared the project DB content signature to its last baseline, so if the file signature was already baselined while the local generation was still ahead of skybolt.ai, the loop returned unchanged and waited for manual action.
Fix: project blob sync now persists the last generation accepted by skybolt.ai per project. The background loop forces an upload whenever local generation is ahead of that cursor, even if the file signature is unchanged. If skybolt.ai rejects a PUT because it already has that same generation, the engine treats it as already current and baselines the cursor instead of bumping another generation. The project-sync overview endpoint remains read-only and reports local_newer while the background loop catches up.
UX: the Git mirror advanced disclosure now labels an older .skybolt/project.sealed artifact as Git Mirror Behind Local State / local-newer pending work instead of a red sync conflict. The destructive rollback path is still available as Take repo version.
Tests: Added test_project_push_if_changed_forces_upload_when_local_generation_is_ahead and overview coverage for read-only local_newer classification, and updated SyncResumePanel.test.tsx local-newer coverage.
Bug: After an agent changed project data, the rail could read local_newer before the project-blob background loop finished its push. The engine then caught up within seconds, but the rail had no follow-up read, so the blue Pending pill stayed until Account Settings > Sync forced a fresh overview.
Fix: useRailSyncStatus now schedules a bounded follow-up only while an enabled project is local_newer. It rechecks long enough for the background push loop to clear the transient state, then stops once the project is synced. Idle rails still do not poll skybolt.ai.
Tests: Added a rail regression for pending -> synced catch-up without idle polling.
Bug: StartupSyncIndicator existed but was not mounted in SkyboltApp, so app boot did not run the renderer's startup sync pass. The rail could stay on stale blue Pending project pills until Account Settings > Sync loaded a fresh project overview. Synced project menu images also appeared missing until a manual sync path refreshed me.projects.
Fix: SkyboltApp now mounts StartupSyncIndicator for signed-in users. After startup sync reaches skybolt.ai, the indicator publishes fresh project/account sync status for the rail. When the account bundle applies project display-field changes or restores cloud-only projects, it also refreshes me, so synced project names/colors/menu images show in the rail without opening Sync.
Tests: Added startup-sync coverage for publishing fresh rail status.
Bug: The project rail badges loaded their compact sync status once and did not update while the Sync hub was open. The hub could show every project as synced while the rail still showed stale blue Pending pills from the earlier local-newer state.
Fix: The Sync hub now publishes its fresh project/account sync status inside the renderer after loading or reloading, and the rail consumes that local event to update badges without adding idle cloud polling or duplicate status reads.
Tests: Added rail and hub regressions for event-driven badge refresh.
Bug: The cloud project-blob channel and the Git mirror share the same per-project generation counter. If the Git mirror sealed generation N, the cloud row could show local_newer; clicking Sync now or waiting for the changed-project loop resealed the same DB as generation N+1 instead of uploading generation N. That made the two sync lanes chase each other even when no source code was being synced.
Fix: Before allocating a new generation, the blob push path now checks whether the existing sealed artifact's generation and stored DB hash match the current project DB snapshot. If they do, it refreshes the cloud blob at that same generation (still including the encrypted portable-secret vault) rather than bumping again. Real project DB edits still allocate the next generation.
Tests: Added test_project_blob_push_reuses_current_git_mirror_generation.
Bug: Startup sync refreshed the app shell only when the account-settings pull returned status: applied. If the settings bundle was already current but sync-now materialized a cloud-only project through the project-blob restore pass, Account Settings > Sync showed the project while the left rail still used the stale me.projects snapshot.
Fix: StartupSyncIndicator now also refreshes me when sync-now.projects.restored contains a restored project, so newly materialized projects appear in the left rail without a full reload.
Tests: Added renderer coverage for a project-only restore result.

Account Settings Sync —

Bug: StartupSyncIndicator allowed two 75 second sync-now attempts plus backoff. A slow cloud call could leave app boot sitting in the sync state for minutes before the project became usable.
Fix: Startup sync now gets one 8 second attempt by default, then hands off with a non-blocking "Keep working" state. A few local app refreshes are queued so a project restored just after the visible startup budget can still appear without trapping the user in the spinner.
Tests: StartupSyncIndicator.test.tsx now covers the one-attempt default handoff.
UX: the Settings tab no longer shows a separate Models summary card. Providers are the management entry point because models are edited under the provider that serves them.
UX: synced providers now show a Needs key badge with a plain Edit action instead of a Set API key button in the provider manager.
UX: provider and model rows in the manager are clickable to open Edit, while the explicit Edit and Delete buttons remain separate actions.
Fix: saving a provider with a Test model now remembers that value on the provider metadata so reopening the provider can test it again without adding it to the catalog model list. The edit-provider modal uses a 50%-width layout with provider details on the left and provider-owned models on the right, and the old saved-provider Probe section was removed.
Fix: the provider list now uses the engine's needs_key flag instead of treating health_status='unknown' as missing credentials. A newly saved but untested provider no longer shows Needs key when its local key reference resolves.
Tests: test_app.py, ProviderModelsPanel.test.tsx, ProvidersPanel.test.tsx, and affected SkyboltApp.test.tsx settings flows.
Bug: the project-scoped provider/model migration added project_id to the schema and docs, but the account-settings bundle still omitted it. A restored or freshly synced machine could insert providers and catalog models with project_id = NULL, making them disappear from each project's Settings tab because the routes correctly filter by project.
Fix: provider and catalog model serializers now carry project_id, and apply writes it on insert/update while preserving an existing scoped assignment when reading an older bundle that lacks the field.
Tests: test_bundle_excludes_all_forbidden_material, test_seal_unseal_apply_round_trip, and test_provider_model_apply_preserves_existing_project_id_for_older_bundle.
Bug: StartupSyncIndicator was implemented but not mounted in SkyboltApp, so the renderer did not run the startup settings-sync pass after sign-in. Synced project display fields such as menu_icon_data_url, menu_color, and names could remain stale in me.projects until another manual sync surface refreshed the app.
Fix: The signed-in app shell now mounts StartupSyncIndicator. When startup sync applies a settings bundle or restores cloud-only projects, it refreshes me; after any successful startup reachability pass, it publishes fresh compact sync status for the project rail.
Tests: StartupSyncIndicator.test.tsx covers the fresh status publish path.
Change: new projects no longer inherit every provider/model/seat automatically. Providers, catalog models, model routes, and CLI seat profiles are scoped to the current project.
UX: the Settings tab now offers an explicit import action that copies providers, catalog models, model routes, and CLI seat config from another project in the same account. The import remaps ids, skips already-present target rows, and is safe to repeat.
Compatibility: engine schema v19 adds project_id to model_providers, catalog_models, and model_routes. Legacy account-global rows are backfilled onto the oldest project in the account so existing setups remain visible somewhere instead of being copied into every new project.
Tests: test_model_settings_are_project_scoped_and_importable.

Agents Execution

Fixed DispatchService._assigned_sessions so it recursively finds assigned sessions under the normal dispatch result keys (handyman, review, merge) as well as aggregate agents lists. The dispatcher was creating queued sessions but not publishing the immediate agent_session_queued scheduler wake for the single-agent result shape.
Hardened test_dispatcher_publishes_scheduler_wake_after_assignment with a bounded asyncio.wait_for, so a missing wake fails quickly instead of hanging the full engine pytest suite around the dispatcher tests.
Added four-worker pytest-xdist, per-test timeouts, --durations=20, and a three-minute GitHub step timeout to the Linux engine pytest CI step. The full engine suite now has a measured local wall time under three minutes instead of running serially for more than seven minutes.
Added card-level work_mode and session-level participation_role, resolved through services/work_contracts.py into a claim-time AgentWorkContract snapshot in agent_sessions.work_metadata_json.
research and spike now default to report_only; feature, bug, docs, test, security, and chore default to repo_change. chore and spike are active card types again under the contract model.
Scheduler prompts and finalization are contract-aware. report_only cards skip Git repo, worktree, branch, commit, review, and merge behavior, and complete only with structured report fields: summary, recommendation, sources, and optional artifacts.
Non-mergeable contracts are blocked from ready_for_merge. Orchestrator snapshots include work_mode/work_contract, and split cards inherit the parent work mode.
Hosted action tools linked to a card now honor the active contract, denying file writes, Git actions, or command runs when the contract disallows them.
Card Detail adds a Work Mode control/readout and hides Ready For Merge when the selected mode is not merge eligible.
The Project Board lane labels are now Backlog, Planning, Ready, In Progress, Review, Ready For Merge, and Done. paused and needs_input remain interruption states surfaced through Inbox/card badges instead of full board lanes, and board move targets now hide Ready For Merge for non-mergeable work modes.
Card Detail now shows user-facing card type labels and guidance: spike appears as Experiment and chore appears as Maintenance, while the stored type values remain unchanged.
Project schema v14 adds cards.work_mode.
Regression coverage: report-only scheduler completion on a non-Git folder, structured report validation, merge-state rejection, action-tool contract denial, board snapshot/split inheritance, and Card Detail work-mode tests.
The Agents page now opens selected session details in a modal instead of rendering the detail panel inline in the page flow.
The session-details modal is sized to at least 50% of viewport width and height, with the existing session detail panel reused inside it.
Regression coverage: AgentsTabBody.test.tsx asserts the selected session renders in a dialog with the required minimum size classes.
Deleting a card now removes agent sessions attached to that card, which also removes their session-backed journal entries from the project.
The delete path also releases held leases and removes agent terminal rows for those sessions so no cardless journal/runtime leftovers remain.
Regression coverage: test_delete_card_removes_attached_session_journal.
Card Detail now reads a card-level journal from GET /projects/{id}/cards/{card_id}/journal, aggregating every agent session tied to the card instead of showing only the most recent session's log. Older research answers and completion text remain visible after reassignment or a later agent run.
Switching a card's owner to Human now cancels open agent sessions for that card, releases held leases, removes the single-use agent terminal row, records a cancellation journal entry, and terminates the live agent terminal process when it is still running.
Regression coverage: test_card_journal_lists_entries_from_all_card_sessions, test_human_assignee_cancels_open_agent_session_and_terminal, and the Card Detail modal card-journal test.
Persona library adds are now additive. Clicking Add on an already-active persona creates another project-agent row instead of removing the existing one, and the engine suffixes duplicate default roster names such as Hank, Hank 2, and Hank 3.
The card/session journal now includes the agent-reported completion summary as an agent_report entry, so completed research or answer-style work remains visible after the single-use agent terminal is cleaned up.
Regression coverage: test_local_project_can_add_same_persona_multiple_times, the scheduler completion-journal assertion, and focused Agents tab persona-library tests.
The deterministic dispatcher now publishes an immediate scheduler wake after it creates a queued session, so the normal path no longer depends on the scheduler's next poll to move a card from agent queued to claimed work.
The deterministic dispatcher catch-up sweep now runs every 10 seconds by default (SKYBOLT_DISPATCHER_CATCHUP_SECONDS) instead of every 5 minutes. This sweep is SQL-only: it scans for eligible cards plus active self-driving roles and marks projects dirty; it does not call models or launch agents unless work is actually dispatchable.
The Orchestrator catch-up sweep now runs every 30 seconds by default (SKYBOLT_ORCHESTRATOR_CATCHUP_SECONDS) so missed durable board events recover quickly while keeping model-call fallback more conservative than the Handyman dispatcher.
Regression coverage: test_dispatcher_publishes_scheduler_wake_after_assignment, test_dispatcher_catchup_sweep_marks_dispatchable_work_dirty, and the Orchestrator catch-up default assertion in test_catchup_sweep_marks_unconsumed_projects_dirty.
The deterministic dispatcher now iterates all active agents for a self-driving role instead of selecting only the oldest handyman, review_agent, or merge_agent row. One busy Handyman no longer blocks other active Handymen from pulling queued cards.
The dispatch result keeps the historical single-agent shape, and returns an aggregate only when more than one agent with the same role is active.
Regression coverage: test_multiple_handymen_keep_dispatching_when_one_is_busy.
Sync side effects retired. ADR-0068 removed project blob sync. Project agent create, update, delete, and memory-update routes now stay local and no longer schedule background project blob pushes or write sync tombstones.
Tests. tests/test_app.py covers that roster mutations do not call the removed project blob sync helpers.

Changelog — minimax-integration

Regular Assets tab file rows gained an Open action backed by POST /projects/{id}/files/open/{path:path}.
The Assets tab output panel now lists model outputs from minimax-output/, outputs/, and model-output/ through GET /projects/{id}/model-outputs.
Output file opening now asks the engine to open local files with the system default app.
Asset deletion now uses a red trash button and a typed acknowledgement dialog before removing the file.

Desktop App

Project cards in the desktop rail can now be reordered by dragging the left-side grip handle. The drag interaction matches the project board: a lifted preview follows the pointer, the source card dims, and an insertion marker shows where the card will land. The order persists through the local Execution Engine registry and is reused by /projects, /auth/me, keyboard project cycling, and dashboard/project rail consumers.
Added PUT /projects/reorder with exact visible-project validation so a bad reorder request cannot drop or duplicate projects.

Encrypted Project Sync

Bug: the Git mirror treated a repo sealed artifact behind this machine's local generation as a hard conflict. Worse, if the live project DB had been restored outside <repo>/.skybolt, sealed push still required the DB to live in the repo and could keep comparing against stale mirror bytes. That made users repeatedly choose Keep my version even when only one machine was syncing.
Fix: local sealed push now blocks only when the repo mirror is ahead of this machine. When this machine is ahead, GET sync/status auto-refreshes the Git mirror by resealing and committing <repo>/.skybolt/project.sealed when the account key is unlocked and the Git index has no unrelated staged files. Sealed push also copies the freshly sealed artifact into <repo>/.skybolt/project.sealed before staging when the live DB is stored elsewhere. Status reads the repo mirror artifact when a repo is configured.
Tests: test_stale_remote_generation_is_updateable_without_manual_conflict and test_status_does_not_auto_refresh_git_mirror_with_unrelated_staged_changes, plus test_sealed_push_updates_repo_mirror_when_project_db_is_outside_repo.
Bug: Dev Ops -> Connect Existing Git Repo could detect a sealed-only checkout and open the Unlock Encrypted Project modal even when the sealed artifact's keyless header matched a project already imported on this machine. If that repo artifact was older than the local state, the backend correctly refused to roll back, but the modal did not render the error inline, so clicking Unlock looked like nothing happened.
Fix: /projects/detect now marks sealed locked results as already_imported when the header's project id is present in project_registry. The Dev Ops connect flow handles already-imported results before opening the unlock modal; for the current project it proceeds with the normal Git connection, and for another project it switches to that project. Unlock failures now render inside the modal.
Tests: test_detect_sealed_clone_reports_already_imported_from_header plus the existing connect/unlock renderer cases.
Bug: sealed sync push could refuse a new private GitHub repo with *"repository is not private (or visibility could not be confirmed)"* when gh repo view could not infer visibility from a not-yet-pushed/empty worktree.
Fix: the local private-repo gate still accepts only verified private, but now falls back to parsing the configured origin GitHub URL and querying gh api repos/{owner}/{repo} --jq .visibility. Explicit public/internal visibility still refuses, and unknown/non-GitHub remotes still fail closed.
Tests: test_github_repository_from_remote_url, test_repo_is_private_falls_back_to_origin_api_for_new_repo, and test_repo_is_private_rejects_public_origin_api_fallback.

Execution Engine

Project recency marking avoids project DB lock contention. POST /projects/{id}/opened now checks membership and writes last_opened_at through the global registry only, using the existing global write retry helper. This avoids a 500 when the renderer marks a project opened during the same burst of per-project polling that holds the attached project database busy. Test coverage: test_project_opened_uses_global_registry_without_project_db.

In-App IDE

Fixed: the Files tab explorer splitter now calculates width from the Files workspace's left edge instead of the viewport. Dragging the resize handle responds immediately even when the workspace is offset by the app shell/sidebar. Added a focused regression test in CodeWorkspace.test.tsx.
Added: Files-tab editor shortcuts. Ctrl/Cmd+W closes the active file; Ctrl/Cmd+N creates an untitled buffer; Ctrl/Cmd+S saves or opens Save As for untitled buffers; Ctrl/Cmd+Shift+S saves the active buffer to a chosen workspace path; Ctrl/Cmd+Shift+Tab cycles open files; Ctrl/Cmd+F opens current-file search; Ctrl/Cmd+Shift+F opens the global Files search; and Ctrl/Cmd+Shift+D selects the next occurrence of the current editor selection.
Added: Ctrl/Cmd+Shift+F now opens the global Files search from the Files page even when no file is open. Ctrl/Cmd+P opens a quick file picker backed by the cached workspace index, with a filename search fallback when the tree is on the lazy path.
Added: right-click actions on open file tabs: Close, Close all to the left, Close all to the right, and Close everything but this file. Batch close uses the existing unsaved-changes guard for dirty tabs.
Added: draggable open-file tabs with visible grabbed state, drag reordering, tab pinning, and split editor groups. Drag an open tab or file-tree file onto an editor group's left, right, top, or bottom edge to create a split; drop in the center to open in the current group. Pinned tabs stay at the front of their group.
Added: Ctrl/Cmd+B toggles the Files rail. Hiding the rail also removes the resize handle so the editor area expands immediately; Ctrl/Cmd+Shift+F reopens the rail when jumping to global search.
Fixed: drag-and-drop for file-tree rows and open tabs now accepts drops over the editor surface during capture and keeps a same-window drag payload fallback, which prevents the no-drop cursor in desktop webviews that swallow editor drag events or strip custom drag data.
Fixed: open-file tab reordering now uses pointer-driven tab movement instead of native browser drag/drop. The grabbed tab dims in place and a floating tab preview follows the pointer so the user can see what they are dragging.
Fixed: open-file tab dragging now listens for window-level pointer release, pointer cancel, app blur, and document leave events so the grabbed-tab preview cannot get stuck when the cursor travels too far away from the tab strip.
Fixed: pointer release over the empty area of the open-file tab bar now ends the drag instead of skipping cleanup while still inside the tab strip.
Fixed: split-pane creation from open tabs and file-tree rows now uses the same pointer-driven drag path as tab reordering. Dragging a tab or file row into an editor edge shows the split target immediately and avoids the browser/webview no-drop cursor from native drag/drop.
Changed: Monaco sticky scroll is disabled in the code editor so the function/bracket breadcrumb band does not consume vertical editing space while scrolling.
Fixed: dragging the only open tab in a pane onto that pane's own edge no longer duplicates the file; open-tab edge splits now move the tab out of its source pane.
Added: dragging to the far left or far right edge of the full editor surface creates a full-height side pane around the current split layout, so nested layouts can still gain a pane that spans the whole side.
Added: dragging to the far top or bottom edge of the full editor surface creates a full-width pane around the current split layout, matching the full-layout left/right behavior.
Added: split panes now have draggable divider handles. Horizontal and vertical split dividers store per-split flex weights so adjacent panes can be resized without changing the rest of the layout.
Added: Files layout now persists in portable project settings under files_layout. Skybolt restores open workspace-relative paths, pinned tabs, active file, focused editor group, split tree, split sizes, rail width/visibility, and Files/Search mode when the project DB or project blob is restored on another computer. The snapshot stores presentation metadata only, not file contents, diffs, absolute local paths, Machine credentials, or secrets.
Fixed: Files layout autosave no longer rehydrates the mounted editor when the saved settings echo back from the parent surface, preventing panes from briefly collapsing to one group after moving a tab or resizing layout state.

2026-06-17

Account & Project Sync —

Bug: After signing in on a new machine, the account bundle could restore the project list while leaving each project as a cloud-only manifest row (skybolt_path = NULL). The sidebar showed the project, but project-scoped screens returned Project not found until the user opened Account Settings > Sync hub and manually restored the project.
Fix: Added a project-blob auto-restore pass for enabled cloud-only projects. Cloud sign-in, recovery-code restore, and the startup /account/settings-sync/sync-now route now materialize any listed project that has a matching cloud blob before the app reloads /auth/me.
Tests: test_auto_restore_cloud_only_projects_materializes_manifest_rows covers the manifest-row-to-openable-project path.
Project agent roster edits push immediately when project blob sync is enabled. Creating, updating, deleting, or editing memory for a project agent now triggers a best-effort project_blob_sync.push_once after the local transaction commits. The periodic changed-file watcher remains the backstop, but a user who adds agents and then moves to another machine no longer depends on the next background tick.
Restored SSH projects rebuild their machine target from project settings. The target reconciliation path now copies portable SSH connection scalars from the project's isolation policy into the synthetic SSH machine row. This lets a restored project point back at its SSH machine before the user re-enters local credentials or accepts host trust on the new device.
Tests: engine route coverage proves agent create triggers an immediate blob push when the project is in sync scope.
Bug: The Sync hub's per-project Sync now button called pushProjectSync, the git mirror endpoint (POST /projects/{id}/sync/push). When the repo's .skybolt/project.sealed artifact was stale, the user saw a sealed-artifact conflict such as repo generation 3 vs local generation 243, even though they were trying to push the skybolt.ai project blob.
Fix: Added POST /account/project-sync/{id}/push to seal and push this machine's zero-knowledge project blob through project_blob_sync.push_once, added the renderer client pushCloudProjectSync, and wired the Sync hub's project Sync now action to that cloud route. The project overview now reports a local-ahead project as local_newer (uploadable) instead of conflict, so the rail no longer paints the project ACTIVE pill amber just because this machine is newer than the cloud. The per-row Git mirror advanced panel still uses the git endpoint for the separate repo-backed sealed artifact channel.
Tests: Added an engine route regression proving the cloud push route calls the blob service, an overview regression for local_newer, and renderer coverage that the project Sync now action calls pushCloudProjectSync and does not call the git mirror client.
Bug: A project blob conflict could show the row-level Resolve... spinner for a long time after choosing Keep my version. The blob resolver reseals the local project DB before pushing it to skybolt.ai, and that seal used PRAGMA wal_checkpoint(TRUNCATE) plus a direct main-file read. In the desktop app, normal renderer/project polling can keep SQLite read snapshots open, so the checkpoint could sit behind ordinary reads. The per-row Advanced disclosure then made the situation look stranger by showing the separate git-mirror sealed artifact as sync ready.
Fix: Sealed blobs now capture the project DB with SQLite's backup API into a temporary snapshot, then seal that snapshot. This includes committed WAL content without requiring a WAL truncate. Blob GET/PUT calls made from async resolver paths are also offloaded from the engine request loop. The per-row disclosure label changed from Details to Git mirror so its sealed artifact status is not mistaken for the skybolt.ai blob conflict state.
Tests: Added an engine regression proving a snapshot includes committed WAL content while an older reader remains open, plus renderer coverage that a conflict Keep my version action calls the blob-channel resolver.

Account Settings Sync —

Bug: agent model/seat assignments did not reliably survive a new-computer setup. The roster row is project content and may restore through the project blob, while providers/models/seats restore through the account-settings bundle. CLI-seat assignments also stored a source-machine default_seat.seat_id, but the receiving machine mints its own seat_profiles.id, leaving the per-agent dropdown dangling or falling back to the project default.
Fix: added a narrow agent_model_assignments section to account settings sync. It carries only project_id, agent_id, route_mode, default_model.catalog_model_id, or default_seat.kind. Apply runs after seats and maps default_seat.kind to the receiving machine's local seat id before updating project_agents.overrides_json. The roster row, memory, prompts, and project content still travel only through the project blob/sealed artifact.
Tests: test_agent_model_assignment_rebinds_local_seat_id_after_restore.
Settings tab provider/model setup now opens a 50%-width "Manage providers & models" modal from the Providers summary. The manager focuses on providers, and each provider row shows how many catalog models belong to it.
Editing a provider now shows provider details and that provider's catalog models in one modal. Creating a provider moves directly into that provider editor, so the user can add models without leaving the provider flow.
Files: ProviderModelsPanel.tsx, SettingsTabBody.tsx, ModelsPanel.tsx, ModelFormModal.tsx, ProvidersPanel.tsx, ProjectSurface.tsx. Tests: ProviderModelsPanel.test.tsx.
The Account Settings terminal shortcut editor now supports drag/drop reordering in addition to the existing move up/down buttons.
The shortcut array is still stored in the synced terminal.shortcuts setting, so the new order is the order that travels through account settings sync.
Test coverage: TerminalShortcutsEditor.test.tsx asserts drag/drop produces the persisted order.

Agent Persona Catalog —

Renamed the three deterministic dispatcher built-ins from the Handyman family labels to short agent names in the canonical catalog: handyman -> Hank, review_agent -> Rhett, and merge_agent -> Mac. Alembic 0030_self_driving_names updates both the global production catalog rows and existing built-in account persona rows.
Regenerated the engine offline fallback from the canonical JSON so local-first installs use the same names before cloud catalog refresh.

Agent Terminal —

Removed the stale visible "Agent-Controlled" and "AGENT" badges from agent terminal cards. control_mode='agent' remains a renderer data flag for showing the inline chat input.
Removed the old pause/resume controls from the Terminals card; those states are not part of the current agent terminal model.
Agent terminal cards now rely only on the standard TerminalView header for drag/drop, pop-out, reconnect, and kill actions, so leased agent seats can be reordered like regular terminals.
The Terminals tab and New Terminal modal now receive a project-filtered seat list from ProjectSurface, and AgentTerminalQuickPicker also filters by its projectId prop.
This keeps synced seat config from another project out of the active project's agent terminal menu. For example, an Antigravity seat restored for Project A no longer appears when Project B has not registered that seat.
Test coverage: AgentTerminalQuickPicker.test.tsx asserts a wrong-project Antigravity seat is absent from the picker.

Agents Execution

The Self-Driving panel now uses the renamed deterministic agents Hank, Rhett, and Mac instead of the old Handyman labels.
Persona library cards are wider and show more description text, reducing the maximum number of cards per row on wide screens.
The Orchestrator-Led panel keeps the new top-level split but restores job-area separators within that section so specialists are easier to scan.
Historical behavior. Project agent create, update, delete, and memory-update routes previously scheduled a best-effort project blob push after the local transaction committed when blob sync was enabled for that project.
Retired test. The old test_project_agent_create_schedules_enabled_project_blob_without_blocking coverage was removed with the sync routes and services.

Cloud Identity (skybolt.ai)

Focused sign-in challenges. The renderer now treats cloud sign-in as separate credential, two-factor, and recovery-code steps. After password success, the 2FA screen hides the email and password fields; after 2FA success, the recovery-code screen hides the 2FA field too. The typed values stay in memory for the final submit, but the user only sees the field needed for the current step.
Stale local encrypted data prompt. If /auth/cloud/restore returns the specific 422 detail That recovery code doesn't match the encrypted data on this machine., the auth screen now offers Delete local data and restart through the existing desktop reset_corrupt_data command instead of leaving the user at a raw dead-end error.
Tests: AuthScreen.test.tsx covers the focused 2FA/recovery screens and the stale local data reset prompt.
Installed-app auth fix. The packaged Tauri/WebView2 renderer can make loopback API calls from tauri.localhost to 127.0.0.1, and the browser can drop the HTTP-only skybolt_engine_session cookie on those cross-origin fetches. That made cloud signup succeed but immediately left account settings, sync, and 2FA routes returning 401 Not authenticated.
Fix: once the engine middleware validates the desktop shell's per-launch bearer token, the shared local-session gate can recover the latest unexpired local session server-side. Web/dev requests without that desktop token still require the cookie. Local logout also uses the same fallback so a cookie-missing desktop session can be deleted.
Tests: test_cloud_auth.py covers TOTP status after cookie loss with a valid desktop token and logout clearing the recovered session.

Desktop App

scripts/build.ps1 and scripts/build.sh now run pnpm install before generating the local Tauri build config and packaging the desktop app, so local installer exports pick up the current workspace dependency state instead of relying on a previous install.
The same scripts now fail when Tauri does not produce a fresh installer/package artifact instead of silently pointing a developer at an older setup executable.
Release sidecar preparation now creates src-tauri/engine-dist/python before compiling the Rust sidecar. Fresh machines no longer fail Tauri's resource validation with resource path engine-dist/python doesn't exist.
The desktop renderer no longer generates a PWA precache service worker, and now ships a tiny retiring /sw.js plus runtime unregister/cache cleanup so installed WebView2 builds cannot keep serving an old cached UI after a local reinstall.
The desktop shell now creates the main window manually with a fresh renderer-v2 WebView profile directory before the renderer loads, forcing installed apps off the stale WebView2 cache used by older builds.
Fixed Windows desktop shutdown leaking the Python Execution Engine child process. The skybolt-engine sidecar now assigns the Python process to a Windows Job Object with kill-on-close semantics, so app exit/restart/update/storage moves that stop the sidecar also terminate the Python child and release held database/source file handles.
Installed Windows builds can issue renderer fetches from tauri.localhost to the loopback engine at 127.0.0.1:<port>, and WebView2 can drop the HTTP-only skybolt_engine_session cookie on those cross-origin calls. The engine now treats a validated per-launch desktop bearer token as a server-side fallback for recovering the latest unexpired local session, so account settings, sync, and 2FA routes keep working after signup. Requests without the desktop token still require the normal cookie.

Execution Engine

Windows CI smoke trimmed. The engine-windows-smoke pytest step now runs only tests marked windows_smoke and prints slow-test durations, keeping the full engine pytest suite on Ubuntu while preserving Windows coverage for terminal, path/data-dir, corruption-recovery, and project-blob sync edge cases. Cross-platform real-PTY process tests are intentionally left to the Ubuntu engine suite unless they prove a Windows-only regression class.
Provider draft probes. Settings can now validate unsaved provider details through POST /projects/{id}/models/providers/probe-draft. The endpoint builds a transient provider/model, accepts either a pasted key value or an existing auth_ref, performs a strict chat-completions probe, and persists no provider, model, or key rows. Draft validation treats 401/403 as unhealthy so setup catches bad credentials before save, while the existing saved provider probe keeps its liveness semantics.
Antigravity seats. antigravity is a first-class local CLI seat kind that launches the agy command, including Windows CLI lookup support. Skybolt does not install or authenticate Antigravity; users must install the CLI, keep agy on PATH, and sign in before creating the seat.
Settings delete UX. Renderer settings deletes for models, providers, and seats keep the warning modal but no longer require typing the item name. Higher-risk deletes remain unchanged.

Git Operation Surface —

Recorded the GitHub provider test hardening pass so the next Git push has a fresh, meaningful diff to exercise the full CI rebuild.

2026-06-16

Account & Project Sync —

Bug: take_remote / "Apply latest" could hit sqlite3.OperationalError: disk I/O error on Linux after earlier route calls left _open_project connections open; the restore then tried to replace the same project DB while SQLite still had attached handles. A later execution-target refresh failure could also turn an already-applied blob restore into status: error.
Fix: _open_project now returns a close-on-exit connection proxy, so existing with _open_project(...) routes close the attached project DB when the block exits. _push_with_conflict_retry also explicitly closes the project connection it opens for sealing before returning to the caller. _apply_remote_blob_async treats execution-target reconciliation as best-effort after the blob install + registry generation update, logs any failure, and closes the project connection it opens for the refresh. Internal restore errors include exception type/message in the service result for regression diagnostics; the public route still maps only safe status/detail fields.
Tests: tests/test_project_blob_sync.py continues to cover take_remote, forced rollback, and keep-local conflict resolution.
Bug: take_remote / "Apply latest" could fail for a project already registered inside a repo because the blob restore path always unsealed into the engine-local default .skybolt location instead of the project's existing registered .skybolt directory.
Fix: _apply_remote_blob_async now preserves the existing project_registry.skybolt_path when local project data exists, and only uses the default engine-local project directory for cloud-only restores. The shared unseal/install primitive also removes SQLite -wal / -shm sidecars before and after replacing a project DB, so stale sidecars from the old copy cannot interfere with the restored main database.
Tests: test_resolve_take_remote_applies_newer_cloud now passes under the Linux full engine pytest job and continues to verify the local generation catches up to the newer cloud blob.
Bug: Windows CI could report an unchanged (size, mtime_ns) tuple after a real project DB write, especially when SQLite reused/checkpointed pages quickly. The change-triggered project blob loop then treated the edited project as unchanged and skipped the push.
Fix: _project_content_signature still avoids hashing the whole DB, but now includes SQLite's fixed header counters, a connection-local PRAGMA data_version observer, and a tiny content marker for non-empty WAL files. Header-only WAL files are still normalized to "no write," preserving the earlier idle-read fix.
Tests: test_project_push_if_changed_detects_write_when_file_stat_is_stale simulates stale main-file stats and a missing WAL after adding a card, and verifies the next changed-pass pushes.
Feedback: "syncing should only happen when there's something to sync — why every 30s when idle?" The every-30s activity was the rail badges' status polling (not data sync): the hook hit project-sync/overview + settings-sync/status on a 30s interval, and each of those does a LIVE cloud read (_remote_manifest / _remote_generation) — so the rail probed skybolt.ai every 30s even when nothing changed. (A stale comment had wrongly called those reads "cheap local".)
Fix: dropped the interval in useRailSyncStatus. The badges now refresh only on mount, on window focus, and after a sync action (ProjectRail's modal-close reload). Real data sync remains change-driven in the engine's background loop (push only when the content signature changed; a slow convergence pull to catch other machines). Tests: useRailSyncStatus.polling.test.tsx (asserts one read on mount, none after 2 idle minutes; a focus event triggers a refresh).
What: the project rail now shows each project's skybolt.ai cloud-sync state inline, and the account-settings button shows the account-wide settings-sync state — so the user can see sync health without opening the Sync tab. Four states drive the color: synced (green), syncing (blue, a spinner; the project's status tag reads "Syncing"), needs attention (amber — conflict, cloud-ahead, or a locked account key), and not syncing / not set up (a muted cloud-off glyph). Clicking the badge opens the Sync tab of Account settings and deep-links to the clicked item: account → the settings-sync card at the top; a project → its row, auto-expanding "Advanced" and briefly ringing the row.
How: a lightweight hook useRailSyncStatus (project-rail) reads the two status endpoints (GET /account/project-sync/overview, GET /account/settings-sync/status) — far lighter than useSyncHub, which still owns the modal. These endpoints make a LIVE cloud read (remote generation / project manifest), so the rail refreshes them only when there's a reason to — on mount, on window focus, and after a sync action (no idle interval — an early version polled every 30s and generated needless cloud traffic while the user was idle). Actual data sync stays change-driven in the engine's background loop (push-on-change + a slow convergence pull); the badges just reflect that state. The "syncing" pulse reflects an in-flight refresh (held visible ≥600ms). Reused state model: deriveProjectSyncState / deriveAccountSyncState.
Files (renderer): app/project/useRailSyncStatus.ts (new hook + derivations), app/project/components/CloudSyncIndicator.tsx (new clickable pill / icon), components/icons.tsx (CloudIcon, CloudOffIcon), app/project/ProjectRail.tsx (integration; project cards + account button restructured to an overlay shell so the badge is its own click target without nesting a <button> inside a <button>), app/project/modals/AccountSettingsModal.tsx (initialTab + initialSyncFocusProjectId), app/project/modals/sync/SyncHubBody.tsx (focusProjectId: auto-expand Advanced + scroll/highlight the row). No engine changes.
Tests: useRailSyncStatus.test.ts (state derivation), CloudSyncIndicator.test.tsx (label, icon-only variant, click-doesn't-bubble). Existing SyncHubBody.test.tsx (8) still pass.

Account Settings Sync —

Bug: after a sync, the desktop popped *"The local engine is no longer running."* The engine was fine (it answered sync-now with 200), but the sync routes were async def that call the synchronous cloud HTTP client (httpx.Client, _TIMEOUT_SECONDS = 15s) directly on the event loop: settings-sync/status → _remote_generation, project-sync/overview → _remote_manifest, and settings-sync/sync-now → sync_now (pull+apply+push). When a cloud round-trip was slow, the whole event loop blocked — including the unauthenticated /health probe the desktop watcher pings (4s timeout, 4 consecutive misses ⇒ "engine down"). The renderer polls status/overview every ~30s and on window focus, and the new project-rail sync badges added more polls, so a single slow cloud call reliably tripped the watcher. (Surfaced once the redaction fix above let sync actually reach the network.)
Fix: made the hot, cloud-touching handlers plain def (settings_sync_now, settings_sync_status in routes/account_settings_sync.py; project_sync_overview in routes/project_blob_sync.py). FastAPI runs def path operations in its threadpool, so a slow (or 15s-timeout) cloud call never blocks the event loop or /health. This matches the background sync loop, which already offloads the same work via asyncio.to_thread. The handler bodies are unchanged (they contained no await). Docstrings warn against converting back to async def without offloading first.
Files: routes/account_settings_sync.py, routes/project_blob_sync.py. Tests: test_cloud_touching_routes_are_sync_not_async, test_overview_route_is_sync_not_async (guard the handlers stay sync); existing route/service suites still pass (55/55 across both files).
Follow-up (not done): the user-action routes that also do cloud I/O on the loop (settings-sync/enable, project-sync/{id}/restore|resolve|set-enabled) are infrequent but block the loop while running; give them the same treatment for consistency.
Bug: a terminal shortcut command containing a secret-shaped substring (e.g. gh auth login --with-token ghp_…, or any sk-…/ghp_…/xoxb-… text) tripped the redaction value-scan in _redaction_safe, which dropped the entire account-settings bundle and refused to ship — silently halting ALL settings sync (push_once returns redaction-blocked; the engine log showed …redaction altered the outbound account bundle… shortcuts[3].command). The scan is a belt-and-suspenders backstop meant to catch *accidental* secret leaks from the allowlist serializers; it should never have hard-blocked deliberate user content.
Fix: shortcut commands are deliberate, user-authored command lines that must travel verbatim (a [REDACTED] command would break the shortcut on other machines), and the bundle is sealed/E2E-encrypted before it leaves (skybolt.ai only ever stores ciphertext), so such a value never ships in plaintext. _redaction_safe now classifies redaction-diff paths: a value matching only the exact, anchored path account_settings[<i>].value_json.shortcuts[<j>].command (_USER_FREE_TEXT_PATH_RE) is shipped verbatim (logged at INFO, paths-only — never the value); any OTHER offending path still raises AccountSettingsSyncRedactionError and blocks. This mirrors how provider_keys carry real secret VALUES sealed, attached after the scan, and how repair_secret_shaped_auth_refs already prevents an auth_ref leak from permanently wedging sync.
Scope/boundary: the exemption is intentionally narrow — providers, auth_ref, metadata_json, other account_settings keys, user prefs, SSH metadata, etc. are all still scanned and still hard-block. A secret-shaped value anywhere in the terminal blob *other than* a shortcut command still trips the backstop.
Files: services/account_settings_sync.py (_USER_FREE_TEXT_PATH_RE, _is_user_free_text_path, _redaction_safe). Tests: test_terminal_shortcut_command_with_secret_shape_still_ships, test_secret_outside_shortcut_command_still_blocks (exemption is narrow); existing redaction-trip
- round-trip tests still pass (34/34).
Follow-up (not done): when a *non-exempt* path blocks (a real allowlist bug), push_once returns redaction-blocked but the renderer's Sync tab does not yet surface it — sync just looks idle. Worth surfacing a "sync blocked" reason in the settings-sync status so it is never silent.

Agents Execution

Symptom. Adding a persona whose stored roster name was held by a *hidden* archived row failed with 409 "Project agent name already exists" even though the roster showed the agent absent. Reported against the Handyman Merger, which stores as the first name "Mac" (AGENT_FIRST_NAMES_BY_KEY["merge_agent"]): a legacy/synced project.db row left in the archived state (status='archived' or archived_at set) is filtered out of the roster list yet still owns the UNIQUE(project_id, name) slot.
Root cause. _project_agent_name_exists (services/personas_ops.py) checked the name with no archived filter, unlike the roster list/dispatch queries — so a hidden archived row read as a live duplicate. Even with that fixed, the insert would still trip the unique constraint while the physical row lingered.
Fix (ADR-0058 — agents are active or removed, never archived). _project_agent_name_exists now mirrors the archived_at IS NULL AND status != 'archived' filter. New _purge_archived_project_agent_name hard-deletes any stray archived row holding the target name (resolving its orphaned notes, as the old archive flow did); the create and rename paths call it before insert/update, so re-adding self-heals a legacy/synced archived row. The PATCH update path no longer writes the archived state (_project_agent_update_data drops a stray status='archived'; the archived_at write-mapping and the unused datetime import were removed). Defensive archived filters and the retained archived_at column (one-time v11 purge) are unchanged.
On-open purge backstop. The one-time v11 migration can't catch a row that *syncs in* after it ran, and the per-process upgrade cache (_UPGRADED_PROJECT_DBS) skips the schema ladder on later opens. The shared purge_archived_project_agents (factored out of the v11 migration) now also runs from _open_project on every project open — an indexed check that only writes/commits when something is actually archived — so a synced/legacy archived row is reclaimed by the next project-scoped request and can never strand the roster behind a 409.
Tests. test_local_project_readd_agent_purges_stray_archived_row injects an archived "Mac" row, confirms the roster hides it, and asserts the re-add returns 201 with the stale row reclaimed; test_local_project_open_purges_synced_archived_agent injects an archived row at the current schema version (no migration) and asserts a plain GET /agents physically deletes it.
Cloud control-plane parity (apps/api). The hosted API's archive/restore routes (POST /agents/{id}/archive and /restore) and the include_archived list flag were removed; AgentStatus dropped "archived". DELETE is now the single removal path: it pauses any in-flight sessions (the protective work the archive flow used to do), hard-deletes the membership row, and keeps built-in personas in the library (the old "built-in agents can be removed but not deleted" refusal is gone — that was only reachable via archive). The create/rename name checks filter to live rows and a new _purge_archived_project_agent_name reclaims the uq_project_agents_project_name slot from any stray archived row before insert. Alembic 0029_retire_agent_archiving purges archived rows (data-only; the archived_at column is retained as a defensive filter). Cloud tests rewritten to assert remove-and-re-add for both custom and built-in personas (39 control-plane tests pass).
Two-panel Persona Library. The Agents tab's Persona Library (the main-column browse-and-add area, not the right-hand roster) now groups personas into two panels instead of by job area: a Self-Driving panel on top — the engine-dispatched agents that run with no Orchestrator (handyman, review_agent, merge_agent; 3 today) — and an Orchestrator-Led panel below for everyone else. Each panel carries a one-line purpose blurb and a count. Search and the job-area filter still apply across both panels; the per-job-area sub-headers were dropped in favor of the new split.
isSelfDrivingPersona (rosterModel.ts). Persona-level counterpart to isSelfDrivingAgent, matching persona.key against the same self-driving role keys, so the library can partition the catalog. The Orchestrator persona itself is not self-driving, so it lands in the Orchestrator-Led panel.

Cloud Deployment

Fixed the production-proxy outage root cause. deploy/production/Caddyfile.prod served the /downloads/* handler with file_server browse off. Caddy has no off keyword for file_server: it read off as a browse-template filename, enabled directory browsing, and failed to load the config — crash-looping Caddy and taking skybolt.ai down for ~17 h. Directory browsing is off by default, so the line is now a bare file_server (with a comment warning against re-introducing browse off).
Added three guardrails so the same class of bug can't recur silently:
- CI gate — new proxy job in .github/workflows/ci.yml runs caddy validate (image caddy:2-alpine) against every Caddyfile on PRs and pushes, so an invalid reverse-proxy config can't merge or deploy.
- Deploy pre-flight — deploy.sh now caddy validates Caddyfile.prod before pinning any tag and aborts on failure, leaving the current proxy serving.
- Deploy health gate — added a caddy validate healthcheck to the caddy service in docker-compose.prod.yml, and deploy.sh now waits for the Caddy container to report healthy after up -d. Previously the deploy only polled /api/health, which probes the api container directly and bypasses Caddy — the blind spot that let the dead proxy go undetected.
Documented the remaining gap (operator follow-up): an external uptime probe on the apex https://skybolt.ai (not just /api/health) is the only thing that catches the proxy going dark at runtime for reasons the deploy-time gates can't see.
Validation: bash -n deploy/production/deploy.sh, YAML parse of ci.yml + docker-compose.prod.yml. caddy validate itself runs in CI and on the droplet at deploy (Docker was declined on this dev machine for a local run).

Cloud Identity (skybolt.ai)

Signup now ignores empty placeholder engine DBs. With at-rest encryption on and no custody file, _profile_exists now treats a zero-byte or plaintext SQLite DB with no users table as fresh setup instead of "already configured." This covers interrupted clears and the local state where skybolt-engine.db exists but contains no Skybolt schema.
The auth screen only shows duplicate-account guidance for the real cloud duplicate detail. A local engine 409 such as "Already configured" now surfaces as that exact error instead of the misleading "You already have a Skybolt account" notice.

Desktop App

Release installers now ship a relocatable CPython 3.14 with the engine pre-installed, so an installed app runs the REAL Execution Engine with no Python, repo, or venv on the user's machine — install → log in → set up projects/models → work. Replaces the deferred bundling stopgap; the loopback health-stub remains only as a last-resort fallback.
New build step apps/desktop/scripts/prepare-engine.mjs (run from prepare:sidecar:release): downloads a python-build-standalone CPython 3.14 (windows x86_64 install_only), pip installs apps/engine into it, and emits src-tauri/engine-dist/python/. Pin a build with SKYBOLT_PYBUILD_URL for reproducibility; engine-dist/ is gitignored.
The release build declares engine-dist/python as a bundle resource (engine-python) via the CI --config override — kept OUT of the base tauri.conf.json because Tauri's build script validates resource paths at compile time, which would break cargo check / tauri dev (they never create engine-dist/). main.rs resolves the bundled interpreter from the resource dir and passes it to the sidecar as SKYBOLT_ENGINE_PYTHON; the launcher (engine_stub.rs) runs python -m skybolt_engine serve directly in that mode (no source tree / PYTHONPATH needed). Debug builds omit the resource, so pnpm desktop:dev still uses the repo's apps/engine/.venv.
Trade-offs: installer grows ~80–150 MB (CPython + native wheels); the bundled site-packages contains readable engine .py (acceptable for the current private launch; revisit if obfuscation is needed). Windows x86_64 only until a macOS/Linux build matrix is added.

Encrypted Project Sync

Bug: on launch, StartupSyncIndicator made a single syncSettingsNow attempt bounded to 10s and, on the first timeout/miss, showed *"Working offline — couldn't reach skybolt.ai."* Over a cold link (a Tailscale tunnel to the control plane, or a just-started local API) the first attempt exceeds 10s while the connection warms up, so the toast fired even though the engine's background sync loop converges a minute or two later — leaving the user with a scary error contradicted shortly after by the rail's "synced" badge.
Fix: the indicator now retries with a fixed backoff (MAX_ATTEMPTS=6, RETRY_DELAY_MS=5s, each attempt bounded by TIMEOUT_MS=10s → ~1 minute), staying in the non-alarming "Syncing…" state across retries and only reporting offline after the link has had real time to come up. The first attempt that reaches the cloud flips straight to "Up to date." Timings are injectable props so the unit tests stay fast.
Not a config issue. This is unrelated to apps/api/.env / DATABASE_URL; the cloud and DB are reachable — the indicator was just impatient. apps/web/src/app/StartupSyncIndicator.tsx (+ StartupSyncIndicator.test.tsx: retries-before-offline, recovers-after-cold-start).
Bug: clicking Resolve on a sealed-sync conflict (and the same step on Sync & Resume's unlock) intermittently returned 500 Internal Server Error with sqlcipher3.dbapi2.OperationalError: database is locked from _set_registry_sync_generation (routes/sync.py → store/project_store.py). On take_remote the local project.db has already been replaced by the remote blob by the time that write runs, so a failed write strands the project mid-resolve: the on-disk state is the remote's but the registry's sync_generation mirror still shows the old value, so the conflict never clears (the UI "spins and does nothing"). The lock is transient writer contention — a single global-DB write can raise database is locked even with busy_timeout set when a WAL write follows a read snapshot invalidated by the renderer's continuous polling or the background blob-sync loop (the same hazard the scheduler already retries on its claim hot path).
Fix: added run_global_write_with_retry(settings, work) to store/project_store.py — opens a fresh global connection per attempt, commits on success, and retries the whole write a bounded number of times (4 × 50ms backoff) on a message-matched database is locked / database is busy (mirrors scheduler._is_db_locked_error); non-lock errors propagate immediately. Both post-unseal sites in routes/sync.py (sync/resolve take-remote and sync/unlock) now run their idempotent registry-generation mirror + keychain install through it.
Files: store/project_store.py, routes/sync.py. Test: test_conflict_resolution_take_remote_retries_on_locked_db (injects a one-shot lock on the first registry write and asserts the resolve still 200s and the generation lands).
Note (separate issue): the interleaved skybolt.cloud log line — *"settings-sync redaction altered the outbound account bundle; refusing to ship (offending key paths: account_settings[0].value_json.shortcuts[3].command)"* — is the account-settings-sync subsystem refusing to push a keyboard-shortcut whose command text trips the secret-redaction scanner. It is unrelated to this 500 (it only signals a concurrent global-DB writer), but means account-settings cloud sync is silently failing each cycle and should be triaged on its own.

Git Operation Surface —

Auto-open on error: DevOpsTab now pops the Operation Log modal whenever a failed/blocked git operation the user hasn't seen yet appears (tracked by latestFailureAt vs the persisted logReadAt), so errors surface without a manual click on the toolbar Log button. Opening marks those failures read (same as a manual open) so the modal doesn't re-pop once dismissed; a fresh failure re-opens it. The unread-failure dot on the Log button is unchanged.
Wider, more legible log: OperationLogModal widened to max-w-[50vw] (~50% of the viewport, was max-w-2xl) and bumped its text up a step — the action line to text-sm (was text-[12px]) and the status/error message to text-[13px] (was text-[11px]) so longer failure output is easier to read.

2026-06-15

Account & Project Sync —

Project tombstones (supersedes additive-only merge). Deleting a project records a project tombstone in the new global sync_tombstones table (engine schema v18, entity_id = project id; record_tombstone called in routes/projects.py). On peers a project tombstone removes the project_registry row (cascading to members/seats) but never deletes a git working tree or on-disk database — mirroring the interactive delete-project route. The origin also DELETEs the project's cloud blob (/api/v1/project-sync/{id}). Project tombstones have no timestamp guard (project ids are unique per creation) and GC after 90 days.
Auto-on + bidirectional. Per-project blob sync defaults ON when signed in (still gated on the unlocked account key + linked cloud identity); the engine loop pulls on a slow timer (PULL_INTERVAL_SECONDS=60) on top of boot/focus/explicit actions, and the hub refreshes on focus + interval, so a long-running machine converges in both directions. Conflict-correctness fixes shipped alongside (reload-on-error in the hub, a per-project lock held during restore/resolve, manifest no longer clobbering a local project's blob generation).
Raised free-tier quotas (ADR-0065). project_sync_free_max_projects=100, project_sync_free_max_total_bytes=10 GiB (env-overridable) in apps/api/app/config.py.
Local-only mode (ADR-0063). Unlinked engines never write blobs to skybolt.ai (hard cloud-write gate); the hub shows "Sign in to enable sync"; local→cloud upgrade prompts to upload local projects.
Bug: the unified Sync hub computes a row's conflict / remote_newer state from the blob channel (cloud-manifest generation vs the local registry mirror), but its "Resolve…" and "Apply latest" buttons called resolveProjectSync → POST /projects/{id}/sync/resolve, which is the git-remote sealed channel (ADR-0041). That endpoint only ever reads the repo's committed .skybolt/project.sealed and never downloads the cloud blob — so "Apply latest" on a remote_newer row was a silent no-op (the row stayed stuck on "Cloud is newer"), and the background blob push then 409'd because the local generation never actually advanced. Matches the reported POST …/sync/resolve 200 → PUT /api/v1/project-sync/{id} 409 loop.
Fix (engine): new blob-channel resolver. project_blob_sync.resolve(settings, pid, resolution) — take_remote GETs + applies the cloud blob via a shared _pull_and_apply (the restore path, refactored to share); keep_local reseals + pushes (push_once, which already bumps past a newer remote on 409). _apply_remote_blob_async gained a force flag so an explicit "take cloud version" can roll back past the anti-rollback guard (unsynced local changes discarded — the renderer warns first). New route POST /account/project-sync/{id}/resolve (Pydantic ProjectSyncResolveRequest, 423 when the account key is locked, actionable status→HTTP mapping for offline/quota/etc.).
Fix (renderer): new api.ts client resolveCloudProjectSync(projectId, resolution); useSyncHub's applyRemote / resolveConflict now call it instead of resolveProjectSync. The git-remote SyncResumePanel (per-row Details disclosure) still uses resolveProjectSync for its own channel — unchanged.
Tests: engine test_project_blob_sync.py — test_resolve_take_remote_applies_newer_cloud (apply-latest catches local up), test_resolve_take_remote_forces_rollback_to_older_cloud (plain restore 409s, forced take_remote rolls back), test_resolve_keep_local_adopts_current_local_generation (17 passed). Renderer SyncHubBody.test.tsx updated to assert the blob resolver (15 passed). Engine pyright + web tsc clean.
Problem: signing in on a NEW computer restores a project's data, but the project keeps targeting the SSH machine it was bound to — and SSH keys + host-key trust never sync (ADR-0054), so the machine isn't reachable on the new box. Nothing told the user *why* the project wouldn't run; it just looked broken ("couldn't sync because the SSH machine wasn't set up yet").
Fix (two surfaces, renderer-only — no engine change):
- Sync hub (ProjectSyncRow): SSH-bound local rows now show a muted note — "Runs on SSH machine <host>. If this is a new computer, set up SSH access to it (keys + host trust never sync) — open the project to connect and retry." SyncHubBody derives the host from the existing projects prop via projectMachineSettings, so no overview/API change was needed.
- Project view (ProjectSurface): a top banner appears when the open project targets SSH and the machine probe reports offline — "This project runs on an SSH machine that isn't set up on this computer," with the probe error, a "Machine settings" button (Environment tab) and a "Retry connection" button. The existing SSH probe was relaxed to run on project open (not only on the Environment tab) so the banner shows immediately; the Environment tab and the banner share the one probe result.
Tests: SyncHubBody.test.tsx — the SSH note renders for an SSH-bound project and is absent for a local one (17 passed). web tsc + eslint clean.

Account Settings Sync —

Tombstone hard-delete (supersedes additive-only merge). New global sync_tombstones table (engine schema v18) recording (kind, entity_id, account_id, deleted_at). record_tombstone is called at every delete site for synced entities — providers/models/seats (routes/models.py), personas (personas_ops.py + routes/agents.py). Tombstones serialize into the account bundle, union-merge on apply, and then delete matching rows by their STABLE handle (provider id, model id, persona key, seat "<project_id>:<kind>"), so a deletion converges on every peer instead of resurrecting from a stale bundle.
Timestamp-guarded apply. Synced entities now carry their real updated_at (preserved on apply, not re-stamped). A tombstone deletes a row only when updated_at <= deleted_at, so a re-created same-handle entity (newer timestamp) survives and a peer re-shipping a stale row can't beat a delete. Tombstones GC after 90 days. *Supersedes the "additive merge (v1, no-delete)" sections of ADR-0054.*
Auto-sync on by default + periodic bidirectional pull. Linking an account now auto-enables settings sync (enable_default_if_unset, respecting an explicit prior off) and keeps the secrets scope at all. The engine loop pulls on a slow timer (PULL_INTERVAL_SECONDS=60) on top of boot, app focus, and explicit Sync-tab actions; the web hub (useSyncHub) refreshes on focus + interval. This restores periodic pull (superseding the 2026-06-12 "pull only on load/manual" cadence).
Conflict-correctness fixes shipped alongside: reload-on-error in the hub, a per-project lock held during restore/resolve, and the account-bundle manifest no longer clobbering a local project's blob generation.
Local-only mode (ADR-0063). When unlinked, the hard cloud-write gate keeps the engine off skybolt.ai and the hub shows a single "Sign in to enable sync" panel.
Bug: a project's display fields (menu_icon_data_url, menu_color, name) reached the synced_projects manifest and were applied on pull, but only to the global project_registry. The project listing (_list_user_projects) and single-project read (_project_by_id) take their display fields from the per-project projects row whenever the project's project.db exists locally — so on any machine that already had the project, the freshly-synced icon/color/name was masked by the stale local row and appeared not to sync. (Git-remote sync was unaffected: the per-project DB itself travels there.)
Fix: after a settings pull applies the manifest, account_settings_sync._reconcile_synced_project_display copies the synced display fields from the registry into each locally-present project's per-project projects row — mirroring what PATCH /projects already does on the writing machine (it updates both copies). The per-project write is delegated to the new project_store.reconcile_local_project_display, keeping this service's serialize/apply paths global-only. Cloud-known-but-not-yet-restored projects (no local DB) and locked/sealed project.dbs are skipped quietly and reconcile on a later pull.
Touch points: engine services/account_settings_sync.py (_reconcile_synced_project_display called from _pull_once_locked; covers sync_now/restore_now/the 409-retry pull) and store/project_store.py (reconcile_local_project_display).
Tests (apps/engine/tests/test_project_blob_sync.py): test_pulled_display_fields_reconcile_into_local_project_db (stale local row → after reconcile the listing shows the synced icon/color/name) and test_reconcile_skips_cloud_known_project_without_local_db.
The terminal account-settings category now also carries the user's rocket-menu quick-launch shortcuts (shortcuts: [{ id, label, command }]) alongside defaultTerminalType / newTerminalSetupMode. No serializer change was needed — _serialize_account_settings already ships the category's value_json verbatim, so the shortcuts sync in the preferences bundle untouched.
Fixed a latent gap: the web app previously wrote terminal account settings to localStorage only (getAccountSettings/putAccountSettings existed but were never called), so they never actually synced. The renderer now seeds the terminal category from /auth/me and persists every change via PUT /accounts/{id}/settings (best-effort, localStorage kept as the offline cache), with a one-time localStorage→engine migration in IdeShell. So defaultTerminalType/newTerminalSetupMode now sync too, not just the new shortcuts.
Covered by apps/engine/tests/test_account_settings_sync.py::test_terminal_shortcuts_round_trip (serialize → seal → unseal → apply preserves the shortcuts array and does not trip redaction).
Seat CONFIG now travels. This reverses the original ADR-0054 decision that seats never sync. A new seats bundle category carries the portable seat config — project_id, kind (the stable handle), label, execution_profile, and the pinned --model (a scalar lifted out of metadata_json; nothing else in metadata_json travels). Keyed by the stable (project_id, kind) handle; the account-local seat id is NOT serialized. Scoped to the linked user (account_id + user_id). Rationale: seat config is portable; CLI availability is machine-bound.
Runtime state stays machine-local (unchanged exclusion). seat_profiles.status, last_used_at, and the implicit "is this CLI installed/authenticated here" never travel.
Apply (_apply_seats): upserts under the LOCAL user_id (like user_prefs) on the (project_id, user_id, kind) unique handle. FK-safe + additive — a seat whose project_id has no project_registry row on this machine is skipped (the project hasn't been restored here yet) and lands on a later pull once it is; seats apply AFTER synced_projects so a just-restored project's registry row already exists. A fresh insert gets status='active' + NULL last_used_at; an existing row keeps its local runtime state and only the config columns update.
Gating + UX: a new account-wide seats selection category (defaults ON, like the others; the selection travels so all machines converge). It appears automatically in the renderer "What syncs across machines" matrix via apps/web/src/app/api.ts SYNC_CATEGORY_META (label "Seats"; the hint notes the CLI must still be installed and signed in on each machine). Rehoming mirrors the existing "Set API key" pattern — config restores, but claude/codex/mmx must still be installed and authenticated locally.
Defense-in-depth unchanged: a new _serialize_seats allowlist serializer; the redaction scan runs over the whole bundle (seats included); the sealed opaque container. No cloud API (apps/api) change — it stores opaque ciphertext with no schema knowledge of the contents.
Code touch points: engine services/account_settings_sync.py (_serialize_seats, _seat_model_from_metadata, _apply_seats, SYNC_CATEGORY_KEYS now includes "seats", wired into _serialize_sections + _apply_bundle); routes/account_settings_sync.py (SyncSelectionUpdate.seats); renderer apps/web/src/app/api.ts (SyncSelection.seats, SYNC_CATEGORY_META entry).
Tests (apps/engine/tests/test_account_settings_sync.py): test_seat_config_travels_but_runtime_does_not, test_seats_gated_off_emit_empty, test_seats_round_trip_config_only_when_project_present; plus the orphan-skip assertion added to test_seal_unseal_apply_round_trip and the runtime-exclusion assertions in test_bundle_excludes_all_forbidden_material.

Agent Terminal —

Ctrl/Cmd+Tab → cycle forward through the project tabs (unchanged).
Ctrl/Cmd+Shift+Tab → NEW: cycle forward through the open terminals in the Terminals tab. Triggered from another tab, it first switches to the Terminals tab and then advances the focused terminal. (Previously this chord cycled tabs backward.)
Ctrl+Alt+Left / Ctrl+Alt+Right → NEW: cycle to the previous / next project in the rail. (Previously plain Shift+Tab cycled projects; that binding was removed, so Shift+Tab is normal focus traversal again.)
Ctrl/Cmd+1..9/0 → jump to the Nth project tab (unchanged). The chord predicates and wrap-around index were extracted into the pure, unit-tested apps/web/src/app/project/keyboardShortcuts.ts. Window handlers live in ProjectSurface.tsx (onTabKeyDown, tabs + terminals) and IdeShell.tsx (onProjectCycle). TerminalView's xterm custom key handler now returns false for these nav chords so the shell handles them and the PTY never receives a stray Tab / back-tab / arrow sequence. Known caveat: on Windows with Intel integrated graphics, Ctrl+Alt+Arrow is the OS-level screen-rotation hotkey and may be intercepted by the GPU driver before the app sees it. 2. Active-terminal outline. The focused terminal in the grid now shows a brighter, persistent inset ring (border-accent + a inset 0 0 0 2px accent-blue box-shadow) instead of the old dim outer 1px shadow. The outer ring was clipped by the card's overflow-hidden on the sides touching the grid edge and faded because it was tied to xterm DOM focus-within; the inset ring is tied to active-terminal state, so it stays put on all four sides. Defined in apps/web/src/app/project/tabs/terminals/TerminalsTabBody.tsx. 3. Broadcast mode (live keystroke mirroring). A tmux synchronize-panes / iTerm-broadcast affordance:
A broadcast toggle button (concentric-signal icon) appears in each terminal panel header when there are 2+ terminals. Clicking it toggles a screen-wide broadcast mode; a "Broadcasting to all" badge shows in the Terminals toolbar while active.
When ON, every keystroke typed in the focused (active/source) terminal is mirrored in real time to all OTHER terminals. The source is skipped when fanning out, so there is no echo/loop.
Auto-disables when fewer than 2 terminals remain; resets on project switch. Implementation: a framework-free bus in apps/web/src/app/project/surface/terminalBroadcastBus.ts (createTerminalBroadcastBus; register/unregister/emit; unit-tested in terminalBroadcastBus.test.ts). State + bus are owned by useTerminals.ts (broadcastEnabled, toggleBroadcast, and the bus methods exposed as registerBroadcastSink / unregisterBroadcastSink / emitBroadcastInput). Each TerminalView registers its raw socket-send as a sink and, when it is the source, emits its onData through the bus; wired via the broadcast prop passed from TerminalsTabBody.tsx. A shared StatusBadge "info" tone (accent) was added in apps/web/src/components/StatusBadge.tsx for the badge. No backend changes. All three are renderer-only; no engine routes, schemas, or apps/api code changed. Files changed
Web (new): apps/web/src/app/project/keyboardShortcuts.ts (+ test), apps/web/src/app/project/surface/terminalBroadcastBus.ts (+ test).
Web (modified): apps/web/src/app/project/ProjectSurface.tsx (onTabKeyDown tab + terminal cycling), apps/web/src/app/project/IdeShell.tsx (onProjectCycle), apps/web/src/app/project/surface/useTerminals.ts (broadcast state + bus methods), apps/web/src/app/project/tabs/terminals/TerminalsTabBody.tsx (inset active ring, broadcast badge, broadcast prop), apps/web/src/components/TerminalView.tsx (nav-chord passthrough, broadcast sink/source wiring, header toggle), apps/web/src/components/StatusBadge.tsx ("info" tone).

Agents Execution

Typed cards + seeded contracts (F0). Cards gain card_type (feature|bug|research|spike|docs|test|security|chore) and a nullable estimate (xs..xl), both code-defined in services/card_templates.py (CARD_TYPES/CARD_ESTIMATES; no DB CHECK). seed_contract fills a per-type contract scaffold (acceptance criteria / quality gates / definition of done / review/test requirements / stop conditions) only into empty fields — user input is preserved. _GATES declares which specialist role_keys gate each type (required vs advisory).
Roster-aware orchestrator (F1). services/orchestrator_ops.py build_board_snapshot now exposes per roster agent: role_policy (required/advisory/off), operator_mode, capability_summary, bundle, tags, definition_of_done, and a derived handles_card_types (services/personas_ops.py ROLE_CARD_TYPE_AFFINITY + handles_card_types_for_agent; custom personas may override via a card_type_affinity snapshot list). The prompt teaches routing by card_type → handles_card_types, required-gate enforcement, advisory consultation, right-sizing, and an outcome-history preference. New decision verbs: assign_review, split_card, answer_question. Net effect: adding a User Advocate (advisory) or Security Reviewer (required gate) reshapes routing/gates with no per-persona code.
Outcome memory (F2). services/outcomes.py record_card_outcome writes a card_outcomes row on card finalize; build_outcome_digest aggregates per (card_type, role_key) (success rate, avg attempts/review cycles/cost) and is injected into the orchestrator snapshot + prompt.
Cost tracking + runaway breaks (F3). The orchestrator records metered token/cost per wake (orchestrator_runs.prompt_tokens/completion_tokens/cost_usd); a cumulative cost budget throttles further wakes. Per-card budgets (services/budgets.py; DEFAULT_BUDGETS, the agent_budgets project setting, per-card budget_json override) pause a runaway card (pause_card: status paused + pause_reason + a budget_pause approval) instead of seating another session. Honest constraint: CLI seats are opaque to the meter, so per-card $ covers only hosted chat cost (chat_runs.cost_usd via the new chat_runs.card_id); the agent-loop guard is proxy-based (sessions / attempts). Endpoints: GET /cards/{id}/usage, GET/PUT /agent-budgets.
Agent Q&A channel (F5). services/questions.py + routes/questions.py: an agent/chat hitting ambiguity raises a question (agent_questions) that parks its card to needs_input; a human or the orchestrator (answer_question) answers, injecting the answer as a card-scoped agent_note and requeuing the card. Open questions surface in the orchestrator snapshot.
Unified inbox (F6). GET /projects/{id}/inbox (routes/inbox.py) unions pending approvals (incl. budget pauses), open questions, and paused cards into one ranked list with a badge count. New coordination events: budget_tripped, question_raised, question_answered, orchestrator_throttled.
Flow analytics + gate visibility (F4). GET /projects/{id}/board/flow (WIP, status spread, throughput, avg cycle time, outcome digest — services/outcomes.py build_flow_metrics). GET /cards/{id}/gates (services/gates.py card_gates) reports required/advisory gates scoped to the current roster (a gate applies only if such an agent is on the project).
Required-gate enforcement is structural, not a DB block. It rides the queue → working → review → ready_for_merge → completed flow + the orchestrator's assign_review, because the LLM orchestrator and the deterministic dispatcher coexist; services/gates.py makes the gate state observable. See ADR-0066 §8.
Schema (project v12 → v13, strictly additive) in storage/project_db.py (the from_version < 13 ladder step): cards.card_type/estimate/budget_json/pause_reason, chat_runs.card_id, orchestrator_runs.prompt_tokens/completion_tokens/cost_usd, and the agent_questions + card_outcomes tables. New card statuses: paused, needs_input.
Tests. New apps/engine/tests/test_roster_aware_orchestration.py, test_outcome_memory.py, test_cost_and_breaks.py, test_questions.py, test_inbox.py, test_flow_and_gates.py. Full engine suite: 863 passed.
Decision. Recorded as ADR-0066 (documents/decisions/ adr-0066-roster-aware-orchestration-budgets-and-the-human-loop.md), extending ADR-0058.
An agent seat running on an SSH Machine is no longer finalized as terminal_exited mid-work when the user's network drops. SSH seats now run inside a persistent tmux session (see the SSH Machines changelog / ADR-0061), and scheduler.py _check_running_sessions only finalizes a running session when not is_alive AND not is_reattaching — so while the reattach supervisor reconnects the dropped channel, the seat keeps its running lease and heartbeat instead of being torn down. finalize_agent_session kills the seat's tmux session on genuine completion/exit. The local seat path is unchanged. Tests: apps/engine/tests/test_ssh_persistence.py.
Roster "Remove" now hard-deletes the membership row instead of archiving it (it used to set status='archived' + archived_at). The backing persona stays in the library, so the same agent can be re-added afterwards by re-checking it in the Persona Library. This works for built-in persona agents too — including the Chief Project Orchestrator — which previously could only be archived. That was the bug being fixed: an archived Orchestrator could not be re-added because the archived row still held the UNIQUE(project_id, name) slot.
Roster "Delete" (custom agents only) is unchanged in spirit: it hard-deletes the agent AND removes the backing custom persona from the library when no longer referenced. Built-in personas can be removed from a project but never deleted from the library.
Backend API contract (apps/engine/skybolt_engine/routes/agents.py):
- Removed POST /projects/{id}/agents/{agent_id}/archive and .../restore.
- DELETE /projects/{id}/agents/{agent_id} now always hard-deletes the membership row (built-in and custom alike) and accepts a delete_persona boolean query param (default false). With delete_persona=true it also deletes a no-longer-referenced custom persona (400 for a built-in persona). It resolves active agent_notes targeting the removed agent, emits a new agent_removed coordination event, and publishes an agent_removed scheduler_bus signal.
- GET /projects/{id}/agents no longer takes include_archived; it always lists non-archived agents (the archived_at IS NULL AND status != 'archived' filter is kept defensively for any stray legacy rows).
Wake whitelists (orchestrator.py, dispatcher.py): now include agent_removed and no longer list agent_archived/agent_restored (agent_added stays).
Renderer (apps/web): the Agents tab loses the "Show archived"/"Hide archived" toggle, the archived agents list, and the "Restore" button. The roster panel header is now "Agents" (was "Active Agents"); the empty state reads "No agents yet." Roster card actions are Edit / Memory / Remove / Delete (Delete only for custom-persona agents). Remove and Delete are disabled while the agent has an active session.
Per-project SQLite migration: PROJECT_SCHEMA_VERSION bumped 10 → 11 (storage/project_db.py). The v10→v11 step (_migrate_project_to_v11) is a one-time idempotent purge that deletes any leftover archived project_agents rows (status='archived' or archived_at set) and resolves their orphaned active notes, applied automatically the next time each project's engine opens its project.db. The archived_at column is intentionally retained (dropping it is needless churn; the list/dispatch queries keep it as a defensive filter).
Symptom. Every agent worktree branch was named skybolt/agent-<id8> — it ignored the project's DevOps branch_prefix setting (so cards never landed on feature/…) and used an opaque session-id fragment instead of the card name, making branches unreadable.
Fix. _claim_next now derives the branch from the project Git settings and the card title via the new agent_branch_base_name helper: <branch_prefix><card-title-slug> (e.g. feature/fix-the-login-bug), with branch_prefix defaulting to feature/. A card-less session falls back to <branch_prefix>agent-<id8>. The name is deduped against other live sessions (-2, -3…) so two concurrent same-titled cards still get distinct branches, and it is persisted to session metadata so a requeue resumes the same branch and a later review/merge session re-attaches the card's feature branch.
Settings. Reads git_settings.branch_prefix (DevOps page / import wizard), which the engine previously stored but never consumed.
Tests. test_agent_branch_naming.py covers prefix normalization, slugging, and fallback; the end-to-end scheduler test now asserts feature/implement-the-feature.
Files. apps/engine/skybolt_engine/services/git_ops.py (helpers), apps/engine/skybolt_engine/scheduler.py (_claim_next, _unique_branch_name).

Browser Redesign

Removed file:// from the browser preview URL allowlist. Browser sessions now accept only http and https targets so local files cannot be previewed through the browser surface.

Cloud Control Plane —

TOTP 2FA (ADR-0064). New POST /api/v1/auth/totp/{enroll,activate,disable,backup-codes} and GET /api/v1/auth/totp/status (rate-limited, pyotp); users.totp_secret / users.totp_enabled + a hashed single-use totp_backup_codes table via migration 0028_totp_2fa. POST /api/v1/auth/login accepts an optional totp_code and returns 401 {detail:{detail:"totp_required"|"invalid_totp", needs_totp:true}} until a valid TOTP or unused backup code is supplied. refresh_token_max_age_seconds raised to 30 days to align the cloud session with the engine's local-unlock window. Alembic head count is now 28.
Raised free-tier project-sync quotas (ADR-0065). project_sync_free_max_projects=100 and project_sync_free_max_total_bytes=10 GiB (env-overridable) in app/config.py, so "sync everything" works on the self-hosted deployment.
Disabled FastAPI /docs, /redoc, and /openapi.json when APP_ENV is outside development/test. First-class /api/* routes remain registered.
Added an ASGI request body size guard for unsafe /api requests. Normal JSON API routes use API_REQUEST_BODY_MAX_BYTES (default 1 MiB); settings/project sync keep explicit blob-route allowances derived from their ciphertext limits. Regression tests cover v1 auth, legacy control-plane auth, and sync override behavior.

Cloud Deployment

Changed the production API image build in .github/workflows/deploy.yml to publish the runtime stage, which runs as the non-root skybolt user, instead of the root base stage.

Cloud Identity (skybolt.ai)

QR code on enrollment. The enable-two-factor flow (apps/web TwoFactorSection) now renders the otpauth:// enrollment URI as a scannable QR code (SVG, via the new qrcode.react dep) instead of showing the raw URI as selectable text. The base32 secret remains as a "Can't scan?" manual fallback. The secret never leaves the device — the QR is generated client-side; no third-party network calls.
TOTP 2FA (ADR-0064). Server-verified TOTP gates account access without weakening zero-knowledge of project data. Cloud (apps/api) gained users.totp_secret / users.totp_enabled, a hashed single-use totp_backup_codes table, migration 0028_totp_2fa, the pyotp dep, and endpoints POST /api/v1/auth/totp/{enroll,activate,disable,backup-codes} + GET /api/v1/auth/totp/status (rate-limited). POST /api/v1/auth/login takes an optional totp_code and answers 401 {detail:{detail:"totp_required"|"invalid_totp", needs_totp:true}} until a valid TOTP (tried first) or unused backup code is supplied. refresh_token_max_age_seconds raised to 30 days so the cloud session and the local-unlock window lapse together.
Engine relays; renderer never calls skybolt.ai. The engine threads totp_code through cloud login/restore, surfaces needs_totp, and proxies the management endpoints via /auth/cloud/totp/* with the Bearer token. Renderer added an enrollment + management UI and an inline TOTP code field in the sign-in/restore flows.
First-class local-only mode + cloud-write gate (ADR-0063). The auth screen offers "Use Skybolt on this computer only" (local /auth/setup, no skybolt.ai account). A hard cloud-write gate (_active_identity None and no keychain refresh token) means an unlinked engine provably never writes to skybolt.ai; the Sync hub shows "Sign in to enable sync". Upgrading via the existing /auth/cloud/migrate prompts to upload existing local projects.

Desktop App

Added a bounded desktop startup /health wait after sidecar spawn, so normal renderer startup waits for the engine to become reachable while still falling through to the recovery UI if health never answers.
Stopped forwarding the desktop engine session token as a Python process argument; the sidecar now relies on SKYBOLT_ENGINE_SESSION_TOKEN for the Python engine launch.
Added an explicit Tauri CSP for the main renderer, scoped the desktop capability to the main webview (so browser-preview child webviews do not inherit IPC), and restricted browser-preview navigation/new-window redirects to loopback http/https URLs without embedded credentials.
Hardened numeric inputs at the desktop boundary: browser-preview bounds now reject non-finite values, and the sidecar launcher rejects wildcard/non-loopback hosts plus port 0, out-of-range ports, and non-finite/non-integer port values before binding or launching the Python engine.
Replaced single .old engine log rotation with bounded numbered backups (skybolt-engine.log.1 and .2) and rotate during long-running noisy sessions as well as on spawn.

Execution Engine

anthropic_api providers now speak the real Anthropic protocol. The provider kind existed in the enums and got tools-on-by-default treatment, but services/model_client.py only ever spoke the OpenAI wire format — it POSTed /chat/completions with Authorization: Bearer to every provider regardless of kind. Against an Anthropic-style endpoint (e.g. MiniMax's https://api.minimax.io/anthropic gateway, or Anthropic itself) that path does not exist, so the probe drew a 404, recorded a real latency, and — since 404 is not 401/403 — marked the provider unhealthy (and real chats failed the same way). The client now translates at its boundary for _ANTHROPIC_KINDS = {"anthropic_api"}: requests go to /v1/messages with x-api-key + anthropic-version: 2023-06-01 headers and an Anthropic body (system pulled to the top-level field, required max_tokens defaulted to 4096, OpenAI tools/tool_choice and assistant tool_calls/tool history mapped to Anthropic tool_use/tool_result blocks); responses and the SSE stream are translated *back* to OpenAI shape (content→choices[].message, tool_use→tool_calls, stop_reason→finish_reason, usage→prompt_tokens/completion_tokens). Nothing downstream (agent_chat, routes/chat.py, the probe) branches on kind. The same secret-resolution path feeds both header styles, so the references-only credential invariant is unchanged. Tests: test_anthropic_chat_translates_request_and_response, test_anthropic_chat_translates_tools_both_directions, test_anthropic_streaming_translates_text_and_usage, test_anthropic_streaming_translates_tool_use, test_anthropic_probe_uses_messages_path_not_chat_completions (tests/test_model_client.py).
Terminal persistence. Terminals no longer die silently when the engine process restarts. Project schema v12 adds a strictly-additive nullable terminal_sessions.scrollback TEXT column (storage/project_db.py), and a new TerminalScrollbackFlusher service (services/terminal_scrollback.py, registered in services/runtime_hooks.py, started/stopped by the app lifespan) snapshots each live terminal's replay buffer into that column every ~4s and once on shutdown — interval configurable via SKYBOLT_TERMINAL_SCROLLBACK_FLUSH_SECONDS (default 4.0). The broker (terminal.py) gained a total_written/flushed_total watermark, a scrollback_prefix, and pending_scrollback() (capped to the last 1 MB). On restart, reconciliation (services/machines.py) now distinguishes an engine restart (broker has no record of the PTY → _mark_terminal_machine_restarted, status message marker machine-restarted, scrollback kept) from a genuine in-engine exit (_mark_terminal_process_exited). New endpoint POST /projects/{id}/terminals/{tid}/restart (routes/terminals.py) relaunches the stored shell/cwd, flips the row back to live, and carries the persisted transcript forward as the new PTY's prefix (with a [Terminal restarted] marker). The renderer's Restart UI was already wired against this contract; only the backend half was added. _terminal_out returns the new column, falling back to the legacy metadata_json.scrollback. Limitation: only the transcript and the ability to relaunch survive — the running process does not, and a machine reboot loses output since the last ~4s flush. Tests: test_engine_restart_marks_terminals_machine_restarted_and_restart_relaunches, test_scrollback_persists_and_is_carried_across_restart, test_pending_scrollback_snapshots_growth_and_carries_prefix (tests/test_app.py); the v12 column is asserted in tests/test_scheduler.py.
Scoped --reload to engine source. cli.py's reload path now passes reload_dirs=[<skybolt_engine package dir>] and reload_excludes=["test_*.py", "*_test.py", "*/tests/*", "*/__pycache__/*"] to uvicorn.run. Previously uvicorn watched the whole working directory, so editing the renderer, docs, or the engine's own tests bounced the worker and killed every live terminal/browser session. Now only engine source changes trigger a reload.
Hardened hosted-chat and terminal WebSocket handshakes so Host, Origin, and the per-launch engine session token are validated before accept(). Attach endpoints now include the engine token in the returned local WebSocket URL so browser clients can satisfy the pre-accept check before sending the one-shot attach token.
Added Phase 0 shell-command guardrails for command profiles and saved-terminal launch paths: shell composition metacharacters are rejected before local shell=True, SSH shell dispatch, or terminal startup injection.
Added Phase 1 sync race hardening: account-settings sync now serializes local pull/push/signature mutation, and project-blob sync serializes each project's push/signature window so background changed-passes and user-triggered actions cannot double-push unchanged content.

Git Operation Surface —

Responsiveness fix: routes/git.py and routes/approvals.py ran the blocking _execute_project_git_operation (which shells out to git) directly on the single-threaded async event loop, so while any git op ran, every other request the Dev Ops tab polls stalled — the whole page froze until git returned (worst on Windows / when a worktree held the target branch). Both now await asyncio.to_thread(...), matching the scheduler/dispatcher/command-exec git paths. The op still records its result identically (incl. failures in the log); only the threading changed.
Worktree-aware close: closing a branch a linked worktree has checked out failed with git's raw cannot delete branch '…' used by worktree at '…'. _git_branch_delete (services/git_ops.py) now detects the holding worktree via _worktree_for_branch (skips the primary worktree) and, only with an explicit remove_worktree: true in the payload, removes it (git worktree remove --force + prune) and retries the delete. The removal runs through the same runner as the delete (local or SSH), so a worktree on an SSH Machine is removed on the machine that owns it — the first cut wrongly used the local-only remove_worktree helper, which on an SSH target ran git on the engine host against a non-existent path and silently no-op'd, so confirming never worked. Without the flag it returns blocked + requires_worktree_removal
- worktree_path + a warning that an active session would be disrupted. Merge is unaffected — git merges a worktree-held branch fine.
UI: MergeBranchPanel shows a warning-toned confirm banner ("Worktree holds this branch", with the path + warning) when the latest branch_delete is blocked: requires_worktree_removal; its "Remove worktree & close" button re-issues branch_delete with remove_worktree: true. DevOpsTab derives the prompt from that operation's metadata.
Tests: test_git_surface.py gains the blocked→confirm→remove flow and a non-worktree-failure regression (still a plain failed); MergeBranchPanel.test.tsx gains the prompt-hidden and confirm-reissues cases.
Replaced the inline GitApprovalsCard (a panel at the top of the Dev Ops tab that was easy to scroll past / miss) with GitApprovalsModal, a modal that auto-pops over the tab whenever a pending git-operation gate appears. Same Approve/Reject rows and the same SSH-passphrase field for push/commit_push; the modal can't be dismissed while a decision is in flight (Modal busy).
Auto-open is tracked by approval id (DevOpsTab seenApprovalIdsRef): a new gate pops the modal, dismissing it won't re-pop for the same gate, and resolving the last pending approval clears the set so the next gate pops again.
DevOpsToolbar gained an "N Pending Approval(s)" primary button (shown only while gates are pending) so a dismissed modal is always one click from reopening — the gate can't get stranded.
Tests: GitApprovalsCard.test.tsx → GitApprovalsModal.test.tsx (6 tests, incl. closed / no-pending render-nothing); DevOpsToolbar.test.tsx gains the pending-approvals button case; DevOpsTab.test.tsx gains an auto-open / dismiss-and-reopen group over the ready view.
Added apps/web/src/app/project/tabs/MergeBranchPanel.tsx, a "Merge & Close Branches" card on the Dev Ops tab (left column, under BranchesPanel). It picks any branch other than the current one and exposes two ordered steps: Merge into <current> (merge) and Close Branch (branch_delete, with a "Force (delete even if unmerged)" checkbox that flips -d → -D).
These are the first admin-direct Dev Ops actions that intentionally stay gated: merge and branch_delete are in HIGH_AUTHORITY_GIT_ACTIONS and excluded from HUMAN_DIRECT_GIT_ACTIONS, so each click queues a request in the Pending Git Approvals surface and only runs on approval via routes/approvals.py. No backend changes were needed — the engine surface has supported both since 2026-06-09.
Wired "merge" and "branch_delete" into the renderer GitOperationAction union (apps/web/src/app/api.ts).
DevOpsTab now also reads the refreshed branch list from a branch_delete result (added to the branchOperation latestOperationOf lookup), so closing a branch updates the branch panels after the approval executes.
Added apps/web/src/app/project/tabs/MergeBranchPanel.test.tsx (6 tests): current branch excluded as a target, merge/branch_delete payload shapes, the force toggle, and the disabled states (no selection / git not runnable).

In-App IDE

Added: terminal quick-launch shortcuts are now user-managed in Settings → Terminals, instead of a hardcoded list. Each shortcut is a free-form { label, command } pair (e.g. claude, codex, or opencode --model zai-coding-plan/glm-5.2), so subscriptions that run their own CLI and ones that run through opencode --model … are both supported. The editor (TerminalShortcutsEditor) supports add / edit / delete / reorder and a "Restore defaults"; the rocket menu renders the configured flat list (no more provider dividers), falling back to the built-in defaults (apps/web/src/components/terminalLaunchCommands.ts) when nothing is configured.
Shortcuts are account-scoped and sync across machines. They live in the existing terminal account-settings category and now round-trip through the engine (PUT /accounts/{id}/settings, seeded from /auth/me), so they travel in the zero-knowledge account-settings sync bundle (ADR-0054/0055). This also fixes a latent gap: the existing defaultTerminalType / newTerminalSetupMode settings were previously written to localStorage only and never synced — the whole terminal category now persists to the engine, with localStorage kept as the offline cache and a one-time migration in IdeShell for machines that already had a local value.
New helpers: resolveTerminalLaunchCommands / seedShortcutsFromDefaults / parseAccountTerminalSettings / accountTerminalSettingsPayload (surfaceShared.ts) and terminalShortcutsFromUnknown (modalShared.ts). TerminalView takes a launchCommands prop; it is threaded from terminalSettings via ProjectSurface → TerminalsTabBody and the popout.
Added: a quick-launch (rocket) button in the terminal panel header. Clicking it opens a small menu of common agent CLIs — Start Claude / Claude Yolo, Codex / Codex Yolo, Kimi / Kimi Yolo, and OpenCode / OpenCode Yolo — and picking one types the exact command into that terminal and presses Enter for you. It reuses the existing queued-input submit pipeline (queueTerminalInput(id, command, { submit:true })), so a queued command flushes as soon as the terminal connects. When broadcast mode is on, picking a command fans it out to every visible terminal (matching how live keystrokes broadcast) via the new launchTerminalCommand action / terminalLaunchTargetIds helper; otherwise it goes only to the clicked terminal. The menu shows on both the grid terminals (TerminalsTabBody) and popped-out terminals (ProjectSurface); it is hidden on embedded consoles that pass showActions={false} (e.g. the planning session view). The command list lives in apps/web/src/components/terminalLaunchCommands.ts; the button reuses the shared OverflowMenu (now accepting a custom trigger icon) and the new RocketIcon.

opencode Dangerous Mode —

The card moved from the bottom (above the Danger zone, full width) up to the top Members/Access row — the slot the standalone Access card vacated when it became the member-access modal (see the project-members feature). The card is now half-width (xl:col-span-2 dropped).
New Settings-tab control for opencode CLI users: a toggle + Apply settings button (project-admin only) that grants opencode full tool permissions for one project, gated by a themed confirmation modal.
Project-scoped, not global. Writes a project-level opencode.json in the project root (which opencode merges over global config), so the full-permission grant is confined to that one directory — unlike editing ~/.config/opencode/opencode.json, which would affect opencode everywhere.
Additive + reversible. Reads any existing opencode.json, sets only $schema + the permission block, preserves the user's other keys, and writes it back. Invalid existing JSON aborts rather than clobbering; a comment-bearing opencode.jsonc is never rewritten. Disable removes the managed block (or the file, if nothing else remains).
Local + SSH execution targets. Local writes happen directly in Python (cross-platform); SSH writes go through the engine's strict-argv SSH helpers (_run_ssh_shell, _ssh_upload_file) — no generated shell scripts.
doom_loop stays "ask" so the runaway-loop guard remains.
Code: engine services/opencode_config.py + routes/opencode.py (GET/POST /projects/{id}/opencode/dangerous-mode), OpencodeDangerousModeRequest in schemas.py, router registered in routes/__init__.py; renderer OpencodeDangerousModeCard (in ProjectSettingsPanel), OpencodeDangerousModeConfirmModal, and get/setOpencodeDangerousMode in api.ts. No cloud API (apps/api) change. No ADR.

Project Members —

Click a member → access modal. The Settings tab Members row is now clickable and opens a MemberAccessModal showing the member's identity and per-capability access readout. The standalone Access card was removed (its readout moved into the modal); the opencode dangerous-mode card took the freed top-row slot.
Guarded removal. New DELETE /projects/{id}/members/{user_id} engine endpoint (project-admin only) removes a membership, but blocks removing the last remaining project_admin (409) so a project can't be orphaned — 404 for unknown members, 403 for non-admins. The UI gates the action behind an in-modal confirm step and surfaces the guard message inline.
Code: engine routes/projects.py (remove_project_member); renderer MemberAccessModal + ProjectSettingsPanel (clickable row, Access card removed, opencode card relocated); removeProjectMember in api.ts. Members stay in the global DB; no schema change; no ADR.
Known gap: no add-member/role API yet (account-controlled). Self-removal with another admin present is unreachable today (single-member + last-admin guard); when multi-member lands it should refresh the project list.

SSH Machines

SSH-Machine terminals and agent seats now survive a transient network drop between the local engine and the remote (ADR-0061). The remote shell runs inside a persistent tmux session (skybolt-<terminal-id>) via a self-detecting wrapper in services/ssh_ops.py (_tmux_wrap), so losing the local ssh channel DETACHES rather than kills it; a new reattach supervisor (services/ssh_persistence.py) re-spawns the ssh process in place (new-session -A reattaches), classifying drop-vs-exit-vs-unreachable by the tmux has-session ssh return code. On by default when the remote has tmux (>= 1.8), with byte-identical fallback when it doesn't. Linux-only (Windows users connect to a Linux/WSL2 remote). The agent scheduler no longer finalizes a seat as terminal_exited while it is reattaching (_check_running_sessions gate), closing the mid-work loss for SSH seats. Delete and seat completion run tmux kill-session; a prefix-bounded reaper clears orphaned skybolt-* sessions; Restart reattaches a still-live session instead of starting fresh. New env knobs: SKYBOLT_SSH_REATTACH_INTERVAL_SECONDS, SKYBOLT_SSH_REATTACH_GRACE_SECONDS, SKYBOLT_SSH_REAP_SECONDS. Tests: apps/engine/tests/test_ssh_persistence.py.

2026-06-14

Account Settings Sync —

Engine — surface them. GET /accounts/{id}/machines (routes/machines.py) now returns the local engine PLUS the synced kind='ssh' rows (serialized via _ssh_account_machine: isolation_policy.machine.ssh.{host,port}, host, status). Previously it returned only the local machine, so synced SSH machines were invisible. Test: test_account_machines_include_synced_ssh_machines.
Renderer — connect a cloud-known ("ghost") project to an SSH machine. The Sync hub gained a "Connect on SSH machine" action on ghost rows (ProjectSyncRow/useSyncHub/ConnectSshMachineModal): pick a synced SSH machine or type a new host/user/port, give the remote repo path, and it calls the existing POST /projects/import/ssh — which pulls only the project's .skybolt/project.db metadata, materializes the ghost (reconciling its id), and points its execution target at that machine. The code stays on the remote host. New importProjectSsh in api.ts. The action shows for any ghost (no pre-existing machine required — that's how the first one gets added).
Propagate the binding. A default-on "Sync this project across my devices" checkbox enables blob sync after connecting, so the project.db (with its SSH execution target) travels as an opaque encrypted blob; other devices see the ghost and "Restore to this machine" (ADR-0060 ghost-restore fix) pulls it already bound. The account-level machines row for a freshly-typed host is seeded by the engine when the imported project's SSH target is first used, and then syncs.
Tests: web ConnectSshMachineModal.test.tsx (7) + SyncHubBody.test.tsx connect cases (4); engine test_app.py machine-list test. The 2026-06-17 project-blob auto-restore pass closed the remaining login zero-click gap.

Agent Terminal —

AgentTerminalQuickPicker now renders the unified seat+model choice list (the same buildChatChoices list the Chats composer uses), with catalog models grouped by provider. Its onConfirm returns a discriminated union: { kind: "seat", ... } or { kind: "model", ... }.
New InlineModelChatPanel component: a fully self-contained, streaming chat panel for one hosted-model thread. It loads the persisted transcript on mount, resumes a reply still streaming server-side (reload-resume), sends turns via createChatRun + streamChatRun, renders tool steps + inline approvals, and reuses AssistantMessageBody. Several panels can run at once (unlike the single-selection useChats).
New useInlineModelChats hook tracks which chat threads are pinned as inline panels on the Terminals tab (localStorage-persisted per project) and owns openInlineModelChat / closeInlineModelChat.
Shared surface/chatTranscript.ts module: the pure transcript/tool helpers (toolActivityFromFrame, upsertToolActivity, mapReloadedTranscript, formatToolActivityLabel, CHAT_HISTORY_LIMIT, …) moved out of useChats.ts so the Chats tab and the inline panel render the same shape from one source. useChats re-exports them, so no external import sites changed.
TerminalView header gained an inline X kill button (shown whenever the terminal is killable) next to the ⋯ overflow menu; clicking it calls the same onKill the menu's "Kill" item does. No backend changes. The model-chat path reuses the existing chat engine endpoints/WebSocket verbatim (POST /chat-runs, the chat attach + /chat/{project}/{thread}/ws socket). No engine routes or schemas were added or modified for this change. Non-goals
CLI seat agent terminals (PTY) are unchanged.
The model-chat path creates a real chat thread, so the conversation also appears in the Chats tab; closing the inline panel only unpins it (it does not delete the thread).
Inline panels render in their own responsive grid section above the PTY terminal grid; they do not participate in the terminal layout presets, drag-reorder, or pop-out. Files changed
Web (new): apps/web/src/app/project/surface/chatTranscript.ts, apps/web/src/app/project/surface/useInlineModelChats.ts, apps/web/src/app/project/tabs/terminals/InlineModelChatPanel.tsx (+ test).
Web (modified): apps/web/src/app/project/surface/useChats.ts (helpers extracted + re-exported), apps/web/src/app/project/tabs/terminals/AgentTerminalQuickPicker.tsx (+ test) (unified picker), apps/web/src/app/project/tabs/terminals/TerminalsTabBody.tsx (render panels + route picker), apps/web/src/app/project/ProjectSurface.tsx (instantiate useInlineModelChats, pass props), apps/web/src/components/TerminalView.tsx (+ test) (inline X kill button). Validation
pnpm --filter @skybolt/web typecheck — green.
pnpm --filter @skybolt/web lint (changed files) — green.
Targeted vitest (InlineModelChatPanel, AgentTerminalQuickPicker, useChatsToolActivity, TerminalView) — green (29 tests).
Full vitest run src — the 8 SkyboltApp.test.tsx failures are pre-existing (verified by running the file on the pre-change baseline: same 8 fail) and unrelated to this change. Limitations / follow-up
Inline panels live outside the terminal layout engine (own grid section). Unifying them into the layout presets / drag / pop-out is a follow-up.
Closing a panel unpins it but leaves the chat thread; a "delete thread" affordance could be added to the panel header.
Tool-approval resolution in the panel reuses the shared approvals endpoint; brain-context priority sources (the Chats tab's source-material selection) are not surfaced in the inline panel yet.
SSH-backed Codex/Claude command terminals now launch through an interactive login shell (-l -i -c) and do not prefix the seat command with exec, so profile PATH entries, aliases, functions, and wrappers resolve the same way they do in a normal SSH terminal.
The inline Agent Terminal chat input now queues raw text with a submit flag. TerminalView sends the body first and sends Enter as a separate carriage-return write so the TUI submits the message instead of only filling its input box.
Chat input pointer and click events stop propagation so clicking the chat box does not refocus the terminal card.

Cloud Identity (skybolt.ai)

cloud_restore no longer dead-ends with "local profile for a different account." When the recovery code opens an existing database that has no profile for this account — an empty/placeholder file, or a stale profile under a different email (leftover test data, or a folder a cloud-sync client re-created) — _restore_unlock_in_place now returns a fall-through signal and the route does a fresh restore (reset at-rest, recreate the profile under the recovery key, pull from the cloud) instead of returning 409. A genuinely different account still can't reach this path: its key fails the custody verifier or won't decrypt the DB → 422. The matching-profile in-place unlock is unchanged.
Storage resolution no longer flip-flops back onto OneDrive (ADR-0060). resolve_data_dir (engine settings.py + desktop data_dir.rs) now prefers the local data dir over a *cloud-synced* legacy <Documents>/Skybolt once the local dir already holds a database — so a sync client re-creating the deleted OneDrive folder can't pull the engine back onto the corrupting copy.
Why: a user who deleted their OneDrive Skybolt folder hit 409 on restore — OneDrive had re-synced the old DB, and a stale local DB (no custody, mismatched email) made the in-place path reject them. Tests: test_cloud_auth.py +2 (empty-DB and different-email → fresh restore); test_db_corruption.py +1 (local preferred over re-synced OneDrive).
No more "Could not unlock with those credentials" dead-end. When the cloud accepts the password but this machine's encrypted store can't be opened — at-rest.json custody missing (an encrypted skybolt-engine.db present without its custody, e.g. dropped/not-synced by a file-sync client like OneDrive) or the master password drifted after a reset — cloud_login now returns needs_recovery instead of a 401. The renderer's inline recovery prompt then asks for the recovery code.
cloud_restore unlocks an existing encrypted database in place (new _restore_unlock_in_place): instead of refusing with 409 when a DB file exists, it uses the recovery code (the account key) to open the data without wiping it. It verifies the recovered key actually decrypts the DB before creating a session, re-wraps the password custody with the current cloud password so a plain password sign-in works next time (self-heal), and rolls back to the prior orphaned state on a wrong code (no data loss, no bogus custody left behind). The fresh-new-machine restore path (no DB on disk) is unchanged.
Why: a returning user with a valid password and recovery code could be locked out of an existing encrypted database with no in-product path forward — the only fix was hand-deleting the DB. The recovery code is the account key; it should open the data. Unlocking still requires BOTH the cloud password and the recovery code, so this is strictly stronger than either alone — no escrow weakening.
Files: apps/engine/skybolt_engine/routes/cloud_auth.py. Tests: test_cloud_auth.py +3 — orphaned-custody login → needs_recovery, in-place recovery unlock + password self-heal, wrong-code rollback keeps the DB.
Sign-in folds restore inline. When cloud login returns needs_recovery (the account exists but this machine holds no encrypted data), the sign-in screen now reveals the recovery-code field in place and the next submit restores via cloudRestore, reusing the email/password already typed — instead of bouncing to the separate restore mode and forcing re-entry of email/display-name/password. The display name defaults from the email local-part since the inline prompt asks only for the recovery code.
Removed the manual "I already have an account — restore it" link from the sign-in screen. A returning user on a new device just signs in; the inline prompt appears only if the machine needs the recovery code. Sign-in remains cloud-first with the silent local fallback on a 503. The restore mode and its create-account entry point are unchanged.
Files: apps/web/src/app/auth/AuthScreen.tsx. Tests updated: AuthScreen.test.tsx (needs_recovery now asserts the inline prompt, no display-name field, cloudRestore called with the email-derived display name) and SkyboltApp.test.tsx (the restore flow drives cloud-login → inline recovery).

Desktop App

EngineRecovery raised the health-ping failure threshold from 2 to 4 (~40 s) so a normal restart gap — a uvicorn --reload worker swap in dev, or a slow sidecar respawn after a desktop rebuild — no longer flashes the "engine is no longer running" recovery dialog. A genuine crash still surfaces instantly via the engine-exited process event (unchanged); the health backstop only needs to catch a hung-but-alive engine, so it can afford to be lenient. Test updated (EngineRecovery.test.tsx).
data_dir.rs: a fresh install now defaults its data dir to a local, never-synced folder (local_app_data_dir, e.g. %LOCALAPPDATA%/Skybolt) when the computed Documents path is cloud-synced; otherwise <Documents>/Skybolt as before. resolve_data_dir gained a legacy fallback (an existing <Documents>/Skybolt/skybolt-engine.db is preserved) so the default change never orphans or silently moves existing installs. The bootstrap config.json anchor moved to a dedicated documents_skybolt_dir() (the historical Documents path) so it stays findable.
New Tauri commands: reset_corrupt_data (stop engine → delete skybolt-engine.db/-wal/-shm + at-rest.json with the same handle-release retry as change_data_dir → relaunch) and quit_app (app.exit(0); the engine is stopped by the existing Exit handler). Both registered in the invoke handler and consumed by the renderer's DatabaseRecoveryScreen.
Mirrors the engine's Python resolution (settings.py); the desktop still passes the resolved path via SKYBOLT_ENGINE_DB, so the Rust resolution is authoritative for the app. Validated with cargo check.

Encrypted Project Sync

Bug: a project known only from the synced_projects manifest (registry row with skybolt_path NULL) gets the cloud's last_generation copied into its local sync_generation. The blob-restore anti-rollback check (_apply_remote_blob_async → sync_ops._generation_conflict) compared that phantom generation against the cloud blob and returned 409 "the cloud copy is older than this machine's state" whenever the phantom was higher — so a project whose real work lives on another machine (e.g. an SSH dev box) could never be pulled onto a fresh machine. The same ghost also blocked everything project-scoped: PATCH /projects/{id} (where the SSH execution target is configured) 404s until the project is materialized, which read as "can't add machines."
Fix: anti-rollback now applies only when the project actually has local data (skybolt_path set). A ghost has nothing on disk to protect, so restore proceeds and the account-key unseal is the authority. The overview's per-project conflict flag likewise requires has_local. Restoring materializes the project (installs the DB, sets skybolt_path, reconciles the execution target), which then unblocks configuring its machine. Anti-rollback for projects WITH local data is unchanged (test_restore_blocks_rollback still passes).
Files: services/project_blob_sync.py, routes/project_blob_sync.py. Test: test_restore_ghost_project_ignores_phantom_generation (ADR-0060 follow-up).

SSH Machines

SSH command-backed terminals, agent seats, and Codex/Claude capability probes now use an interactive login shell (-l -i -c). Seat commands are no longer prefixed with exec, so aliases, functions, wrapper scripts, and profile-managed PATH entries resolve the same way they do in a normal SSH terminal.

2026-06-13

Agent Terminal —

New engine endpoint POST /projects/{id}/agent-terminals that creates a terminal_sessions row with control_mode='agent' and agent_lease_state='leased', spawns the requested seat on the project's default execution target, and returns the new terminal plus the attach record inline.
Companion DELETE /projects/{id}/terminals/{tid}/agent-lease that releases only the opener's lease (holder is user:<user_id>, not an agent session id).
Two UI affordances: a new + Agent Terminal button on the Terminals toolbar, and an "Agent Terminal" preset tab in the existing NewTerminalSetupModal.
An inline chat input docked at the bottom of every terminal card whose control_mode === "agent". Submitting the input sends the message as raw terminal input via the existing queueTerminalInput path; the encoding matches the engine's format_seat_submit so a multi-line paste still reaches the seat intact.
New formatQueuedTerminalInput helper in terminalsPresentation.ts that mirrors the engine's format_seat_submit (single-line + CR, multi-line + bracketed paste + CR). Both the renderer and the engine now agree on every byte that goes to the seat. Non-goals
The agent scheduling / Kanban card path is untouched.
apps/api is untouched.
No new tooling upgrade.
The user-opened terminal does not create a chat session — the chat input is terminal input only. Files changed
Engine: new apps/engine/skybolt_engine/routes/agent_terminals.py; additions to apps/engine/skybolt_engine/schemas.py and apps/engine/skybolt_engine/routes/__init__.py; additions to apps/engine/tests/test_app.py (6 tests).
Web: new apps/web/src/app/project/tabs/terminals/AgentTerminalQuickPicker.tsx (+ test), AgentTerminalChatPanel.tsx (+ test); additions to apps/web/src/app/api.ts, apps/web/src/app/project/modals/NewTerminalSetupModal.tsx, apps/web/src/app/project/project.tsx's useTerminals hook, apps/web/src/app/project/tabs/terminals/TerminalsTabBody.tsx, apps/web/src/app/project/tabs/terminals/terminalsPresentation.ts, and the shell ProjectSurface.tsx (props + busy wiring). Validation
cd apps/engine && python -m pytest -q — green (full suite).
pnpm --filter @skybolt/web test -- --run — green (528 tests total; 8 pre-existing SkyboltApp.test.tsx failures are present on origin/main and unrelated to T7).
pnpm --filter @skybolt/web typecheck — green.
pnpm --filter @skybolt/web lint — green. Limitations / follow-up
The persona dropdown is a placeholder — the engine stores the persona_id on the row but no behavior keys off it yet. A v2 prompt package can read it to seed the seat.
The inline chat is a single-line input. A multi-line message can be pasted (the bracketed-paste encoding handles it correctly), but there is no Shift+Enter newline in the input itself. A future textarea upgrade would surface the same encoding.
Releasing the agent lease flips the row to control_mode='human' but does not auto-pause or auto-kill the seat. The user can keep watching after releasing; closing the terminal kills the seat.
The endpoint uses the project's default execution target. A future iteration could let the user pick a specific machine from the picker (the underlying project_execution_targets plumbing already supports it).

Agents Execution

Symptom. A prompt-injected model with chat-driven delete_path could call the tool on a top-level path that passes the write policy (e.g. apps/web/src) and have shutil.rmtree wipe the entire subtree — including .env, *.key, *.pem, anything under secrets/, .git/ internals, .skybolt/ internals. Pre-fix, the per-file policy gate was only applied to the *top* path; the walker and the rmtree did the rest.
Invariant being enforced. agent_file_tools.py and agent_action_tools.py both state: "the built-in always-on denies cannot be reached even with can_write." That invariant is now true for the delete path as well as the read/write paths.
Fix. _local_delete_workspace_entry (and its SSH counterpart _ssh_delete_workspace_entry) now accept a policy kwarg; when recursive=True and a policy is supplied, the helper walks the subtree (pruning .git / .skybolt / node_modules / __pycache__ at every depth) and refuses the whole delete if any leaf is denied. The chat tool's _delete_path passes its policy through; the IDE route is unchanged (operators running deletes through the route get the same protection for free as soon as a future call passes a policy).
SSH caveat. The SSH path uses the existing remote workspace index to enumerate leaves; if the index is cold and the walker fails, the recursive delete is refused with a 409. The conservative tradeoff is prefer-refusal over silent policy bypass.
Tests. test_delete_path_refuses_to_wipe_protected_leaves exercises the recursive-protected-leaf case. Existing top-level .env test (test_file_policy den...) still passes; the new gate is a second layer below the top-level check.
Files. apps/engine/skybolt_engine/services/workspace_ops.py (the new helper + gate), apps/engine/skybolt_engine/services/ssh_workspace.py (SSH counterpart with index-based walker), apps/engine/skybolt_engine/services/agent_action_tools.py (passes policy through on delete), apps/engine/tests/test_agent_action_tools.py (regression test).
Symptom. Seating an agent (Claude Code / Codex) via the scheduler sometimes landed the prompt in the seat's PTY without pressing Enter: the bracketed-paste body appeared in the agent's input but the trailing CR was swallowed, so the seat never started developing.
Cause. Two compounding races: 1. `_deliver_seat_prompt slept a fixed seat_prompt_startup_delay before writing. On slow machines the seat's TUI was still in its splash screen when we wrote, so the input box was not yet live and the Enter was eaten by the splash sequence. 2. winpty's write() goes through a conpty buffer; without an explicit flush()` the trailing CR of a paste-and-submit sequence can sit in the buffer past the moment the seat is interactive, and the TUI never sees the Enter.
Fix. New `LocalTerminalBroker.wait_for_seat_ready watches the PTY output stream for a per-CLI readiness pattern with a hard 25s timeout. The Claude pattern is the strict ❯ symbol + whitespace + a cursor-positioning escape (CSI `<row>;<col>H) — the cursor-positioning sequence is the strongest "the input box is now the active editing surface" signal in Ink / raw-mode TUIs, because the TUI moves the cursor into the input region right before it starts accepting keys. Matching the bare symbol alone races against the TUI drawing the prompt before its input loop is up. The scheduler calls the wait first; on success the prompt is written immediately, on timeout a logger.warning is emitted and the prompt is still delivered (best-effort). _write_terminal_input now calls flush() after every winpty write so the trailing CR is not stranded in the conpty buffer. The legacy fixed-sleep override is preserved as a fallback for operators who set seat_prompt_startup_delay` explicitly.
Files. `skybolt_engine/terminal.py (new wait_for_seat_ready, seat_ready_pattern, SEAT_READY_PATTERNS, DEFAULT_SEAT_READY_PATTERN, read_event / read_offset on LocalTerminalProcess; _write_terminal_input flushes winpty); skybolt_engine/scheduler.py (_schedule_seat_prompt / _deliver_seat_prompt` use the readiness wait).
Tests. Four new tests in `tests/test_terminal.py: test_seat_ready_pattern_picks_per_cli_regex (per-CLI + path resolution + default fallback), test_broker_wait_for_seat_ready_returns_false_on_timeout (no output → False), test_broker_wait_for_seat_ready_returns_true_when_pattern_matches (sentinel in stdout → True, read_offset advances), test_write_terminal_input_flushes_winpty_process (winpty flush()` is called after every write). Full engine suite: 663 passed, 0 failed.
Read path (engine). Coordination events were durable-only — written to coordination_events but never exposed to the renderer. Added services/events.py::coordination_event_out + list_recent_coordination_events (parses payload_json to an object; newest-first with a rowid tiebreaker) and a read-only route GET /projects/{project_id}/coordination-events?limit= in routes/agents.py (mirrors list_orchestrator_runs: _project_role authz, 404 for non-members, capped at 100).
Web client. api.ts gained the CoordinationEvent type and listCoordinationEvents. ProjectSurface fetches it in both the full load and the runtime poll (via optionalList, so a 404 degrades to empty) and threads it to the Agents tab.
First-class renderer. dispatch_stalled (and its sibling dispatch_blocked_no_orchestrator) now render in the Agents tab as a "Dispatch alerts" feed (icon + label + warning tone + reason chip + relative time + message), reusing the journal/StatusBadge styling. A transient "Dispatch stalling — retrying" banner sits above the roster while a stall is active, mirroring the "Machine offline" banner. New AlertTriangleIcon; coordinationPresentation.ts owns the kind→label/tone/icon map and the activeStallEvent signal (recent stall with no successful claim after it; bounded by a 2-minute recency window because the engine throttles re-emits and emits no explicit "cleared" event — the durable feed entry remains either way).
Tests. test_coordination_events_route.py (ordering, parsed payload, project scoping, limit, authz); web coordinationPresentation.test.ts (alert filter, presentation map, stall-reason labels, activeStallEvent recency/recovery cases) and three new AgentsTabBody cases (banner + feed shown on a recent stall, banner hidden but feed kept after a later claim, nothing rendered when empty). Engine ruff/pyright and web tsc/eslint clean.
Symptom. A deterministic Handyman would create a queued agent session (the dispatch_handyman_assigned event fired) but the card never moved — the scheduler claimed nothing and surfaced nothing. Root cause was a sqlcipher3 ... database is locked raised in the claim pass (_claim_next → _ensure_default_execution_target) and swallowed by tick_once's per-project except, so the board looked identical to "idle". Aggravated by uvicorn --reload churn and the renderer's continuous polling contending for the single global writer lock.
Read-first machine ensure (store/project_store.py::_ensure_default_execution_target). The synthetic machines row lives in the GLOBAL database and was UPSERTed on every claim attempt, taking the global writer lock each tick. It now reads the row first and writes only when missing or a tracked field (kind/display_name/host/status/capabilities_json/metadata_json) actually changed. Steady state is a pure read. Write semantics on change are unchanged.
Bounded claim retry (scheduler.py). _claim_next_with_retry retries a claim a bounded number of times (SKYBOLT_AGENT_DB_LOCKED_RETRIES, default 4; SKYBOLT_AGENT_DB_LOCKED_BACKOFF_SECONDS, default 0.05) on a transient database is locked (detected by message via _is_db_locked_error, so it catches both plain sqlite3 and SQLCipher's unrelated OperationalError class). Non-lock errors re-raise immediately; immediate_transaction rolls back so a retry never leaves a partial lease.
Surfaced stalls (new dispatch_stalled coordination event). When a claim or maintenance pass ultimately fails, _record_pass_failure logs it AND emits one throttled dispatch_stalled coordination event (payload.reason ∈ {database_locked, error}, payload.phase), best-effort and cleared on the next clean claim pass — mirroring dispatch_blocked_no_orchestrator. The board can now explain why nothing is moving instead of going silent.
Logged global-registry failures (scheduler._project_ids). Previously caught sqlite3.Error and returned [] silently — a locked/unreadable encrypted global DB made the scheduler service zero projects with no trace. Now broadened to except Exception (SQLCipher errors are not sqlite3.Error subclasses) and logged.
Tests. tests/test_scheduler.py adds: read-first machines write, locked-claim retry + dispatch_stalled surfacing + throttle, non-lock error fails fast but still surfaces, and _project_ids logging on global-DB error. Full scheduler/dispatcher/orchestrator suites pass; ruff and pyright clean on the edited source.
Follow-up. ~~The renderer should render the dispatch_stalled event kind (and consider a roster status hint) so the stall is visible without reading the raw coordination-event stream~~ — done (see the renderer entry above). Still open: the identical silent-swallow in dispatcher.py::_project_ids (catch-up sweep) could get the same logging treatment.
Recorded the M1–M5 work below (hosted-model chats gaining write/git/command tools, gated to the existing approval model) as ADR-0059: Hosted-Model Agent Write/Git/Command Tools (documents/decisions/adr-0059-hosted-model-agent-write-git-command-tools.md). The ADR captures the rationale that earlier passes implemented: the "write"-not-"edit" file_policy action (the .git/.skybolt protected-infra deny is gated on "write"), the can_write capability carried on the chat attach token, reusing the existing approval_requests flow instead of a second approval path, the enriched tool-frame + role="tool" persistence contract, the per-model supports_tools tri-state, the MAX_TOOL_ROUNDS = 6 bound, and the live-working-tree isolation caveat versus the orchestrator's per-card worktrees. References ADR-0057 (read-only predecessor), ADR-0039 (IDE write path + secret boundary), and the git-operations approval gates. Docs-only; no code changed.
WS tool-frame enrichment (M4). The {type:"tool"} frame and the AgentEventSink.tool(...) protocol gained kind ("read" | "write" | "git" | "command", derived in the loop from a name→kind map TOOL_KIND_BY_NAME, so the read file-tools path needs no change), plus, on the end frame for an approval-pending action, approval_request_id (the approval_requests.id the inline approval affordance acts on) and optional title / risk_level / summary, alongside the existing name / path / phase / ok and a monotonic seq. The new fields are additive and default to None, so ADR-0057-era clients ignore them.
Structured tool results. agent_chat.py added a ToolResult dataclass (text, ok, approval_request_id?, title?, risk_level?, summary?). run_agent_chat's executor may now return EITHER a plain str (read file tools + planning sessions — kind inferred from the name, ok from the Error/Access denied/Approval required prefix) OR a ToolResult (action tools). execute_agent_file_tool's plain-string contract is unchanged, so planning is unaffected. services/agent_action_tools.py::execute_agent_action_tool now returns a ToolResult; the git / run_command approval paths carry the approval_request_id + title + risk_level.
Tool persistence (M4). A reloaded thread no longer loses tool activity. Each completed tool step is persisted as a DISPLAY-ONLY role = "tool" row in chat_messages, with the structured detail in the existing metadata_json column (NO schema change). _persist_chat_message gained a metadata param; _RegistrySink persists the step on the end phase; _chat_message_out now returns a metadata dict. Tool rows stay out of _CHAT_ROLES, so _sanitize_chat_history never replays them to the model.
Per-model "supports tools" toggle (M5). Catalog models can now store an explicit tool-support override. CatalogModelCreateRequest / CatalogModelUpdateRequest gained supports_tools: bool | None; routes/models.py writes it into capabilities_json["tools"] (the key model_supports_tools already reads), so it flows through resolve_model → resolved.capabilities → the tool-eligibility decision. _catalog_model_out promotes it to a top-level supports_tools field (null = unset/provider-kind default, true/false = explicit override).
Validation: full pytest tests/ green (610 passed); ruff + pyright clean on changed files.
Backend (Execution Engine). Hosted-model chats (the WS path in routes/chat.py driven by services/agent_chat.py::run_agent_chat) previously had only the read-only file tools (ADR-0057). They now also get write_file, delete_path, rename_path, git_action, and run_command, in a new services/agent_action_tools.py that mirrors agent_file_tools.py (stateless, never raises, opens its own short-lived project connection). Works on Local and SSH targets.
Authorization (M1). chat_attach.py::ChatAttachBroker tokens now carry user_id and a can_write capability bit; create_attach_token gained user_id/can_write params and a new attach_context(...) reader. routes/chat.py::attach_chat_thread computes can_write = _project_can_run(role) and bakes it into the token (the WS is token-only, no cookie); chat_ws reads it back and threads user_id/can_write through _prepare_chat_run → _run_generation → a bound executor closure → run_agent_chat. Viewers mint a read-only token and only ever get the read-only file tools.
File safety. Every write/delete/rename routes through file_policy.evaluate_file_access(..., "write"), so .env, *.key, *.pem, secrets/*, .git, .skybolt stay denied even with can_write. The local write/create/rename/delete bodies were extracted from routes/workspace.py into shared workspace_ops.py helpers (_local_write_workspace_file, _local_create_workspace_entry, _local_rename_workspace_entry, _local_delete_workspace_entry) so the route and the chat executor share one implementation (no duplication).
Approval gating (M3). Reuses the existing machinery — no second approval path. High-authority git (HIGH_AUTHORITY_GIT_ACTIONS), protected-branch writes (_protected_branch_gate), destructive git (reset / clean / abort / force_push / branch_rename / stash_pop / tag_create), and every run_command create the same approval_requests row (+ a git_operations row awaiting_approval for subject_type="git_operation", or a command_runs row for subject_type="command_run") that routes/git.py and the command-run route create, attributed via requested_by_user_id with origin:"chat_agent", and return "Approval required — request <id> is pending; ..." WITHOUT executing. The existing routes/approvals.py resolve flow runs them on approve. Low-authority git reads (status / log / branch-create / stash / fetch / diff) and file writes run inline. run_command requires a saved command-profile id (no arbitrary shell).
Prompt. agent_chat.py::build_system_prompt gained action_tools_enabled to describe the write/git/command tools and the approval behavior only when those tools are enabled.
Validation: ruff + pyright clean on changed files; full engine pytest suite 610 passed.
Click the owner icon to flip human/agent: the assignee glyph on each Kanban card (apps/web/src/app/project/tabs/board/ProjectBoardCard.tsx) is now a button that toggles assignee_type directly via updateCard, reusing the existing handler (renamed toggleReviewOwner → toggleCardAssignee, busy key card-review-owner: → card-assignee:, and the card props onReviewOwnerToggle/reviewOwnerBusy → onAssigneeToggle/assigneeBusy). The review-only "Send to human/agent" button uses the same handler. When the user lacks edit rights the glyph renders as a static, non-interactive badge.
Card delete is a plain confirm: ConfirmDeleteDialog (apps/web/src/components/ConfirmDeleteDialog.tsx) gains a requireNameMatch prop (default true). The board card delete in ProjectSurface.tsx passes requireNameMatch={false}, so it shows the card preview/impact and enables Delete immediately — no type-the-name step. It is still a themed modal (no native confirm); other deletes keep type-to-confirm.
Card Detail modal is now tabbed: CardDetailModal.tsx splits into a Details tab (the full contract edit form, now laid out across 2–3 responsive columns) and an Agents tab (instructions composer + prior notes + live activity/journal). The modal widened (~72vw, fixed-height flex column) to host the columns and a role="tablist" header; the active tab resets to Details when a different card opens.
Tests: ProjectBoardCard.test.tsx (icon click toggles to the opposite owner; static badge when not editable), ConfirmDeleteDialog.test.tsx (requireNameMatch={false} enables Delete with no field + surfaces errors), and CardDetailModal.test.tsx updated to switch to the Agents tab before asserting on instructions/notes/activity.
Layout: the Active Agents panel (apps/web/src/app/project/tabs/agents/AgentsTabBody.tsx) now renders one roster card per line (grid grid-cols-1) instead of the prior two-up grid at the 2xl breakpoint — each agent gets the full column width.
Recent work / notes collapse to icons: AgentRosterCard (apps/web/src/app/project/tabs/agents/AgentRosterCard.tsx) replaces the inline "Recent work" and "Notes" lists with two compact icon buttons (briefcase + speech-bubble, each with a count). Clicking opens a per-kind detail Modal: the work modal lists each finalized session (goal, lane, when, an Open session link), and the notes modal lists each authored note (type chip, title, body). Both modals expose a View card link that calls the existing onOpenCard, which switches to the Task Board and opens that card's detail modal.
Prop change: the card now takes cardById (full WorkCard lookup, for resolving the card links) in place of cardTitleById, plus the onOpenCard callback threaded down from AgentsTabBody. Two new shared glyphs BriefcaseIcon/MessageSquareIcon live in apps/web/src/components/icons.tsx.
Tests: AgentRosterCard.test.tsx (work icon → modal → Open session/View card; notes icon → modal), AgentsTabBody.test.tsx (notes open from the card icon), and the SkyboltApp.test.tsx roster flow updated to find the Recent work icon button by accessible name.

Changelog — minimax-integration

mmx provider + Seats panel: new mmx value in the provider enum, a thin wrapper over mmx auth login / mmx auth status / mmx quota, a Seats panel card for the auth flow, and cli_resolution extended so a GUI-launched engine resolves mmx on PATH the same way it resolves claude / codex.
Token / cost tracker: a small USD-per-Mtok table (model_costs.py) covers Claude, GPT, o1/o3/o4, and mmx families. The agent chat loop now surfaces the per-chunk usage object and persists prompt_tokens, completion_tokens, and cost_usd on every chat_runs row (v9 → v10 migration, additive). New GET /projects/{id}/chat-runs/usage returns today / this week / this month / all-time rollups and a 30-day per-day series. The Usage tab in the agent session detail panel renders the rollup.
minimax-output/ watcher: new GET /projects/{id}/mmx/output lists every file under the project's mmx output directory; the new MmxAssetsPanel polls it every 30 seconds and surfaces generated media in the Assets tab. The new GET /projects/{id}/mmx/output/content/{path:path} streams the raw bytes for an in-IDE blob: preview.
Tests: 5 new engine test files (test_model_costs.py, test_mmx_cli.py, test_chat_run_usage.py, test_mmx_output.py, scheduler patch) plus web tests for MmxAssetsPanel and MmxSeatCard. All pass.
Feature docs: documents/features/minimax-integration/ (overview, technical-design, test-plan, changelog).
API key handling: stored via the existing keyring-backed secret store. The key never leaves the engine process — mmx invocations pass the key via env var, not argv.
Mmx-preset terminal: the New Terminal modal gets an "Agent Terminal" tab (see agent-terminal feature doc) that defaults to a seat bound to the active mmx provider when one is connected.

Cloud Control Plane —

apps/api is the production skybolt.ai control plane. The "migration-era / extraction source / optional future" language in README.md and AGENTS.md was wrong and has been replaced.
Added apps/api to the pnpm workspace (pnpm-workspace.yaml) and gave it a marker apps/api/package.json (@skybolt/api) so the root scripts can wire it up the same way as the other workspace targets.
New root scripts:
- pnpm api:dev → python -m app (host-mode API)
- pnpm api:lint → ruff check .
- pnpm api:typecheck → pyright
- pnpm api:test → pytest -q
CI (.github/workflows/ci.yml → api job) was already first-class (ruff + pyright + pytest against SQLite on every push/PR); the inline comment was updated to make it explicit that this is the production control-plane job, not a "legacy" job.
Created this cloud-control-plane feature doc package (overview.md / technical-design.md / test-plan.md / changelog.md) so future agents have a single landing page that describes the service end-to-end, points at the ADRs that govern it (ADR-0042 identity, ADR-0043 deploy, ADR-0052 catalog, ADR-0054/0055 zero-knowledge sync), and is honest that documents/features/control-plane/ is the legacy PWA/runner control plane and not the active product path.
Both scripts/setup.ps1 and scripts/setup.sh already mention apps/api and provision its venv by default (scripts/setup.sh: SETUP_OPTIONAL_API=1 opts in; scripts/setup.ps1: -SkipApi opts out). No changes were required to the setup scripts — they were already correct.

2026-06-12

Account & Project Sync —

Bug: the per-project content signature (below) statted the -wal/-shm siblings with mtimes, but SQLite creates and deletes those files around every read connection — and the engine's runtime polls open read connections constantly. The 8s change-check racing a poll saw the transient files as an edit and re-sealed + re-uploaded an untouched project, producing steady "sync every ~30s with no changes" server traffic (and pointless generation bumps).
Fix: _project_content_signature is now main-file size+mtime plus the WAL's normalized frame size only — a header-only (≤ _WAL_HEADER_BYTES = 32) or missing WAL counts as "no buffered writes", the WAL mtime is ignored, and -shm is ignored entirely. Pure reads no longer change the signature; every committed write still does (WAL grows past its header, or the close-time checkpoint rewrites the main file).
Tests: new test_project_signature_stable_across_read_connections — a raw read connection creating/deleting -wal/-shm leaves the signature identical and sync_changed_once reports unchanged. Full engine suite: 532 passed.
The per-project blob-sync background loop now pushes on change instead of on a 30-min timer. Because a seal always bumps the generation, a cheap per-project content signature (size + mtime of the project DB and its WAL/SHM siblings — _project_content_signature) is checked every CHANGE_CHECK_INTERVAL_SECONDS (~8s); a project is sealed + pushed only when that signature changed. So creating a project or editing one reaches the cloud within seconds, and idle projects are never re-sealed. The post-seal signature is baselined in sync_cursors (scope project_sig:<project_id>) on a successful push — or on a persistent rejection (quota/oversize) so the loop stops re-trying identical content; transient failures (offline/conflict) retry next tick. New push_if_changed / sync_changed_once drive the loop (sync_once kept for the existing manual/test paths).
New projects already auto-sync when account-settings sync is on: read_project_enabled defaults ON absent an explicit per-project toggle (ADR-0055), so a newly created project starts syncing within seconds with no extra step. Over-quota projects surface the quota push status (graceful — no crash, no spam); a dedicated "limit reached" badge in the Sync hub is a documented follow-up.
Tests: test_project_blob_sync.py — a change-detection pass pushes on first run, no-ops when unchanged, and pushes again after an edit (add a card). 144 engine tests pass.
The "Server-readable board metadata (legacy)" disclosure and the CloudSyncPanel component (+ its test) were removed from the unified Sync hub; the four api.ts clients runProjectCloudSync, getProjectCloudSyncStatus, getProjectCloudSyncPrefs, putProjectCloudSyncPrefs (and their types) were deleted. The hub now presents only the two zero-knowledge channels (ADR-0054 account settings + ADR-0055 project blobs).
Engine/API side of the removal is recorded in documents/features/cloud-sync/changelog.md (the retired metadata-mirror feature). Net result: skybolt.ai stores no server-readable project metadata; every remaining sync channel is zero-knowledge.

Account Settings Sync —

Sync direction now sets the cadence. Pushes stay near-immediate: the loop checks the local content signature every CHANGE_CHECK_INTERVAL_SECONDS (~8s) and uploads only when content actually changed — the active machine is the source of truth and its edits land in the cloud within seconds. Pulls no longer poll: PULL_INTERVAL_SECONDS (~60s, below) is gone; remote changes arrive only at engine startup (_pull_if_enabled, once, ~5 s after boot), on web app load (StartupSyncIndicator → sync-now), and via the manual "Sync now"/restore buttons on the Sync tab. An idle engine makes zero settings-sync network calls.
Removed sync_once (the old loop's unconditional pull+push cycle — it bumped the generation even when nothing changed); renamed _loop_pull_if_enabled → _pull_if_enabled.
Also fixed the other source of idle traffic, in project blob sync: the per-project stat signature included the -wal/-shm siblings with mtimes, but SQLite creates and deletes those around every *read* connection — so the runtime polls' reads raced the 8s change-check and re-uploaded untouched projects ("syncing every ~30s with no changes"). The signature is now main-file size+mtime plus normalized WAL frame size (header-only/missing ⇒ 0, mtime ignored, -shm ignored) — stable across reads, still catches every committed write (_project_content_signature, _WAL_HEADER_BYTES).
Tests: test_startup_pass_disabled_is_quiet_noop (both halves quiet when the toggle is off); new test_project_signature_stable_across_read_connections (a read connection creating and deleting -wal/-shm is not a change and triggers no re-upload). Full engine suite: 532 passed.
Bug: the pre-seal redaction matcher (security/redaction.py::_is_secret_key) matched the secret fragments token/secret as bare substrings, so a benign model-capability key like max_tokens (or output_tokens, token_limit, …) inside an expanded capabilities_json/metadata_json column was treated as a secret and "redacted". That altered the scan view, tripped _redaction_safe, and silently blocked every settings-sync for the account — surfaced only as the opaque log line *"settings-sync redaction altered the outbound account bundle; refusing to ship"*.
Fix 1 — matcher: _is_secret_key now short-circuits to *not-secret* for token COUNT/LIMIT/USAGE field names via BENIGN_KEY_PATTERNS (the plural tokens count word and token_(count|limit|usage| budget)), mirroring the existing _ref/REFERENCE_KEYS exclusion. Singular credential keys (access_token, refresh_token, session_token, client_secret, …) and all secret-shaped *values* still redact — detection strength is unchanged.
Fix 2 — diagnosability: _redaction_safe now reports the offending key paths (e.g. models[0].capabilities_json.max_tokens) in both the warning log and the AccountSettingsSyncRedactionError message, via a new _redaction_diff_paths helper. Key names only are surfaced — the secret value is never logged. Still fails closed.
Tests: new apps/engine/tests/test_redaction.py (benign token-count keys pass, real credential keys + secret values redact); test_account_settings_sync.py +2 (max_tokens capability no longer trips build_bundle; a real leak's error names the key path but not the value). Affected suites: 24 passed.
The settings-sync background loop now pushes on change instead of on a 12h timer. A cheap local content signature (_content_signature — hash of the gated sections + selection + resolved key values) is checked every CHANGE_CHECK_INTERVAL_SECONDS (~8s); it pushes only when the signature changed, so an edit (provider/persona/setting/pref/matrix/key) reaches the cloud within seconds with no network when idle. Remote changes are pulled every PULL_INTERVAL_SECONDS (~60s) + on app open. The signature is baselined after every push AND pull (persisted in sync_cursors scope settings_sig:<account_id>), so a machine never re-pushes what it just pulled (no peer ping-pong), and a failed push retries on the next tick. New push_if_changed / _loop_pull_if_enabled; the loop (_run) interleaves a fast change-check with a slower pull. build_bundle refactored to share _serialize_sections with the signature.
Tests: test_account_settings_sync.py — push only on change (add provider → pushes; no change → unchanged; disabled → never), and pull baselines the signature so there's no immediate re-push. 143 engine tests pass.
Scrollbar gap: the account-settings modal's scroll area gained pr-3 so the scrollbar no longer sits flush against the panels (AccountSettingsModal.tsx).
Equal-height account row: the skybolt.ai account and recovery-code panels now match heights — the grid is lg:items-stretch and both panels' <section>/<Card> fill the cell (flex h-full flex-col + flex-1).
Projects gated on settings sync: the Projects and Restore sections are now hidden when "Sync settings across machines" is off (project sync depends on the account syncing). The settings-sync enabled state is lifted into useSyncHub (settingsEnabled + markSettingsEnabled); AccountSettingsSyncSection reports it via a new onEnabledChange callback so toggling the switch shows/hides the project sections live. Tests updated: SyncHubBody.test.tsx runs with sync on (so project rows render) plus a new case asserting the sections hide when it's off.
Sync hub layout (renderer): in the account-settings Sync tab the skybolt.ai account panel and the recovery-code panel now sit side by side (top row), with the settings-sync panel full width below; Projects and Restore sit side by side. Both grids stack below lg (where the modal narrows). The thin AccountSyncCard wrapper was removed (SyncHubBody.tsx now composes the sections directly).
"What syncs" matrix (new, account-wide): under the "Sync settings across machines" toggle, an enabled machine now shows a per-category matrix — Providers & models, Custom personas, Preferences, SSH machines, Project list, API keys — with All on / All off and per-item checkboxes. The standalone API keys card at the bottom is gone; API keys is now a matrix row, with the existing all / pick-per-key scope tucked behind an Advanced disclosure on that row (SecretsScopeCard gained an embedded mode).
Account-wide semantics (engine): the selection is account-wide, not per-machine: it lives in account_settings under a reserved __sync_selection category and always travels in the bundle (attached after the redaction scan — its api_keys key name would otherwise false-trip the secret-key rule), so every machine converges on the same choice and no machine can silently drop a category from the shared blob. build_bundle gates each section by the selection (off → emitted empty / provider_keys omitted); apply is additive, so turning a category off stops it travelling but never deletes a machine's existing local rows. Defaults all-on (existing accounts unchanged). New engine routes GET/PUT /account/settings-sync/selection (PUT reseals + pushes when sync is on); api-client getSyncSelection/setSyncSelection + SYNC_CATEGORY_META.
Tests: engine test_account_settings_sync.py — defaults all-on, build_bundle gates sections, and the selection converges through seal→apply (off SSH machine never lands on the peer). Renderer AccountSettingsSyncSection.test.tsx (matrix shows when enabled, toggling a category calls setSyncSelection, hidden when off) and SyncHubBody.test.tsx (secrets scope now reached via the API-keys Advanced disclosure). Full suites green: engine 131, web 47.
New: when the app opens on a cloud-linked machine, it now pulls the latest config from the cloud immediately (instead of waiting for the 60s-delayed background pass / 12h loop) and shows a bottom-center status toast: *"Syncing from cloud…"* → *"Up to date"* (auto-dismiss) or *"Working offline — couldn't reach skybolt.ai"* (dismissible). New component apps/web/src/app/StartupSyncIndicator.tsx, mounted in SkyboltApp.tsx for the authenticated view; on an applied pull it triggers a lightweight me reload so restored providers/settings show at once.
Pull whenever cloud-linked (owner decision): the on-open pull runs regardless of the per-machine auto-sync toggle (the engine sync_now pulls always, pushes only when enabled), so a second machine reflects the latest even if it never opted into background sync. A machine with sync off still *receives* cloud updates on open but never *uploads*.
Timeout: the pull is bounded by a 10s client-side AbortSignal.timeout (the engine HTTP client caps at 15s), so no connectivity / a down site flips the toast to the offline state instead of hanging; the app is never blocked on it. api.ts::syncSettingsNow gained an optional signal arg.
Tests: StartupSyncIndicator.test.tsx — not-linked renders nothing + never syncs; applied pull shows "Up to date" and reloads me; engine failure/timeout and an offline pull both show "Working offline". (4 tests; auth + indicator suites: 36 passed.)
The pre-seal redaction scan now runs over a view with embedded JSON strings expanded (account_settings_sync._expand_json_strings), so a secret hidden under a secret-named key *inside* a *_json/value_json column (e.g. metadata_json = '{"client_secret": "..."}') is caught by the key-name rule too — not just secrets matching a value regex at the top level. The bundle still ships with the strings intact; only the scan view is expanded. Fails closed as before.
Tests: test_account_settings_sync.py — a secret nested in metadata_json whose value matches no value pattern now trips AccountSettingsSyncRedactionError.
Fixed the orchestration gap where, after /auth/cloud/restore (sign in + recovery code on a new machine), nothing pulled the settings bundle: the user landed in an empty IDE with no providers/models/personas/API keys/project manifest, and the obvious "Restore settings from cloud" button silently no-op'd because the enable toggle was OFF by default and sync-now was gated on it.
Engine service services/account_settings_sync.py:
- New restore_now(settings) — enables settings-sync and pulls the cloud bundle (pull is NOT gated on the toggle), so a fresh machine restores immediately. Called by cloud_restore.
- New sync_now(settings) — the explicit, user-triggered path: always pulls (so a restore lands even before the toggle is on) and pushes only when enabled. sync_once stays the toggle-gated background loop.
Engine routes routes/account_settings_sync.py:
- POST /account/settings-sync/sync-now now calls sync_now (pull-regardless), not sync_once.
- POST /account/settings-sync/enable now pulls before pushing on enable, so a restoring machine with an emptier local config can't clobber a newer cloud bundle (it restores it instead).
Engine route routes/cloud_auth.py: cloud_restore calls restore_now after the account key is unlocked + the cloud identity is linked, and rebuilds me afterward so the response reflects the restored config. Adds settings_restored: bool to the response. Best-effort + offline-tolerant — a failure defers to the background loop and never blocks reaching the IDE.
Renderer apps/web/src/app/auth/AuthScreen.tsx: master-password strength helper text under the Password field in setup mode (see cloud-identity changelog for the policy).
Tests: test_account_settings_sync.py +2 (sync_now pulls when disabled; restore_now enables and pulls); test_cloud_auth.py doubles the settings-sync endpoint so restore stays network-free.

Agent Persona Catalog —

Canonical source is apps/api/app/data/builtin_personas.json; apps/api's BUILTIN_PERSONAS loads it (_load_builtin_personas). The engine's offline fallback apps/engine/skybolt_engine/personas_data.py is generated from it by scripts/generate-personas-data.py, with a CI personas job that fails on drift.
Startup reconcile: ensure_catalog_seeded (in control_plane.py, wired into main.py _lifespan, non-test) inserts only MISSING catalog keys from the JSON on boot — adding a built-in needs a deploy, not a migration; existing rows are never clobbered.
Staff admin API: GET/POST/PATCH /api/v1/admin/builtin-personas[/{key}] (routes/admin_personas.py), gated by _require_staff against the ADMIN_EMAILS allowlist. A PATCH bumps version → catalog ETag rotates → engines re-pull and re-seed. Editing an existing built-in needs no deploy/migration. The staff API still has no DELETE route; catalog absence now propagates local builtin-library pruning, per the 2026-06-22 entry above.
Tests in tests/control_plane/test_sync_catalog.py cover staff/non-staff gating, PATCH version+ETag rotation, create + duplicate-key 409, and the insert-missing reconcile preserving admin edits.

Agents Execution

A CLI seat can now pin a model, and that model rides along to the terminal it launches. The seat create/edit form (apps/web/src/app/project/tabs/settings/SeatFormModal.tsx) gains a Model dropdown populated from a code-owned curated list per CLI kind (apps/web/src/app/project/tabs/settings/seatModels.ts): Claude Code offers stable aliases (opus/sonnet/haiku/opusplan) and pinned ids (claude-opus-4-8, …); Codex offers its model slugs (gpt-5-codex, …). Leaving it on Default keeps the CLI's own default.
Storage: the model is persisted in seat_profiles.metadata_json.model (no schema change) and surfaced as a top-level model field on the seat API contract (_seat_profile_out). Create/update accept an optional model; a blank value clears it. The value is charset-validated (model_defaults.sanitize_seat_model) on the way in — it is later concatenated into a shell command, so anything outside the CLI-alias/id charset is rejected (422) rather than stored.
Launch wiring (two paths): for autonomous sessions, resolve_agent_seat_command (apps/engine/skybolt_engine/services/model_defaults.py) now looks up the seat named by the agent's default_seat.seat_id, reads its model, and appends --model <model> via the new seat_command_for_kind helper before storing seat_command on the session. For the interactive chat-with-seat terminal, localCliTerminalShell(target, model) (apps/web/src/app/project/tabs/terminals/terminalsPresentation.ts) appends the same flag, with the model looked up from the matching seat by kind in useChats/ProjectSurface.
Tests: tests/test_app.py::test_seat_profile_model_round_trips_and_validates (create/update/clear + 422 on unsafe values), tests/test_dispatcher.py::test_seat_model_appends_model_flag_to_command (model reaches the resolved seat_command) and ::test_seat_command_for_kind_appends_and_sanitizes_model (pure helper), plus terminalsPresentation.test.ts (client-side --model appending + charset guard).
Bug: the Project Command Overview's Live Agent Floor rendered one pod per *every* agent session in LiveAgentFloor (apps/web/src/app/project/overview/LiveAgentFloor.tsx) — including finalized (completed/failed) ones. Each card a self-driving agent finished left a stale pod, so a busy Handyman accumulated ghost copies of itself and, because the floor previously capped at 10 pods, filled every slot ("10 of the same agent"). An agent whose sessions were all finished never fell through to its single idle pod.
Fix: pod derivation moved out of the component into a pure, tested deriveAgentFloorPods(agentSessions, projectAgents) in overview/agentFloorModel.ts. It builds session pods only from non-finalized sessions (!isFinalizedAgentSession), then one idle pod per roster agent with no live session — so a finished agent shows once, as idle/standby, and an actively working agent shows once, as working.
Stale status vocabulary: isActiveAgentSession matched only the legacy ["queued","running"], so an M1 working session read as idle/sleeping on the floor and was missed by the overview's "active agents" badge. It now also matches working. The Project Board agent metrics in ProjectSurface.tsx had the same defect — "Active" missed working and "Done" matched done instead of M1's completed, leaving both stuck at 0; both now include the M1 names.
Tests: apps/web/src/app/project/overview/agentFloorModel.test.ts (finalized sessions never pile up, live session → working pod, finished agent → single idle pod).
The scheduler now seats agents on SSH Machines, not just local. Previously AgentScheduler._claim_next (apps/engine/skybolt_engine/scheduler.py) hard-returned for any non-local target, so a Handyman/agent session on an SSH project stayed queued forever. It now handles kind in ("local", "ssh"): for SSH it skips local-repo validation, stamps the claimed session and its resource leases with the target's real machine_id (ssh-machine, not the hardcoded local-machine), and carries kind/machine_id on the claim.
Remote worktrees — services/worktree_ops.py gains add_worktree_ssh / remove_worktree_ssh (+ ssh_worktree_path_for), which run the same git worktree add/remove --force/prune operations on the remote via _run_ssh_git. The remote worktree lives beside the remote repo at <repo>-worktrees/<session_id[:8]> (POSIX), or under git_settings_json.worktree_root when set. A failed remote git worktree add returns a failed status so _launch releases the session with the error instead of crashing the tick.
SSH seat launch reuses the interactive-terminal SSH path. New _launch_ssh mirrors _launch but builds the seat via services/ssh_ops.py::_ssh_seat_argv — an ssh -tt PTY whose remote command (a) cds into the remote worktree, (b) emits the seat env (SKYBOLT_COMPLETE_TOKEN, SKYBOLT_COMPLETE_URL, ids) as export K=Q statements with every value shlex.quote-d, and (c) execs the seat (claude/codex) through a login shell so the remote PATH is loaded. The argv adds a reverse tunnel -R {port}:127.0.0.1:{port} (and does not set ClearAllForwardings=yes) so the seat's existing SKYBOLT_COMPLETE_URL (http://127.0.0.1:{port}) reaches the engine over SSH unchanged. The local ssh process is opened with cwd=None / process_cwd=None (the remote cd does the work; a remote POSIX path would fail the local cwd existence check). The post-open bookkeeping is shared with the local path via _finish_seat_launch.
Completion over SSH flows through the same three signals as local: explicit POST .../complete (token, over the reverse tunnel), terminal_exited (the ssh seat PTY dies), and idle_timeout. _check_running_sessions and recover now reconcile sessions on the engine's owned machine ids ({local-machine, <project default target machine_id>} via _owned_machine_ids) so SSH seats get heartbeats and fallback completion; a foreign machine id remains the TTL reaper's job (unchanged).
Cleanup is target-aware and best-effort. _cleanup_worktree (recover/reaper) and finalize_agent_session call remove_worktree_ssh for SSH targets, guarded so a remote failure is logged/journaled but never crashes finalize. For SSH, finalize skips local-filesystem git status/HEAD inspection for v1 (the worktree is remote) so completion is never blocked on a remote round-trip; files_changed/commit_sha are recorded empty. The local path is unchanged.
Security — every value interpolated into a remote command is shlex.quote-d; env keys are validated as POSIX-shell-safe identifiers (_is_safe_env_key) before being exported, so a hostile key can't inject shell words; SSH host/user/port still flow through the existing SshMachineSpec validation. Tests: apps/engine/tests/test_scheduler.py (seat-argv builder for claude+codex, unsafe-key drop, SSH claim/launch with no local repo, local-path regression guard).
v1 limitations: remote diff inspection is deferred (no per-file change list / commit sha on SSH completions); the engine-local brain-context artifact path is still passed into the remote prompt (it points at an engine-local file the remote seat can't read — same as existing SSH chat); the remote worktree assumes the remote repo at target['project_path'] is a real git repo (a bad path surfaces as a release with the remote git error, not a crash).
Card Detail modal (apps/web/src/app/project/modals/CardDetailModal.tsx, wired in ProjectSurface.tsx) replaces the old basic EditCardModal (title/summary/assignee). A single click on a Kanban card now opens it (~50% wide × 75% tall). It edits the full card contract — title, summary, status, assignee, scope, plus the list fields (acceptance criteria, definition of done, quality gates, review/docs/test/handoff requirements, stop conditions) — through the existing PATCH /projects/{id}/cards/{cardId} (updateCard).
Human instructions to agents — new POST /projects/{project_id}/agent-notes (201) creates a human-authored agent_note (author_kind="human", note_type default instruction) scoped to a card (and optionally a target_agent_id). Route create_agent_note in routes/orchestrator.py, schema AgentNoteCreateRequest in schemas.py (card_id, body, title?, note_type?, target_agent_id?), service insert_agent_note in services/agent_notes.py. Gated on the project run capability (viewers 403'd). It is not an orchestration trigger (no wake event); like an orchestrator note, it is injected into the agent's prompt on its next session for that card — the engine cannot interrupt a running agent (ADR-0058). Client helper createAgentNote in apps/web/src/app/api.ts.
Agent activity in the modal — shows the card's most recent agent session (lease/heartbeat/current step/outcome) and its live journal, refreshed by polling (the app already polls the board ~5s; the modal polls the journal ~4s while open via listWorkJournal → GET /agent-sessions/{sid}/journal). No new WebSocket. The shared journal feed is extracted to apps/web/src/app/project/tabs/agents/SessionJournalFeed.tsx.
Security fix — card mutation routes are run-gated. PATCH and DELETE /projects/{id}/cards/{cardId} (update_card/delete_card in routes/cards.py) previously checked project membership but not the run capability, so a viewer could edit a card or flip its status to queue — firing a card_queued coordination event that wakes the orchestrator and spends model tokens. Both now return 403 "Project run access required" for members without run capability, matching create_card/list_cards and the agent-notes route. Regression coverage in apps/engine/tests/test_card_route_authz.py.
Sessions now originate from the orchestrator. Session creation moved to services/agent_sessions.py::create_agent_session_record(..., created_by user|orchestrator); the manual POST /projects/{id}/agent-sessions route delegates to it but returns 403 unless SKYBOLT_ALLOW_MANUAL_AGENT_SESSIONS is set (test/debug escape hatch only).
New orchestration runtime (skybolt_engine/orchestrator.py OrchestratorService + services/orchestrator_ops.py, registered via runtime_hooks): event-triggered off the scheduler_bus with a wake whitelist (card_queued, session_completed/session_failed, dependents_unblocked, agent_added/agent_archived/agent_restored, orchestrator_kick — orchestrator-emitted kinds are structurally ignored, no self-trigger loops), debounced (~3s quiet / 15s max holdoff), rate-capped (default 6 runs/min, throttled runs recorded), single-flight per project via an 'orchestrator' resource lease, with durable catch-up through a coordination_events rowid cursor in orchestrator_runs.consumed_through. One wake = bounded board snapshot → one non-streaming model call (resolve_model with the orchestrator project agent, so per-project/per-agent routing applies) → tolerant JSON parse (fenced blocks, first balanced object) + one corrective retry → Pydantic-validated decisions applied individually.
Decision vocabulary (the only effects model output can have): assign_card (queued session), create_prep_task (synthetic card with cards.origin='orchestrator' + ordering edges blocking dependents, then a session), add_dependency (self-edge + DFS cycle rejection), leave_note, defer. Per-decision verdicts (applied / rejected+reason) are audited in orchestrator_runs.decisions_json and as coordination_events kind='orchestrator_decision' (slim id-only payloads); a rejected decision never aborts the run.
Roster gate: AgentScheduler._claim_next dispatches nothing for a project without an active agent with role_key='chief_project_orchestrator'; queued sessions stay queued (backpressure) and one dispatch_blocked_no_orchestrator event per blocked stretch explains the stall.
Blackboard (per-project schema v8 → v9, additive: agent_notes, orchestrator_runs, cards.origin): notes carry note_type ∈ {instruction, sync_note, blocker, risk, handoff}, card_id (NULL = project-wide), target_agent_id (NULL = any agent), and author_kind ∈ {orchestrator, agent, human}. Written by orchestrator leave_note and by completing agents (AgentSessionCompleteRequest.notes; scope ids validated against the project or dropped). Matching active notes are injected into seat prompts (bounded 20 notes / 8k chars, orchestrator/human-authored prioritized, whitespace-collapsed to prevent label forgery) with injected ids recorded in prompt_package["notes"]. Lifecycle: card completion resolves card-scoped notes; agent archive resolves agent-targeted notes.
New routes (routes/orchestrator.py): GET /projects/{pid}/orchestrator/runs (limit ≤ 50), GET /projects/{pid}/agent-notes (?include_resolved), and POST /projects/{pid}/orchestrator/kick (202; requires a run-capability role).
Renderer (same slice, parallel agent in apps/web): the Agents tab becomes a capability roster — the start-session form and session kanban are removed; OrchestratorSpotlight shows the decision log from orchestrator runs (CTA "Add an Orchestrator to activate your agents" when absent); AgentRosterCard per specialist shows live activity/dormant states and authored notes; AgentSessionDetailPanel opens from roster cards only (apps/web/src/app/project/tabs/agents/).
Tests: new tests/test_orchestrator.py (21: roster gate, assign_card end-to-end through scheduler dispatch, prep-task dependency blocking, rejection verdicts, cycle rejection, malformed-output retry → failed run, fenced JSON, note injection scoping, complete-with-notes, forged-scope drop, label-forgery sanitization, lease single-flight, rate-cap throttling, debounce whitelist, catch-up sweep, kick + agent lifecycle events). Engine suite: 550 passed. HTTP-creating tests now set SKYBOLT_ALLOW_MANUAL_AGENT_SESSIONS=1 and seed an orchestrator on the roster.

Cloud Identity (skybolt.ai)

Closed the /kdf salt-length enumeration channel: kdf_salt is pinned to exactly 64 hex chars (schemas/identity_v1.py::KdfSaltField) — the length the engine always emits and the length of the deterministic fake for unknown emails — so a real user's /kdf response can never be distinguished from a fake by salt size. (The kdf_params value remains the ADR-0042-accepted account-existence disclosure: a real user must receive their actual params, which can't be faked.)
Per-account /refresh rate limit added alongside the per-IP one (keyed on the resolved user_id, applied only after the token resolves so it adds no enumeration signal).
Atomic single-use for email tokens: verify-email and password-reset/confirm now claim the token with UPDATE … WHERE used_at IS NULL + rowcount check (_claim_email_token), matching the /refresh pattern — two concurrent confirms with one token can no longer both succeed.
Cross-worker rate limiter (P2 #3): the identity limiter gained a Postgres-backed fixed-window backend (core/ratelimit.py::DbRateLimiter) so its ceilings hold across all workers/replicas instead of per-process. Each check atomically UPSERT-increments a shared rate_limit_counters row in its own committed transaction (so a request that later rolls back — e.g. a failed login — still counts), and a DB write failure fails open (logs + allows) so a database hiccup never becomes an auth outage. New table + additive migration 0023_rate_limit_counters (down_revision 0022_drop_metadata_sync); model app/db/models/ratelimit.py. enforce() is now async; selected via RATE_LIMIT_PERSISTENT (default true) and only active outside dev/test — dev/test keep the in-process SlidingWindowRateLimiter (and tests' rate_limiter.reset()), which now also gained a per-account /refresh limit.
Tests: test_identity_v1.py salt-length rejection + DbRateLimiter cross-session counter sharing; existing reset/verify reuse + login-429 tests preserved; the four api test suites' salt constants bumped to 64 hex. Validated against SQLite (full control-plane suite, 109 passed).
Not changed (deliberate): per-project sealed-artifact data-key rotation (GCM nonce-reuse bound, ~2³² theoretical) is left to its own ADR rather than altering the deliberate persistent-data-key design here.
Raised the Argon2id KDF defaults from 64 MiB / t=3 / p=1 to 256 MiB / t=3 / p=4 — the master password is the linchpin of the zero-knowledge design, and this Argon2id runs client-side only (skybolt.ai never executes it), so the higher cost is borne by the user's machine, not the server. security/cloud_kdf.py::DEFAULT_KDF_PARAMS and apps/api schemas/identity_v1.py::KdfParams defaults updated in lockstep (the wire contract of record). Existing accounts keep the params they signed up with (published per-user by the server); only new accounts get the heavier defaults.
Raised the param floor to 64 MiB (_PARAM_BOUNDS/KdfParams.ge) so a compromised or spoofed server can never downgrade a client below previously-shipped strength on login.
Pinned the server-side argon2 hasher params explicitly (apps/api/app/core/security.py: t=3 / 64 MiB / p=4 / 16-byte salt) instead of inheriting argon2-cffi library defaults, so a dependency upgrade can't silently weaken the at-rest hash of the auth_verifier.
Master-password strength gate (NIST SP 800-63B-aligned: length floor + blocklist, no composition rules). New security/password_policy.py::validate_master_password (min 12 chars, reject common weak values, reject low-variety runs) enforced in routes/cloud_auth.py::cloud_signup — the engine is authoritative (the raw password never reaches skybolt.ai), returning a friendly 422 before any derivation or cloud call. Renderer mirror apps/web/src/app/auth/passwordPolicy.ts drives inline feedback + a disabled submit in setup mode (AuthScreen.tsx); the engine remains the source of truth.
Tests: engine test_password_policy.py (new), test_cloud_auth.py (weak-password 422 before the cloud is touched; updated KDF-contract + light-param constants to the new floor); web passwordPolicy.test.ts (new) + AuthScreen.test.tsx (strength-gate gating); api test_identity_v1.py KDF constant tracks the new schema defaults.

Execution Engine

Extracted the streaming tool-calling loop into a single reusable component, services/agent_chat.py, so every place an agent reasons over a project shares one implementation (no more fixing the same logic in multiple routes). It owns model_supports_tools, build_system_prompt, read_brain_context_text, the SSE parse/assemble helpers, and run_agent_chat (the loop). Transport is abstracted behind an AgentEventSink protocol; routes/chat.py is now a thin layer (_WebSocketChatSink + _stream_chat_reply) that resolves the model, builds the prompt/brain, persists turns, and delegates the loop. run_agent_chat accepts a custom tool_schemas/execute_tool so the same loop can drive other tool families later. Note: there is only one engine-side hosted-model path today, and it already serves both chats and planning (planning uses the same chat WebSocket); agent *sessions* run CLI seats with their own shell access.
Fixed: hosted-model tools were never actually sent. model_client._chat_payload re-gated the tools param on model.capability("tools",...), which is always false for user-added catalog models (empty capabilities_json) — so OpenRouter never received the tools and the model imitated a tool call as TEXT JSON ({"name":"list_files","arguments":{...}}) in its content. _chat_payload now forwards tools/tool_choice whenever the caller includes them (the caller owns eligibility via _model_supports_tools + degradation); stream/response_format stay capability-gated. This was the real reason file tools never engaged on any provider, even tool-capable models.
Made skybolt.* loggers emit INFO to stdout (app._configure_engine_logging) — uvicorn only configures its own loggers, so engine logger.info(...) was being swallowed. Added a per-round tool-loop diagnostic and an explicit warning when a model narrates a tool call as text.
Gave hosted-model agents (chats and planning sessions on OpenRouter/Ollama/ vLLM/etc.) on-demand project file access (ADR-0057). New services/agent_file_tools.py exposes three OpenAI-style tools — list_files, read_file, search_files — that reuse the in-app IDE workspace helpers (Local + SSH). routes/chat.py now runs a single streaming agentic loop (_stream_with_optional_tools): text streams to the client token-by-token while the model can interleave file-read rounds. Every access is filtered through file_policy.evaluate_file_access, so .env, *.key, *.pem, secrets/*, .git, and .skybolt are never listed, read, or searchable — the secret exclusion is enforced, not advisory.
Tools default ON for OpenAI-compatible provider kinds (openrouter, openai_compatible, litellm, ollama, vllm, anthropic_api, local) via _model_supports_tools, because user-added catalog models always have empty capabilities_json — gating on an explicit flag left every hosted model (incl. OpenRouter) with no file access. An explicit capability flag overrides the default; a provider that rejects the tools param is retried once without tools.
Hosted chats now prepend a system message with the project identity, tool guidance, and the project brain context (built via _build_brain_context_package). CLI-agent .env handling is unchanged (deferred).
Visibility + robustness pass. Each file read now emits a {type:"tool", name, path, phase:"start"|"end", ok} WebSocket frame, and the Chats panel (useChats + ChatsTabBody) renders a live "Reading/Listing/Searching …" row with a spinner → ✓/✕ so a tool-using reply no longer looks like it stalled. The tools now also accept an absolute path under the project root (rebased to relative) and treat the project root / . / empty as the whole repo, so a model that passes G:\…\repo\… no longer errors. The system prompt instructs the model to actually CALL the tools (not narrate) and that paths are repo-relative.
list_files now reuses the Files page's flat-index cache (_read/_store_workspace_index_cache, same project/target/max_entries key and TTL), so repeated tool calls are instant and share a warm cache with the IDE in both directions. Read-only, so no invalidation is added; existing writes already drop the cache. read_file/search_files stay uncached (single-file reads and ripgrep are already fast and avoid staleness).
All three tools work on Local and SSH Machines (they branch on target["kind"] and reuse the same workspace/_ssh_* helpers as the Files tab); the index cache, .env/secret denial, and absolute-path rebasing apply on both. SSH absolute paths are rebased against the configured remote root for parity with local.

Project Import —

Bug: creating a project with a Local path connection set only git_settings.repo_path and never configured the project's execution target, so the very next step (the repo scan) and all later file/terminal operations failed with the engine 422 Set up a Machine with a project folder before using project files (raised by workspace_ops._workspace_root when the target project_path is empty/.). The SSH branch already configured its target; the local branch did not.
Fix: ProjectImportWizard.submitIdentityAndConnect now, for a new project with a non-empty path that is not a Git URL, persists the Machine settings via updateProject — local sets isolation_policy.machine.local.project_path (+ terminal_startup_path), SSH keeps its existing remote settings. The engine PATCH /projects/{id} path runs _ensure_local_project_directory (mkdir parents=True, exist_ok=True), so an existing browsed folder is a no-op and a typed-but-missing folder is created. The scan then resolves the target root.
Git URL connections are intentionally excluded (no local folder to map yet) — unchanged behavior; mapping a cloned URL to a Machine remains future work.
Added a one-line hint under the local path field explaining that the folder becomes the project's Local Machine folder, so the wizard reads as setting the Machine up rather than silently requiring one.
Tests: new ProjectImportWizard.test.tsx case asserts the local create issues the updateProject with environment_kind: "local" and the chosen project_path; the two existing local-path cases now mock updateProject since the local flow performs the extra call.

2026-06-11

Account & Project Sync —

Cloud API: new project_blob table + /api/v1/project-sync (apps/api/app/api/routes/ project_sync.py): GET manifest, PUT/GET/DELETE by id, storing opaque ciphertext only (no name column, no schema knowledge of contents). Model app/db/models/project_sync.py (ProjectBlob), schemas app/schemas/project_sync.py, migration alembic/versions/0021_project_blob.py (down_revision 0020_account_settings_blob). Config project_sync_max_bytes (64 MiB) + quota knobs (project_sync_free_max_projects=3, project_sync_free_max_total_bytes=256 MiB; premium 100 / 10 GiB). 409 stale-generation, 413 oversize, 402 quota. Tests tests/control_plane/test_project_sync.py.
Engine: new skybolt_engine/services/project_blob_sync.py (push/restore, runtime-hooks loop, ProjectBlobSyncClient) and control routes routes/project_blob_sync.py. The ADR-0054 account bundle (services/account_settings_sync.py) gains the synced_projects manifest section, an encrypted provider_keys section, the project_sync_secrets_mode pref, and a recovery-backed-up flag. New shared store/project_store.py::_upsert_project_registry_row lets restore register a project with no repo on disk. Reuses sync_ops _seal_project/_unseal_project/ _build_portable_vault/_install_vault_secrets and _ensure_default_execution_target. Tests tests/test_project_blob_sync.py.
Renderer: unified Sync hub in apps/web/src/app/project/modals/sync/ (useSyncHub.ts, SyncHubBody.tsx, AccountSyncCard, ProjectSyncRow, ProjectSyncAdvanced, RestoreFromCloudCard, RecoveryCodeCard, SecretsScopeCard) behind a new "Sync" tab in AccountSettingsModal.tsx. New api.ts clients: getAccountProjectSyncOverview, restoreCloudProject, setProjectSyncEnabled, get/setSecretsSyncScope, getRecoveryCodeStatus, revealRecoveryCode. CloudSyncPanel demoted to an off-by-default legacy disclosure; SyncResumePanel reused behind a per-row Details disclosure; DevOps Sync mounts removed; LiveAgentFloor route_mode fallback removed.
Behavior: automatic background push of enabled projects; explicit one-click restore of cloud-only projects (no auto-download); shared per-project generation counter across the skybolt.ai blob and the ADR-0041 git-remote sealed path; secrets always encrypted with an all vs per_key scope (machine-scoped creds never travel); password-gated recovery-code reveal; restore reconciles machine-local execution targets.
Boundary shift: ADR-0055 amends ADR-0044 §4 — project content MAY now live on skybolt.ai, but only as ciphertext under the account key. It extends ADR-0054.

Account Settings Sync —

The three previously-separate renderer sync surfaces collapse into one Sync hub, a new "Sync" tab in AccountSettingsModal.tsx (after "General"). Sections top→bottom: a plain-language explainer (three channels, all end-to-end encrypted, recovery code is the only key) → Account (settings sync + cloud account + recovery-code backup) → Projects (one row per project) → Restore (cloud-only projects) → API keys (secret-portability scope).
New module apps/web/src/app/project/modals/sync/:
- useSyncHub.ts — all hub data + mutations, with a per-project busy map so one syncing project never disables the others.
- SyncHubBody.tsx (+ colocated SyncHubBody.test.tsx) — layout + state→section mapping.
- AccountSyncCard.tsx folds CloudAccountSection + AccountSettingsSyncSection (moved, not rewritten; their tests stay green) and demotes the ADR-0044 readable board-metadata mirror (CloudSyncPanel) behind an off-by-default "server-readable (legacy)" disclosure — kept, not deleted, because ADR-0055 retains the /api/v1/sync mirror for future collaboration (ADR-0045).
- ProjectSyncRow.tsx — one badge + one primary action per server state (synced→"Sync now", not_set_up→"Turn on sync", available_to_restore→"Restore to this machine", conflict→themed "Resolve…" modal, remote_newer→"Apply latest").
- ProjectSyncAdvanced.tsx — per-row "Details" disclosure that reuses the validated SyncResumePanel (conflict resolve, git-remote mirror push, portable-secret list, git-crypt legacy unlock/convert/install) rather than duplicating its logic.
- RestoreFromCloudCard.tsx, RecoveryCodeCard.tsx (password-gated reveal, 403/423 handling), SecretsScopeCard.tsx (segmented "Sync all (encrypted)" vs "Pick per key"; per-key list reuses listProjectSecrets/setProjectSecretPortable; SSH keys shown machine-scoped; no insecure option).
New api.ts clients (snake_case, session-gated): getAccountProjectSyncOverview(), restoreCloudProject(projectId), setProjectSyncEnabled(projectId, enabled), getSecretsSyncScope()/setSecretsSyncScope(mode), getRecoveryCodeStatus()/ revealRecoveryCode(password), plus the overview row + ProjectSyncHubState types.
Retired from primary UI: the DevOps SyncResumePanel mounts + the toolbar Sync button / syncActionable / syncModalOpen wiring (DevOpsTab.tsx, DevOpsToolbar.tsx); the CloudSyncPanel mount in SettingsTabBody.tsx. SyncResumePanel.tsx and CloudSyncPanel.tsx are kept (consumed by the hub's advanced disclosure / demoted legacy disclosure), so their existing tests remain green.
Cleanups: removed the General-tab placeholder content that hosted the now-folded cards; dropped the dead ?? agent.route_mode fallback in LiveAgentFloor.tsx (ADR-0053).
Tests: SyncHubBody.test.tsx (hub render, per-row state→action incl. restore + conflict, secrets segmented control + per-key reveal, recovery reveal happy + 403). Re-verified green: SyncResumePanel.test.tsx, DevOpsTab.test.tsx, DevOpsToolbar.test.tsx, CloudSyncPanel.test.tsx, AccountSettingsSyncSection.test.tsx, CloudAccountSection.test.tsx, ProvidersPanel.test.tsx.
New third sync channel: account-global config now follows the user across machines as an opaque, end-to-end-encrypted blob on skybolt.ai. Distinct from the per-project sealed artifact (ADR-0041) and the server-readable per-project metadata mirror (ADR-0044).
Engine crypto container security/sealed_settings.py (SBSETv1 | u32 header_len | header JSON {format_version, account_id, generation, sealed_at} | 12-byte nonce | AES-256-GCM ciphertext, header bound as AAD). Encrypts the whole bundle directly under the account key (ADR-0040) — no per-project data-key indirection.
Engine service services/account_settings_sync.py: allowlist serializers (providers, models, custom personas, account settings, user prefs, SSH connection metadata), a redaction scan that fails loud if it alters the bundle, generation persistence in sync_cursors (settings_sync:<account_id>), pull/push with a generation guard + single 409-retry, additive (no-delete) upsert apply, and a runtime-hook background loop. Off by default per account.
Engine routes routes/account_settings_sync.py (session-gated): GET /account/settings-sync/status, POST /account/settings-sync/enable, POST /account/settings-sync/sync-now.
Cloud API /api/v1/settings-sync (routes/settings_sync.py, model app/db/models/settings_sync.py, schemas app/schemas/settings_sync.py): PUT (generation-guarded upsert, 409 on stale, 413 over SETTINGS_SYNC_MAX_BYTES=1048576), GET (404 when empty), DELETE (204). Stores ciphertext verbatim with no schema knowledge of the contents. Migration 0020_account_settings_blob (additive; down_revision 0019_builtin_persona_catalog).
Renderer: AccountSettingsSyncSection.tsx ("Sync settings across machines" toggle + status + conflict + restore-on-new-machine CTA), mounted in the General tab of AccountSettingsModal.tsx; api.ts clients (getSettingsSyncStatus / setSettingsSyncEnabled / syncSettingsNow); a "Set API key" / health-unknown state for synced providers in ProvidersPanel.tsx; the per-project "Sync & Resume" button relabeled "Sync project" in SyncResumePanel.tsx.
Security gate (negative assertions): no secret value, no seat row, no SSH credential_ref, no kind='local' machine, no host-key trust material, no local filesystem path, and no last_machine pref can enter the bundle; the redaction scan runs over the pre-encryption JSON; the stored server blob is not readable SQLite/JSON. SSH machines re-verify host key on first connect (credential_ref left NULL on apply); imported providers are marked health_status='unknown'.
Tests: apps/engine/tests/test_sealed_settings.py (8), apps/engine/tests/ test_account_settings_sync.py (the ADR-0054 gate + round-trip + generation guard + redaction trip), apps/api/tests/control_plane/test_settings_sync.py (put/get/409/413/404/delete + opacity
- isolation), apps/web/src/app/auth/AccountSettingsSyncSection.test.tsx (panel states).

Agent Persona Catalog —

The catalog stays at the SAME URL GET /api/v1/sync/catalog but was relocated to a new apps/api/app/api/routes/catalog_v1.py (router prefix /v1/sync retained for deployed-engine compatibility) when ADR-0056 removed the ADR-0044 metadata mirror that had shared the sync_v1.py module.
The engine fetch moved from CloudSyncClient.fetch_catalog in the deleted services/cloud_sync.py to CloudClient.fetch_catalog in the new services/cloud_client.py (errors renamed CloudClientUnavailableError / CloudClientApiError). cloud_refresh.py persona seeding is otherwise unchanged.
Four routes in routes/agents.py (auth: local session → project ownership → account-admin; responses via _persona_out):
- POST /projects/{project_id}/agent-personas → 201, creates a custom persona (is_builtin=0, version=1). Also fixes the previously backend-less "Create custom agent" wizard.
- PATCH /projects/{project_id}/agent-personas/{persona_id} → 200, edits a custom persona (version bump + revision). 403 if the persona is builtin; 404 if not found / wrong account.
- POST /projects/{project_id}/agent-personas/{persona_id}/clone → 201, body { name? }. Clones ANY persona into a new is_builtin=0, version=1, key=NULL copy named "<source> Copy" by default with a "cloned from <id>" revision; source unchanged.
- DELETE /projects/{project_id}/agent-personas/{persona_id} → 200 {deleted_persona_id}. 403 if builtin; 400 if still referenced by a project agent (project_agents.persona_id).
New helpers + typed guards in services/personas_ops.py: _create_custom_persona, _clone_persona, _update_custom_persona, _delete_custom_persona, PersonaBuiltinError (→403), PersonaInUseError (→400); reuse _persona_values_from_spec / _snapshot_persona_revision.
New request schemas in schemas.py: AgentPersonaCreateRequest, AgentPersonaUpdateRequest, AgentPersonaCloneRequest. Frontend (apps/web):
api.ts: cloneAgentPersona, updateAgentPersona, deleteAgentPersona (createAgentPersona already existed).
New app/project/modals/PersonaEditorModal.tsx (name/description/bundle/instructions/tags/ permissions/definition_of_done + JSON-textarea policy/access objects, validated before save).
PersonaLibraryPanel.tsx: Clone on every card; Edit/Delete only on custom personas. PersonaDetailsModal.tsx: Clone (Edit when custom). New hook app/project/surface/personaLibraryActions.ts (usePersonaLibrary) + PersonaLibraryModals.tsx, wired through AgentsTabBody.tsx / ProjectSurface.tsx.
Clone-then-edit UX: Clone creates the copy, adds it to state, and immediately opens the editor on the new persona. Delete uses the existing type-to-confirm dialog. Safety:
Cloned customs are is_builtin=0, so the catalog re-seed (_ensure_builtin_personas, which only touches is_builtin=1 rows) never clobbers a customization.
Custom personas were initially local/account-scoped only. Superseded on 2026-06-22: custom personas now sync as account-scoped customer metadata through the ADR-0069/0070 record-envelope path.
Builtins are cloneable but not editable in place (edit/delete → 403). Tests:
apps/engine/tests/test_persona_crud.py (10) — create/clone/edit/delete, builtin 403 guard, in-use 400 guard, re-seed-leaves-custom-untouched regression.
apps/web/src/app/project/modals/PersonaEditorModal.test.tsx and apps/web/src/app/project/tabs/agents/PersonaLibraryAffordances.test.tsx — Clone on every card, Edit/Delete only on custom, clone-opens-editor, save-calls-updateAgentPersona. Superseded on 2026-06-22: the cloud control plane now exposes direct custom persona deletion at DELETE /api/projects/{project_id}/agent-personas/{persona_id} with the same builtin and in-use guards.
New GLOBAL table builtin_persona_catalog (BuiltinPersonaCatalog in apps/api/app/db/models/control_plane.py) mirroring AgentPersona minus account_id/is_builtin.
Migration 0019_builtin_persona_catalog (down_revision 0017_metadata_sync) creates the table and seeds ~50 rows from the existing BUILTIN_PERSONAS constant — day-one behavior unchanged.
New read-only endpoint GET /api/v1/sync/catalog (apps/api/app/api/routes/sync_v1.py): Bearer-authenticated via the existing _bearer_session, returns {etag, version, personas[]}, honors If-None-Match → 304. Schemas BuiltinPersonaCatalogOut / BuiltinPersonaCatalogPersona in apps/api/app/schemas/control_plane.py.
The API's per-account _ensure_builtin_personas now reads specs from the catalog table, falling back to BUILTIN_PERSONAS if it is empty. Desktop (apps/engine):
New singleton cache table persona_catalog_cache (etag, version, personas_json, fetched_at) in storage/schema.py.
CloudSyncClient.fetch_catalog(access_token, etag) in services/cloud_sync.py: conditional GET, 200/304/offline handling (CloudSyncUnavailableError on network failure).
_refresh_persona_catalog in services/cloud_refresh.py: after the identity refresh (boot
- every ~12 h), fetches the catalog, upserts the cache, and re-seeds every local account. Wrapped in its own try/except; 304/offline/error are no-ops.
_resolve_builtin_specs + _ensure_builtin_personas in services/personas_ops.py: resolve specs from the cached cloud catalog if present and non-empty, else bundled BUILTIN_PERSONAS. Existing insert/version-bump+snapshot/skip logic unchanged.
personas_data.py unchanged — still the offline fallback. Behavior:
Original behavior was deprecate-not-delete: a persona removed from the cloud catalog was not deleted locally. Superseded on 2026-06-22 by local builtin-library pruning when the cloud catalog is the active source; existing project agents keep their persona_snapshot.
Local-only users (no Bearer token) never fetch from cloud; they use the fallback. Tests:
apps/api/tests/control_plane/test_sync_catalog.py (5, sqlite) — endpoint, ETag/304, auth.
apps/engine/tests/test_persona_catalog.py (15) — resolve-specs fallback, fetch_catalog 200/304/offline, refresh wiring, version-bump revision. Decision: ADR-0052 (sanctioned exception to the ADR-0044 sync allowlist; references ADR-0042 offline tolerance).

Agents Execution

The Custom Agent Wizard's "Route mode" dropdown is replaced by a "Default model" picker (catalog models, plus "Use project default"). The choice is stored on the agent in overrides_json as overrides.default_model = { catalog_model_id, target, model } — no schema migration; the legacy route_mode column is retained but the wizard no longer writes it.
The engine now resolves an agent's default model (per-agent → project default → none) and includes it in the agent session's prompt package as model_recommendation / model_reason, so the session detail panel shows a real model instead of "no route resolved". See services/model_defaults.py::resolve_agent_model. Display-only; the CLI seat still runs its own model.

Cloud Metadata Sync —

The ADR-0044 server-readable metadata mirror is removed end to end. Cloud API: deleted apps/api/app/api/routes/sync_v1.py (/push, /pull), apps/api/app/schemas/sync_v1.py, and apps/api/app/db/models/sync.py (SyncChange, SyncedProject); de-registered from app/db/base.py; removed the six SYNC_* / RATE_LIMIT_SYNC_* fields from app/config.py. New migration alembic/versions/0022_drop_metadata_sync.py (down_revision 0021_project_blob) drops sync_changes and synced_projects (downgrade recreates them).
Engine: deleted services/cloud_sync.py and routes/cloud_sync.py; removed the runtime-hook registration; dropped the project_sync_prefs table from storage/project_db.py; removed routes POST /projects/{id}/sync-now, GET /sync-status, GET/PUT /sync-prefs. The reusable cloud HTTP transport (persona-catalog fetch + bearer acquisition) was extracted to new services/cloud_client.py (CloudClient, get_client, _acquire_access_token, CloudClientUnavailableError/CloudClientApiError), still used by cloud_refresh.py and account_settings_sync.py.
Renderer: removed the "Server-readable board metadata (legacy)" disclosure and CloudSyncPanel (+ its test) from the Sync hub; removed runProjectCloudSync, getProjectCloudSyncStatus, getProjectCloudSyncPrefs, putProjectCloudSyncPrefs and their types from apps/web/src/app/api.ts.
Preserved: the ADR-0052 persona catalog stays at GET /api/v1/sync/catalog, relocated to apps/api/app/api/routes/catalog_v1.py (router prefix /v1/sync kept for deployed-engine compatibility), fetched by the engine via cloud_client.CloudClient.fetch_catalog.
Result: skybolt.ai no longer stores any server-readable project metadata. The only remaining sync channels are the zero-knowledge ADR-0054 settings bundle and ADR-0055 project blobs. ADR-0045 collaboration (unbuilt) loses its mirror substrate — see ADR-0056.

Execution Engine

Removed the user-facing model Routes concept from Settings (ADR-0053). Routes were inert in the engine — model_routing.resolve_model had no callers and the agent prompt package never resolved a model. The Models & Routes panel is now a Models-only panel; catalog models remain selectable anywhere in the project. The model_routes table/endpoints/model_routing.py are retained but unused (no migration).
Added a project default model for new chats: a "Default model for new chats" picker in the Model Settings card, persisted in project-settings KV under default_chat_model = { target, model } and pre-selected by the chat composer (useChats).
Wired model defaults into the agent prompt package. New services/model_defaults.py::resolve_agent_model resolves a per-agent default (overrides.default_model.catalog_model_id) then the project default, and the create-session, build-prompt-package, and scheduler paths now set model_recommendation / model_reason so the agent session detail shows a real model instead of "no route resolved". Display-only — the CLI seat still runs its own model.
Reorganized the Settings tab Model Settings card into a scannable highlights view. Providers, Models, and Seats each show a count and a short preview of what is configured, with Manage providers / Manage models / Manage seats buttons that open the full add-and-delete forms in dedicated modals.
Added clear labels and one-line hints to every Model Settings form field (e.g. the provider name field now reads "Display name" with guidance that it is just a label, and the auth reference field explains it stores the name of a secret, not the key itself) so users understand what to enter.

Git Operation Surface —

BranchesPanel now lays the checkout select + Checkout button and the new- branch input + Create Branch button into a single four-column row (sm:grid-cols-[minmax(0,1fr)_auto_minmax(0,1fr)_auto]), collapsing to a stack on narrow screens. The "checking out branch" spinner moved below the combined row.
ChangesToCommitCard: Commit & Push now floats to the right of the action row (ml-auto) and carries a tiny (CTRL + Enter) hint under its label. Pressing Ctrl/Cmd + Enter in the commit-message box runs Commit & Push, mirroring the button's enable rules and skipping while a push is in flight.
ChangesToCommitCard now surfaces an explicit scan state: a "Scanning…" chip (spinner) in the header badge cluster whenever busy is devops:refresh or devops:complete_setup, so a re-scan over an existing list is visible, and the empty file list shows a spinner + "Scanning the repository for new and updated files…" instead of a bare blank. Previously the scan ran silently (only the toolbar Refresh button spun), so the list looked blank and then files popped in with no in-between feedback.
Added apps/web/src/app/project/tabs/ChangesToCommitCard.test.tsx (9 tests) covering the Ctrl/Cmd+Enter shortcut, the empty-message and in-flight guards, the hint text, and the scanning indicator (chip, placeholder, idle, and the setup-completion scan).
DevOpsTab now renders a dedicated "Checking Project Folder…" loader while the opening readiness/git probe runs (gated on busy === "devops:refresh" before a provider is chosen). Previously the tab fell straight through to the "Setup Dev Ops" screen for the several seconds the probe takes, so an existing repo looked like "nothing found" until the probe landed and flipped the tab to the full view. State order is now: Machine Required → Checking → Setup (found nothing) / full tab (found a repo).
runDevOpsReadinessChecks does not touch busy, so devops:refresh holds for the whole probe — making it a reliable signal for the checking state.
Added apps/web/src/app/project/tabs/DevOpsTab.test.tsx (4 tests) covering the checking state, the setup fallback, the Machine-Required precedence, and the guard that a mid-setup refresh does not hijack the chosen provider form.

Project Import —

The wizard's repo-connection step now shows a Browse… button next to the "Repository path" field when the connection kind is Local path and the app is running in the Tauri desktop runtime. It opens the native OS folder picker (pickLocalFolder → the engine-host select_folder Tauri command) and drops the chosen absolute path into the field; cancelling leaves the field untouched.
Gated on isTauriDesktopRuntime() so the web build — where pickLocalFolder is a no-op — never renders a dead control. Hidden for SSH (remote path, not reachable by the local picker) and Git URL connection kinds. Reuses the same picker plumbing as StorageLocationSettings.
Tests: two added in ProjectImportWizard.test.tsx — the desktop Browse flow fills the path from the picker; the button is absent outside the desktop runtime.
ProjectImportWizard now switches to the "scan" step *before* awaiting scanProjectRepo, so the prominent "Scanning repository — detecting frameworks, commands, Docker, and Git setup…" loader is shown during the multi-second detection instead of leaving the user on the connection form with only a small button spinner (which read as "nothing happened / found nothing").
On any failure before the scan completes, the wizard reverts to the connection step so the repo path can be corrected and retried; the existing error detail is still surfaced (covered by the existing scan-failure test).

2026-06-10

Cloud Deployment

Added the public site (@skybolt/site) as a fourth container in the production stack (ADR-0049 follow-up): the edge Caddy now terminates TLS for skybolt.ai + www.skybolt.ai (www → apex) and reverse-proxies to a site container serving the static Astro build. The site ships as its own GHCR image (ghcr.io/txmerc/skybolt/site), built/pushed by deploy.yml and tag-pinned / rolled-back beside the api (SKYBOLT_SITE_TAG).
Validation: bash -n on deploy/rollback, YAML parse of compose + workflow, and a local pnpm --filter @skybolt/site build (4 pages incl. the build-time changelog). The image build runs in CI; docker compose config runs implicitly on the droplet at deploy.
Operator action for go-live: apex + www A records (Namecheap) → droplet, alongside api.

Desktop App

Moved the manual update check out of the native Help → Check for Updates… menu and into an Updates section in the renderer's Account settings (desktop only). The native app menu and its rfd update dialogs are removed; the new section calls the same check_for_update/install_update Tauri commands and shows status, release notes, and an "Install & Restart" action in the themed UI. The check_for_update/install_update commands and the pre-exit contract are unchanged.
Marketing-site consolidation: deleted the React apps/web/src/marketing/MarketingSite.tsx (the public marketing site now lives solely in the Astro apps/site). apps/web/src/App.tsx no longer branches on path/runtime to serve a marketing page — it always renders SkyboltApp. Updated AppRouting.test.tsx accordingly and dropped the now-unused three, @types/three, and gsap dependencies from apps/web (smaller renderer bundle).
scripts/dev.ps1 / scripts/dev.sh default runs now also start the public Astro site (apps/site) on http://localhost:50030 alongside the desktop app, and stop it when the desktop process exits (incl. Ctrl-C). New -Site (--site) switch runs only the site.
Made tauri dev restartable on Windows after an orphaned sidecar. prepare-sidecar.mjs (the first step of beforeDevCommand, before cargo's build script) now stops any lingering skybolt-engine process on debug builds. Previously, when tauri dev force-killed the app on a Rust rebuild — or the dev process was Ctrl+C'd — the sidecar orphaned and held the image lock on target/debug/skybolt-engine.exe, so the next build's tauri-build copy_binaries panicked with Access is denied (OS error 5). Release builds intentionally skip the kill so they never stop an installed Skybolt's engine. See documents/troubleshooting.md.

Encrypted Project Sync

Locked/ciphertext project.db → clean 423 on every per-project route. The A4 convert route added an auth-first pre-flight that sniffs the header and returns 422 before opening, but that guard protected only sync/convert; every other per-project route (cards, terminals, agents, workspace, project detail, …) still opened the project via _open_project, whose per-project ATTACH raised a raw sqlite3.DatabaseError → opaque HTTP 500 when the on-disk project.db was git-crypt ciphertext (\x00GITCRYPT), a sealed artifact (SBSEALv1), or otherwise not SQLite.
- Engine: store/project_store.py adds ProjectDatabaseLockedError + a header classifier _project_db_locked_detail (mirrors services/sync_ops.inspect_project_db). _ensure_project_db sniffs the header on EVERY open — not gated by the _UPGRADED_PROJECT_DBS upgrade cache, so a re-locked checkout (a pull/checkout brings ciphertext after a prior successful open) is still caught — and raises the typed error; _open_project keeps a tightly-scoped ATTACH safety net for a residual non-SQLite body and always closes the global connection on failure. app.py registers an exception handler mapping it to 423 Locked with an actionable detail (git-crypt → "Unlock Encrypted Data"; sealed → "Unseal (Sync & Resume)"; generic → "Unlock or unseal"), mirroring the engine-lock require_unlocked middleware. Additive only: the convert pre-flight (still 422) and the graceful /projects listing fallback (_read_project_settings_row) are unchanged; background callers (scheduler) already wrap per-project opens in except Exception.
- Tests (tests/test_project_db_lock.py, 5): detail + listing routes return 423 (never 500) for all three ciphertext kinds with actionable details; the lock still fires after a prior successful open (cache bypass); the global /projects index still lists a locked project.

Public Site

Marketing consolidation: migrated the richer single-page design/content of the former React marketing site (apps/web/src/marketing/MarketingSite.tsx) into this Astro landing, keeping the HeroSwarm agent animation as the hero. New / sections: a command-center hero with a three-step strip, Platform (#platform, six product cards), Machines (#machines, capability chips + a command-window mock), and Privacy (#privacy, three trust groups) — replacing the previous "Control the chaos" grid, principles terminal, and pricing teaser. Nav now targets the section anchors plus Pricing. The React site's Tailwind was translated to the Astro site's CSS-variable system; the mobile nav was resized so all four links fit a phone width. The old React MarketingSite.tsx was deleted, apps/web/src/App.tsx now always renders the Skybolt app (its AppRouting test updated), and the unused three/gsap deps were removed from apps/web. scripts/dev.ps1 / dev.sh default runs now start the desktop app + this site on :50030 together (new -Site / --site flag runs the site alone).
Animated hero swarm (src/components/HeroSwarm.astro): a live "agents working a board" layer — an ORCH node routes five named agents (claude-code, codex-01, vllm-local, expert:docs, expert:security, reference colors) to task cards with progress bars; agents fly over, dock, pulse and fill the bar, pair up over a green collaboration link, and completed cards flash green with a ✓ before respawning with the next task title; packets travel the work links. Dependency-free (no three.js/gsap): runtime DOM + one SVG line pool driven by a single rAF loop, paused offscreen via IntersectionObserver. The build-time constellation remains as the no-JS fallback; prefers-reduced-motion gets a static composed scene. Mobile runs a smaller 3-agent/3-card config. The technical-design hero section was rewritten accordingly (the three.js island is no longer the planned upgrade path).
Hero visibility fix: the upper hero rendered as a near-black void — the readability scrim (0.35–0.55 black) sat over the whole starfield and flattened the Atmosphere gradient and constellation below perception. The scrim is now light at the top (0.05 → 0.22 @ 42%) and dark only in the lower text zone; the hero background gained a faint #0077e4 radial halo; stars are brighter/larger (opacity 0.35–0.9, r 0.6–1.9); the constellation stroke went 0.25 → 0.4 and agent nodes gained translucent halo rings. The starfield is also 28px taller than the hero so the drift animation never exposes an unpainted strip.
Content refresh to match the shipped product (Waves A+B): landing panels now feature the gated full Git surface (M4) and repo import/readiness audits (P-M2) — "Kanban cards" merged into the orchestration panel and "Local by default" ceded to the principles terminal directly below. /features grew matching "Git operations, gated" and "Project import & readiness" cards (eight total), the orchestration card swapped the unshipped overlap-radar/artifact-registry claim for lease-backed scheduling + file-scope conflict prevention and the live agent board, and the cloud identity card now mentions device-session list/revoke. Pricing Free tier lists repo import with readiness audits.
Mobile nav: page links (Features/Changelog/Pricing) stay visible under 720px (smaller type); the waitlist CTA pill hides instead — previously the links vanished entirely, leaving subpages unreachable except via the footer.
Font loading: preload <link>s for the five above-the-fold latin woff2 faces (Space Grotesk 500/600, Inter 400/500/600) to suppress the fallback-font flash on first paint.
Containerized production deployment (ADR-0049 follow-up, rides ADR-0043): apps/site/Dockerfile (multi-stage — pnpm workspace build mirroring the site CI job → caddy:2-alpine static file-server on :8080), apps/site/Dockerfile.dockerignore (keeps documents/ in the build context so the build-time changelog resolves — the root .dockerignore drops it), and apps/site/site.Caddyfile (internal static server).
Added the site service to deploy/production/docker-compose.prod.yml (own GHCR image ghcr.io/txmerc/skybolt/site, edge network, healthcheck) and skybolt.ai + www.skybolt.ai blocks to Caddyfile.prod (edge Caddy terminates TLS; www → apex redirect).
Extended the deploy pipeline to two images: deploy.yml builds/pushes the site image beside the api; deploy.sh/rollback.sh pin and roll back both tags (SKYBOLT_SITE_TAG).

2026-06-09

Agents Execution

Per-project schema v3 → v4 → v5 (storage/project_db.py, PROJECT_SCHEMA_VERSION 3 → 5, same strictly-additive ladder):
- v4 (M2): card_dependencies (UNIQUE(card_id, depends_on_card_id), kind ∈ {ordering, interface}), additive cards.file_scope_json (JSON glob list; ** is the exclusive solo scope) + cards.priority (ascending, default 100), and real terminal_sessions lease columns (control_mode, agent_lease_session_id, agent_lease_state, agent_lease_claimed_at, agent_lease_released_at, orchestrator_session_id) — the columns _terminal_out used to hardcode.
- v5 (M3): command_profiles execution columns (execution_target_id, working_path, allowed_actors_json, agent_callable, audit_policy ∈ {none, metadata, full}) and command_runs result columns (profile_id, terminal_session_id, output_ref, exit_code).
Scheduler multi-claim (M2) (scheduler.py): each tick now runs per-project maintenance (heartbeats, fallback completion, the TTL reaper) first, then claims up to admission headroom sessions. Eligibility = dependencies all completed (SQL NOT EXISTS over card_dependencies) ∧ declared file-scope disjoint from every in-flight card's scope (glob/prefix comparison; ** waits for an empty board and runs solo) ∧ admission: global cap (SKYBOLT_AGENT_MAX_CONCURRENT, default 3), live machine PTY budget (SKYBOLT_AGENT_MAX_MACHINE_PTYS vs LocalTerminalBroker.live_count()), and catalog_models.max_concurrency when the session's declared model resolves. Over-cap sessions stay queued — backpressure, not failure. Claims also take file_scope leases (radar visibility + exact-pattern race belt-and-suspenders).
TTL reaper (M2): each tick, held leases whose heartbeat_at (fallback acquired_at/ created_at) is older than ttl_seconds (SKYBOLT_AGENT_LEASE_TTL_SECONDS, default 600) are released; holder sessions are requeued through the existing release path (or failed past the attempt limit with the card escalated to review); orphan leases release directly.
Dependents unblock through the events bus (M2): completion writes a durable dependents_unblocked coordination event (payload card_ids) and publishes it on scheduler_bus, so the next tick fires immediately instead of waiting out the poll interval.
Terminal lease endpoints (M2) (routes/terminals.py): POST /projects/{id}/terminals/{tid}/lease (one active holder; resource_leases under BEGIN IMMEDIATE — a concurrent second claim 409s) and POST .../release (returns the terminal to human control; 409 when not held). _terminal_out reads the real v4 columns, falling back to M1 metadata_json fields for not-yet-upgraded rows.
Command execution on targets (M3) (new services/command_exec.py): a saved command profile runs on its execution target in three shapes — local captured (shell subprocess; stdout/stderr written to a machine-local log under <engine db dir>/projects/<pid>/command-output/<run_id>.log, recorded as output_ref; raw output never lands in the syncable project.db), local visible (in_terminal=true opens a broker terminal linked via terminal_session_id), and SSH (_run_ssh_shell, cd -- <cwd> && <command>). execute_command_run accepts an explicit cwd override — the M8 quality gates will point it at a card's worktree. audit_policy controls retention: full keeps the log, metadata keeps byte counts only, none keeps neither.
Run endpoint + approval gating (M3) (routes/command_profiles.py, routes/approvals.py): the 409 stub is replaced by POST /projects/{id}/command-profiles/{pid}/runs (writes command_runs rows; body: actor, agent_session_id, cwd, in_terminal, timeout_seconds). approval_policy != 'none' parks the run awaiting_approval behind an approval_requests row (subject_type='command_run'); the approvals resolve flow executes it on approve (new branch, mirroring the git-operation branch) or marks it rejected/blocked. Actor gates: agent_callable=false profiles 403 agent actors; a non-empty allowed_actors list restricts all actors. PATCH /projects/{id}/command-profiles/{pid} updates the M3 metadata; GET /projects/{id}/command-runs returns real history (was an empty stub).
M1 handoffs closed: LocalTerminalBroker.open gained an env kwarg (per-spawn overrides merged over the engine environment; None = historical inherit) so the scheduler injects SKYBOLT_COMPLETE_TOKEN/SKYBOLT_AGENT_SESSION_ID/SKYBOLT_PROJECT_ID/SKYBOLT_COMPLETE_URL per seat without touching os.environ; _terminal_out no longer hardcodes lease fields.
Serializers: _card_out exposes file_scope/priority; _command_profile_out the M3 execution metadata; new _command_run_out (profile/terminal links, output_ref, exit_code); _approval_request_out exposes command_run_id. All tolerant of not-yet-upgraded rows.
Tests: tests/test_scheduler.py grew to 20 (M2 adds: two concurrent thread claimers → exactly one wins; two schedulers → one claim; dependency order + bus unblock; overlapping scopes never co-run; ** runs solo; global cap under a 4-session burst; PTY cap; provider max_concurrency; reaper requeue + cap-to-review; terminal lease/release endpoint contract; the ladder test now asserts v5). New tests/test_command_exec.py (13: cwd precedence, actor gates, captured run + output log, failure exit codes, cwd override, missing cwd, audit none, unsupported action blocked + disabled 409, agent-callable PATCH flow, approve-executes, reject-without-execute, in-terminal link, SSH routing). tests/test_terminal.py grew to 5 (env merge for the child only; broker env kwarg reaches the spawned PTY).
Per-project schema v2 → v3 (storage/project_db.py): new _upgrade_project_schema version ladder (strictly additive; older synced project.db files upgrade in place), resource_leases with the ux_active_lease partial unique index (WHERE state='held' — the IntegrityError IS the mutex), coordination_events, and additive agent_sessions runtime columns (card_id, lease_state, attempt_count, worktree_path, branch_name, engine_terminal_id, heartbeat_at, claimed_at, outcome_json).
New skybolt_engine/scheduler.py: AgentScheduler self-registers via services/runtime_hooks (start/stop in the app lifespan, quiesce hook for sync push/pull, defers while the engine is locked). Claim path opens the project DB writer with BEGIN IMMEDIATE and writes only leases + status; after commit it git worktree adds <repo>/.skybolt/worktrees/<session_id>, launches the CLI seat on the in-process LocalTerminalBroker (cwd=worktree), injects the brain prompt package into the PTY, and persists a terminal_sessions row (control_mode='agent' in metadata). recover() reconciles this machine's rows on start: dead terminals requeue (cards back to queue) or fail past the attempt limit (cards escalated to review).
New services/worktree_ops.py (worktree add/remove/prune with the strict-argv _run_git discipline; re-applies the .skybolt/.gitignore that ignores worktrees/) and services/events.py (durable coordination-event insert + minimal in-process async pub/sub).
Completion contract: POST /projects/{id}/agent-sessions/{sid}/complete accepts a per-session scoped token (injected into the seat PTY as SKYBOLT_COMPLETE_TOKEN, stored only as a SHA-256 hash, never the engine session token) via the X-Skybolt-Complete-Token header, or a normal renderer session. On completion: read-only git status/HEAD summary, journal + coordination event, leases released, card working → completed, worktree removed + pruned. Fallback signals wired minimally: terminal liveness false, then output-idle timeout; the fired signal is journaled. Per ADR-0036 every completion is recorded as claimed-complete, not quality-verified.
_agent_session_out now serializes the scheduler state (lease_state, attempt_count, branch_name, engine_terminal_id, heartbeat_at, claimed_at, outcome) tolerantly for not-yet-upgraded rows; agent_sessions.card_id is populated at create.
Tests: apps/engine/tests/test_scheduler.py (9 tests) drive tick_once()/recover() deterministically with a real python -c seat through the real broker: end-to-end claim → worktree → seat → scoped-token completion; wrong/missing token 403/401; terminal-exit fallback; recovery requeue + attempt-limit failure; quiesce blocks claims; v2 → v3 ladder upgrade; the partial-index lease mutex; event-bus pub/sub.
Known follow-ups: tests/test_storage_split.py:84 still asserts project schema version 2 (one line, owned by the storage wave); app.py's engine-token middleware must exempt the completion path before headless seats can signal completion on token-configured engines; LocalTerminalBroker.open needs an env parameter so the scoped token no longer rides a guarded os.environ override around the spawn.

Cloud Deployment

Authored the production deploy stack (ADR-0043 milestone C1):
- deploy/production/docker-compose.prod.yml — api (GHCR tag-pinned, no host port, healthcheck, log caps) + Caddy on a pinned 172.28.0.0/24 network so TRUSTED_PROXY_CIDRS can name the proxy hop; Caddy state in named volumes.
- deploy/production/Caddyfile.prod — api.skybolt.ai with automatic Let's Encrypt, the shared security-headers snippet, /api/* proxy, and static landing pages.
- deploy/production/site/ — verify-email.html (completes verification in-browser via POST /api/v1/auth/verify-email) and reset-password.html (copyable token + desktop-app instructions; split derivation forbids web-side password handling).
- deploy/production/{provision,deploy,rollback,backup}.sh — idempotent droplet bootstrap (Docker, UFW 22/80/443, fail2ban, unattended-upgrades, skybolt deploy user), tag-pinned deploy with previous-tag rollback state and /api/health polling, one-command rollback (no downgrades — migrations stay backward-compatible one release), weekly pg_dump with keep-4 local pruning and optional DO Spaces upload.
- .github/workflows/deploy.yml — api-v* tags or confirmed manual dispatch → build apps/api/Dockerfile → push ghcr.io/txmerc/skybolt/api:<tag> → SSH deploy with pinned host key; production environment gate; GHA build cache.
- Validation run: bash -n on all four scripts, YAML parse of compose + workflow. docker compose config is not runnable on this dev machine (no Docker) — it runs implicitly on the droplet at first deploy.sh.
Go-live (C2) is operator-executed; the runbook is this feature's test-plan.md.

Cloud Identity (skybolt.ai)

Sign-out, background refresh, and targeted session revocation (Pillar 2 milestones B2/B3/B4, ADR-0042).
- API (B4): auth_sessions.refresh_family_id (model + migration 0018_identity_sessions — the documented follow-up column; numbered 0018 to leave 0017 free for a parallel workstream) links each bearer access session to the refresh-token family it was minted with. New GET /api/v1/auth/sessions (one entry per signed-in device: family id, created/last-used, last ip/user-agent, current flag; never token material) and POST /api/v1/auth/sessions/revoke {family_id} (revokes ONE family's refresh tokens + its access sessions only; 404 for unknown/foreign families; returns current so the client knows it revoked itself). Refresh-reuse revocation now also targets only the compromised family's sessions instead of all of a user's sessions. Rate limits reuse the existing refresh (list) and token-confirm (revoke) ceilings.
- Engine (B2): POST /auth/cloud/logout — requires the local cookie session, best-effort cloud /logout (mints an access token from the keychain refresh token after an engine restart; offline or upstream errors never block), then clears the keychain refresh token, the in-memory access token, and deletes the cached cloud_identity row (state reads unlinked; the email is not retained). Local users/sessions and at-rest custody (account key + argon2id wrap) are untouched — offline login/unlock keep working.
- Engine (B4): GET /auth/cloud/sessions + POST /auth/cloud/sessions/revoke passthroughs (the renderer never talks to the cloud — binding rule), with a shared _authed_cloud_call helper that retries once through a refresh-token rotation when the in-memory access token is missing/stale. Revoking this machine's own family performs the same local cleanup as logout and returns the unlinked state.
- Engine (B3): new services/cloud_refresh.py registers cloud-refresh with the runtime_hooks seam at import time (app.py untouched). First pass ~60 s after startup, then every ~12 h: rotate the keychain refresh token, re-fetch /me, update the cached row. Every failure is swallowed to debug logs; a locked engine (at_rest.is_enabled() false with encryption on) skips the pass; a 401 clears the dead token so the renderer can prompt. refresh_once() is the synchronous, directly-testable core.
- last_synced_at is now written as ISO-8601 UTC with timezone offset everywhere the engine writes it (the documented datetime('now') follow-up); the renderer formatter still compensates for pre-existing suffix-less rows.
- Renderer: CloudAccountSection gained Sign out (themed confirm modal — never a native dialog) and an on-demand "Devices" list (render stays offline-safe) with per-device "Sign out device" confirm modals; the current family is labelled "This device", and revoking it is treated as a local cloud sign-out. api.ts: cloudLogout, cloudSessions, cloudRevokeSession (+ CloudSession type).
- Tests: API +5 (test_identity_v1.py, 17 total), engine +18 (test_cloud_auth.py 35 + new test_cloud_refresh.py 7), web +6 (CloudAccountSection.test.tsx, 11 total).
Production mailer: SMTP2GO backend (Pillar 2 milestone B1, ADR-0043; provider chosen by the owner — the team already uses SMTP2GO).
- The Mailer protocol is now async; routes await mailer.send(...) so a slow SMTP/API call can never block the single-worker event loop.
- Smtp2goMailer registered in _BACKENDS: POSTs to the SMTP2GO HTTP API (/v3/email/send, X-Smtp2go-Api-Key header) via httpx.AsyncClient (10 s timeout, injectable transport for tests). Failures — including SMTP2GO's failures-inside-a-200 (data.failed > 0) — are logged with recipient + subject only, never the token-bearing body, and swallowed (a failed send must not fail signup; the rate-limited resend/reset endpoints are the retry path).
- Config: SMTP2GO_API_KEY + MAIL_FROM settings; validation rejects console outside development/test (the documented launch blocker) and requires key+from for smtp2go. httpx promoted from dev-only to a runtime dependency; apps/api/.env.example added.
- Tests: tests/control_plane/test_mailer.py (9 — MockTransport payload/auth assertions, API-error, failures-inside-200, and network-error swallowing with no-body-logged checks, prod config guards).
Proxy IP trust hardening (Pillar 2 milestone B5, ADR-0043).
- client_ip honors X-Forwarded-For only when the socket peer is inside TRUSTED_PROXY_CIDRS (new setting, validated CIDR list, empty default = ignore XFF), and then only the rightmost entry — the hop our proxy appended. Previously any client could spoof per-IP rate-limit buckets and login audit IPs with a forged header.
- Tests: tests/control_plane/test_client_ip.py (7 — trust matrix incl. spoof-ignored, rightmost-entry, non-IP peers, invalid-CIDR settings rejection).
Wired the cloud API into CI and setup (commit 9af8b4c).
- .github/workflows/ci.yml gained the api job: ruff, pyright, and the full apps/api pytest suite against SQLite (TEST_DATABASE_URL=sqlite+aiosqlite:…) on every push/PR, on ubuntu-latest to match the deploy target.
- Both setup scripts (scripts/setup.sh, scripts/setup.ps1) now provision apps/api/.venv by default (SKYBOLT_SETUP_OPTIONAL_API=0 opts out) — apps/api is the active identity service, no longer optional-only infrastructure.
- Fixed the stale joke_teller roster test (the persona was deliberately removed from BUILTIN_PERSONAS; the test had drifted).
Phase B renderer: cloud sign-in flows and the skybolt.ai account card (commit b334097).
- AuthScreen.tsx setup/login/restore modes now drive the cloud flows: setup → cloudSignup + one-time recovery-code screen; login → cloudLogin with silent fallback to local sign-in on 503 and a needs_recovery hand-off into restore mode (email/password kept, amber guidance shown); restore → cloudRestore. Non-blocking warnings (keychain misses) render once on a "Signed in with a warning" screen. Local unlock unchanged.
- Account settings gained the "skybolt.ai account" card (src/app/auth/CloudAccountSection.tsx): linked state from the engine's cached getCloudState (offline-safe render), "Sync now" via cloudRefresh (503 → offline notice; 401/409 → re-sign-in notice), and the local-profile link flow via cloudMigrate.
- API client functions in api.ts: getCloudState, cloudSignup, cloudLogin, cloudRestore, cloudMigrate, cloudRefresh.
- Tests: AuthScreen.test.tsx + CloudAccountSection.test.tsx (11 new).
Phase B engine: cloud auth client + split derivation (commit 594f37f).
- security/cloud_kdf.py — Argon2id + HKDF-SHA256 split derivation (auth_verifier sent; wrap_key local-only, repr=False); server-published params bounds-checked client-side; 64-char hex salts indistinguishable from the API's fakes.
- security/at_rest.py — the custody password record now dispatches between legacy PBKDF2 and the argon2id cloud wrap (set_cloud_password_wrap); same account key, nothing re-encrypted, legacy wraps still unlock.
- services/cloud_identity.py — httpx client for /api/v1/auth (CloudUnavailableError/CloudApiError); refresh token in the OS keychain (skybolt-cloud/refresh-token), access token in process memory; keychain failures degrade to a response warning, never a crash.
- routes/cloud_auth.py + the cloud_identity single-row cache table (global schema v13) — /auth/cloud/{state,signup,login,restore,migrate,refresh}. Every cloud call degrades to 503 offline; state never touches the network; local /auth/login + /auth/unlock stay fully offline. Login with no local profile returns needs_recovery (no key escrow — the recovery code is the only cross-machine path).
- Tests: tests/test_cloud_auth.py (24) + tests/test_at_rest.py cloud-wrap additions (3).
Phase A identity service v1 in apps/api (commit ad70ff4).
- app/api/routes/identity_v1.py (mounted /api/v1/auth): signup/login/kdf/refresh/logout/verify-email/resend-verification/password-reset/me. Split-derivation verifiers — the server stores only argon2id(auth_verifier); bearer access tokens (opaque, SHA-256-hashed, 1 h) + rotating refresh-token families (60 days) with atomic claim and reuse-revocation (family + all access sessions); timing-equalized unknown-email login and deterministic fake KDF salts; password reset revokes all credentials and does not decrypt data.
- app/schemas/identity_v1.py (contract of record for the engine client), app/core/mailer.py (console backend behind a Mailer protocol; Resend/Postmark slot in via _BACKENDS — real backend is a launch blocker), app/core/ratelimit.py (in-process sliding window, per-IP/per-email on every endpoint).
- Migration alembic/versions/0016_identity_v1.py: users.email_verified/kdf_salt/ kdf_params, refresh_tokens, email_tokens, entitlements; merges the two 0015 heads.
- Tests: tests/control_plane/test_identity_v1.py (12).
ADR-0042 (skybolt.ai cloud identity with zero-knowledge encryption) accepted, alongside ADR-0041 (sealed project artifact) (commit cd32231). Decided the split-derivation password model, engine-as-only-cloud-client, offline-tolerance guarantees, the accepted enumeration oracles, and that the recovery code remains the only cross-machine decryption path.

Cloud Metadata Sync —

New /api/v1/sync router (apps/api/app/api/routes/sync_v1.py): POST /push (batched allowlisted changes, server-assigned monotonic cursor, rollup upserts) and GET /pull (cursor/limit paging with has_more), Bearer-authenticated via the identity v1 session helper.
Strict allowlist schemas (apps/api/app/schemas/sync_v1.py, extra="forbid", discriminated union on entity_kind): project_meta, card, board_column, agent_session_meta. Unknown fields/kinds and forbidden classes => 422.
New tables synced_projects + sync_changes (apps/api/app/db/models/sync.py, migration 0017_metadata_sync — chained after the concurrently created 0018_identity_sessions; single head verified).
Per-plan quotas from Entitlement (free: 3 projects, 10k changes/24h; premium: 100 / 200k) and per-user rate limits, all configurable in apps/api/app/config.py.
Tests: apps/api/tests/control_plane/test_sync_v1.py (11 tests: round-trip, cursor monotonicity, allowlist negatives, isolation, quotas, idempotency, auth).

Desktop App

Added engine-crash recovery: the shell watches the sidecar process and emits skybolt://engine-exited on unexpected exits; the renderer's new EngineRecovery surface (event listener plus /health ping backstop) offers Restart Engine, which respawns the sidecar with a fresh port/token and reconnects the renderer cleanly.
Mirrored engine sidecar stdout/stderr to <data-dir>/logs/skybolt-engine.log (rotated at 5 MB) and surfaced the path in the recovery UI; the logs/ folder moves with the storage location.
Wired tauri-plugin-updater with manual-only checks (check_for_update/install_update commands and a Help -> Check for Updates... menu) and the documented pre-exit contract: the engine is stopped before any binary swap, and respawned if the install fails. Signing keys are not generated yet, so checks report "unconfigured" until an operator runs tauri signer generate and fills plugins.updater.pubkey (see technical-design.md).
Packaging pass: bundle now uses the standard src-tauri/icons set (including icon.ico for Windows), version 0.1.0, publisher/category/description metadata, and explicit NSIS currentUser install mode.

Encrypted Project Sync

A3 — portable-secret vault + per-secret opt-in. Secrets now travel with sealed projects, per-secret and opt-in (ADR-0041 tier 2). Vault JSON shape + opt-in contract documented in technical-design.md.
- Engine: global portable_secrets(ref TEXT PRIMARY KEY, opted_in_at TEXT) table (schema v14 → v15, additive; helpers _portable_secret_refs/_set_secret_portable in store/project_store.py). _build_portable_vault (services/sync_ops.py) seals only refs that are project-referenced AND opted in; machine-scoped refs (machines.credential_ref) are excluded UNCONDITIONALLY; unresolvable opted-in refs are reported skipped. include_secrets on a sealed push now means "seal WITH the vault" (the A2-era 422s in _push_sealed_local/_sync_push_ssh_sealed are gone); the push response's secrets carries {included, skipped}. _install_vault_secrets restores the vault on every unseal flow — sealed sync/unlock, resolve take_remote, sealed SSH import — installing values into the keychain (env refs → manual) and re-marking each ref portable locally so it keeps travelling. New project-admin routes: GET /projects/{id}/secrets (per-ref portability listing; values never returned) and POST /projects/{id}/secrets/portable (toggle; 422 for machine-scoped/unparseable refs). The legacy sealed_artifact._unwrap_data_key reference in tests moved to the public unwrap_data_key (landed with the wave-B checkpoint).
- Renderer: SyncResumePanel.tsx gains a sealed-only "Portable Secrets" section — on-demand listing (loading/error/empty states), per-ref Make portable / Stop syncing, machine-scoped rows explained and untoggleable, opt-in behind a themed confirm stating the value "will be stored encrypted in the repo". The hidden include-secrets checkbox is replaced by "Include portable secrets (sealed vault)" with inline consequence copy; push results render the vault summary. api.ts: ProjectSecretEntry, listProjectSecrets, setProjectSecretPortable, PortableSecretsInstallSummary, widened push/unlock/resolve result types.
A4 — one-click git-crypt → sealed conversion. POST /projects/{id}/sync/convert (project-admin; local + SSH; idempotent — already-sealed → 200 no-op; 423 when the account key is locked). Verifies the working tree is unlocked before the per-project DB is opened (auth-first on the global DB; a \x00GITCRYPT project.db → clean 422 instead of a raw database error). Steps: import secrets.enc into the keychain + mark refs portable (_install_project_secrets now reports refs); strip Skybolt's git-crypt rules from .gitattributes (git_sync.strip_git_crypt_attribute_lines/strip_git_crypt_attributes — user rules preserved, file removed when empty); write the sealed gitignore; delete + untrack .skybolt/keys/account.enc, .skybolt/secrets/, and the plaintext project.db; seal at max(existing, 0) + 1; single commit. SSH variant _sync_convert_ssh_sealed (services/sync_ops.py) fetches secrets/.gitattributes over SSH, retires remote artifacts with --ignore-unmatch, uploads only ciphertext. Renderer: "Convert To Sealed Sync" banner on every git-crypt project + themed confirm modal; success shows generation/secrets and flips the panel to the sealed surface.
A5 — git-crypt retired from setup. scripts/setup.ps1 Ensure-GitCrypt and scripts/setup.sh ensure_git_crypt are detection-only (no Scoop bootstrap, no package installs); an installed git-crypt is reported as legacy-only. sync/remote-git-crypt-install
- panel install guidance demoted to legacy paths that recommend conversion. documents/setup.md rewritten: sealed is the sync format, git-crypt is legacy-read-only, conversion is the migration.
Also: /projects/import/ssh now authenticates + checks account access before any SSH fetch or local write (was after the fetch); test_ssh_import_rejects_option_injection_host updated to authenticate first.
Tests (tests/test_sealed_sync.py, 40): include-secrets vault push (committed blob's vault = opted-in refs only, response summary, no plaintext anywhere) + opt-in-without-include shipping no vault; secrets listing/toggle route coverage (machine-scoped 422, values never in responses); vault build rules unit (forced machine-scoped row still excluded; skipped refs); cross-machine vault round trip (unlock installs + re-marks portable); local conversion end-to-end (repo retirement, untracking, sealed status, idempotent re-run, user .gitattributes rules preserved) + locked-tree 422 / missing-key 423 / nothing-to-convert 422 guards; SSH conversion with recorded primitives (ciphertext-only upload, remote retirement commands, secrets installed + marked portable).
Sealed SSH import (closes the milestone-A2 deferral): POST /projects/import/ssh now resumes a sealed-sync remote whose fresh checkout holds no plaintext project.db. When the project.db fetch fails — or the fetched bytes are the SBSEALv1 container itself — the route fetches <remote>/.skybolt/project.sealed, unseals it locally with the account key into the project's engine-local .skybolt dir, then proceeds with the unchanged registration/rehome flow and mirrors the artifact header's generation into project_registry.sync_generation.
- Engine, routes/projects.py: new _import_ssh_sealed_fallback (fetch blob → keyless read_header for the project id → _unseal_project → identity/header consistency check). 423 when the account key is locked (shared _ACCOUNT_KEY_LOCKED_DETAIL from routes/sync.py); 422 for a missing/unopenable artifact or a foreign account key; temp fetch files always cleaned up. git-crypt remotes (readable project.db) take the existing path unchanged.
- Tests, tests/test_sealed_sync.py (+5, _ssh_fetch faked from a local "remote" dir per the existing patterns): blob-only remote imports end-to-end (cards readable, travelled sync meta intact, registry mirror = header generation, installed DB is real SQLite); container at the project.db path falls back via the magic sniff; locked key → 423 with nothing registered; foreign key → 422; neither file readable → one clean 422. The account_key fixture now also patches routes/projects.
- Known gap (also noted in overview follow-ups): re-importing a stale sealed remote over newer already-registered local state is not generation-guarded the way sync/unlock is — matches the plaintext import paths; candidate follow-up.
Sealed-artifact sync integration (ADR-0041 milestone A2): .skybolt/project.sealed is now the sync format for new projects; git-crypt stays as the legacy path (explicit format:"git-crypt" on init, or auto-detected on an already-initialized git-crypt repo).
- Engine, services/sync_ops.py: async _seal_project (snapshot refresh → generation bump → meta persistence → registry mirror → quiesce → WAL checkpoint → seal → write blob) and _unseal_project (verify/decrypt → quiesced DB replace). Generation + the account-key-wrapped data key persist in per-project __project_meta (sync_generation / sync_format / sealed_data_key_wrap — raw key never persisted) and travel inside the sealed payload; the generation is mirrored to project_registry.sync_generation (global schema v13 → v14, additive; helpers in store/project_store.py). _checkpoint_project_db now opens via connect_project (plaintext driver) so it works with at-rest encryption active. Both seal and unseal wrap the file-touching section in runtime_hooks.quiesced_project (ADR-0038 H1/H2 seam; no-op until the P1 scheduler registers). SSH: _sync_init_ssh_sealed / _sync_push_ssh_sealed seal locally and upload only ciphertext (_ssh_upload_file) — no git-crypt on the remote; _ssh_read_remote_sealed_header reads the remote blob header keylessly (first 64 KiB, base64). inspect_project_db reports a blob-only checkout as locked with the keyless header project id.
- Engine, routes/sync.py: sync/status adds format (sealed|git-crypt|none), local_generation, remote_generation, conflict (remote < local, per ADR-0041 anti-rollback). sync/init defaults new projects to sealed (423 when the account key is locked; response carries format/generation, key_path:null). sync/push for sealed projects seals → stages only .skybolt/.gitignore + project.sealed → commits/pushes (private-repo gate plus Machine-owned Git auth); 409 + conflict:true when the working-tree blob's generation diverges; include_secrets → 422 (vault is A3). sync/unlock sniffs the sealed magic before any git-crypt requirement and unseals in place with the account key (no key file; 409 on rollback attempts; 423 locked). New POST /projects/{id}/sync/resolve {resolution: keep_local|take_remote, message?, push?, confirm_private_repo?} — keep_local reseals at max(local, remote)+1 and commits/pushes; take_remote unseals the repo blob over the local DB (route connection closed first). Request models SyncInitFormatRequest/SyncResolveRequest live in routes/sync.py for now (schemas.py owned by a parallel workstream — fold in later).
- Engine, git_sync.py: sealed .skybolt/.gitignore writer (project.db + WAL/SHM ignored, !project.sealed re-included, NO filter lines) + sealed_artifact_path helper.
- Engine, services/git_ops.py: new stage guard _guard_staged_paths wired into _git_stage_or_unstage and both sync push paths — refuses the plaintext .skybolt/project.db (unless an active git-crypt clean filter covers it), always refuses WAL/SHM, and refuses staged text content matching security/redaction.SECRET_VALUE_PATTERNS (bounded directory expansion; .skybolt/secrets/** and project.sealed exempt; refusals name the path, never the content).
- Renderer: SyncResumePanel.tsx is format-aware — Sealed/git-crypt badge, generation counters, sealed init copy (recovery-code confirm; no git-crypt availability gate), conflict resolution UI (keep-local / take-remote behind a themed Modal confirmation), an "apply newer remote state" offer when the repo is ahead, include-secrets hidden for sealed; all legacy git-crypt UI (install guidance, key-file unlock, secrets export) preserved for legacy projects. api.ts: ProjectSyncFormat, generation/conflict fields on ProjectSyncStatus, key_path: string | null + generation on ProjectSyncInitResult, resolveProjectSync + ProjectSyncResolveResult, optional format on initProjectSync.
- Tests: new apps/engine/tests/test_sealed_sync.py (20 — init/push/round-trip on real tmp repos with real crypto, conflict + both resolutions, detect, stage guard, SSH sealed with mocked primitives, persistence); test_git_sync.py sealed gitignore tests; legacy test_sync.py init calls now pass format:"git-crypt" (paths otherwise unchanged, still green including the real-git-crypt E2E); renderer SyncResumePanel.test.tsx rewritten format-aware (22 tests).
- Deferred to follow-ups: sealed SSH import (/projects/import/ssh fetching the blob), A3 vault + per-secret opt-in, A4 conversion flow, A5 setup-script cleanup.
Sealed artifact primitives (ADR-0041 milestone A1): new skybolt_engine/security/sealed_artifact.py implements the SBSEALv1 container — AES-256-GCM payload keyed by a persistent per-project data key, the data key wrapped by the account key in the header (key_id-tagged for rotation/teammate wraps), AAD = the exact header bytes (binds project id + generation + version; the anti-rollback/anti-swap guarantee), keyless read_header() for locked-machine status/conflict checks, strict unknown-version rejection, tolerated unknown wrap types, and length-prefixed project.db + vault payload with a manifest hash double-check. Byte spec in technical-design.md (new). Tests: tests/test_sealed_artifact.py (13 — round-trips, header/ciphertext tamper, wrong-key, version-reject, truncation, non-deterministic ciphertext). Sync integration (seal/unseal in sync_ops, generation persistence, conflict UI, stage guard) is milestone A2 — git-crypt remains the active sync path until then.
Extended in-app git-crypt unlock to SSH-machine projects (was local-only). POST /projects/{id}/sync/unlock now branches on the project's execution-target type — same endpoint and same unlockProjectSync client call.
- Engine: SSH branch in project_sync_unlock (routes/sync.py) delegates to _ssh_remote_git_crypt_unlock (services/ssh_ops.py), which streams the user's local git-crypt key over the SSH-encrypted channel into a unique remote mktemp file (mode 0600), runs git-crypt unlock in the remote repo, and always deletes the temp key afterward (in a finally, even on failure). git-crypt is auto-detected on the remote with a best-effort install and manual guidance fallback; an SSH target with no resolvable remote repo_path is rejected. Key path/contents are never logged or returned. Supersedes the prior "SSH targets rejected — unlock is local-only for now" behavior.
- Renderer: added a native Browse button to both unlock UIs — the "Unlock Encrypted Data" section in the Sync & Resume panel (SyncResumePanel.tsx) and the "Unlock Encrypted Project" connect-existing modal (SkyboltApp.tsx) — to pick the key file from the local machine via a new desktop select_file Tauri command + pickLocalFile helper. The key path stays transient; Browse shows only in the desktop runtime.
- Tests: apps/engine/tests/test_sync.py test_sync_unlock_ssh_* (remote success + cleanup, wrong-key failure + cleanup, git-crypt missing → manual guidance, missing local key, no remote path configured).
Made the local git-crypt install guidance package-manager-aware and copy-pasteable. _git_crypt_install_command (services/git_ops.py) now detects the local package manager via shutil.which (Windows: Scoop → winget → Chocolatey; macOS: Homebrew; Linux: apt/dnf/yum/pacman/ zypper) so the suggested command works on the user's actual setup. GET /projects/{id}/sync/status now returns git_crypt_install_command, and the Sync & Resume panel renders it as a copyable command with a Copy button when git-crypt is missing locally. For SSH projects the panel notes the engine auto-installs git-crypt on the remote during unlock/sync, so nothing needs installing locally. Note: git-crypt unlock is run once per checkout per machine — the key is stored in .git/git-crypt/ and the checkout stays unlocked (Skybolt never re-locks); re-unlock only on a fresh clone or a new machine.

Git Operation Surface —

Added engine helpers for merge (--no-ff default) / merge abort, rebase / rebase abort, cherry-pick / cherry-pick abort, stash save/pop/list, tag create/delete/list, branch delete/rename, fetch, pull (--ff-only default), force push (--force-with-lease, its own action), bounded log, bounded blame attribution, commit amend, and conflict listing (services/git_ops.py), all in the (status, metadata) shape over strict argv runners.
Failed merge/rebase/cherry-pick/stash-pop/pull now return structured conflict metadata: state, bounded conflicts[], conflict_count, and a resolution hint naming the matching abort action.
Extended local dispatch with a shared per-action handler map and lifted the SSH mutating-Git block: the same handlers run over _run_ssh_git (services/git_exec.py). GitHub gh actions over SSH remain pending (M7).
New gates in constants.py: HIGH_AUTHORITY_GIT_ACTIONS now also covers merge, rebase, force_push, branch_delete, tag_delete, cherry_pick, and commit_amend (any history rewrite is high authority in v1); PROTECTED_BRANCH_WRITE_ACTIONS + DEFAULT_PROTECTED_BRANCH_PATTERNS (main, master) force approval for ANY write touching a protected branch, with per-project fnmatch patterns from git_settings_json.protected_branches.
routes/git.py resolves both gates before insert, stamps protected_branch into the approval metadata, and extends the approval titles/bodies per action; unknown actions stay 422.
New tests/test_git_surface.py (18 tests): every new action on tmp repos, conflict structures, approve-executes / reject-discards, protected-branch defaults + custom globs, and SSH routing asserted on exact argv.
Docs: this feature folder created; supersedes the execution-engine doc statement that mutating SSH Git actions stay blocked.

In-App IDE

Fixed: the workspace API now works for SSH-machine projects, not only Local machines. Previously every /projects/{id}/workspace/* endpoint ran local filesystem operations against project_path, which for an SSH target is a remote path — the tree came back empty and reads/writes failed. Each endpoint now branches on target["kind"] ("local" vs "ssh"); the local code paths are unchanged.
For SSH targets the engine drives portable remote shell commands over the existing SSH helpers (_run_ssh_shell, _run_ssh_git, _ssh_fetch, _ssh_upload_file): ls -Ap for the tree, wc -c + cat (fetch to a tempfile) for reads/raw, upload-from-tempfile for writes/create, test/mkdir/mv/rm/rmdir for create/rename/delete, git status --porcelain for git-status, and remote rg → git grep → grep -rn for content search. Remote .gitignore is fetched, appended idempotently, and uploaded back.
Remote path safety: with no local FS to resolve against, SSH paths are sanitized with the existing _safe_relative_path + .git/.skybolt blocklist and every remote path is shlex.quoted before it reaches a shell. The remote search keeps the same ReDoS rule as local — a user regex only runs under a timeout-bounded engine (remote rg/git grep); otherwise it is refused with 422.
Added focused SSH tests in apps/engine/tests/test_app.py (fake-SSH against a temp "remote" dir, mirroring test_sync.py): tree listing + protected-dir exclusion, read content, raw bytes, save via upload, create/rename/delete round-trip, traversal/.git write rejection, idempotent gitignore, git-status (repo + non-repo), and git-grep search + regex refusal.
Fixed: .env and other dotfiles/extensionless files (.gitignore, Dockerfile, etc.) were misreported as binary and could not be opened. Path(".env").suffix is "", so the suffix-only text check never matched. _is_text_like now also matches known text filenames and .env/.env.*. More importantly, the read path no longer refuses on an unknown extension: it sniffs the actual bytes and opens anything without a NUL byte as text (_looks_binary), so unknown-but-textual files still open in the editor. Only genuinely binary content (or images) is withheld. Applies to both local and SSH reads; covered by test_workspace_reads_dotfiles_and_unknown_text_as_text.
Performance: added a cached bulk file index (GET /projects/{id}/workspace/index) so the IDE loads the whole tree in ONE round-trip instead of one tree call per folder — the big win over SSH, where every call is a fresh ssh connection. The engine builds the flat listing with os.walk (local) or a single portable find (SSH, both -type passes in one connection), excludes .git/.skybolt/ soft-excludes, caps at max_entries (truncated → client falls back to lazy per-folder loading), and caches it for 30s keyed by (project, target), invalidated on create/rename/delete. The frontend builds the tree client-side from the index (folders expand with no fetch) and serves filename search instantly from the in-memory index; content search stays a live scan. SSH connection reuse via ControlMaster was not used (unsupported by the Windows OpenSSH client) — the one-shot index sidesteps it by collapsing N folder round-trips into one.
UX: opening a file now shows immediate feedback. The tab appears instantly in a loading state with a spinner (and the editor area shows a spinner) while the content is fetched, instead of a silent lag with no tab; the placeholder fills in when content arrives, or is removed if the read fails. This is most noticeable over SSH where the read is a remote round-trip.

Machines

Retired the runner-era request/response contract in favor of execution_target_id. The engine no longer accepts runner_id in any request schema (terminals, chat threads/runs, browser sessions, source materials, agent sessions, command profiles, Git operations) and no longer writes or echoes runner_id in metadata or API responses; the renderer stopped sending and reading it. Existing DB columns are untouched (additive-only schema). Removed the unused runner enrollment/pairing endpoints. GET /accounts/{id}/runners and POST /accounts/{id}/runners/{id}/revoke remain as documented legacy for the renderer's machine status and delete-machine flows.

Project Import —

Added services/repo_scan.py: bounded, read-only machine-side repo scan (languages, frameworks, package managers, inferred test/lint/typecheck/build commands, Docker/Compose files, docs presence, git metadata) for Local and SSH targets, reusing the workspace index and SSH helpers.
Added services/readiness.py: 22-item readiness audit (present/partial/missing, fixable flags) plus the static project-name-interpolated scaffold template registry (AGENTS.md, /documents skeleton, setup/dev scripts, .env.example).
Added routes/readiness.py: POST /projects/{id}/scan, GET /projects/{id}/readiness, POST /projects/{id}/readiness/fixes/preview (no writes, unified diffs), POST /projects/{id}/readiness/fixes/apply (explicit files only, per-file overwrite gate, run-capable role gate, workspace path-safety, approval_requests audit row per apply). Registered in routes/__init__.py.
Scan responses carry suggested_command_profiles shaped for the existing command-profiles create endpoint; seeding is the wizard's job.
Tests: apps/engine/tests/test_readiness.py (12 tests — detection fixtures covering all six manifest ecosystems, audit fixtures, preview/apply safety, role gating, SSH branch with a fake remote).

Public Site

Created the public site (Pillar 3 milestone P-M9, ADR-0049): apps/site, a new pnpm workspace package @skybolt/site — Astro, TypeScript (astro check via @astrojs/check), static output only (output: 'static'), fonts self-hosted via @fontsource/space-grotesk + @fontsource/inter (no Google Fonts CDN).
Pages v1: landing (static SVG/CSS interpretation of the reference three.js Atmosphere → Void hero — build-time seeded starfield + accent agent constellation; island upgrade path documented), /features (six cards sourced from the feature overviews), /changelog (generated at build time from documents/features/*/changelog.md by src/lib/changelog.ts — drift-tolerant ## YYYY-MM-DD parser, grouped by date then feature, newest first), /pricing (Free vs Premium placeholder, "coming soon", no billing), and a waitlist section (stub submit handler; documented note pointing at the future apps/api waitlist endpoint — nothing is transmitted).
Visual identity per documents/design-system.md: void base, #0077e4 accent, 2px radii, glass panels, tracked uppercase micro-labels, dense technical layout; responsive and prefers-reduced-motion-gated animation.
Workspace/CI wiring: added apps/site to pnpm-workspace.yaml (and approved sharp builds for Astro), added root site:build script, appended the site CI job (astro check + pnpm --filter @skybolt/site build) to .github/workflows/ci.yml.
Follow-ups: Caddy serving block for skybolt.ai on the ADR-0043 droplet (documents/features/cloud-deployment/), real waitlist endpoint in apps/api + form wiring, optional three.js hero island, documents/features/index.md entry (deferred — concurrent-wave file ownership).

SSH Machines

Hardened SSH endpoint handling: host/user values are now validated where the connection spec is built (SshMachineSpec.__post_init__ in apps/engine/skybolt_engine/adapters/ssh.py), so every ssh_ops path (probe, import, shell, git, file transfer, remote git-crypt unlock) automatically rejects a host/user that ssh could misread as an option flag (e.g. -oProxyCommand=...) instead of only the probe adapter. /projects/import/ssh now returns a clean 422 for such values rather than a 500. No shell-injection was possible (argv lists, no shell=True); this closes the lower-severity flag-injection gap flagged by the remote git-crypt unlock security review. Tests: apps/engine/tests/test_adapters.py and apps/engine/tests/test_sync.py.
Added a Test connection button to the saved SSH Machine card (ProjectEnvironmentPanel in apps/web/src/app/SkyboltApp.tsx) that re-runs the live GET /projects/{id}/machine/probe on demand, with visible probe status (Checking / Online / Offline) and error text. Previously the renderer only probed on entering the Environment tab or on a settings change, so a Machine that was offline at setup stayed visually offline after the user fixed SSH and had to be deleted and re-added. Frontend-only; the backend probe route is unchanged. Test: apps/web/src/app/SkyboltApp.test.tsx "re-probes the saved ssh machine when Test connection is clicked".

2026-06-08

Agents Execution

Designed the multi-agent orchestration and conflict-prevention architecture and recorded it as ADR-0036 and the agents-execution feature docs (overview.md, technical-design.md, storage-and-portability.md, test-plan.md). No code changed this round.
Chose SQLite (WAL + busy timeout) + an in-process asyncio event bus + the existing renderer WebSocket as the coordination substrate, and rejected the earlier Redis proposal as incompatible with local-first/offline-after-install and the "engine owns runtime / no external services" rules.
Defined the executor (an in-engine asyncio scheduler) as the missing piece that advances queued agent sessions: claim → prepare worktree → launch CLI seat with a prompt package → heartbeat → complete → advance card → unblock dependents, with TTL-based lease expiry and restart/cross-machine recovery.
Defined the four-tier conflict-prevention model: per-card worktree isolation, scope-disjoint scheduling over a dependency graph, an advisory overlap radar (escalate, do not deadlock), and an interface-first Shared Artifact Registry — the primary defense against duplicate helpers/types.
Set concurrency as N-agnostic with a default of 3-5 and admission control over global/machine/ provider/disk capacity; documented 30 concurrent agents on one host as unrealistic.
Recorded the per-project storage and cross-machine portability design as ADR-0037 and storage-and-portability.md: a global engine database plus a per-project .skybolt/project.db, split into machine-agnostic (portable) and machine-keyed (reconciled-on-open) rows, designed to be encrypted before being committed.
Updated implementation-plan.md to replace the Redis-era assumptions with the SQLite direction, the per-project storage split, and the phased build (P0-P4).

Desktop App

Made .\scripts\setup.ps1 install missing Windows prerequisites by default, with -NoInstallPrereqs available for check-only environments.
Made .\scripts\dev.ps1 the documented Windows desktop dev command, with -Desktop retained only as a compatibility alias.

Encrypted Project Sync

Added read-only project detection so connecting an existing folder auto-offers to import an existing Skybolt project instead of forcing a re-set-up on a second computer.
- Engine: POST /projects/detect (routes/projects.py detect_project), request model ProjectDetectRequest (schemas.py), and filesystem helper inspect_project_db (services/sync_ops.py). Session-gated like POST /projects/import, no account_id; HTTP 200 for all detection outcomes; reports none/locked/readable/unreadable from the project.db header plus already_imported against this machine's project_registry. Never registers or mutates anything.
- Renderer: completeProjectGitSetup (apps/web/src/app/SkyboltApp.tsx) calls detectProject before the bare existing-repo connect. readable + not imported → "Existing Skybolt Project Found" modal (Import vs Connect Without Importing); readable + already imported → switch to it; locked → show the git-crypt unlock hint and stop; none/unreadable (or a failed detect) → proceed with the normal connect. Client wrapper detectProject + ProjectDetectResult in apps/web/src/app/api.ts. Clone / new-repo providers unchanged.
- Tests: apps/engine/tests/test_sync.py test_detect_* (4 tests).
Added in-app git-crypt unlock so an encrypted checkout can be decrypted without a terminal.
- Engine: POST /projects/{id}/sync/unlock (routes/sync.py project_sync_unlock), request model SyncUnlockRequest (schemas.py). Session-gated, project_admin only. Request { key_path, repo_path? }; repo_path optional (resolved from the project's default execution target when omitted; SSH targets with no explicit repo_path were rejected at the time — unlock was local-only, since extended to remote SSH unlock on 2026-06-09 above). Runs git-crypt unlock <key_path> in the repo (argument list, no shell; key contents never logged). Response { unlocked, repo_path, detail }.
- Renderer: unlockProjectSync (apps/web/src/app/api.ts) wired into (a) the connect-existing detect flow — a locked detection now opens an "Unlock Encrypted Project" modal (SkyboltApp.tsx): enter key file path → unlock → auto re-detect → import prompt; and (b) the "Sync & Resume" panel as an "Unlock Encrypted Data" section (apps/web/src/app/project/tabs/SyncResumePanel.tsx), shown when in_repo and git_crypt_initialized.
- Tests: apps/engine/tests/test_sync.py test_sync_unlock_* (happy path, explicit repo_path, missing key file, requires git repository); renderer tests in SkyboltApp.test.tsx and SyncResumePanel.test.tsx.
Made project delete non-destructive for repo-hosted databases.
- Engine: DELETE /projects/{id} (routes/projects.py delete_project) now removes the engine-local project.db only for non-repo-hosted projects. When .skybolt lives inside a git repo (detected via <.skybolt parent>/.git), it de-registers the project but leaves the git-tracked .skybolt/project.db on disk so the project can be re-imported via the connect-existing detect flow. Docstring note added to _remove_project_db (store/project_store.py).
- Tests: apps/engine/tests/test_sync.py test_delete_preserves_repo_hosted_db, test_delete_removes_non_repo_hosted_db.

Execution Engine

Fixed terminal deletion so subprocess-backed terminals are spawned in their own process group/session and Kill/Kill All terminates the group before deleting the terminal row. This prevents web servers launched from a terminal shell from surviving after Skybolt clears the panel.
Changed renderer terminal capacity handling to ignore legacy machine-local runner capacity and show/enforce only project terminal capacity.
Disabled config-defined SSH port forwarding on normal Execution Engine SSH terminal, probe, command, and Git calls so connecting to a saved SSH config entry does not automatically bind local preview ports.
Added a project-scoped Machine probe route so the renderer can display SSH Machine connectivity status without opening a terminal.
Fixed desktop-engine terminal creation for SSH Execution Targets. SSH terminals now spawn a structured OpenSSH argv behind the existing terminal WebSocket bridge and use the remote project path as the SSH startup directory command instead of passing it through local filesystem validation.
Fixed Dev Ops setup so the saved Project Execution Target appears as the selectable Machine, including SSH Machines that do not have a legacy runner row.
Added Connect Existing GitHub Repo and Connect Existing Azure DevOps Repo setup paths that save the selected Machine project path and provider without cloning, fetching, checking out, or changing files.
Made existing-checkout setup use a single repo status readiness probe and ignore stale clone failures so it does not briefly fail and then swap to ready after background metadata catches up.
Kept SSH existing-checkout readiness lightweight by skipping optional origin/HEAD and remote enumeration during the initial status probe, while preserving the selected Git provider from saved setup metadata.
Added read-only SSH Git metadata probes for Dev Ops status, branches, commit history, compare, and GitHub auth checks through quoted OpenSSH commands.
Raised SSH Git read command timeout to two minutes so slower SSH connections do not fail basic existing-repo metadata probes such as git remote -v.
Detached stdin for SSH Git subprocesses so OpenSSH cannot inherit terminal input while the engine is collecting read-only metadata.

In-App IDE

Added an in-app IDE in the Files tab: browse, open, edit, create, rename, and delete files and folders across the project's git working tree (project_path), with an open-files tab bar, dirty tracking, and an unsaved-changes guard. Replaces the old assets/-only Files page scope.
Chose Monaco (the VS Code editor core) as the editor, with self-hosted offline workers: the editor and language workers are bundled by Vite and served from the app — nothing from a CDN — so the IDE works fully offline and behaves identically across Linux/macOS/Windows in the desktop webview. Monaco was chosen over a lighter editor for room to expand (multi-language intelligence, diffs) without re-platforming.
Scoped the IDE to the whole repo working tree, with .git/ and .skybolt/ hidden from the tree and write-protected on every read/write/rename/delete/gitignore path. Source stays local; no new sync or exfiltration (ADR-0034/ADR-0035 privacy posture preserved).
Added git-status badges (from git status --porcelain; non-repo trees report no statuses rather than erroring) and a tiered filename + content search (ripgrep → git grep → a bounded os.walk floor that always works, no external indexer).
Added an auth-aware image viewer (Blob object-URL with the bearer credential, revoked on unmount) and a right-click "Add to .gitignore" that appends idempotently as a plain file edit, not a high-authority Git action.
Added the backend workspace API (/projects/{id}/workspace/*: tree, git-status, file, raw, entries[create], rename, entries[delete], search, gitignore) in apps/engine/skybolt_engine/app.py, with reads gated to project membership and writes gated to run-capable roles (_project_can_run). Path safety reuses _safe_relative_path + _path_inside plus a .git/.skybolt blocklist.
Added the frontend apps/web/src/app/project/ide/ module (CodeWorkspace, useCodeWorkspace, file-tree, context menu, open-files bar, editor pane, lazy Monaco editor, image viewer, search panel, new-entry dialog, fileLanguage, monacoSetup), wired through FilesTab/SkyboltApp, with new workspace API client functions/types in api.ts and new icons. Raised the Vite workbox maximumFileSizeToCacheInBytes to 8 MB so service-worker precache generation tolerates Monaco's code-split local worker chunks.
Setup scripts (scripts/setup.ps1, scripts/setup.sh) now ensure git (required) and install ripgrep (optional, for fast/regex IDE search) so search uses the ripgrep engine on a freshly set-up machine; the engine resolves them off the OS PATH via cli_resolution.resolve_local_cli, falling back to git-grep/os.walk when rg is absent. documents/setup.md updated to match.
Recorded the decision as ADR-0039 (in-app IDE workspace file access).

Machines

Fixed Machine and Dev Ops settings appearing unconfigured after every app restart. The project listing (/auth/me, /projects) read display-only rows from the global project_registry, which no longer carries per-project settings after the ADR-0037 storage split, so isolation_policy, git_settings, and the project paths came back null on launch and the app prompted re-setup. The saved values were never lost — they live in the per-project project.db — so listings now overlay the authoritative settings row from each per-project database (_list_user_projects).
Fixed Dev Ops Machine labels so a saved SSH Project Execution Target without a legacy runner row appears as available instead of stale.
Removed the user-facing legacy machine-local terminal capacity from the renderer. Terminal badges, Machine rows, overview readiness, and new-terminal gating now use only project terminal capacity when the Execution Engine reports it.

SSH Machines

Disabled SSH config-defined port forwarding for normal SSH terminal, probe, command, and Git operations so saved config entries do not accidentally bind preview ports such as 127.0.0.1:8000.
Added a project Machine probe endpoint and wired the Environment tab SSH tag to show green when the saved SSH Machine can connect and red when the probe fails.
Fixed SSH Machine terminal creation so the Execution Engine launches an OpenSSH PTY with the saved SSH config entry or host/user/port and starts in the remote project path instead of validating the remote path as a local folder.

2026-06-07

Browser Redesign

Removed the sessions rail New button and made Kill All full width.
Moved New and Kill All into the sessions rail and removed the visible top Browser toolbar.
Removed the visible target/save-target controls from the desktop Browser UI, added inline session rename, changed session close to an X, and removed the success status-message copy.
Removed internal session status and browser-kind tags from the desktop Browser UI.
Replaced the desktop Browser console panel with a left-side vertical sessions rail and DevTools inspection.
Routed embedded Browser new-window link requests back into the Skybolt Browser pane instead of Tauri shell.open.
Added a DevTools action for the embedded Tauri Browser webview.
Switched the visible desktop Browser pane from renderer iframe embedding to a native Tauri child webview so external sites with frame restrictions can load inside Skybolt.
Added Execution Engine browser target/session endpoints for local Chromium launch.
Updated the Browser tab to use desktop Chromium sessions with a local console panel.
Created Browser Redesign feature docs.
Documented local browser automation and SSH-forwarded URLs as the active direction.
Demoted runner WebRTC streaming to fallback.

Desktop App

Fixed desktop sidecar preparation so it creates Tauri's expected externalBin before Tauri's Cargo build validates bundled resources.
Added the standard Tauri desktop icon set generated from the existing Skybolt web app icon so Windows resource generation can complete.
Aligned the Tauri desktop dev URL with Vite's localhost bind so Tauri stops polling 127.0.0.1:50000 while Vite is listening on localhost:50000.
Set the desktop Cargo default binary and removed the extra literal -- from the Vite dev command so Tauri launches skybolt-desktop and Vite honors --strictPort.
Moved Skybolt-owned local dev ports to 50000 and above so lower ports remain available for SSH development sites and forwarded project previews.
Fixed the desktop sidecar reload flag so the Rust sidecar accepts --reload and forwards it to the Python Execution Engine.
Routed the desktop runtime directly into the app shell so launch goes to login, first-run setup, or the dashboard based on the normal auth session check instead of showing the marketing site.
Wired desktop renderer API calls to the Tauri-managed engine session and added local engine auth/session endpoints so desktop development does not require the optional cloud API on localhost:50002.
Added local engine project CRUD, health, runner listing, and empty runtime collection endpoints so creating a project in the desktop app no longer returns {"detail":"Not Found"}.
Added Windows PowerShell setup/dev scripts for the desktop flow and documented .\scripts\setup.ps1 plus .\scripts\dev.ps1.
Wired desktop development to set the Python engine reload flag while keeping Tauri/Vite hot reload as the default renderer path.
Created the desktop-first feature doc set.
Documented Tauri as the active shell direction.
Documented the expected bash scripts/dev.sh, pnpm desktop:dev, and pnpm desktop:build startup paths.
Added the apps/desktop Tauri scaffold with engine sidecar launch, per-launch token creation, desktop app-data database wiring, and shutdown cleanup.
Documented the temporary sidecar health-stub fallback; this was later superseded by the self-contained bundled Python engine release path above.

Execution Engine

Fixed Codex/Claude Code Project Board planning sessions failing to enter plan mode. Root cause: the programmatic terminal-input queue advanced inside an impure setState updater that mutated a backlog ref. Because the app runs under React.StrictMode, which double-invokes updaters, the second invocation saw the already-popped backlog and committed a state with the item dropped, so queued inputs (/plan, Enter, the planning prompt) were lost intermittently — /plan typed but never submitted, or not typed at all. The queue is now a single per-terminal FIFO array in state with pure updaters and a monotonic id counter (no Date.now() id collisions), so every queued input is delivered in order. Manual typing was unaffected because it bypasses the queue. This also fixes the same drop for any other programmatic terminal input (e.g. send-to-console). Regression guard: the planning CLI test now renders under StrictMode (it times out against the old queue).
Planning also waits (best-effort) for the CLI to finish booting — recognized banner or settled output, with a fixed-delay fallback so /plan is always typed even when the raw-ANSI banner isn't recognized — captures the /plan echo marker right before typing, lets the slash-command autocomplete menu render, then submits with Enter plus a second spaced Enter (a redundant empty Enter is ignored by the CLI). These harden the typed sequence on top of the queue fix.
Raised the Execution Engine to Python 3.14 and added CLI reload support for desktop development.
Created the Execution Engine feature docs.
Documented apps/runner as migration source for apps/engine.
Documented neutral engine naming and local-first privacy boundaries.
Added the apps/engine Python package scaffold with FastAPI loopback API, token-protected session/probe endpoints, fresh SQLite schema, metadata redaction helpers, and Local/SSH adapter probes.
Added local project board card persistence and CRUD endpoints to the Execution Engine.
Removed the renderer Project Board planning drafts strip; the board now loads direct card data without planning-draft reads.
Added local built-in agent persona persistence plus the seed/list endpoints used by the Agents tab.
Expanded the built-in persona library to 50 business and software delivery roles, with reseeding support that refreshes existing built-in rows.
Added desktop-engine chat thread persistence and terminal request persistence for the Chats page.
Added a compatibility Local Machine response for the current renderer while the UI still consumes legacy runner-shaped API fields.
Added local model provider, catalog model, and CLI seat persistence so Chats can offer only Settings-enabled chat targets.
Added an engine-local terminal broker, terminal attach endpoint, and WebSocket bridge so Codex/Claude Code chat terminals move from requested to live without the legacy runner adapter.
Added non-destructive local schema repair for older terminal_sessions tables so terminal creation does not fail on existing desktop databases.
Reconciled local terminal rows against the engine terminal broker so CLIs that exit before attach are marked exited with an actionable install/environment message instead of retrying stale WebSocket attachments.
Added local project-agent persistence so personas can be added to Active Agents in desktop mode.
Added an Agents tab persona detail view and first-name active-agent display names for built-ins.
Added desktop-engine Dev Ops Git operation and approval routes so the Dev Ops page can run local Git status/history/branch/stage operations and route commit/push/PR creation through approvals instead of hitting placeholder routes that returned 405.
Refined the Dev Ops Git workflow with branch checkout, setup-only GitHub repo clone controls, combined commit/push approval, and broader Windows GitHub CLI discovery for the desktop engine.
Added Dev Ops branch creation and adjusted the commit controls so stage, unstage, commit, push, and commit/push share one action row.
Moved Dev Ops branch checkout and branch creation controls into a dedicated top-right Branches panel.
Added select-all for changed files, made Push a primary action, and surfaced GitHub CLI install plus auth script blocks in the GitHub auth warning.
Changed GitHub CLI install links to explicit "GitHub CLI install instructions" buttons that open the install page directly.
Clarified Windows GitHub CLI auth scripts by labeling PowerShell and Git Bash separately and avoiding Bash $PATH syntax that PowerShell parses badly.
Removed the redundant GitHub Auth button, changed unready Dev Ops status to red "Not Ready", and added explicit GitHub vs Azure DevOps setup choices.
Changed new/unready projects to show only the Dev Ops setup flow until the repo verifies as ready.
Gated fresh Dev Ops setup behind an Environments Machine, made setup require Machine, provider, repo, default branch, and branch prefix, and added the Complete Setup action.
Changed fresh GitHub setup to use the repository default branch from GitHub CLI metadata when available, while keeping feature/ as the default branch prefix.
Changed fresh GitHub setup to load available branches for the selected repo and show Default branch as a dropdown that starts on GitHub's marked default branch.
Removed manual Load Repos and Load Branches buttons from Dev Ops setup; repositories and branches now refresh automatically as the setup choices change.
Added a modern native Windows/Linux folder browser for local Machine setup and kept Complete Setup blocked when the selected project folder already exists or may contain files.
Changed Dev Ops setup to use the selected Machine's saved project path as the clone target and the default sibling worktree root without showing project folder or worktree root inputs. The Git settings card also hides worktree root and derives the same default on save.
Changed ready Dev Ops Git Setup to hide repo path, show branch defaults as dropdown data, and refresh branch metadata after branch creation or checkout. The Git Setup card no longer repeats ready/provider tags.
Added explicit inline loaders for setup clone and branch checkout operations.
Removed Verify Setup, changed existing project folders to a Yes/No confirmation modal, and kept blocked or failed clone messages from duplicating as page-level errors.
Blocked all Dev Ops actions when a configured project no longer has a Machine, without forcing a full Dev Ops re-setup, and added automatic repo readiness checks when the Dev Ops tab opens.
Added the renderer AI Planning Session Mission Room for the Project Board, using durable planning chat threads and staged card creation without adding backend planning-draft schema.
Added Planning Session model selection across enabled provider models and Codex/Claude Code seats.
Changed Codex and Claude Code CLI seats from account/user scope to project/user scope so new projects start with independent local CLI settings.
Blocked Dev Ops Git status operations until the project has an explicit repo path so fresh projects cannot inspect the Skybolt checkout through execution-target or engine-cwd fallback.
Fixed local terminal output streaming so Windows shells and short-running commands render prompt bytes before the process exits instead of waiting for a full pipe buffer.
Improved planning terminal startup errors so failed or non-live terminal sessions surface the engine status message instead of continuing into a generic attach failure.
Added Windows PTY support through pywinpty so local interactive CLIs such as Codex run with a real terminal instead of exiting with stdin is not a terminal.
Added planning-seat availability checks for Codex and Claude Code so missing local CLIs are blocked before Skybolt opens a terminal session.
Fixed Local Machine CLI capability probing so the Project Board sees Codex and Claude Code when they are installed in common Windows locations even if the desktop sidecar starts with a narrower PATH than the user's terminal.
Sanitized terminal-control output before showing Codex/Claude Code planning responses in the Mission Room transcript, and launch Codex with --no-alt-screen to reduce full-screen TUI redraws.
Changed Codex/Claude Code planning sessions to show an embedded live console in the Mission Room instead of relying only on sanitized terminal text in chat bubbles.
Removed the guided answer controls from CLI-backed planning sessions so the embedded terminal can fill the Mission Room console panel.
Restored automatic Mission Room prompt send for Codex/Claude Code planning sessions by queuing /plan, waiting until it is visible in the live console, sending Enter, then sending the generated planning prompt in order.
Added a Mission Room planning session manager with Start New, Resume, Start Over, and Delete actions. Planning sessions remain hidden from normal Chats and resume from local-only planning_state snapshots stored in existing chat thread metadata.
Changed Mission Room Start New so the model selected on the manager screen is used immediately, with no duplicate model dropdown or second start button in the session view, and added Back from the active session to the manager.
Added Planning Session cleanup for attached CLI terminals so deleting or starting over a Codex or Claude Code planning session stops stale planning terminals from piling up.
Wrapped multiline planning terminal prompts in bracketed paste markers so Codex and Claude Code TUIs receive one pasted prompt instead of scrambled raw keystrokes.
Fixed terminal Ctrl+V paste in the desktop renderer by reading clipboard text during the key gesture while keeping browser paste events as a fallback.
Changed blank Windows terminal launches to prefer detected PowerShell over %COMSPEC%, changed Linux/macOS blank terminal launches to default to Bash, and wired terminal creation to use the selected Machine's default shell override.
Applied terminal attach/resize dimensions to the Windows PTY and refreshed the xterm viewport after output writes so new terminal cursors render at the live prompt before first input.
Removed the synthetic metadata-only notice from the xterm buffer so Windows shells that use absolute cursor positioning, such as PowerShell with PSReadLine, do not render first input on the wrong row.
Filtered xterm-generated terminal response sequences, including device-attribute replies such as ESC[?1;2c, before forwarding renderer input to the PTY so PowerShell startup probes do not appear as typed continuation text.
Added project-local Brain context packages that include Brain-enabled source materials, general chats, and Files assets. The Files page manages project assets under the selected assets folder, defaults new uploads to Brain Off, lets users mark individual files in or out of Brain, and stays blocked until the project has a saved Machine project folder.

Legacy Control Plane

Marked the control-plane feature docs as superseded by ADR-0035.
Demoted runner-first setup, pairing, enrollment, and direct runner WSS instructions to historical context.
Pointed future work at Desktop App, Machines, Execution Engine, SSH Machines, and Browser Redesign docs.

Machines

Added a Windows-only Local Machine default-shell dropdown on the Environment tab using detected PowerShell and Command Prompt locations.
Changed Environment saved actions from Compose-specific presets to named command actions with a target Machine and an auto-run-on-new-terminal option.
Folded underlying runner status into saved Machine rows so refreshing with an online local runner does not show two Local Machine entries.
Changed Add Machine to open a setup modal and made saved Machine settings render as rows with Edit and Delete actions.
Moved Machine environment choice and source-material path into the Environment tab.
Moved Git repo/worktree settings to Dev Ops so provider-specific Git, GitHub, and Azure DevOps workflows share one surface.
Made desktop-engine terminal requests use the saved Local or SSH Machine project path as the startup directory.
Added user-facing Machine delete flow that removes the Machine from the active list while preserving runtime history.
Created the Machines feature docs.
Established Machine and Project Execution Target as the active product language.
Documented the replacement of user-facing Runner concepts.

SSH Machines

Created SSH Machines feature docs.
Documented Linux/macOS remotes as v1 scope.
Documented host-key and credential-storage requirements.