Experiment Persistence dakka

Playwright specs covering spawn/dismiss, avatars, MQI, and responsiveness can be written against the Mork dashboard once pre-existing warband state is isolated per test.

16/16 pass (14 + 2 conditional skips). Two product bugs found and fixed: destroy-rebuild in MascotBar races Playwright locators, and avatar canvases n

dakkae2eplaywrightmork
Hypothesis

Playwright specs covering spawn/dismiss, avatars, MQI, and responsiveness can be written against the Mork dashboard once pre-existing warband state is isolated per test.

Result: confirmed
Key Findings

16/16 pass (14 + 2 conditional skips). Two product bugs found and fixed: destroy-rebuild in MascotBar races Playwright locators, and avatar canvases never mount for boys spawned before the avatar manifest fetches. Three pitfalls documented: release-binary drift, boy:krump vs boy:dismiss semantics, destroy-rebuild-races-Playwright.

Hypothesis

Playwright can cover the Mork dashboard’s five critical surfaces (spawn/dismiss, avatar rendering, MQI hover, MQI accuracy, responsiveness) once pre-existing warband state is isolated per test via a WebSocket-based beforeEach. The spec work is primarily a documentation-of-expected-behavior task; product bugs should be rare.

Method

  1. Inventory the five domains (S1/S4/S5/S6, A1/A2, M1/M3/M4/M5, D2, R1/R2/R3/R4/R5).
  2. Write e2e/helpers.ts::dismissAllBoys(baseURL): opens /ws, reads warband:init, sends boy:dismiss for each live boy, waits for matching boy:removed acks, filters binary PTY frames.
  3. Wire dismissAllBoys as beforeEach in every spec.
  4. Run against ./target/debug/rusty-dakka spawn --port 7331 & (background; cargo run blocks the caller).
  5. Iterate on real failures: fix spec bugs against real product behavior, fix product bugs when the spec reveals them.

Results

Progression

IterationStateWhy
Start0 / 16 pass#btn-spawn-a disabled because 2 pre-existing boys put multiclaude mode into the boysA.length >= 2 disabled state. No tests had a way to clear state.
+ dismissAllBoys helper (first cut)Still 0 / 16Helper sent boy:krump expecting boy:removed. Krump doesn’t unregister. See [topics/pitfalls/boy-krump-does-not-unregister](/topics/pitfalls/boy-krump-does-not-unregister).
+ boy:dismiss fix5 / 16Setup works. Tests still fail on selector bugs and spec-vs-reality mismatches.
+ selector fixes (.agent-card-dismiss, state: 'attached', 6s timeouts)10 / 16Real failures start surfacing: .agent-card “resolved to visible” then timeouts, .mqi-indicator detached during hover.
+ M4 JSON-parse-before-skip guard11 / 16Empty-body 404 from stale release binary → SyntaxError → test cancel.
+ switch target/release (2026-04-17) → target/debug (2026-04-23)11 / 16 (M3/M5 flicker)Release binary predated the Task 6 MQI fix; debug matched source. See [topics/pitfalls/rusty-dakka-release-binary-drift](/topics/pitfalls/rusty-dakka-release-binary-drift).
+ MascotBar reconcile refactor10 / 16The reconciler stopped re-calling renderIntoIconBox on every update; boys that spawned before the avatar manifest fetched never got a canvas. Regression from A1/R4.
+ AvatarManager.init().then(updateMascotBar) re-paint14 / 16 (+ 2 skip)All real failures cleared. Remaining 2 are conditional skips (D2, M4 both guard if (mqi == null) test.skip() against bloomnet.db which has no session rows).
+ workers: 1 in config16 / 16 (14 + 2 skip)Previously-flaky failures at workers > 1 were contention over the single dakka server’s warband; serializing tests eliminated the race.

Product fixes shipped

  1. crates/ui/static/js/sidebar.js: MascotBar.update() now reconciles by data-boy-id instead of destroying both clusters on every snapshot. New _upsertAgentCard + _applyCardVisuals methods mutate cards in place; avatar canvas stays mounted across state transitions. See [topics/pitfalls/mascot-bar-destroy-rebuild-races-playwright](/topics/pitfalls/mascot-bar-destroy-rebuild-races-playwright).
  2. crates/ui/static/js/app.js: after AvatarManager.init() resolves, call updateMascotBar() if any boys are already registered: retry path for the manifest race.

Spec-only fixes

  • .agent-card-dismiss is the real class (spec drafts used .dismiss-btn|.krump-btn|[aria-label="Dismiss"], none existed).
  • M4: resp.text() + JSON.parse try/catch before skip check.
  • A1: .count() instead of .textContent() (the latter’s 30s default action timeout burned the entire test budget when the element was absent).
  • R4: assertion fix: each boy owns two canvases (mascot + terminal), so dismissing the only boy drops the count to zero, not to before - 1.
  • workers: 1 in e2e/playwright.config.ts.

Findings

  1. Write once, run iteratively. The 2 h session split roughly: 20 min diagnosing why the button was disabled (initial misread as “cargo run blocks”), 40 min on helper + selector iteration, 40 min on the MascotBar refactor + retry kick, 20 min on documentation. The diagnose-before-iterate discipline is what kept the scope bounded.

  2. Pre-existing server state is a first-class test-isolation concern. Dakka’s server keeps boys across client reconnects. A spec that page.goto('/') and expects an empty dashboard is making an undocumented assumption. The cheap fix is a beforeEach WS cleanup; the cheaper path is to stand up a fresh server per run, but that costs ~3s per test suite in release-binary startup.

  3. Binary drift is invisible until it isn’t. The 2026-04-17 release binary had served happily for six days. Nothing about its behavior was wrong in isolation: only relative to the spec written against the 2026-04-23 source contract. Always diff target/release/<bin> mtime against the handler source mtime before trusting a “known good” server.

  4. Destroy-rebuild is a React-to-vanilla-JS porting hazard. React’s reconciler makes “wipe the cluster, redraw” cheap. The DOM makes it expensive: visually (flicker), semantically (listener re-bind), and test-wise (locator races). Any component ported from a reactive framework should preserve node identity across updates.

  5. Parallel workers + shared server = undefined behavior. Playwright defaults to autodetected workers, but a single-server E2E suite must serialize unless the setup creates per-worker isolation (separate server, separate DB, separate port). Encoding workers: 1 in the config is cheaper than the debugging spiral it prevents.

Next Steps

  • Extend dismissAllBoys into a fuller test-setup module that also seeds a mock MQI session so D2 and M4 exit the skip path.
  • Consider starting a per-suite server via Playwright’s webServer config, which would allow workers > 1 and cut round-trip time.
  • Revisit the 2026-04-17 release binary: either rebuild and promote it, or document that the release channel has moved to the desktop binary instead.
  • [topics/pitfalls/mascot-bar-destroy-rebuild-races-playwright](/topics/pitfalls/mascot-bar-destroy-rebuild-races-playwright)
  • [topics/pitfalls/rusty-dakka-release-binary-drift](/topics/pitfalls/rusty-dakka-release-binary-drift)
  • [topics/pitfalls/boy-krump-does-not-unregister](/topics/pitfalls/boy-krump-does-not-unregister)
  • [projects/dakka/_index](/projects/dakka/_index)