Experiment Persistence dakka

Playwright specs covering spawn/dismiss, avatars, MQI, and responsiveness can be written against the Mork dashboard once pre-existing warband state is isolated per test.

16/16 pass (14 + 2 conditional skips). Two product bugs found and fixed: destroy-rebuild in MascotBar races Playwright locators, and avatar canvases n

April 22, 2026

dakkae2eplaywrightmork

Hypothesis

Playwright specs covering spawn/dismiss, avatars, MQI, and responsiveness can be written against the Mork dashboard once pre-existing warband state is isolated per test.

Result: confirmed

Key Findings

16/16 pass (14 + 2 conditional skips). Two product bugs found and fixed: destroy-rebuild in MascotBar races Playwright locators, and avatar canvases never mount for boys spawned before the avatar manifest fetches. Three pitfalls documented: release-binary drift, boy:krump vs boy:dismiss semantics, destroy-rebuild-races-Playwright.

Hypothesis

Playwright can cover the Mork dashboard’s five critical surfaces (spawn/dismiss, avatar rendering, MQI hover, MQI accuracy, responsiveness) once pre-existing warband state is isolated per test via a WebSocket-based beforeEach. The spec work is primarily a documentation-of-expected-behavior task; product bugs should be rare.

Method

Inventory the five domains (S1/S4/S5/S6, A1/A2, M1/M3/M4/M5, D2, R1/R2/R3/R4/R5).
Write e2e/helpers.ts::dismissAllBoys(baseURL): opens /ws, reads warband:init, sends boy:dismiss for each live boy, waits for matching boy:removed acks, filters binary PTY frames.
Wire dismissAllBoys as beforeEach in every spec.
Run against ./target/debug/rusty-dakka spawn --port 7331 & (background; cargo run blocks the caller).
Iterate on real failures: fix spec bugs against real product behavior, fix product bugs when the spec reveals them.

Results

Progression

Iteration	State	Why
Start	0 / 16 pass	`#btn-spawn-a` disabled because 2 pre-existing boys put multiclaude mode into the `boysA.length >= 2` disabled state. No tests had a way to clear state.
+ `dismissAllBoys` helper (first cut)	Still 0 / 16	Helper sent `boy:krump` expecting `boy:removed`. Krump doesn’t unregister. See `[topics/pitfalls/boy-krump-does-not-unregister](/topics/pitfalls/boy-krump-does-not-unregister)`.
+ `boy:dismiss` fix	5 / 16	Setup works. Tests still fail on selector bugs and spec-vs-reality mismatches.
+ selector fixes (`.agent-card-dismiss`, `state: 'attached'`, 6s timeouts)	10 / 16	Real failures start surfacing: `.agent-card` “resolved to visible” then timeouts, `.mqi-indicator` detached during hover.
+ M4 JSON-parse-before-skip guard	11 / 16	Empty-body 404 from stale release binary → SyntaxError → test cancel.
+ switch `target/release` (2026-04-17) → `target/debug` (2026-04-23)	11 / 16 (M3/M5 flicker)	Release binary predated the Task 6 MQI fix; debug matched source. See `[topics/pitfalls/rusty-dakka-release-binary-drift](/topics/pitfalls/rusty-dakka-release-binary-drift)`.
+ MascotBar reconcile refactor	10 / 16	The reconciler stopped re-calling `renderIntoIconBox` on every update; boys that spawned before the avatar manifest fetched never got a canvas. Regression from A1/R4.
+ `AvatarManager.init().then(updateMascotBar)` re-paint	14 / 16 (+ 2 skip)	All real failures cleared. Remaining 2 are conditional skips (D2, M4 both guard `if (mqi == null) test.skip()` against bloomnet.db which has no session rows).
+ `workers: 1` in config	16 / 16 (14 + 2 skip)	Previously-flaky failures at `workers > 1` were contention over the single dakka server’s warband; serializing tests eliminated the race.

Product fixes shipped

crates/ui/static/js/sidebar.js: MascotBar.update() now reconciles by data-boy-id instead of destroying both clusters on every snapshot. New _upsertAgentCard + _applyCardVisuals methods mutate cards in place; avatar canvas stays mounted across state transitions. See [topics/pitfalls/mascot-bar-destroy-rebuild-races-playwright](/topics/pitfalls/mascot-bar-destroy-rebuild-races-playwright).
crates/ui/static/js/app.js: after AvatarManager.init() resolves, call updateMascotBar() if any boys are already registered: retry path for the manifest race.

Spec-only fixes

.agent-card-dismiss is the real class (spec drafts used .dismiss-btn|.krump-btn|[aria-label="Dismiss"], none existed).
M4: resp.text() + JSON.parse try/catch before skip check.
A1: .count() instead of .textContent() (the latter’s 30s default action timeout burned the entire test budget when the element was absent).
R4: assertion fix: each boy owns two canvases (mascot + terminal), so dismissing the only boy drops the count to zero, not to before - 1.
workers: 1 in e2e/playwright.config.ts.

Findings

Write once, run iteratively. The 2 h session split roughly: 20 min diagnosing why the button was disabled (initial misread as “cargo run blocks”), 40 min on helper + selector iteration, 40 min on the MascotBar refactor + retry kick, 20 min on documentation. The diagnose-before-iterate discipline is what kept the scope bounded.
Pre-existing server state is a first-class test-isolation concern. Dakka’s server keeps boys across client reconnects. A spec that page.goto('/') and expects an empty dashboard is making an undocumented assumption. The cheap fix is a beforeEach WS cleanup; the cheaper path is to stand up a fresh server per run, but that costs ~3s per test suite in release-binary startup.
Binary drift is invisible until it isn’t. The 2026-04-17 release binary had served happily for six days. Nothing about its behavior was wrong in isolation: only relative to the spec written against the 2026-04-23 source contract. Always diff target/release/<bin> mtime against the handler source mtime before trusting a “known good” server.
Destroy-rebuild is a React-to-vanilla-JS porting hazard. React’s reconciler makes “wipe the cluster, redraw” cheap. The DOM makes it expensive: visually (flicker), semantically (listener re-bind), and test-wise (locator races). Any component ported from a reactive framework should preserve node identity across updates.
Parallel workers + shared server = undefined behavior. Playwright defaults to autodetected workers, but a single-server E2E suite must serialize unless the setup creates per-worker isolation (separate server, separate DB, separate port). Encoding workers: 1 in the config is cheaper than the debugging spiral it prevents.

Next Steps

Extend dismissAllBoys into a fuller test-setup module that also seeds a mock MQI session so D2 and M4 exit the skip path.
Consider starting a per-suite server via Playwright’s webServer config, which would allow workers > 1 and cut round-trip time.
Revisit the 2026-04-17 release binary: either rebuild and promote it, or document that the release channel has moved to the desktop binary instead.

[topics/pitfalls/mascot-bar-destroy-rebuild-races-playwright](/topics/pitfalls/mascot-bar-destroy-rebuild-races-playwright)
[topics/pitfalls/rusty-dakka-release-binary-drift](/topics/pitfalls/rusty-dakka-release-binary-drift)
[topics/pitfalls/boy-krump-does-not-unregister](/topics/pitfalls/boy-krump-does-not-unregister)
[projects/dakka/_index](/projects/dakka/_index)