Self-healing Claude Code hook install: symlink + wrapper preflight + binary fail-open
~8-12 stderr lines per user turn from 17 hook wrappers defaulting to bare `peon`/`stella` names against an empty PATH. Non-blocking, but drowned real signal (including a `master -> master` git-push confirmation) on every tool call. -> 0 stderr lines on sample Bash/Write/Edit invocations. Stripped-PATH smoke test (`env -i PATH=/usr/bin:/bin bash …`) returns exit 0 with zero stderr. Future `cargo clean` + rebuild cycles auto-heal: wrappers silently skip while the binary is missing, resume normal operation after the next release build lands.
Context
The Stella + Peon Rust migration (see breakthroughs/2026-04-04-stella-peon-rust-migration) left Claude Code with 17 hook wrappers invoking peon/stella binaries on every tool use. The wrappers assumed operators would export PEON_BIN/STELLA_BIN per session. Nobody did. Every Pre/Post tool use emitted command not found: non-blocking, but high-volume enough to push real signal (git-push confirmations, tool successes) off-screen.
A fail-open fix had just shipped in the peon Rust binary (commit 9b9a96d). It didn’t help: the shell wrapper exec-failed before reaching any Rust code.
What Changed
Three layers of robustness, each required for genuine self-healing:
1. Install-time: stable PATH entry. Symlinked target/release/peon and target/release/stella into ~/.local/bin/ (first on PATH, shadows any other lookup). Symlink (not copy) means cargo build --release auto-propagates without re-running installers.
2. Wrapper preflight: refuse to exec a missing binary. Every wrapper now carries command -v "$BIN" >/dev/null 2>&1 || exit 0 immediately after the binary-name assignment. Missing binary → exit 0 silently → Claude Code sees success.
3. Binary fail-open: tolerate broken inputs. Already shipped in 9b9a96d. Now actually reachable because the shell layer no longer dies first.
All three live in source-of-truth templates at ~/Documents/rusty-bloomnet/hooks/*.sh.template; scripts/install-hooks.sh copies verbatim into ~/.claude/hooks/ so reinstalls preserve the hardening.
Why it works
- Self-healing across
cargo clean. Symlink target vanishes → preflight catches it → wrappers exit 0 silently. Nextcargo build --releaserestores the binary → symlink resolves again → wrappers resume normal operation. No manual reinstall needed. - Fail-open is a layered property, not a single knob. Shell exec can fail before the binary’s fail-open runs. Covering one layer leaves the others exposed. Layered fail-open is strictly additive; no layer makes the others redundant.
- Placeholder sentinels are an anti-pattern. The prior
__STELLA_BIN__default encoded a substitution contract the installer never honored. Replacing it withstella(a real PATH name) plus preflight is simpler: one fewer invariant to maintain. - Symlink, not copy. Copy forces a manual install step per rebuild. Symlink to
target/release/means the next release build is live the moment it links.
Impact
- 0 hook error blocks on representative tool calls (from ~8-12 stderr lines per user turn).
- Stripped-PATH smoke test passes: binaries unavailable → wrappers exit 0, no stderr.
- Rust-level fail-open fix (commit 9b9a96d) is now actually exercised by hooks.
- Future contributors who
cargo cleanor rebuild only one binary don’t reintroduce noise. - Re-running
scripts/install-hooks.shafter a template edit is idempotent: preflight lands everywhere, jq-merge deduplicates settings entries.
Applies to
- Any shell wrapper that invokes a companion binary (Claude Code hooks, cron jobs, systemd units, CI scripts).
- Any toolchain where iterative rebuilds should reach deployment without a manual install step.
- Any layered system where fail-open is a design goal: check each layer independently, never assume one layer’s fail-open covers another’s exec failure.
Anti-pattern
- Single-layer fail-open (“the binary handles bad input, so the wrapper doesn’t need to check anything”).
- Placeholder sentinels that rely on an unstated
export VAR=...contract from the operator. - Copying binaries to
~/.local/bininstead of symlinking: forces a manual install step per rebuild, and the gap between build and install is a failure window. - Conflating “hook errored” with “tool blocked.” Non-blocking errors are a silent tax that drowns real signal.
Source
Debug session on 2026-04-20 after every tool call in a rusty-bloomnet docs push surfaced peon: command not found. Root-cause trace revealed shell wrapper → missing PATH entry → bash exec failure → non-blocking but visible error → binary fail-open never reached. Fix landed in templates + installer run + binary symlink, verified with stripped-PATH test.
Related
- topics/pitfalls/hook-wrapper-needs-binary-preflight: the pitfall framing of the same incident.
- topics/pitfalls/hook-paths-must-be-absolute: earlier hook-installer pitfall (same project).
- breakthroughs/2026-04-04-stella-peon-rust-migration: the migration that introduced these wrappers.
- breakthroughs/2026-04-20-layered-hook-policy: complementary layering at the policy level.
- projects/bloomnet/_index