fix: prevent nested skill invocations from orphaning skill-active-state#2025
Conversation
When a skill (e.g. omc-setup) invokes another skill (e.g. mcp-setup) via the Skill tool, writeSkillActiveState() overwrites the parent's skill-active-state.json with the child's state. When both skills complete, the child's active state persists with no completion signal, causing the stop hook to block up to max_reinforcements times with "[SKILL ACTIVE: mcp-setup] still executing" even though all work is done. Two-part fix: 1. Nesting guard in writeSkillActiveState(): if another skill is already active in the same session, skip writing. The parent's protection already covers the session. 2. Explicit cleanup in setup-progress.sh complete: delete all skill-active-state.json files under .omc/state/ when setup finishes, so the stop hook has no stale state to block on. Reproduced during omc-setup update from 4.9.1 to 4.9.3, where Phase 3 invokes mcp-setup inside omc-setup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Yeachan-Heo
left a comment
There was a problem hiding this comment.
Good direction — the nesting guard is the right idea. A few concerns:
1. Over-cleaning in setup-progress.sh
The completion cleanup deletes all skill-active-state.json files under .omc/state/, which could interfere with concurrent active sessions. Should scope to only the current skill's state file.
2. Non-atomic nesting guard
The read-check-write in writeSkillActiveState() has a TOCTOU race — two skills starting simultaneously could both pass the "no active skill" check. For a single-threaded Claude session this is unlikely but worth noting.
3. Test coverage gaps
- No regression test for the nested invocation scenario (parent + child skill)
- Existing tests may still expect the old overwrite behavior — need to verify/update those
- Need a test for the cleanup path in
setup-progress.sh complete
Suggestion
- Scope the cleanup to only the invoking skill's state file (or use skill name matching)
- Add a test that simulates: parent skill writes state → child skill attempts write → verify parent state preserved
- Update any existing tests that rely on unconditional overwrite
The core fix (nesting guard) is correct in concept, just needs tightening.
—
[repo owner's gaebal-gajae (clawdbot) 🦞]
Addresses review feedback from Yeachan-Heo#2025: 1. Scope cleanup in setup-progress.sh to current session only (uses $CLAUDE_SESSION_ID with fallback to broad cleanup) 2. Add nesting guard to TypeScript source (src/hooks/skill-state/index.ts) matching the scripts/pre-tool-enforcer.mjs fix 3. Document TOCTOU race condition (acceptable for single-threaded sessions) 4. Update/add tests: - "does not overwrite when a different skill is already active" - "allows re-invocation of the same skill" - "allows write when no prior state exists" All 56 tests pass (skill-state + stop-hook integration). Constraint: Claude Code sessions are single-threaded so TOCTOU is not a real risk Rejected: Atomic file locking | unnecessary complexity for single-threaded runtime Confidence: high Scope-risk: narrow Not-tested: Concurrent multi-session cleanup race on shared .omc/state/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Review Feedback AddressedAll 3 concerns from the review have been addressed: 1. Over-cleaning in setup-progress.sh → Scoped to current session
2. TOCTOU race → Documented
3. Test coverage → Added
Files changed
|
|
Working on additional improvements — will push an update shortly. |
The nesting guard in writeSkillActiveState() correctly prevents child skills from overwriting parent state, but the PostToolUse handler unconditionally calls clearSkillActiveState() on any Skill completion, deleting the parent's state when the child finishes. Three additional fixes: - Sync SKILL_PROTECTION_MAP in pre-tool-enforcer.mjs with canonical TS (ralplan: none, add missing entries, remove extras) - Add _meta envelope to MJS state writes (matching writeModeState) - Validate session ID in setup-progress.sh before path interpolation - Scope fallback find to stale files only (-mmin +30) Tests: 4 added (canonical nesting, triple nesting, reinforcement reset, stop-hook integration), 1 redundant removed. 50/50 pass. Constraint: PostToolUse fires for child before parent returns Rejected: Stack-based approach | over-engineered for current single nesting Rejected: Amend prior commit | preserves review history Confidence: high Scope-risk: narrow Not-tested: Non-OMC plugin with colon-namespaced skill name Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Additional Improvements (d7a61e4, 80bacb2)Review uncovered several issues beyond the original nesting guard fix. All addressed: CRITICAL fixes
HIGH fixes
MEDIUM fixes
Tests
Key architectural insightThe nesting guard prevents overwrite (write path) but the original fix missed the clear path: Test plan verificationThe e2e lifecycle test (80bacb2) simulates the bridge.ts PostToolUse clear guard logic and validates: parent state survives child completion → stop hook still blocks → parent completion clears state → stop hook no longer blocks. All 3 test plan items now checked. 🤖 Generated with Claude Code |
|
@Yeachan-Heo Ready for re-review. All previous feedback addressed, plus additional fixes in d7a61e4 + 80bacb2:
See previous comment for full details. |
Simulates the full omc-setup → mcp-setup lifecycle including the PostToolUse nesting-aware clear logic from bridge.ts. Verifies: 1. Child completion preserves parent state (clear skipped) 2. Stop hook still blocks while parent is active 3. Parent completion clears state (clear executed) 4. Stop hook no longer blocks after parent completes This directly validates the PR test plan item: "Verify stop hook no longer blocks after omc-setup completes" 51/51 pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Reviewed and merged to dev. Clean fix — nesting guard in One note for future PRs: the diff includes a large amount of Thanks for the contribution! 🦞 — |
Summary
writeSkillActiveState(): if another skill is already active in the same session, skip writing — the parent's protection already covers the sessionskill-active-state.jsoncleanup insetup-progress.sh completeto ensure no stale skill state survives after setup finishesralplan)_metaenvelope to MJS state writes, session ID validation, stale-only cleanupFixes #2024
Root Cause
During
omc-setup, Phase 3 invokesmcp-setupvia the Skill tool.writeSkillActiveState()unconditionally overwrites the parent (omc-setup) state with the child (mcp-setup) state. When both skills complete, the child'sactive: truestate persists with no completion signal, causing the stop hook to block 5 times with "[SKILL ACTIVE: mcp-setup] still executing".Additionally,
processPostToolUseunconditionally calledclearSkillActiveState()on any Skill completion, meaning even with the nesting guard protecting writes, the child's PostToolUse would delete the parent's state.Reproduced during omc-setup update from 4.9.1 to 4.9.3.
Changes
src/hooks/bridge.tssrc/hooks/skill-state/index.tswriteSkillActiveState(): skip write if a different skill is already activescripts/pre-tool-enforcer.mjsSKILL_PROTECTION_MAPwith canonical TS version_metaenvelope to state writessafeSessionIdforsession_idfieldscripts/setup-progress.shfindscoped to stale files only (-mmin +30)src/hooks/skill-state/__tests__/skill-state.test.tsTest plan
setup-progress.sh completeclears skill-active-state.json from session dirs🤖 Generated with Claude Code