Conversation
…t loop handler, OAuth non-blocking Four fixes for MCP server stability issues reported by community member (terminal lockup, zombie processes, escape sequence pollution, startup hang): 1. MCP reload timeout guard (cli.py): _check_config_mcp_changes now runs _reload_mcp in a separate daemon thread with a 30s hard timeout. Previously, a hung MCP server could block the process_loop thread indefinitely, freezing the entire TUI (user can type but nothing happens, only Ctrl+D/Ctrl+\ work). 2. MCP stdio subprocess PID tracking (mcp_tool.py): Tracks child PIDs spawned by stdio_client via before/after snapshots of /proc children. On shutdown, _stop_mcp_loop force-kills any tracked PIDs that survived the SDK's graceful SIGTERM→SIGKILL cleanup. Prevents zombie MCP server processes from accumulating across sessions. 3. MCP event loop exception handler (mcp_tool.py): Installs _mcp_loop_exception_handler on the MCP background event loop — same pattern as the existing _suppress_closed_loop_errors on prompt_toolkit's loop. Suppresses benign 'Event loop is closed' RuntimeError from httpx transport __del__ during MCP shutdown. Salvaged from PR #2538 (acsezen). 4. MCP OAuth non-blocking (mcp_oauth.py): Replaces blocking input() call in _wait_for_callback with OAuthNonInteractiveError raise. Adds _is_interactive() TTY detection. In non-interactive environments, build_oauth_auth() still returns a provider (cached tokens + refresh work), but the callback handler raises immediately instead of blocking the MCP event loop for 120s. Re-raises OAuth setup failures in _run_http so failed servers are reported cleanly without blocking others. Salvaged from PRs #4521 (voidborne-d) and #4465 (heathley). Closes #2537, closes #4462 Related: #4128, #3436
This was referenced Apr 3, 2026
jooray
added a commit
to jooray/hermes-agent
that referenced
this pull request
Apr 3, 2026
* upstream/main: (38 commits) fix(memory): Fix ByteRover plugin - run brv query synchronously before LLM call chore: release v0.7.0 (2026.4.3) (NousResearch#4812) fix: route memory provider tools in sequential execution path (NousResearch#4803) fix: persist API server sessions to shared SessionDB (state.db) (NousResearch#4802) fix(discord): register /approve and /deny slash commands, wire up button-based approval UI (NousResearch#4800) fix: respect per-platform disabled skills in Telegram menu and gateway dispatch (NousResearch#4799) fix(gateway): route /approve and /deny through running-agent guard (NousResearch#4798) docs: add community FAQ entries — multi-model workflows, WhatsApp binding, verbose control, skills config, thread sessions, migration, install troubleshooting (NousResearch#4797) fix: handle None mcp_servers in _get_platform_tools() fix(mcp): stability fix pack — reload timeout, shutdown cleanup, event loop handler, OAuth non-blocking (NousResearch#4757) fix: prevent compression death spiral from API disconnects (NousResearch#2153) (NousResearch#4750) fix: handle Anthropic Sonnet long-context tier 429 by reducing to 200k (NousResearch#4747) fix: correct qwen3.6-plus model slug fix: handle Anthropic long-context tier 429 by reducing to 200k docs(acp): fix zed config fix: use get_hermes_home(), consolidate git_cmd, update tests Add fork detection and upstream sync to hermes update fix(update): handle conflicted git index during hermes update (NousResearch#4735) fix: remove redundant restart message from update launchd path fix(update): avoid launchd restart race on macOS ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Four fixes for MCP server stability issues reported by community member the77helios (terminal lockup with Obsidian MCP server, zombie processes accumulating, escape sequence pollution, startup hang).
What this PR does
Fix 1: MCP reload timeout guard (cli.py)
_check_config_mcp_changesnow runs_reload_mcpin a separate daemon thread with a 30s hard timeout. Previously, a hung MCP server could block the process_loop thread indefinitely, freezing the entire TUI — user can type but nothing happens, only Ctrl+D/Ctrl+\ work.Fix 2: MCP stdio subprocess PID tracking (mcp_tool.py)
Tracks child PIDs spawned by
stdio_clientvia before/after snapshots of /proc children. On shutdown,_stop_mcp_loopforce-kills any tracked PIDs that survived the SDK's graceful SIGTERM→SIGKILL cleanup. Prevents zombie MCP server processes from accumulating across sessions.Fix 3: MCP event loop exception handler (mcp_tool.py)
Installs
_mcp_loop_exception_handleron the MCP background event loop — same pattern as the existing_suppress_closed_loop_errorson prompt_toolkit's loop. Suppresses benign 'Event loop is closed' RuntimeError from httpx transport__del__during MCP shutdown.Fix 4: MCP OAuth non-blocking (mcp_oauth.py + mcp_tool.py)
Replaces blocking
input()call in_wait_for_callbackwithOAuthNonInteractiveErrorraise. Adds_is_interactive()TTY detection. In non-interactive environments,build_oauth_auth()still returns a provider (cached tokens + refresh work), but the callback handler raises immediately instead of blocking the MCP event loop for 120s. Re-raises OAuth setup failures in_run_httpso failed servers are reported cleanly without blocking others.Attribution
Test plan
test_mcp_stability.pyandtest_mcp_oauth.py— all passingIssues
Closes #2537, closes #4462
Related: #4128, #3436