Skip to content

fix: increase API timeout default from 900s to 1800s for slow-thinking models#3431

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-86f614ec
Mar 27, 2026
Merged

fix: increase API timeout default from 900s to 1800s for slow-thinking models#3431
teknium1 merged 1 commit intomainfrom
hermes/hermes-86f614ec

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

@teknium1 teknium1 commented Mar 27, 2026

Problem

Models like GLM-5/5.1 can think for 15+ minutes. The previous HERMES_API_TIMEOUT default of 900s (15 min) killed legitimate requests mid-thinking.

Fix

Raise HERMES_API_TIMEOUT default from 900s to 1800s (30 min) in both places that read the env var:

  • _build_api_kwargs() — non-streaming total timeout
  • _call_chat_completions() — streaming connection write timeout

Still configurable via HERMES_API_TIMEOUT env var.

Unchanged

  • Stream per-chunk read timeout (60s) — appropriate for inter-chunk timing
  • Stale stream detector (180-300s) — already scales for large contexts

Test results

200 passed, 0 failures.

@teknium1 teknium1 force-pushed the hermes/hermes-86f614ec branch from b4391bc to 9ca086c Compare March 27, 2026 18:31
…g models

Models like GLM-5/5.1 can think for 15+ minutes. The previous 900s
(15 min) default for HERMES_API_TIMEOUT killed legitimate requests.

Raised to 1800s (30 min) in both places that read the env var:
- _build_api_kwargs() timeout (non-streaming total timeout)
- _call_chat_completions() write timeout (streaming connection)

The streaming per-chunk read timeout (60s) and stale stream detector
(180-300s) are unchanged — those are appropriate for inter-chunk timing.
@teknium1 teknium1 force-pushed the hermes/hermes-86f614ec branch from 9ca086c to 20c2aeb Compare March 27, 2026 19:58
@github-actions
Copy link
Copy Markdown

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Install hook files modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

@teknium1 teknium1 changed the title fix(streaming): increase read timeout and skip retries for thinking models fix: increase API timeout default from 900s to 1800s for slow-thinking models Mar 27, 2026
@teknium1 teknium1 merged commit fb46a90 into main Mar 27, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant