Skip to content

fix(matrix): add backoff for SyncError in sync loop#2937

Closed
ticketclosed-wontfix wants to merge 1 commit intoNousResearch:mainfrom
ticketclosed-wontfix:fix/matrix-sync-loop-backoff
Closed

fix(matrix): add backoff for SyncError in sync loop#2937
ticketclosed-wontfix wants to merge 1 commit intoNousResearch:mainfrom
ticketclosed-wontfix:fix/matrix-sync-loop-backoff

Conversation

@ticketclosed-wontfix
Copy link
Copy Markdown
Contributor

Summary

When the Matrix homeserver returns an error response, matrix-nio parses it as a SyncError return value rather than raising an exception. The sync loop only had backoff logic in the except handler, so SyncError responses caused an immediate tight retry loop (~489 req/s) that flooded logs with thousands of 'next_batch' is a required property warnings per second and hammered the homeserver.

Fix

Check the sync() return value for nio.SyncError and sleep 5s before retrying, matching the existing exception-handling backoff.

Reproduction

  1. Run Hermes gateway with a Conduwuit homeserver
  2. Trigger a transient sync error (e.g., restart Conduwuit mid-sync)
  3. Observe logs filling at ~489 lines/second with WARNING nio.responses: Error validating response: 'next_batch' is a required property

Testing

  • Verified the sync loop processes normal SyncResponse objects unchanged
  • Verified SyncError responses now trigger a 5s backoff with a descriptive warning
  • self._closing guard prevents 5s shutdown delay, matching existing pattern at line 561

When the homeserver returns an error response that matrix-nio parses
as a SyncError (rather than raising an exception), the sync loop
previously retried immediately with no delay. This created a tight
retry loop (~489 req/s) flooding logs with 'next_batch is a required
property' warnings and hammering the homeserver.

Check the sync() return value for SyncError and sleep 5s before
retrying, matching the existing exception-handling backoff.
teknium1 pushed a commit that referenced this pull request Mar 26, 2026
When the homeserver returns an error response, matrix-nio parses it
as a SyncError return value rather than raising an exception. The sync
loop only had backoff in the except handler, so SyncError caused a
tight retry loop (~489 req/s) flooding logs and hammering the
homeserver. Check the return value and sleep 5s before retry.

Cherry-picked from PR #2937 by ticketclosed-wontfix.
teknium1 added a commit that referenced this pull request Mar 26, 2026
When the homeserver returns an error response, matrix-nio parses it
as a SyncError return value rather than raising an exception. The sync
loop only had backoff in the except handler, so SyncError caused a
tight retry loop (~489 req/s) flooding logs and hammering the
homeserver. Check the return value and sleep 5s before retry.

Cherry-picked from PR #2937 by ticketclosed-wontfix.

Co-authored-by: ticketclosed-wontfix <ticketclosed-wontfix@users.noreply.github.com>
@teknium1
Copy link
Copy Markdown
Contributor

Merged via PR #3280. Your commit was cherry-picked onto current main with authorship preserved. Clean fix, thanks!

StreamOfRon pushed a commit to StreamOfRon/hermes-agent that referenced this pull request Mar 29, 2026
When the homeserver returns an error response, matrix-nio parses it
as a SyncError return value rather than raising an exception. The sync
loop only had backoff in the except handler, so SyncError caused a
tight retry loop (~489 req/s) flooding logs and hammering the
homeserver. Check the return value and sleep 5s before retry.

Cherry-picked from PR NousResearch#2937 by ticketclosed-wontfix.

Co-authored-by: ticketclosed-wontfix <ticketclosed-wontfix@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants