Skip to content

fix: gateway token double-counting with cached agents#3306

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-140430f8
Mar 27, 2026
Merged

fix: gateway token double-counting with cached agents#3306
teknium1 merged 1 commit intomainfrom
hermes/hermes-140430f8

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Summary

Fixes #3222 (reported by @zaycruz).

Gateway was double/triple-counting token usage because the cached agent accumulates session_input_tokens across messages (cumulative totals), but update_session() used += (increment) in both the in-memory entry and the SQLite DB.

Example of the bug

Message Agent returns Entry had Entry becomes (bug) Should be
1 100 0 0 + 100 = 100 ✓ 100
2 250 100 100 + 250 = 350 ✗ 250
3 300 350 350 + 300 = 650 ✗ 300

This caused inflated /usage reports and could trigger premature context compression.

Fix

  • session.py: change in-memory += to = (direct assignment for cumulative values)
  • hermes_state.py: add absolute=True flag to update_token_counts() — uses SET col = ? instead of SET col = col + ?
  • session.py: pass absolute=True when calling the DB

The CLI path is unchanged — it passes per-API-call deltas directly with the default absolute=False (increment).

Why not cherry-pick #3222

The original PR is stale (+225/-123 with heavy formatting noise) and bundles an unrelated platform toolset refactor that no longer applies. The actual fix is the +== change plus the DB flag.

…lative totals

The cached agent accumulates session_input_tokens across messages, so
run_conversation() returns cumulative totals. But update_session() used
+= (increment), double-counting on every message after the first.

- session.py: change in-memory entry updates from += to = (direct
  assignment for cumulative values)
- hermes_state.py: add absolute=True flag to update_token_counts()
  that uses SET column = ? instead of SET column = column + ?
- session.py: pass absolute=True to the DB call

CLI path is unchanged — it passes per-API-call deltas directly to
update_token_counts() with the default absolute=False (increment).

Reported by @zaycruz in #3222. Closes #3222.
@teknium1 teknium1 merged commit a8df7f9 into main Mar 27, 2026
1 of 2 checks passed
StreamOfRon pushed a commit to StreamOfRon/hermes-agent that referenced this pull request Mar 29, 2026
)

The cached agent accumulates session_input_tokens across messages, so
run_conversation() returns cumulative totals. But update_session() used
+= (increment), double-counting on every message after the first.

- session.py: change in-memory entry updates from += to = (direct
  assignment for cumulative values)
- hermes_state.py: add absolute=True flag to update_token_counts()
  that uses SET column = ? instead of SET column = column + ?
- session.py: pass absolute=True to the DB call

CLI path is unchanged — it passes per-API-call deltas directly to
update_token_counts() with the default absolute=False (increment).

Reported by @zaycruz in NousResearch#3222. Closes NousResearch#3222.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant