VS Code: Persist and display token usage by turn/session/history #10567

benjschiller · 2026-02-17T04:15:31Z

benjschiller
Feb 17, 2026

Summary

Token usage is currently visible in some runtime views but not consistently persisted or shown across chat/session/history contexts. We want token usage to be reliable, local-first, and user-configurable in one place.

Problem

Token counts are not consistently persisted at the turn level in session artifacts.

Proposal

Persist token usage per assistant turn in session history artifacts.
Compute per-session and history-bucket totals from persisted session data.
Add a single setting:
- continue.showTokenUsage: "never" | "history" | "session" | "turn"
Setting semantics are escalating:
- never: show nowhere
- history: history only
- session: history + session
- turn: history + session + turn
Apply setting changes live without requiring Developer: Reload Window.
Keep existing stats DB/schema unchanged (no migration in this change).
Keep token tracking always-on; setting controls display only.

PR incoming :)

xXMrNidaXx · 2026-02-23T14:01:14Z

xXMrNidaXx
Feb 23, 2026

Great proposal! Token visibility is crucial for cost management and debugging.

Additional suggestions:

Cost estimation alongside tokens

interface TokenUsage {
  input: number;
  output: number;
  total: number;
  estimatedCost?: number;  // Based on model pricing
}

Users care about dollars, not just tokens.

Export/analytics hook

// Allow extensions to subscribe to usage events
continue.onTokenUsage((usage) => {
  // Send to analytics, budget tracker, etc.
});

Budget alerts

{
  "continue.tokenBudget": {
    "sessionLimit": 100000,
    "dailyLimit": 500000,
    "warnAt": 0.8
  }
}

Context window visualization
Show how much of the model context is used:
```
[████████░░] 80% context used (32k/40k)
```
Helps users understand when they are hitting limits.
Per-model tracking
When switching models mid-session, track separately:
```
Session: 45k tokens
- gpt-4o: 30k ($0.15)
- claude-3: 15k ($0.05)
```

UI placement suggestion:

Turn-level: subtle inline badge
Session: status bar item
History: column in history view

We build token-aware AI tools at Revolution AI — budget visibility is one of the most requested features from enterprise users. Looking forward to the PR!

0 replies

xXMrNidaXx · 2026-02-23T15:25:12Z

xXMrNidaXx
Feb 23, 2026

+1 for persistent token tracking! Essential for cost management.

Great design decisions:

Escalating display levels (never → turn)
Always-on tracking, display-only setting
No schema migration

Additional suggestions:

1. Cost estimation

interface TokenUsage {
  promptTokens: number;
  completionTokens: number;
  estimatedCost?: number;  // Based on model pricing
}

2. Export capability

// Allow export for billing/analysis
{
  "sessions": [
    {
      "id": "abc",
      "totalTokens": 15000,
      "turns": [
        {"prompt": 500, "completion": 200}
      ]
    }
  ]
}

3. Budget alerts

continue:
  tokenBudget:
    daily: 100000
    alertAt: 80%  # Warn at 80% usage

4. Per-model breakdown

Session tokens:
  gpt-4o: 8,500 ($0.04)
  claude-3: 3,200 ($0.02)
  local/qwen: 12,000 (free)

UI placement suggestion:

Turn level: Small badge next to response
Session: Status bar item
History: Column in history view

We track AI costs at Revolution AI — this feature would be huge for enterprise users managing budgets.

0 replies

ivandamyanov · 2026-03-08T18:51:34Z

ivandamyanov
Mar 8, 2026

Would be a great addition, I am actually thinking to install LM Studio instead of Ollama simply for the detailed info they provide in token usage.

0 replies

Nyrok · 2026-03-11T07:41:19Z

Nyrok
Mar 11, 2026

Good proposal. One angle worth adding to the data model: token breakdown by prompt section, not just by turn. The instruction layer (system prompt) is often a fixed cost per turn that doesn't get surfaced separately from the dynamic context.

Knowing whether tokens are going to retrieval, examples, instructions, or conversation history changes the optimization decision. You'd want different strategies for each. A structured prompt format (typed blocks with known boundaries) makes that breakdown feasible — you can measure instruction tokens separately from content tokens because the structure says where one ends and the other begins.

I've been building flompt for this, a visual prompt builder that decomposes prompts into 12 semantic blocks and compiles to Claude-optimized XML. Open-source: github.com/Nyrok/flompt

A token usage display that can attribute usage to instruction vs. context vs. output would be much more actionable than per-turn totals.

0 replies

UrRhb · 2026-03-25T11:39:47Z

UrRhb
Mar 25, 2026

The escalating display levels are a great design choice. On the cost estimation piece that multiple people have mentioned:

The tricky part with showing dollar amounts alongside tokens is that pricing varies significantly depending on context — cached vs. uncached input tokens can differ by 10x (e.g., Anthropic charges $3/MTok for input but $0.30/MTok for cached reads). So a naive tokens * price_per_token calculation can be off by an order of magnitude if you're hitting cache regularly.

A more accurate approach: extract the actual breakdown from the API response headers/body. Most providers return this in the response metadata — OpenAI gives you prompt_tokens and completion_tokens in the usage object, Anthropic returns cache_creation_input_tokens and cache_read_input_tokens separately. If Continue intercepted these at the response level, the cost calculation would be exact rather than estimated.

For the budget alerts idea — one pattern that works well is per-session cost accumulation with a configurable threshold. Rather than daily limits (which are hard to track across IDE restarts), session-scoped budgets are more practical:

Session cost: $0.47 / $5.00 budget [█████████░] 94%

+1 on this feature. Token visibility is becoming table stakes for AI-assisted coding tools — it changes how developers think about which model to use per task.

0 replies

abdylan · 2026-04-05T05:25:12Z

abdylan
Apr 5, 2026

Hi guys, any timelines on this feature?

1 reply

benjschiller Apr 5, 2026
Author

I've got a PR up, so you can build and use that if you like #10568 (but I would recommend you back up your ~/.continue directory)

I meant to address the comments this week, but wasn't able. Hoping to get to that soon. It sounds like the PR (after splitting up into smaller chunks) may take some time review tho, so realistically at least a few weeks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VS Code: Persist and display token usage by turn/session/history #10567

Uh oh!

{{title}}

Uh oh!

Replies: 6 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

VS Code: Persist and display token usage by turn/session/history #10567

Uh oh!

benjschiller Feb 17, 2026

Summary

Problem

Proposal

Replies: 6 comments · 1 reply

Uh oh!

xXMrNidaXx Feb 23, 2026

Uh oh!

xXMrNidaXx Feb 23, 2026

Uh oh!

ivandamyanov Mar 8, 2026

Uh oh!

Nyrok Mar 11, 2026

Uh oh!

UrRhb Mar 25, 2026

Uh oh!

abdylan Apr 5, 2026

Uh oh!

benjschiller Apr 5, 2026 Author

benjschiller
Feb 17, 2026

Replies: 6 comments 1 reply

xXMrNidaXx
Feb 23, 2026

xXMrNidaXx
Feb 23, 2026

ivandamyanov
Mar 8, 2026

Nyrok
Mar 11, 2026

UrRhb
Mar 25, 2026

abdylan
Apr 5, 2026

benjschiller Apr 5, 2026
Author