VS Code: Persist and display token usage by turn/session/history #10567
Replies: 6 comments 1 reply
-
|
Great proposal! Token visibility is crucial for cost management and debugging. Additional suggestions:
UI placement suggestion:
We build token-aware AI tools at Revolution AI — budget visibility is one of the most requested features from enterprise users. Looking forward to the PR! |
Beta Was this translation helpful? Give feedback.
-
|
+1 for persistent token tracking! Essential for cost management. Great design decisions:
Additional suggestions: 1. Cost estimation interface TokenUsage {
promptTokens: number;
completionTokens: number;
estimatedCost?: number; // Based on model pricing
}2. Export capability // Allow export for billing/analysis
{
"sessions": [
{
"id": "abc",
"totalTokens": 15000,
"turns": [
{"prompt": 500, "completion": 200}
]
}
]
}3. Budget alerts continue:
tokenBudget:
daily: 100000
alertAt: 80% # Warn at 80% usage4. Per-model breakdown UI placement suggestion:
We track AI costs at Revolution AI — this feature would be huge for enterprise users managing budgets. |
Beta Was this translation helpful? Give feedback.
-
|
Would be a great addition, I am actually thinking to install LM Studio instead of Ollama simply for the detailed info they provide in token usage. |
Beta Was this translation helpful? Give feedback.
-
|
Good proposal. One angle worth adding to the data model: token breakdown by prompt section, not just by turn. The instruction layer (system prompt) is often a fixed cost per turn that doesn't get surfaced separately from the dynamic context. Knowing whether tokens are going to retrieval, examples, instructions, or conversation history changes the optimization decision. You'd want different strategies for each. A structured prompt format (typed blocks with known boundaries) makes that breakdown feasible — you can measure instruction tokens separately from content tokens because the structure says where one ends and the other begins. I've been building flompt for this, a visual prompt builder that decomposes prompts into 12 semantic blocks and compiles to Claude-optimized XML. Open-source: github.com/Nyrok/flompt A token usage display that can attribute usage to instruction vs. context vs. output would be much more actionable than per-turn totals. |
Beta Was this translation helpful? Give feedback.
-
|
The escalating display levels are a great design choice. On the cost estimation piece that multiple people have mentioned: The tricky part with showing dollar amounts alongside tokens is that pricing varies significantly depending on context — cached vs. uncached input tokens can differ by 10x (e.g., Anthropic charges $3/MTok for input but $0.30/MTok for cached reads). So a naive A more accurate approach: extract the actual breakdown from the API response headers/body. Most providers return this in the response metadata — OpenAI gives you For the budget alerts idea — one pattern that works well is per-session cost accumulation with a configurable threshold. Rather than daily limits (which are hard to track across IDE restarts), session-scoped budgets are more practical: +1 on this feature. Token visibility is becoming table stakes for AI-assisted coding tools — it changes how developers think about which model to use per task. |
Beta Was this translation helpful? Give feedback.
-
|
Hi guys, any timelines on this feature? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Token usage is currently visible in some runtime views but not consistently persisted or shown across chat/session/history contexts. We want token usage to be reliable, local-first, and user-configurable in one place.
Problem
Proposal
continue.showTokenUsage:"never" | "history" | "session" | "turn"never: show nowherehistory: history onlysession: history + sessionturn: history + session + turnDeveloper: Reload Window.PR incoming :)
Beta Was this translation helpful? Give feedback.
All reactions