BYOK Costs
Cost model for AI conversations with bring-your-own-key providers.
Ticket0 tracks estimated AI cost per task in ai_usage and surfaces breakdowns in AI Agent → Spend.
Typical cost profile per conversation
Most conversations include multiple small calls:
- classification/routing/language tasks (Flash or Haiku-class)
- draft generation (Sonnet-quality)
- optional retrieval and evaluation tasks
In practice, low-complexity conversations are usually a few tenths of a cent. Complex, long-thread conversations cost more because token usage is higher.
How the estimate is computed
Each call's cost is computed from token usage and the provider's published per-million-token rate:
(input_tokens × input_rate + output_tokens × output_rate) / 1_000_000
A conversation's total cost is the sum of every call made on its behalf (classification, retrieval, drafting, evaluation). The exact per-model rates live in the AI service and are updated alongside the model catalogue — you see the resulting per-call and per-conversation numbers in AI Agent → Spend rather than doing the arithmetic yourself.
Optimization tips
- Keep KB content focused and deduplicated — irrelevant articles bloat the prompt
- Trim very long historical context (resolve or close old tickets rather than keeping them open as reply targets)
- Raise auto-send thresholds in high-risk categories so expensive drafts aren't wasted on tickets that humans would rewrite anyway
- Use AI Agent → Spend to spot high-cost task types and iterate
Setting a monthly cap
Rather than chasing pennies per call, the more reliable lever is the workspace monthly spend cap (AI Agent → Spend → Spend cap). Ticket0 pauses AI features for the rest of the month when the cap is hit; alert notifications go out at the warning threshold (default 80%) and at the cap.
For provider-key setup, see Setting up your OpenRouter key. Remember OpenRouter adds ~5% on top of direct rates — layering direct provider keys bypasses that markup on calls that can use them.