Docs
AI Agent

Monitoring AI Quality

How to track AI quality with evaluator scores, feedback signals, and review filters.

Use both operator feedback and automated evaluator signals to monitor AI quality.

Main quality views

  • AI quality report API (/api/reports/ai-quality)
    • acceptance rate
    • edit rate
    • rejection rate
  • Live chat quality filters
    • low quality sessions
    • hallucination flag
    • circular response flag
    • negative feedback count

Conversation evaluator

Completed widget sessions can be evaluated automatically with structured metrics:

  • accuracy
  • completeness
  • resolution
  • hallucination flag
  • circular flag
  • question type key

A composite quality score is stored and surfaced in live chat list badges.

What to watch weekly

  • rising rejection or edit rates in one category
  • repeated hallucination flags for the same question type
  • low-quality clusters after KB or policy changes
  • high negative-feedback sessions that were not escalated

Closing the loop

When you find a bad response pattern:

  1. Add/refresh KB coverage for that scenario.
  2. Tighten instructions for risky behavior.
  3. Increase confidence threshold for that flow.
  4. Monitor acceptance/edit/rejection deltas over the next week.

Nightly aggregation jobs keep quality insights fresh, but immediate operator feedback signals are still the fastest indicator of drift.

On this page