Platform

Traces & Evaluations

View agent runs, trace spans, and evaluation results in the dashboard.

The traces and evaluations views let you inspect your agent's execution history and quality assessments.

Viewing Traces

Run List

Each agent shows its recent runs with:

Status - running, completed, failed
Trigger - manual, automatic, test
Overall score - evaluation result (if complete)
Timestamp - when the run started and completed

Span Detail

Clicking a run shows its trace spans - the individual function calls recorded by @projectkate.trace():

Node name - the function or span name
Span kind - LLM call, tool use, or custom
Input - what was passed to the function
Output - what was returned
Duration - execution time in milliseconds
Token count - LLM tokens used (if applicable)

Spans are displayed in a waterfall view showing the execution timeline.

Evaluations

Intelligence Summary

Each evaluated run produces an intelligence summary:

Overall score - composite quality score (0.0 to 1.0)
Natural language summary - what went well and what didn't
Recommendations - specific suggestions for improvement
Regression detection - alerts if scores dropped from previous runs

Score Trends

The trends chart shows scores over time, letting you visualize:

Whether your agent is improving after knowledge acquisition
Whether recent changes caused regressions
Score stability across different types of requests

Per-Node Breakdown

Evaluate which parts of your agent perform well and which are weak:

Each traced function gets individual metrics
Identify bottlenecks (slow spans) and quality issues (low-scoring spans)
Compare node performance across runs

Triggering Evaluations

Evaluations run automatically when a run is completed. You can also trigger one manually:

Navigate to your agent's detail page
Click "Trigger Evaluation"
Wait for the evaluation to complete (typically 30-60 seconds)

Via SDK:

result = await client.evals.trigger(agent_id="your-agent-id")

Next Steps

Tracing (SDK) - instrument your agent
Runs (SDK) - manage runs programmatically
Evals Client - evaluation API

Managing Agents

Register, configure, and monitor agents from the Kate dashboard.

Artifacts Management

Create, manage, and monitor knowledge artifacts from the Kate dashboard.

On this page

Viewing Traces Run List Span Detail Evaluations Intelligence Summary Score Trends Per-Node Breakdown Triggering Evaluations Next Steps