SDK
Evals Client
View evaluation results, trends, and intelligence summaries with client.evals.
The evals client provides access to your agent's automated evaluation results.
Methods
summary(agent_id)
Get the latest evaluation intelligence summary.
summary = await client.evals.summary(agent_id="your-agent-id")
print(f"Overall score: {summary.overall_score}")
print(f"Regression: {summary.regression_detected}")
print(f"Summary: {summary.intelligence_summary}")
for rec in summary.recommendations:
print(f" - {rec}")Returns: EvalSummary
history(agent_id)
Get the history of evaluation runs.
history = await client.evals.history(agent_id="your-agent-id")
for run in history:
print(f"[{run.created_at}] Score: {run.overall_score}")Returns: List of evaluation runs
trends(agent_id, days)
Get score trends over time.
trends = await client.evals.trends(
agent_id="your-agent-id",
days=30,
)Returns: Trend data for charting
Data Model
EvalSummary
@dataclass
class EvalSummary:
intelligence_summary: str # Natural language summary
updated_at: str
overall_score: float | None # 0.0 to 1.0
recommendations: list[str] # Improvement suggestions
regression_detected: bool # True if scores droppedNext Steps
- Traces & Evals (Dashboard) - visual evaluation management
- Runs Client - manage the runs being evaluated