Reference
Rate Limits
API rate limits by endpoint.
Rate limits prevent abuse and ensure fair access. When you exceed a limit, the API returns 429 Too Many Requests.
Default Limits
Most endpoints share a default rate limit per API key:
| Tier | Requests per Minute |
|---|---|
| Standard | 60 |
| Elevated (on request) | 300 |
Endpoint-Specific Limits
Some endpoints have stricter limits:
| Endpoint | Limit | Reason |
|---|---|---|
POST /agents/{id}/eval/trigger | 30/minute | Evaluation is compute-intensive |
POST /agents/{id}/knowledge | 10/minute | Knowledge ingestion is resource-heavy |
POST /agents/{id}/runs/{id}/spans | 60/minute | Span upload rate limiting |
GET /marketplace/listings/{id}/preview | 20/minute | Prevents preview abuse |
POST /tools/{id}/execute | 30/minute | Tool execution is compute-intensive |
PATCH /tools/{id}/{name}/toggle | 30/minute | Tool management |
POST /agents/{id}/env-vars | 30/minute | Credential management |
POST /agents/{id}/env-vars/bulk | 10/minute | Bulk credential operations |
PATCH /agents/{id}/tool-settings | 10/minute | Tool settings changes |
POST /artifacts/validate-tool/start | 5/minute | Tool validation is resource-heavy |
POST /auth/otp/send | 3/minute | Prevents OTP spam |
Rate Limit Headers
Rate-limited responses include headers indicating your current status:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1711814400
Retry-After: 30| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Retry-After | Seconds to wait before retrying |
Handling Rate Limits
import asyncio
import httpx
async def call_with_retry(client, url, max_retries=3):
for attempt in range(max_retries):
response = await client.get(url)
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 5))
await asyncio.sleep(retry_after)
continue
return response
raise Exception("Rate limit exceeded after retries")Tips
- Batch operations where possible instead of many individual requests
- Cache responses for data that doesn't change frequently
- Use webhooks (when available) instead of polling
- Spread requests evenly rather than bursting