Reference
Rate Limits
API rate limits by endpoint.
Rate limits prevent abuse and ensure fair access. When you exceed a limit, the API returns 429 Too Many Requests.
Default Limits
Most endpoints share a default rate limit per API key:
| Tier | Requests per Minute |
|---|---|
| Standard | 60 |
| Elevated (on request) | 300 |
Endpoint-Specific Limits
Some endpoints have stricter limits:
| Endpoint | Limit | Reason |
|---|---|---|
POST /agents/{id}/eval/trigger | 30/minute | Evaluation is compute-intensive |
POST /agents/{id}/knowledge | 10/minute | Knowledge ingestion is resource-heavy |
POST /agents/{id}/runs/{id}/spans | 60/minute | Span upload rate limiting |
GET /marketplace/listings/{id}/preview | 20/minute | Prevents preview abuse |
POST /auth/otp/send | 3/minute | Prevents OTP spam |
Rate Limit Headers
Rate-limited responses include headers indicating your current status:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1711814400
Retry-After: 30| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Requests remaining in current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Retry-After | Seconds to wait before retrying |
Handling Rate Limits
import asyncio
import httpx
async def call_with_retry(client, url, max_retries=3):
for attempt in range(max_retries):
response = await client.get(url)
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 5))
await asyncio.sleep(retry_after)
continue
return response
raise Exception("Rate limit exceeded after retries")Tips
- Batch operations where possible instead of many individual requests
- Cache responses for data that doesn't change frequently
- Use webhooks (when available) instead of polling
- Spread requests evenly rather than bursting