AgentsRate Limits & Caching

Rate Limits & Caching

What gates an agent’s calls, and how to spend the budget well.

Request rate

Your API key’s tier sets a per-minute request budget, enforced per key across both MCP and REST. Exceed it and calls return 429 (REST) or a rate-limit error (MCP) — back off and retry.

TierRequests / min
Free5
Layer 1100
Layer 2300
Layer 3600
Enterprise3,000+

Full plan details on Tiers & Rate Limits.

Spend your budget on the composite tools. wallet_overview, token_deep_dive, token_coordination_report, and wallet_network_report each fan out many internal readers but count as one request — far cheaper against your rate limit (and fewer model round-trips) than calling a dozen query_* tools yourself. Use the targeted tools only for follow-ups.

Caching & freshness

Every read is fronted by an in-process cache with a short, data-appropriate TTL (2s for price up to 300s for token metadata), so repeated identical calls within the window are effectively free and consistent. A response can be up to its TTL stale. See Caching & Freshness for the per-endpoint TTLs.

For zero-latency, uncached data, subscribe to the live WebSocket streams instead of polling.

Streaming limits

If you stream directly over WebSocket, the per-tier connection and subscription caps (and the silent-drop backpressure behavior) are on the WebSocket Protocol page — Layer 1+ only, 50 subscriptions per connection below Enterprise.

Pagination

List-returning tools and endpoints page with limit + offset (no cursors); each has its own default and max, documented on the relevant API Reference page.