Vorel enforces rate limits at multiple layers — API key, tenant aggregate, per-tool, dashboard user, webhook ingress, per-customer-number. The most-restrictive applicable layer wins. Hitting a limit returnsDocumentation Index
Fetch the complete documentation index at: https://docs.vorel.ai/llms.txt
Use this file to discover all available pages before exploring further.
429 rate_limited with standard Retry-After + X-RateLimit-* headers.
The full stack
| Layer | Limit | Window | Bucket key | Where it fires |
|---|---|---|---|---|
| Per-API-key | 200 req/min | 60s | apk:<key_id> | Inside v1Handler wrapper, after auth verify |
| Per-Clerk-user dashboard | 200 req/min | 60s | dashboard:<userId> | Clerk middleware, after auth resolve |
| Per-tenant aggregate | 5000 req/min | 60s | tenant:<tenantId>:total | Across every authenticated surface in your tenant |
| Per-(tenant, agent-router) | 1000 req/min | 60s | jwt:<tenantId>:<agentRouterSub> | Agent dispatch chokepoint |
| Per-(tenant, tool) | 50 req/min | 60s | jwt:<tenantId>:<toolSub> | Each internal tool endpoint |
| Per-IP webhook | 500 req/min | 60s | webhook:<source>:<ip> | /api/webhooks/whatsapp|vapi|clerk (before signature verify) |
| Per-(tenant, customer-phone) | 30 req/min | 60s | customer:<tenantId>:<phone> | WhatsApp + voice inbound, post-payload-parse |
checkRateLimit in
apps/web/src/lib/rate-limit.ts). Windows roll automatically — the bucket key embeds
floor(now / window) so a fresh window allocates a fresh bucket.
Which layer your traffic hits
Public API calls (Bearer-authed)
Public API calls (Bearer-authed)
A call to
/api/v1/* with a valid API key passes through:- Per-API-key (200 req/min) — gated inside the
v1Handlerwrapper after auth. - Per-tenant aggregate (5000 req/min) — applied via the same path so a runaway script in one tenant can’t accumulate quota by issuing N keys.
Dashboard pages (Clerk-authed)
Dashboard pages (Clerk-authed)
A page render or server action under
/(dashboard)/* passes through:- Per-Clerk-user dashboard (200 req/min) — applied in the Clerk middleware after the session resolves.
Inbound webhooks (signed)
Inbound webhooks (signed)
WhatsApp / Vapi / Clerk webhooks pass through:
- Per-IP webhook (500 req/min) — applied before signature verification, so a flood can’t make us pay for the verification cost on every request.
- HMAC signature verification — reject 401 on mismatch.
- Per-(tenant, customer-phone) (30 req/min) — applied after payload parse, so a single compromised customer number can’t burn through tenant quota.
Agent dispatch + tool routes (JWT-authed)
Agent dispatch + tool routes (JWT-authed)
The router → sub-agent → tool flow uses short-TTL JWTs (5-min for tool calls, 2-min for
worker calls):
- Per-(tenant, agent-router) (1000 req/min) — gates the dispatch chokepoint.
- Per-(tenant, tool) (50 req/min) — gates each internal tool endpoint after the agent router fans out.
- Per-tenant aggregate (5000 req/min) — same global ceiling that public API + tool traffic both share.
Hitting a limit (the response)
Retry-After— seconds until the current window rolls. Honour this; the limiter clamps to ≥1 so you don’t hot-loop.X-RateLimit-Limit— the bucket’s limit (e.g.200).X-RateLimit-Remaining— count remaining in the current window (0when limited).X-RateLimit-Reset— UNIX epoch seconds when the current window rolls.
code field tells you which layer you hit:
'rate_limited'for the public-API surfaces.- The
messagecarries the surface-specific detail (per-key rate limit exceeded,webhook ingress rate limit exceeded, etc.).
Fail-open posture (and why)
The limiter fails OPEN on Redis blips. A Redis outage admits the request rather than 429-ing every customer — better to serve traffic than to falsely block it. The trade-off is explicit: a sustained Redis outage means rate limits stop functioning. Mitigations:- Alarm on the
rate_limit.redis_failedlog spike. A blip is fine; a sustained spike pages the operator. - Defence in depth. The application-tier rate limit is one of three layers — Cloudflare WAF (per-IP global) and per-channel BSP-level limits (360dialog throttling, Vapi quotas) also apply. A Redis failure removes the application-tier gate but not the others.
Honouring rate limits (client side)
When you get a429:
- Read
Retry-After— honour it as the minimum back-off. - Don’t loop tighter than
Retry-After. Add jitter so multiple clients don’t all retry at the same window-roll millisecond. - Keep separate buckets per resource. If you’re hammering
/v1/conversationsand getting 429’d, that doesn’t mean/v1/leadsis rate-limited too — they share the per-API-key bucket but only count failures against you under that bucket. - For around-the-brain workflows (n8n nurture loops, post-booking nudges),
Retry-Afterhandling is built into the stock n8n HTTP node.
Plan-aware limits (planned)
The current numbers are platform-wide. Per-customer plan-based ceilings (raise the per-tenant aggregate to 20,000 req/min on Pro plans, etc.) are implemented as a parameter on the limit helpers but not customer-bound today (we don’t have customer plans yet pre-customer). When the billing model launches, these ceilings flex per-plan at the call site without code changes elsewhere.What’s NOT enforced today
- Per-resource-class quotas. No daily/monthly cap on “how many leads can your tenant create”; the per-key + tenant aggregate is the only ceiling.
- Per-key burst credits. Fixed window only; no token-bucket burst allowance. A request at
second
:00and one at:59both count toward the same window’s200. - Custom plans. No way to bump a single tenant’s per-API-key limit without a code change today.
Related docs
- API introduction — surface overview
- Authentication — per-API-key issuance + scopes
- Webhooks — inbound rate-limit specifics
- Security overview — full rate-limit table