HTTP 429 Too Many Requests: what triggers it and how to handle it

A 429 Too Many Requests response means the server understood your request perfectly — and is refusing to handle it because you've sent too many in a given window. Unlike a 503 (server-side failure), a 429 is the server protecting itself from you specifically.

The definition ¶

429 Too Many Requests indicates the user has sent too many requests in a given amount of time ("rate limiting").

Per RFC 6585, 429 is a deliberate signal from the server. The response should include a Retry-After header telling you when to try again, either in seconds or as an HTTP date.

429 vs 503 vs Cloudflare 1015 ¶

Code	Layer	Cause	Fix
429	Application	You exceeded an explicit per-client rate limit	Back off; respect `Retry-After`
503	Server	Server is overloaded or in maintenance	Wait; the issue isn't yours specifically
Cloudflare 1015	Edge / CDN	Rate limit applied by the site's CDN before reaching origin	Same as 429 — back off; see our 1015 explainer

The practical difference: 429 means "you, specifically, are being throttled." 503 means "everyone is having a bad time right now."

What triggers a 429 ¶

Per-IP rate limits. An API allows N requests per minute per source IP; you exceeded N.
Per-API-key limits. Most paid APIs (Stripe, Twilio, OpenAI) cap request rates per key. The free tier is the most aggressive.
Burst limits. Even within a per-minute budget, sending all N requests in the same second can trip a burst detector.
Global limits. Some endpoints have shared tenant-wide quotas; one customer's burst can cause 429s for others.
Bot-detection rate limits. Cloudflare, AWS WAF, and similar services emit 429 (or sometimes 403) when traffic patterns look automated.

Reading the response ¶

A well-behaved server tells you exactly when to retry. Three headers worth checking:

HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1714000000

Retry-After: 30 — wait 30 seconds. May also be an HTTP date.
X-RateLimit-Remaining: 0 — confirms you exhausted your quota.
X-RateLimit-Reset — Unix timestamp when the quota refills (not standardized, varies by vendor).

If neither Retry-After nor an X-RateLimit-* header is present, default to exponential backoff starting at 1 second.

The right way to back off ¶

Naïve retry on 429 ("try again in 1 second") often makes the problem worse — the server's quota window may not have moved yet, and your retry just deepens the deficit. The pattern that works:

Honor Retry-After if present. The server told you when to retry; trust it.
Otherwise: exponential backoff with jitter. Start at 1 second, double each retry (1s, 2s, 4s, 8s), and add 0–500 ms of random jitter so concurrent clients don't synchronize their retries.
Cap retries. 3–5 attempts max. If the request still fails, surface the error rather than looping forever.
Cap total wait. Stop retrying once cumulative wait exceeds 30 seconds — at that point the user is staring at a spinner and a clean error message is better.

Triggering 429 yourself without realizing ¶

Aggressive polling. Refreshing a status endpoint every second from a tab quickly trips per-minute caps.
Parallel batch jobs without throttling. Spinning up 100 workers that each call an API once trips a per-IP burst limit even if the per-minute limit is fine.
Shared NAT or VPN. If your office or VPN routes everyone through one outbound IP, the API sees aggregate traffic from your team as one source.
Webhook retries. A failing webhook receiver that doesn't 200 quickly will get retried by the sender, multiplying load.

Designing APIs that emit 429 well ¶

If you're on the server side:

Always include Retry-After. Without it, naïve clients hammer you.
Pick a sensible window. Per-minute is the most common; per-second is too granular for most use cases.
Return rate-limit headers on every response, not just 429s — clients can self-throttle if they see X-RateLimit-Remaining: 5.
Keep 429s cheap. The whole point is to shed load; don't burn a database query verifying the quota.
Consider a higher limit for authenticated traffic; per-IP for anonymous, per-key for authenticated.

FAQ ¶

Is 429 the server's fault or mine?

Yours, by definition. 429 is the server saying you sent too many requests. If you legitimately need higher throughput, the answer is to authenticate (most APIs offer higher limits to authenticated callers) or contact the provider for a quota increase — not to retry harder.

Will switching IPs help?

Sometimes — if the limit is per-IP. But many APIs apply per-account or per-API-key limits, in which case a new IP changes nothing. And cycling IPs to evade rate limits is usually a Terms-of-Service violation.

How do I know if I'm being rate-limited or actually rejected?

Rate-limited responses come back fast (the server short-circuits without doing real work) and include a 429 status with Retry-After. Genuine errors (400, 403, 404) don't include retry headers. Connection timeouts (no response at all) usually mean a different problem entirely.