Skip to main content
Application & APIMedium

Missing API Rate Limiting

Missing API rate limiting means a public endpoint accepts an unlimited number of requests from a single client without throttling, letting attackers run credential stuffing, data scraping, account enumeration, and resource-exhaustion denial-of-service at machine speed.

Rate limiting is the unglamorous guardrail that decides whether an attacker's automated script runs into a wall after a few dozen attempts or quietly grinds through millions. When it is absent on an internet-facing API, especially a login, password-reset, or search endpoint, every weakness behind that endpoint becomes cheaper to exploit and harder to notice. The exposure rarely announces itself: the API works perfectly for normal users, returns clean 200 responses, and looks healthy on a dashboard right up until a botnet turns a stolen credential dump into hijacked accounts or a single curl loop turns your egress bill into a four-figure surprise.

Reviewed by Ameya Lambat

Security Research Contributor, Legba

Reviewed 2026-05-28 · Updated 2026-05-28

What it is

API rate limiting is a server-side control that caps how many requests a given client (identified by IP, API key, token, or account) may make within a defined time window, and rejects or delays excess requests, typically with an HTTP 429 'Too Many Requests' response. 'Missing API rate limiting' is the condition where one or more reachable API endpoints enforce no such cap and no equivalent control (no throttling, no quota, no anti-automation step, no account lockout). This maps directly to OWASP API4:2023 'Unrestricted Resource Consumption,' which broadened the older 2019 'Lack of Resources & Rate Limiting' category to cover both request-frequency limits and the size and cost of the resources each request consumes. The underlying software weaknesses are CWE-770 (Allocation of Resources Without Limits or Throttling) and CWE-799 (Improper Control of Interaction Frequency).

An unthrottled endpoint converts other people's mistakes, leaked password lists, weak validation, expensive backend calls, into your incident. Credential stuffing is the clearest stake: OWASP's Credential Stuffing Prevention Cheat Sheet documents that attackers replay billions of breached username and password pairs against login forms, and without rate limiting the only thing standing between a stuffing campaign and a wave of account takeovers is the reuse habits of your users. The OWASP API4:2023 page records concrete financial damage too, including a cached 18GB file that drove a cloud bill from roughly $13 to about $8,000 a month, and password-reset SMS abuse that ran up thousands of dollars in per-message charges. What an organization stands to lose is therefore not abstract: hijacked customer accounts and the fraud, support load, and regulatory exposure that follow; scraped proprietary data; outages for legitimate users during a resource-exhaustion flood; and direct, attacker-controlled spend on SMS, compute, and bandwidth.

At a glance

Typemissing-api-rate-limiting
Ports80, 443, 8080, 8443
ProtocolsHTTP, HTTPS
Seen onREST APIs, GraphQL APIs, Login and authentication endpoints, Password reset and OTP flows, API gateways (Kong, AWS API Gateway, Apigee), NGINX, Express.js, Spring Boot, Django REST Framework
SeverityMedium
Updated2026-05-28

How attackers find and exploit it

  • Enumerate the target's API surface through passive recon: crawl JavaScript bundles, mobile-app traffic, public OpenAPI or Swagger documents, and DNS records to build a list of reachable endpoints such as /api/v1/login, /graphql, /password-reset, and /search.
  • Probe each endpoint with a short burst of identical requests and watch for the absence of HTTP 429 responses, Retry-After headers, or X-RateLimit-* headers, which signals that no throttling is enforced.
  • Identify the highest-value unthrottled endpoint, prioritizing authentication and OTP/reset flows for account takeover, and search or pagination endpoints for scraping.
  • Run credential stuffing or password spraying by replaying breached credential lists at high volume, often distributed across a proxy or residential botnet to defeat any per-IP heuristics, until valid sessions are obtained.
  • Abuse expensive operations to exhaust resources or drive cost: send large or batched payloads (for example GraphQL query batching or deeply nested queries), request oversized pagination like ?limit=9999999, or loop SMS/email triggers to overwhelm CPU, memory, third-party quotas, or billing.
  • Automate enumeration of objects and accounts by iterating identifiers or usernames to harvest data and confirm which accounts exist, using error and timing differences as oracles.

How to detect it on your surface

  • Inventory every internet-reachable API endpoint, including undocumented, legacy, and versioned paths (v1, v2, internal/), since rate limiting is frequently applied to the primary login form but forgotten on a parallel mobile or partner endpoint.
  • For each endpoint, send a controlled burst of authenticated and unauthenticated requests from a single source and record whether the server ever returns 429 or begins delaying responses.
  • Inspect responses for rate-limit signaling headers (X-RateLimit-Limit, X-RateLimit-Remaining, RateLimit, Retry-After); their total absence across an endpoint is a strong indicator that no limit is enforced.
  • Test authentication-specific protections separately: submit repeated failed logins for a single account and from a single IP to confirm whether account lockout, exponential backoff, or anti-automation (CAPTCHA, MFA step-up) ever engages.
  • Review API gateway, WAF, and load-balancer configuration to confirm a rate-limiting or throttling policy is actually bound to each route rather than merely available but unattached.

Detection signals

  • Hundreds of consecutive requests from one client receive 200/401/403 responses with no 429 'Too Many Requests' ever returned.
  • Responses lack any rate-limit headers such as X-RateLimit-Remaining, RateLimit-Limit, or Retry-After even under sustained load.
  • Login or reset endpoints accept unlimited failed attempts against the same account with no lockout, no increasing delay, and no CAPTCHA or MFA challenge.
  • Pagination, search, or GraphQL endpoints honor arbitrarily large or batched requests (for example very large limit/page-size values or multi-operation batched mutations) without rejection.
  • Server-side latency, CPU, or error rates climb in proportion to request volume from a single source, indicating no upstream throttle is shedding load.

Validate before you report

  • Reproduce the missing limit safely: from a single controlled source, send a bounded sequence of requests (for example 100 over 60 seconds) to the candidate endpoint and confirm no 429 or throttling delay appears, capturing full request/response pairs as evidence.
  • Distinguish a true unthrottled endpoint from an upstream block by confirming responses remain legitimate application responses (valid 200/401/403 bodies) rather than WAF or CDN interstitials.
  • Test the specific abuse path that defines the finding, for example repeated failed logins for one account to prove no lockout, or an oversized/batched request to prove no payload cap, rather than only measuring raw request count.
  • Verify the control is genuinely absent and not just permissive by comparing behavior against a sibling endpoint or documented policy, ruling out a high-but-present threshold.
  • Record the evidence set, the endpoint, method, timestamps, request volume, and the absent 429/lockout, so the finding is reproducible and severity can be assigned to the actual reachable path.

What looks like this but isn't

  • A high threshold is not the same as no limit: an endpoint that tolerates 100 requests but returns 429 at 500 is rate limited, so confirm the cap is truly absent before flagging.
  • Throttling may live upstream at a CDN, WAF, or API gateway rather than on the application, so a clean origin response does not prove the public-facing path is unprotected; test through the real internet-facing hostname.
  • Authentication endpoints may enforce per-account lockout or CAPTCHA instead of per-IP rate limiting, which still mitigates credential stuffing; verify that no anti-automation control engages before concluding the endpoint is exposed.
  • A non-production, sandbox, or intentionally public read-only endpoint may have rate limiting deliberately relaxed; confirm the endpoint serves real users or sensitive operations before assigning impact.

Remediation

  • Enforce rate limiting on every internet-facing endpoint at the API gateway, reverse proxy, or framework layer, keying limits on the strongest available client identity (authenticated token or API key first, then IP) and returning HTTP 429 with a Retry-After header.
  • Apply stricter, dedicated anti-automation limits to authentication, password-reset, OTP, and other sensitive endpoints, going beyond ordinary API throttling as OWASP recommends for these high-value flows.
  • Add credential-stuffing defenses that do not rely on per-IP limits alone: enforce or step up multi-factor authentication, implement progressive delays and account lockout on repeated failures, and check submitted passwords against known-breached lists.
  • Constrain resource cost per request by validating and capping payload size, pagination limits, query depth and complexity (for GraphQL), and the number of batched operations, addressing the resource-consumption side of OWASP API4:2023.
  • Set spending limits and billing alerts on metered third-party services (SMS, email, cloud egress and compute) so an abuse spike cannot translate into unbounded cost.
  • Monitor and alert on 429 rates, per-client request volume, and authentication failure spikes, then feed this telemetry back into tuning thresholds and triggering graduated responses.

Operational checklist

  • Maintain a living inventory of all public API endpoints and require that each new or changed route ships with an explicit, attached rate-limiting policy before release.
  • Default-deny on rate limits: new endpoints inherit a sensible global throttle unless a route-specific policy is defined, so nothing reaches production unthrottled by omission.
  • Hold authentication, reset, and OTP endpoints to a stricter standard with lockout/backoff plus MFA, and review these controls whenever the auth flow changes.
  • Continuously scan the external surface for endpoints that return no 429 or rate-limit headers under controlled bursts, including legacy and versioned paths.
  • Configure billing alerts and hard spending caps on all metered downstream services consumed by APIs.
  • Track 429 volume, top talkers, and login-failure ratios on a dashboard and alert on anomalous spikes that indicate stuffing, scraping, or DoS in progress.

What to do next

Missing API rate limiting is cheap for an attacker to find and cheap to abuse, which is exactly why it deserves attention before a stuffing campaign or a runaway cloud bill forces it. The fix is well understood and documented by OWASP: bind explicit throttles to every public route, harden authentication flows with stricter anti-automation controls, and cap the cost of each request. The immediate next step is concrete, enumerate your internet-facing endpoints and confirm each one returns a 429 under load and a lockout on repeated failed logins; any endpoint that does neither is the one to remediate first.

Methodology

Each finding-type guide is built from Legba Recon's real detection and validation logic, reviewed by a named security contributor, and cited against primary sources such as OWASP, CISA, NIST, and MITRE. We update pages when the underlying guidance changes. See our contributors and company.

FAQs.

References.

Weakness references (CWE)

Keep exploring

Your agent needs its Legba.

Read the docs