Table of contents

How can I handle HTTP 429 Too Many Requests errors?

HTTP 429 "Too Many Requests" is a client error status code that indicates the user has sent too many requests in a given amount of time. This error is commonly encountered in web scraping, API interactions, and automated data collection scenarios. Understanding how to properly handle 429 errors is crucial for building robust and respectful web scraping applications.

Understanding HTTP 429 Errors

The HTTP 429 status code is part of the rate limiting mechanism that web servers use to control the flow of incoming requests. When a server returns a 429 error, it's essentially saying "slow down" to prevent resource exhaustion and maintain service quality for all users.

Common Scenarios

  • API Rate Limiting: Most APIs implement rate limits (e.g., 100 requests per minute)
  • Web Scraping: Websites protect themselves from aggressive scraping
  • DDoS Protection: Security systems may trigger 429 responses for suspicious activity
  • Resource Protection: Servers limit requests to prevent overload

Key Response Headers

When a server returns a 429 error, it often includes helpful headers:

  • Retry-After: Indicates how long to wait before making another request
  • X-RateLimit-Limit: The request limit for the current window
  • X-RateLimit-Remaining: Number of requests remaining in the current window
  • X-RateLimit-Reset: When the rate limit window resets

Implementation Strategies

1. Basic Retry Logic with Exponential Backoff

Python Example using requests:

import requests
import time
import random
from typing import Optional

def make_request_with_retry(url: str, max_retries: int = 3, base_delay: float = 1.0) -> Optional[requests.Response]:
    """
    Make HTTP request with exponential backoff retry logic for 429 errors.
    """
    for attempt in range(max_retries + 1):
        try:
            response = requests.get(url, timeout=30)

            if response.status_code == 429:
                if attempt == max_retries:
                    print(f"Max retries ({max_retries}) exceeded for {url}")
                    return None

                # Check for Retry-After header
                retry_after = response.headers.get('Retry-After')
                if retry_after:
                    delay = int(retry_after)
                    print(f"Rate limited. Waiting {delay} seconds (from Retry-After header)")
                else:
                    # Exponential backoff with jitter
                    delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
                    print(f"Rate limited. Waiting {delay:.2f} seconds (exponential backoff)")

                time.sleep(delay)
                continue

            # Return successful response or other error codes
            return response

        except requests.RequestException as e:
            print(f"Request failed: {e}")
            if attempt == max_retries:
                return None

            delay = base_delay * (2 ** attempt)
            time.sleep(delay)

    return None

# Usage example
url = "https://api.example.com/data"
response = make_request_with_retry(url, max_retries=5, base_delay=2.0)

if response and response.status_code == 200:
    data = response.json()
    print("Request successful!")
else:
    print("Request failed after all retries")

JavaScript Example using fetch:

async function makeRequestWithRetry(url, maxRetries = 3, baseDelay = 1000) {
    for (let attempt = 0; attempt <= maxRetries; attempt++) {
        try {
            const response = await fetch(url);

            if (response.status === 429) {
                if (attempt === maxRetries) {
                    console.log(`Max retries (${maxRetries}) exceeded for ${url}`);
                    return null;
                }

                // Check for Retry-After header
                const retryAfter = response.headers.get('Retry-After');
                let delay;

                if (retryAfter) {
                    delay = parseInt(retryAfter) * 1000; // Convert to milliseconds
                    console.log(`Rate limited. Waiting ${delay/1000} seconds (from Retry-After header)`);
                } else {
                    // Exponential backoff with jitter
                    delay = baseDelay * Math.pow(2, attempt) + Math.random() * 1000;
                    console.log(`Rate limited. Waiting ${delay/1000} seconds (exponential backoff)`);
                }

                await new Promise(resolve => setTimeout(resolve, delay));
                continue;
            }

            return response;

        } catch (error) {
            console.error(`Request failed: ${error.message}`);
            if (attempt === maxRetries) {
                return null;
            }

            const delay = baseDelay * Math.pow(2, attempt);
            await new Promise(resolve => setTimeout(resolve, delay));
        }
    }

    return null;
}

// Usage example
async function fetchData() {
    const response = await makeRequestWithRetry('https://api.example.com/data', 5, 2000);

    if (response && response.ok) {
        const data = await response.json();
        console.log('Request successful!', data);
    } else {
        console.log('Request failed after all retries');
    }
}

fetchData();

2. Advanced Rate Limiting with Token Bucket

Python Implementation:

import time
import threading
from typing import Optional

class TokenBucket:
    """
    Token bucket algorithm for rate limiting requests.
    """
    def __init__(self, capacity: int, refill_rate: float):
        self.capacity = capacity
        self.tokens = capacity
        self.refill_rate = refill_rate
        self.last_refill = time.time()
        self.lock = threading.Lock()

    def consume(self, tokens: int = 1) -> bool:
        """
        Try to consume tokens from the bucket.
        Returns True if successful, False if not enough tokens.
        """
        with self.lock:
            now = time.time()
            elapsed = now - self.last_refill

            # Add tokens based on elapsed time
            self.tokens = min(self.capacity, self.tokens + elapsed * self.refill_rate)
            self.last_refill = now

            if self.tokens >= tokens:
                self.tokens -= tokens
                return True

            return False

    def wait_for_token(self, tokens: int = 1) -> None:
        """
        Wait until enough tokens are available.
        """
        while not self.consume(tokens):
            time.sleep(0.1)

class RateLimitedClient:
    def __init__(self, requests_per_second: float = 1.0, burst_capacity: int = 5):
        self.bucket = TokenBucket(burst_capacity, requests_per_second)

    def make_request(self, url: str) -> Optional[requests.Response]:
        """
        Make a rate-limited request.
        """
        self.bucket.wait_for_token()

        try:
            response = requests.get(url, timeout=30)

            if response.status_code == 429:
                # If we still get 429, respect the Retry-After header
                retry_after = response.headers.get('Retry-After')
                if retry_after:
                    time.sleep(int(retry_after))
                    return self.make_request(url)  # Retry once

            return response

        except requests.RequestException as e:
            print(f"Request failed: {e}")
            return None

# Usage example
client = RateLimitedClient(requests_per_second=0.5, burst_capacity=3)
response = client.make_request("https://api.example.com/data")

3. Handling 429 Errors in Web Scraping

When scraping websites, 429 errors often require more sophisticated handling. Here's how to integrate proper handling of timeouts and error management:

Python with BeautifulSoup:

import requests
from bs4 import BeautifulSoup
import time
import random
from urllib.parse import urljoin, urlparse

class WebScraper:
    def __init__(self, base_delay: float = 1.0, max_delay: float = 60.0):
        self.session = requests.Session()
        self.base_delay = base_delay
        self.max_delay = max_delay
        self.request_count = 0

        # Set realistic headers
        self.session.headers.update({
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
        })

    def scrape_with_backoff(self, url: str, max_retries: int = 5) -> Optional[BeautifulSoup]:
        """
        Scrape a URL with exponential backoff for 429 errors.
        """
        for attempt in range(max_retries + 1):
            try:
                response = self.session.get(url, timeout=30)
                self.request_count += 1

                if response.status_code == 200:
                    return BeautifulSoup(response.content, 'html.parser')

                elif response.status_code == 429:
                    if attempt == max_retries:
                        print(f"Max retries exceeded for {url}")
                        return None

                    delay = self._calculate_delay(response, attempt)
                    print(f"Rate limited on attempt {attempt + 1}. Waiting {delay} seconds...")
                    time.sleep(delay)
                    continue

                else:
                    print(f"HTTP {response.status_code} error for {url}")
                    return None

            except requests.RequestException as e:
                print(f"Request exception: {e}")
                if attempt < max_retries:
                    time.sleep(self.base_delay * (2 ** attempt))
                    continue
                return None

        return None

    def _calculate_delay(self, response: requests.Response, attempt: int) -> float:
        """
        Calculate delay based on Retry-After header or exponential backoff.
        """
        retry_after = response.headers.get('Retry-After')

        if retry_after:
            try:
                return min(int(retry_after), self.max_delay)
            except ValueError:
                pass

        # Exponential backoff with jitter
        delay = min(self.base_delay * (2 ** attempt), self.max_delay)
        jitter = random.uniform(0.1, 0.5) * delay
        return delay + jitter

    def scrape_multiple_urls(self, urls: list, delay_between_requests: float = 1.0):
        """
        Scrape multiple URLs with built-in delays.
        """
        results = []

        for i, url in enumerate(urls):
            print(f"Scraping {i+1}/{len(urls)}: {url}")

            soup = self.scrape_with_backoff(url)
            results.append({
                'url': url,
                'content': soup,
                'success': soup is not None
            })

            # Add delay between requests
            if i < len(urls) - 1:  # Don't delay after the last request
                time.sleep(delay_between_requests)

        return results

# Usage example
scraper = WebScraper(base_delay=2.0, max_delay=30.0)
urls = [
    "https://example.com/page1",
    "https://example.com/page2",
    "https://example.com/page3"
]

results = scraper.scrape_multiple_urls(urls, delay_between_requests=2.0)
successful_scrapes = [r for r in results if r['success']]
print(f"Successfully scraped {len(successful_scrapes)}/{len(urls)} URLs")

Best Practices for Handling 429 Errors

1. Respect Server Resources

# Use appropriate delays between requests
curl -w "Time: %{time_total}s\n" -H "User-Agent: MyBot/1.0" https://api.example.com/data
sleep 2  # Wait 2 seconds between requests

2. Implement Circuit Breaker Pattern

class CircuitBreaker:
    def __init__(self, failure_threshold: int = 5, recovery_timeout: float = 60.0):
        self.failure_threshold = failure_threshold
        self.recovery_timeout = recovery_timeout
        self.failure_count = 0
        self.last_failure_time = None
        self.state = 'CLOSED'  # CLOSED, OPEN, HALF_OPEN

    def call(self, func, *args, **kwargs):
        if self.state == 'OPEN':
            if time.time() - self.last_failure_time > self.recovery_timeout:
                self.state = 'HALF_OPEN'
            else:
                raise Exception("Circuit breaker is OPEN")

        try:
            result = func(*args, **kwargs)
            if hasattr(result, 'status_code') and result.status_code == 429:
                self._record_failure()
                return result

            self._record_success()
            return result

        except Exception as e:
            self._record_failure()
            raise

    def _record_failure(self):
        self.failure_count += 1
        self.last_failure_time = time.time()

        if self.failure_count >= self.failure_threshold:
            self.state = 'OPEN'

    def _record_success(self):
        self.failure_count = 0
        self.state = 'CLOSED'

3. Monitor and Log Rate Limiting

import logging

# Configure logging for rate limiting events
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def log_rate_limit_info(response: requests.Response, url: str):
    """
    Log detailed rate limiting information.
    """
    headers = response.headers

    info = {
        'url': url,
        'status_code': response.status_code,
        'retry_after': headers.get('Retry-After'),
        'rate_limit': headers.get('X-RateLimit-Limit'),
        'rate_remaining': headers.get('X-RateLimit-Remaining'),
        'rate_reset': headers.get('X-RateLimit-Reset'),
        'timestamp': time.time()
    }

    if response.status_code == 429:
        logger.warning(f"Rate limited: {info}")
    else:
        logger.info(f"Request info: {info}")

Testing Rate Limiting Behavior

Load Testing Script

import concurrent.futures
import time
from collections import Counter

def test_rate_limiting(url: str, num_requests: int = 50, max_workers: int = 5):
    """
    Test how a server responds to rapid requests.
    """
    results = Counter()

    def make_single_request(request_id):
        try:
            response = requests.get(url, timeout=10)
            return response.status_code
        except Exception as e:
            return f"Error: {str(e)}"

    start_time = time.time()

    with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [executor.submit(make_single_request, i) for i in range(num_requests)]

        for future in concurrent.futures.as_completed(futures):
            result = future.result()
            results[result] += 1

    end_time = time.time()

    print(f"Test completed in {end_time - start_time:.2f} seconds")
    print("Response status codes:")
    for status, count in results.items():
        print(f"  {status}: {count}")

    if 429 in results:
        print(f"Rate limiting triggered {results[429]} times")

# Run the test
test_rate_limiting("https://api.example.com/test", num_requests=100, max_workers=10)

Integration with Popular Libraries

Using with aiohttp (Async Python)

import asyncio
import aiohttp
import random

async def async_request_with_retry(session: aiohttp.ClientSession, url: str, max_retries: int = 3):
    """
    Async HTTP request with 429 handling.
    """
    for attempt in range(max_retries + 1):
        try:
            async with session.get(url) as response:
                if response.status == 429:
                    if attempt == max_retries:
                        return None

                    retry_after = response.headers.get('Retry-After')
                    delay = int(retry_after) if retry_after else (2 ** attempt) + random.random()

                    await asyncio.sleep(delay)
                    continue

                return await response.text()

        except Exception as e:
            if attempt == max_retries:
                print(f"Max retries exceeded: {e}")
                return None

            await asyncio.sleep(2 ** attempt)

    return None

Conclusion

Handling HTTP 429 errors effectively requires a multi-faceted approach combining retry logic, exponential backoff, proper delay mechanisms, and respectful rate limiting. The key is to build systems that can gracefully handle rate limiting while maintaining good citizenship on the web.

Remember to always respect the Retry-After header when provided, implement exponential backoff with jitter to avoid thundering herd problems, and consider the impact of your requests on the target server. When building web scrapers or API clients, these practices will help you create more robust and reliable applications while maintaining good relationships with the services you're accessing.

For complex scenarios involving browser automation, consider implementing similar strategies when monitoring network requests to detect and handle rate limiting at the browser level.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon