Table of contents

How do I set a timeout for a request in urllib3?

In urllib3, you can control request timeouts using the timeout parameter to prevent your application from hanging indefinitely when making HTTP requests.

Types of Timeouts

urllib3 supports two types of timeouts:

  1. Connection Timeout: Maximum time to wait for establishing a connection to the server
  2. Read Timeout: Maximum time to wait for receiving a response after connection is established

Setting a Simple Timeout

For basic use cases, pass a single number to set both connection and read timeouts to the same value:

import urllib3

# Create a PoolManager instance
http = urllib3.PoolManager()

# Set timeout to 5 seconds for both connection and read
response = http.request('GET', 'https://httpbin.org/delay/2', timeout=5)

print(f"Status: {response.status}")
print(f"Data: {response.data.decode()}")

Setting Separate Connection and Read Timeouts

For more control, use the Timeout object to specify different values:

import urllib3
from urllib3.util.timeout import Timeout

http = urllib3.PoolManager()

# Create a Timeout object with separate values
timeout = Timeout(connect=2.0, read=10.0)

# Make request with custom timeout configuration
response = http.request('GET', 'https://httpbin.org/delay/3', timeout=timeout)

print(f"Status: {response.status}")
print(f"Response received in time!")

Using Default Timeouts

You can set default timeouts when creating a PoolManager:

import urllib3
from urllib3.util.timeout import Timeout

# Set default timeout for all requests
default_timeout = Timeout(connect=5.0, read=30.0)
http = urllib3.PoolManager(timeout=default_timeout)

# This request will use the default timeout
response = http.request('GET', 'https://httpbin.org/get')

# Override default timeout for specific requests
response = http.request('GET', 'https://httpbin.org/delay/1', timeout=2.0)

Handling Timeout Exceptions

Always handle timeout exceptions to create robust applications:

import urllib3
from urllib3.exceptions import ReadTimeoutError, ConnectTimeoutError, TimeoutError

http = urllib3.PoolManager()

try:
    response = http.request('GET', 'https://httpbin.org/delay/10', timeout=3.0)
    print(f"Success: {response.status}")

except ConnectTimeoutError:
    print("Connection timeout: Could not establish connection within timeout period")

except ReadTimeoutError:
    print("Read timeout: Server did not respond within timeout period")

except TimeoutError as e:
    print(f"General timeout error: {e}")

Advanced Timeout Configuration

The Timeout object supports additional parameters:

from urllib3.util.timeout import Timeout

# Complete timeout configuration
timeout = Timeout(
    connect=5.0,     # Connection timeout
    read=30.0,       # Read timeout
    total=35.0       # Total timeout (connection + read combined)
)

http = urllib3.PoolManager()
response = http.request('GET', 'https://httpbin.org/get', timeout=timeout)

Best Practices

  1. Always set timeouts to prevent hanging requests
  2. Use appropriate values based on your use case:
    • Connection timeout: 3-10 seconds
    • Read timeout: 10-60 seconds for web scraping
  3. Handle exceptions gracefully in production code
  4. Consider retry logic for timeout scenarios
  5. Set default timeouts at the PoolManager level for consistency
import urllib3
from urllib3.util.timeout import Timeout
from urllib3.util.retry import Retry

# Configure retry strategy with timeout handling
retry_strategy = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 500, 502, 503, 504],
    allowed_methods=["HEAD", "GET", "OPTIONS"]
)

# Create PoolManager with timeout and retry
http = urllib3.PoolManager(
    timeout=Timeout(connect=5.0, read=30.0),
    retries=retry_strategy
)

try:
    response = http.request('GET', 'https://httpbin.org/status/500')
    print(f"Success: {response.status}")
except Exception as e:
    print(f"Request failed after retries: {e}")

This approach ensures your web scraping applications are both responsive and resilient to network issues.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon