How do I set a timeout for a request in urllib3?

In urllib3, you can set a timeout for a request using the timeout parameter. This parameter can be set in two ways:

  1. Connection Timeout: The amount of time to wait for a connection to the HTTP server. It's the time from the moment you initiate the request until a connection is established.
  2. Read Timeout: The time to wait for a response from the server after the connection has been established.

urllib3 supports a combined timeout that sets both the connection timeout and read timeout to the same value if you provide a single number. If you need different values for connection and read timeouts, you can provide a Timeout object instead.

Here's an example of how to set a timeout in urllib3:

import urllib3

# Create a PoolManager instance
http = urllib3.PoolManager()

# Specify a timeout duration in seconds
timeout_duration = 5  # 5 seconds for both connection and read timeout

# Make a request with the specified timeout
response = http.request('GET', 'http://example.com', timeout=timeout_duration)

print(response.status)
print(response.data)

If you need to specify different connection and read timeouts, you can create a Timeout object, like so:

import urllib3
from urllib3.util.timeout import Timeout

# Create a PoolManager instance
http = urllib3.PoolManager()

# Specify connection timeout and read timeout separately
connect_timeout = 2  # 2 seconds to establish a connection
read_timeout = 10  # 10 seconds to wait for a response

# Create a Timeout object with separate values
timeout = Timeout(connect=connect_timeout, read=read_timeout)

# Make a request with the specified timeouts
response = http.request('GET', 'http://example.com', timeout=timeout)

print(response.status)
print(response.data)

Be aware that if the server does not send a response within the specified read timeout, or if it takes too long to establish a connection (exceeding the connection timeout), urllib3 will raise a ReadTimeoutError or ConnectTimeoutError respectively. You should be prepared to handle these exceptions in your code:

import urllib3
from urllib3.exceptions import ReadTimeoutError, ConnectTimeoutError

http = urllib3.PoolManager()
timeout = 5  # seconds

try:
    response = http.request('GET', 'http://example.com', timeout=timeout)
    print(response.status)
    print(response.data)
except ConnectTimeoutError:
    print("The connection timed out!")
except ReadTimeoutError:
    print("The server took too long to send the data!")

Make sure to handle these exceptions appropriately in your application to ensure a robust and error-tolerant web scraping or data retrieval process.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon