In Python, when you're using the requests
library to make HTTP requests, you may occasionally encounter connection errors due to network issues, server unresponsiveness, or timeouts. By default, requests
does not retry failed requests, but you can control the behavior to implement retries using a Session
object and the HTTPAdapter
class from the requests.adapters
module.
Here's how you can set up a maximum number of retries on connection errors using requests
:
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
# Define the maximum number of retries
max_retries = 3
# Create a session
session = requests.Session()
# Define a Retry object with your retry parameters
retries = Retry(
total=max_retries,
backoff_factor=1,
status_forcelist=[500, 502, 503, 504],
)
# Mount an HTTPAdapter to the session's HTTP and HTTPS endpoints
adapter = HTTPAdapter(max_retries=retries)
session.mount('http://', adapter)
session.mount('https://', adapter)
# Now you can make requests through the session, and it will retry on connection errors
url = 'http://example.com'
try:
response = session.get(url)
# Handle successful response
except requests.exceptions.RetryError as e:
# Handle the case where the maximum number of retries is exceeded
# You can access the last response with e.last_response (if any)
pass
except requests.exceptions.RequestException as e:
# Handle other request-related errors (non-retry or after all retries)
pass
The Retry
class has several parameters that you can adjust to control the retry behavior:
total
: Total number of retries to allow. Set toNone
for infinite retries.read
: How many times to retry on read errors.connect
: How many times to retry on connection-related errors.status_forcelist
: A set of integer HTTP status codes that we should force a retry on. e.g.,[500, 502, 503, 504]
.backoff_factor
: A backoff factor to apply between attempts after the second try (most errors are resolved immediately by a second try without a delay).
The backoff_factor
is a delay applied between retries, and it's used to introduce a sleep time in the interval between retries to give the server time to recover. For example, backoff_factor=1
will result in a delay of {backoff factor} * (2 ** (retry number - 1))
seconds. After the first failure, there is no delay, after the second failure, it will wait 1
second, after the third failure, it will wait 2
seconds, and so on.
If a request exceeds the maximum number of retries, a RetryError
is raised. You can catch this exception and handle it accordingly.
Remember that it's important to use retries responsibly and not to overload the server with too many quick, repeated requests. Always follow the website's terms of service and respect the robots.txt
file when scraping.