When using the Requests library in Python to perform web scraping or any HTTP request, it's important to handle exceptions properly to ensure your program can deal with network-related issues gracefully. The Requests library can raise several exceptions based on the issues encountered during the HTTP request process.
Here are some of the most common exceptions you might encounter:
requests.exceptions.RequestException
: This is the base exception for all other exceptions raised byrequests
. It's a good idea to catch this exception as a catch-all for any unhandled exceptions.requests.exceptions.HTTPError
: Raised when an HTTP error occurs, i.e., when a response returns a status code that indicates an error (4XX client error or 5XX server error).requests.exceptions.ConnectionError
: Raised when there is a problem with the network, such as a DNS failure, refused connection, etc.requests.exceptions.Timeout
: Raised when a request times out.requests.exceptions.TooManyRedirects
: Raised when a request exceeds the configured number of maximum redirections.
Here's an example of how you can handle these exceptions using the Requests library in Python:
import requests
from requests.exceptions import HTTPError, ConnectionError, Timeout, RequestException
url = "http://example.com"
try:
response = requests.get(url)
# Raise an HTTPError if the HTTP request returned an unsuccessful status code
response.raise_for_status()
except HTTPError as http_err:
print(f"HTTP error occurred: {http_err}")
except ConnectionError as conn_err:
print(f"Connection error occurred: {conn_err}")
except Timeout as timeout_err:
print(f"Timeout error occurred: {timeout_err}")
except RequestException as req_err:
print(f"An error occurred: {req_err}")
else:
# The request was successful, and no exceptions were raised
print(response.text)
In the example above, the try
block is used to catch any exceptions that may occur during the HTTP request. Each except
block is dedicated to handling a specific type of exception. The else
block will only execute if no exceptions were raised, meaning the request was successful.
It's worth noting that the response.raise_for_status()
method is used to throw an HTTPError
if the response contains an HTTP status code that indicates an error. Without this line, a response with an error status code will not raise an exception, and you would have to check the status code manually using response.status_code
.
Always ensure that you have proper error handling in place when performing web scraping or any network-related tasks to handle unexpected situations and make your application more robust.