How do I handle HTTP exceptions using the Requests library?

When using the Requests library in Python to perform web scraping or any HTTP request, it's important to handle exceptions properly to ensure your program can deal with network-related issues gracefully. The Requests library can raise several exceptions based on the issues encountered during the HTTP request process.

Here are some of the most common exceptions you might encounter:

  • requests.exceptions.RequestException: This is the base exception for all other exceptions raised by requests. It's a good idea to catch this exception as a catch-all for any unhandled exceptions.
  • requests.exceptions.HTTPError: Raised when an HTTP error occurs, i.e., when a response returns a status code that indicates an error (4XX client error or 5XX server error).
  • requests.exceptions.ConnectionError: Raised when there is a problem with the network, such as a DNS failure, refused connection, etc.
  • requests.exceptions.Timeout: Raised when a request times out.
  • requests.exceptions.TooManyRedirects: Raised when a request exceeds the configured number of maximum redirections.

Here's an example of how you can handle these exceptions using the Requests library in Python:

import requests
from requests.exceptions import HTTPError, ConnectionError, Timeout, RequestException

url = "http://example.com"

try:
    response = requests.get(url)
    # Raise an HTTPError if the HTTP request returned an unsuccessful status code
    response.raise_for_status()
except HTTPError as http_err:
    print(f"HTTP error occurred: {http_err}")
except ConnectionError as conn_err:
    print(f"Connection error occurred: {conn_err}")
except Timeout as timeout_err:
    print(f"Timeout error occurred: {timeout_err}")
except RequestException as req_err:
    print(f"An error occurred: {req_err}")
else:
    # The request was successful, and no exceptions were raised
    print(response.text)

In the example above, the try block is used to catch any exceptions that may occur during the HTTP request. Each except block is dedicated to handling a specific type of exception. The else block will only execute if no exceptions were raised, meaning the request was successful.

It's worth noting that the response.raise_for_status() method is used to throw an HTTPError if the response contains an HTTP status code that indicates an error. Without this line, a response with an error status code will not raise an exception, and you would have to check the status code manually using response.status_code.

Always ensure that you have proper error handling in place when performing web scraping or any network-related tasks to handle unexpected situations and make your application more robust.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon