Yes, with the Python requests
library, you can manually handle redirects instead of allowing requests
to follow them automatically. By default, requests
will follow redirects for all HTTP methods except HEAD. However, you can disable this behavior and handle redirects manually by setting the allow_redirects
parameter to False
.
Here's how you can do it:
import requests
# Make a request without automatically following redirects
response = requests.get('http://example.com', allow_redirects=False)
# Check if the response is a redirect (status codes 300-399)
if response.is_redirect or response.is_permanent_redirect:
# Get the URL to redirect to
redirect_url = response.headers.get('Location')
print(f"Redirect to: {redirect_url}")
# Manually perform the redirect (if desired)
response = requests.get(redirect_url)
By manually following redirects, you'll have more control over the process and can, for example, track the URL chain, set different headers for the redirected request, or limit the number of redirects to prevent endless loops.
When you decide to follow the redirect manually, you can use the next
attribute of the response history to iterate over the redirection chain:
response = requests.get('http://example.com', allow_redirects=False)
while response.is_redirect or response.is_permanent_redirect:
redirect_url = response.headers['Location']
print(f"Redirecting to: {redirect_url}")
# You could add conditions here to stop at certain points, inspect headers, etc.
response = requests.get(redirect_url, allow_redirects=False)
# Final response
print(response.url)
print(response.status_code)
It's essential to note that when following redirects manually, you should be careful about possible redirect loops and set a maximum number of redirects to follow to avoid getting stuck in an infinite loop. Here's how you could implement such a limit:
max_redirects = 10
num_redirects = 0
response = requests.get('http://example.com', allow_redirects=False)
while response.is_redirect or response.is_permanent_redirect:
if num_redirects >= max_redirects:
print("Reached maximum number of redirects.")
break
redirect_url = response.headers['Location']
print(f"Redirecting to: {redirect_url}")
response = requests.get(redirect_url, allow_redirects=False)
num_redirects += 1
# Process the final response
Remember to handle redirects with care, and always respect the terms of service of the websites you're interacting with when web scraping.