Table of contents

Is there a way to follow redirects manually using Requests?

Yes, the Python requests library allows you to manually handle redirects by disabling automatic redirection. This gives you complete control over the redirect process.

Disabling Automatic Redirects

Set allow_redirects=False to prevent requests from automatically following redirects:

import requests

# Disable automatic redirect following
response = requests.get('http://httpbin.org/redirect/1', allow_redirects=False)

print(f"Status Code: {response.status_code}")  # 302
print(f"Location: {response.headers.get('Location')}")  # Redirect URL

Basic Manual Redirect Handling

Check for redirects and follow them manually:

import requests

response = requests.get('http://httpbin.org/redirect/3', allow_redirects=False)

# Check for redirect status codes (300-399)
if response.is_redirect or response.is_permanent_redirect:
    redirect_url = response.headers.get('Location')
    print(f"Redirecting to: {redirect_url}")

    # Follow the redirect manually
    response = requests.get(redirect_url)
    print(f"Final status: {response.status_code}")

Complete Redirect Chain Following

Handle multiple redirects with proper loop protection:

import requests
from urllib.parse import urljoin

def follow_redirects_manually(url, max_redirects=10):
    """
    Follow redirects manually with loop protection
    """
    redirects_followed = 0
    redirect_chain = []

    while redirects_followed < max_redirects:
        response = requests.get(url, allow_redirects=False)
        redirect_chain.append(url)

        # Check if response is a redirect
        if not (response.is_redirect or response.is_permanent_redirect):
            # Not a redirect, we're done
            break

        # Get redirect location
        location = response.headers.get('Location')
        if not location:
            break

        # Handle relative URLs
        url = urljoin(url, location)
        redirects_followed += 1

        print(f"Redirect {redirects_followed}: {response.status_code} -> {url}")

    if redirects_followed >= max_redirects:
        print(f"Stopped after {max_redirects} redirects")

    return response, redirect_chain

# Usage example
final_response, chain = follow_redirects_manually('http://httpbin.org/redirect/5')
print(f"Final URL: {final_response.url}")
print(f"Redirect chain: {' -> '.join(chain)}")

Advanced Use Cases

Conditional Redirect Following

import requests

def selective_redirect_handler(url):
    """
    Follow redirects only to trusted domains
    """
    trusted_domains = ['example.com', 'httpbin.org']

    response = requests.get(url, allow_redirects=False)

    while response.is_redirect or response.is_permanent_redirect:
        redirect_url = response.headers.get('Location')

        # Check if redirect is to a trusted domain
        from urllib.parse import urlparse
        domain = urlparse(redirect_url).netloc

        if domain not in trusted_domains:
            print(f"Refusing to redirect to untrusted domain: {domain}")
            break

        print(f"Following redirect to trusted domain: {redirect_url}")
        response = requests.get(redirect_url, allow_redirects=False)

    return response

Preserving Headers Across Redirects

import requests

def redirect_with_custom_headers(url, headers=None):
    """
    Manually follow redirects while preserving custom headers
    """
    if not headers:
        headers = {'User-Agent': 'My Custom Bot 1.0'}

    response = requests.get(url, headers=headers, allow_redirects=False)

    while response.is_redirect or response.is_permanent_redirect:
        redirect_url = response.headers.get('Location')
        print(f"Following redirect with custom headers: {redirect_url}")

        # Custom headers are preserved across redirects
        response = requests.get(redirect_url, headers=headers, allow_redirects=False)

    return response

HTTP Status Code Reference

Common redirect status codes you'll encounter:

  • 301: Moved Permanently
  • 302: Found (temporary redirect)
  • 303: See Other
  • 307: Temporary Redirect (method preserved)
  • 308: Permanent Redirect (method preserved)
import requests

response = requests.get('http://httpbin.org/status/301', allow_redirects=False)

# Check specific redirect types
if response.status_code == 301:
    print("Permanent redirect")
elif response.status_code in [302, 303, 307]:
    print("Temporary redirect")

Best Practices

  1. Always set a maximum redirect limit to prevent infinite loops
  2. Handle relative URLs using urllib.parse.urljoin()
  3. Validate redirect destinations before following them
  4. Preserve necessary headers when making redirect requests
  5. Log redirect chains for debugging purposes

Manual redirect handling is particularly useful for web scraping when you need to: - Track all URLs in a redirect chain - Apply different logic based on redirect destinations - Preserve authentication or custom headers across redirects - Implement custom redirect policies

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon