How do I add headers to my requests in urllib3?

Adding headers to HTTP requests in urllib3 is straightforward - you pass a dictionary of headers to the request() method. Headers are essential for web scraping, API authentication, and controlling request behavior.

Quick Example

import urllib3

http = urllib3.PoolManager()
headers = {'User-Agent': 'Mozilla/5.0 (compatible; Python urllib3)'}
response = http.request('GET', 'https://example.com', headers=headers)

Basic Setup

Installation and Import

# Install urllib3 if needed
# pip install urllib3

import urllib3

Creating Headers Dictionary

Headers are passed as a Python dictionary where keys are header names and values are header values:

headers = {
    'User-Agent': 'MyApp/1.0',
    'Accept': 'application/json',
    'Content-Type': 'application/json'
}

Common Header Examples

Web Scraping Headers

import urllib3

http = urllib3.PoolManager()

# Common web scraping headers
scraping_headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language': 'en-US,en;q=0.5',
    'Accept-Encoding': 'gzip, deflate',
    'Referer': 'https://google.com'
}

response = http.request('GET', 'https://example.com', headers=scraping_headers)
print(response.status)

API Authentication

# Bearer token authentication
api_headers = {
    'Authorization': 'Bearer your-api-token-here',
    'Content-Type': 'application/json',
    'Accept': 'application/json'
}

# API key authentication
api_key_headers = {
    'X-API-Key': 'your-api-key-here',
    'User-Agent': 'MyApp/1.0'
}

response = http.request('GET', 'https://api.example.com/data', headers=api_headers)

Custom Headers for Different Methods

import urllib3
import json

http = urllib3.PoolManager()

# GET request with headers
get_headers = {
    'User-Agent': 'MyApp/1.0',
    'Accept': 'application/json'
}
get_response = http.request('GET', 'https://api.example.com/users', headers=get_headers)

# POST request with JSON data
post_headers = {
    'Content-Type': 'application/json',
    'Accept': 'application/json',
    'User-Agent': 'MyApp/1.0'
}
post_data = json.dumps({'name': 'John', 'email': 'john@example.com'})
post_response = http.request('POST', 'https://api.example.com/users', 
                           headers=post_headers, body=post_data)

Advanced Usage

Multiple Requests with Same Headers

import urllib3

http = urllib3.PoolManager()

# Define headers once for multiple requests
common_headers = {
    'User-Agent': 'MyBot/1.0',
    'Accept': 'application/json',
    'Authorization': 'Bearer your-token'
}

urls = ['https://api.example.com/users', 'https://api.example.com/posts']

for url in urls:
    response = http.request('GET', url, headers=common_headers)
    print(f"Status: {response.status}, URL: {url}")

Dynamic Headers

import urllib3
import os

http = urllib3.PoolManager()

# Headers with environment variables
headers = {
    'User-Agent': 'MyApp/1.0',
    'Authorization': f"Bearer {os.getenv('API_TOKEN')}",
    'Accept': 'application/json'
}

# Add conditional headers
if os.getenv('DEBUG'):
    headers['X-Debug'] = 'true'

response = http.request('GET', 'https://api.example.com/data', headers=headers)

Error Handling and Security

Proper Exception Handling

import urllib3
from urllib3.exceptions import MaxRetryError, TimeoutError

http = urllib3.PoolManager()
headers = {'User-Agent': 'MyApp/1.0'}

try:
    response = http.request('GET', 'https://example.com', 
                          headers=headers, timeout=10)
    print(f"Success: {response.status}")
    print(response.data.decode('utf-8'))
except MaxRetryError as e:
    print(f"Connection failed: {e}")
except TimeoutError as e:
    print(f"Request timed out: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Secure Header Management

import urllib3
import os

# Use environment variables for sensitive data
headers = {
    'User-Agent': 'MyApp/1.0',
    'Authorization': f"Bearer {os.getenv('API_TOKEN')}",  # From environment
    'Accept': 'application/json'
}

# Don't hardcode sensitive information
# BAD: 'Authorization': 'Bearer abc123token456'
# GOOD: 'Authorization': f"Bearer {os.getenv('API_TOKEN')}"

Best Practices

Always include User-Agent: Many servers block requests without proper User-Agent headers
Use environment variables: Store API keys and tokens securely
Handle exceptions: Wrap requests in try-catch blocks
Verify SSL certificates: Use proper SSL verification for production
Rate limiting: Respect server rate limits and add delays if needed

import urllib3
import time
import os

# Recommended production setup
http = urllib3.PoolManager(
    cert_reqs='CERT_REQUIRED',
    ca_certs=urllib3.util.ssl_.DEFAULT_CERTS
)

headers = {
    'User-Agent': 'MyApp/1.0 (contact@example.com)',
    'Accept': 'application/json',
    'Authorization': f"Bearer {os.getenv('API_TOKEN')}"
}

try:
    response = http.request('GET', 'https://api.example.com/data', 
                          headers=headers, timeout=30)
    if response.status == 200:
        data = response.data.decode('utf-8')
        print(data)
    else:
        print(f"Request failed with status: {response.status}")
except Exception as e:
    print(f"Error: {e}")

This approach ensures your urllib3 requests include the necessary headers for successful web scraping and API interactions while maintaining security best practices.

Table of contents

How do I add headers to my requests in urllib3?

Quick Example

Basic Setup

Installation and Import

Creating Headers Dictionary

Common Header Examples

Web Scraping Headers

API Authentication

Custom Headers for Different Methods

Advanced Usage

Multiple Requests with Same Headers

Dynamic Headers

Error Handling and Security

Proper Exception Handling

Secure Header Management

Best Practices

Try WebScraping.AI for Your Web Scraping Needs

Key Features:

Getting Started:

Related Questions

Can urllib3 handle cookies and sessions during web scraping?

What are the best practices for error handling with urllib3?

How do I set a timeout for a request in urllib3?

Get Started Now