How do I handle persistent connections with the Requests library?

Understanding Connection Management in Requests

The Python requests library is designed for simplicity and creates a new connection for each individual request by default. When you use requests.get() or requests.post(), a new TCP connection is established, the request is made, and the connection is closed afterward.

For better performance when making multiple requests to the same server, you should use Session objects which enable connection pooling and persistent connections.

Using Session Objects for Persistent Connections

A Session object allows you to persist parameters across multiple requests and automatically handles connection pooling through the underlying urllib3 library.

Basic Session Usage

import requests

# Create and use a session
with requests.Session() as session:
    # Set common headers for all requests
    session.headers.update({
        'User-Agent': 'MyApp/1.0',
        'Accept': 'application/json'
    })

    # First request establishes connection
    response1 = session.get('https://api.example.com/users')
    print(f"Status: {response1.status_code}")

    # Subsequent requests reuse the same connection
    response2 = session.get('https://api.example.com/posts')
    response3 = session.post('https://api.example.com/data', json={'key': 'value'})

Manual Session Management

If you need more control over when the session is closed:

import requests

# Create session manually
session = requests.Session()

try:
    # Configure session
    session.headers.update({'Authorization': 'Bearer token123'})
    session.timeout = 30

    # Make multiple requests
    for i in range(10):
        response = session.get(f'https://api.example.com/item/{i}')
        print(f"Item {i}: {response.json()}")

finally:
    # Always close the session to release connections
    session.close()

Advanced Configuration

Connection Pool Settings

You can customize the connection pool behavior:

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

# Create session with custom adapter
session = requests.Session()

# Configure retry strategy
retry_strategy = Retry(
    total=3,
    backoff_factor=1,
    status_forcelist=[429, 500, 502, 503, 504]
)

# Create adapter with connection pooling settings
adapter = HTTPAdapter(
    pool_connections=10,  # Number of connection pools
    pool_maxsize=20,      # Maximum connections per pool
    max_retries=retry_strategy
)

# Mount adapter for HTTP and HTTPS
session.mount("http://", adapter)
session.mount("https://", adapter)

# Use the configured session
response = session.get('https://api.example.com/data')

Session with Authentication

Sessions are particularly useful for maintaining authentication state:

import requests

with requests.Session() as session:
    # Login and store session cookies
    login_data = {'username': 'user', 'password': 'pass'}
    session.post('https://example.com/login', data=login_data)

    # Subsequent requests automatically include authentication cookies
    profile = session.get('https://example.com/profile')
    dashboard = session.get('https://example.com/dashboard')

    # Make API calls with persistent authentication
    api_response = session.get('https://example.com/api/data')

Performance Benefits

Using sessions provides several advantages:

  • Connection Reuse: TCP connections are reused for multiple requests to the same host
  • Cookie Persistence: Cookies are automatically maintained across requests
  • Header Persistence: Common headers are set once and used for all requests
  • Better Performance: Eliminates connection establishment overhead

Performance Comparison

import time
import requests

def without_session():
    start = time.time()
    for i in range(10):
        requests.get('https://httpbin.org/delay/1')
    return time.time() - start

def with_session():
    start = time.time()
    with requests.Session() as session:
        for i in range(10):
            session.get('https://httpbin.org/delay/1')
    return time.time() - start

print(f"Without session: {without_session():.2f}s")
print(f"With session: {with_session():.2f}s")

Important Considerations

  • Connection Pooling: Sessions use connection pooling but don't guarantee a single persistent connection
  • Server Support: The server must support HTTP keep-alive for connection reuse to be effective
  • Session State: Sessions maintain cookies and authentication but don't persist application state on the server
  • Thread Safety: Session objects are not thread-safe; use separate sessions for concurrent requests

Alternative Libraries

For use cases requiring true persistent connections or advanced features:

  • http.client: Lower-level HTTP client in Python's standard library
  • aiohttp: Asynchronous HTTP client with persistent connection support
  • httpx: Modern HTTP client with async support and connection pooling
  • websockets: For WebSocket connections requiring persistent bidirectional communication
# Example with httpx for comparison
import httpx

async def with_httpx():
    async with httpx.AsyncClient() as client:
        response = await client.get('https://api.example.com/data')
        return response.json()

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon