What is the difference between Session and regular requests in Requests?
When working with Python's Requests library for web scraping or API interactions, understanding the difference between Session objects and regular requests is crucial for building efficient and robust applications. This guide explores the fundamental differences, use cases, and best practices for both approaches.
Understanding Regular Requests
Regular requests in the Requests library are stateless, one-off HTTP requests. Each request is independent and doesn't maintain any connection or state information between calls.
Basic Regular Request Example
import requests
# Each request is independent
response1 = requests.get('https://httpbin.org/cookies/set/sessioncookie/123456789')
response2 = requests.get('https://httpbin.org/cookies')
print(response2.json()) # Will not show the cookie from response1
Understanding Session Objects
A Session object allows you to persist certain parameters across requests. It maintains cookies, headers, and other request parameters throughout its lifetime, making it ideal for scenarios where you need to maintain state.
Basic Session Example
import requests
# Create a session object
session = requests.Session()
# Set a cookie and maintain it across requests
response1 = session.get('https://httpbin.org/cookies/set/sessioncookie/123456789')
response2 = session.get('https://httpbin.org/cookies')
print(response2.json()) # Will show the cookie from response1
Key Differences
1. Cookie Persistence
Regular Requests:
import requests
# Cookies are not maintained between requests
requests.get('https://example.com/login', data={'username': 'user', 'password': 'pass'})
response = requests.get('https://example.com/protected') # May fail due to no authentication cookie
Session Requests:
import requests
session = requests.Session()
# Login and maintain authentication cookie
session.post('https://example.com/login', data={'username': 'user', 'password': 'pass'})
response = session.get('https://example.com/protected') # Succeeds with maintained session
2. Connection Pooling
Sessions provide connection pooling and reuse, which significantly improves performance when making multiple requests to the same host.
import requests
import time
# Regular requests - creates new connection each time
start_time = time.time()
for i in range(10):
requests.get('https://httpbin.org/delay/1')
regular_time = time.time() - start_time
# Session requests - reuses connections
session = requests.Session()
start_time = time.time()
for i in range(10):
session.get('https://httpbin.org/delay/1')
session_time = time.time() - start_time
print(f"Regular requests: {regular_time:.2f}s")
print(f"Session requests: {session_time:.2f}s")
3. Header Persistence
Sessions allow you to set default headers that apply to all requests within the session.
import requests
session = requests.Session()
session.headers.update({
'User-Agent': 'My Web Scraper 1.0',
'Accept': 'application/json',
'Authorization': 'Bearer your-token-here'
})
# All requests will include these headers
response1 = session.get('https://api.example.com/users')
response2 = session.get('https://api.example.com/posts')
4. Configuration Persistence
Sessions maintain configuration settings like timeout values, proxies, and SSL verification settings.
import requests
session = requests.Session()
session.timeout = 30
session.proxies = {'http': 'http://proxy.example.com:8080'}
session.verify = False # Disable SSL verification
# All requests inherit these settings
response = session.get('https://example.com/api/data')
Performance Comparison
The performance benefits of using Sessions become apparent when making multiple requests:
import requests
import time
from contextlib import contextmanager
@contextmanager
def timer():
start = time.time()
yield
end = time.time()
print(f"Execution time: {end - start:.2f} seconds")
urls = ['https://httpbin.org/get'] * 20
# Regular requests
with timer():
for url in urls:
response = requests.get(url)
# Session requests
session = requests.Session()
with timer():
for url in urls:
response = session.get(url)
session.close() # Clean up
Advanced Session Features
Custom Adapters
Sessions support custom adapters for enhanced functionality:
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
def create_session_with_retries():
session = requests.Session()
retry_strategy = Retry(
total=3,
status_forcelist=[429, 500, 502, 503, 504],
method_whitelist=["HEAD", "GET", "OPTIONS"]
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount("http://", adapter)
session.mount("https://", adapter)
return session
session = create_session_with_retries()
response = session.get('https://unreliable-api.example.com/data')
Authentication with Sessions
Sessions excel at handling authentication workflows:
import requests
class APIClient:
def __init__(self, base_url):
self.base_url = base_url
self.session = requests.Session()
def login(self, username, password):
response = self.session.post(
f"{self.base_url}/auth/login",
json={'username': username, 'password': password}
)
if response.status_code == 200:
token = response.json()['token']
self.session.headers.update({'Authorization': f'Bearer {token}'})
def get_user_data(self, user_id):
return self.session.get(f"{self.base_url}/users/{user_id}")
def close(self):
self.session.close()
# Usage
client = APIClient('https://api.example.com')
client.login('username', 'password')
user_data = client.get_user_data(123)
client.close()
Web Scraping Use Cases
For web scraping scenarios, Sessions are particularly valuable when dealing with authenticated content or when you need to maintain browsing state similar to how to handle browser sessions in Puppeteer.
Scraping with Login
import requests
from bs4 import BeautifulSoup
def scrape_protected_content(username, password):
session = requests.Session()
# Get login page to extract CSRF token
login_page = session.get('https://example.com/login')
soup = BeautifulSoup(login_page.content, 'html.parser')
csrf_token = soup.find('input', {'name': 'csrf_token'})['value']
# Login with credentials
login_data = {
'username': username,
'password': password,
'csrf_token': csrf_token
}
session.post('https://example.com/login', data=login_data)
# Access protected content
protected_page = session.get('https://example.com/dashboard')
return protected_page.content
content = scrape_protected_content('myuser', 'mypass')
Best Practices
1. Always Close Sessions
import requests
# Method 1: Explicit closing
session = requests.Session()
try:
response = session.get('https://example.com')
# Process response
finally:
session.close()
# Method 2: Context manager (recommended)
with requests.Session() as session:
response = session.get('https://example.com')
# Session automatically closed when exiting context
2. Set Appropriate Timeouts
session = requests.Session()
session.timeout = (3.05, 27) # (connect timeout, read timeout)
3. Handle Exceptions Properly
import requests
from requests.exceptions import RequestException, ConnectionError, Timeout
session = requests.Session()
try:
response = session.get('https://example.com', timeout=10)
response.raise_for_status()
except ConnectionError:
print("Failed to connect to the server")
except Timeout:
print("Request timed out")
except RequestException as e:
print(f"Request failed: {e}")
When to Use Each Approach
Use Regular Requests When:
- Making a single, one-off request
- Requests are completely independent
- No need to maintain state between requests
- Simple API calls or file downloads
Use Sessions When:
- Making multiple requests to the same host
- Need to maintain cookies or authentication
- Require persistent headers or configuration
- Performance is important (connection reuse)
- Building complex web scraping workflows
Conclusion
Understanding the differences between Session and regular requests in Python's Requests library is essential for efficient web development and scraping. Sessions provide significant advantages in terms of performance, state management, and configuration persistence, making them the preferred choice for most multi-request scenarios. While regular requests are suitable for simple, one-off operations, Sessions offer the flexibility and efficiency needed for complex applications.
For more advanced scenarios involving JavaScript-heavy websites, consider exploring how to handle authentication in Puppeteer or how to monitor network requests in Puppeteer for comprehensive web scraping solutions.