What are HTTP Conditional Requests and When Should I Use Them?
HTTP conditional requests are a powerful mechanism that allows clients to make requests that are only processed if certain conditions are met. These requests help optimize network traffic, reduce server load, and improve application performance by avoiding unnecessary data transfers when resources haven't changed.
Understanding HTTP Conditional Requests
Conditional requests use special HTTP headers to specify conditions that must be satisfied before the server processes the request. If the conditions aren't met, the server responds with a status code indicating that the request wasn't processed, typically without transferring the requested resource.
The most common conditional request headers include:
If-Modified-Since
: Only process if the resource has been modified since the specified dateIf-None-Match
: Only process if the resource's ETag doesn't match the provided valueIf-Match
: Only process if the resource's ETag matches the provided valueIf-Unmodified-Since
: Only process if the resource hasn't been modified since the specified dateIf-Range
: Used with range requests to ensure partial content is still valid
Common HTTP Status Codes for Conditional Requests
When using conditional requests, you'll encounter these specific status codes:
- 304 Not Modified: The resource hasn't changed, so the cached version can be used
- 412 Precondition Failed: The condition specified in the request headers wasn't met
- 416 Range Not Satisfiable: Used with range requests when the specified range is invalid
Implementing Conditional Requests in Python
Here's how to implement conditional requests using Python's requests
library:
import requests
from datetime import datetime, timezone
# Example 1: Using If-Modified-Since
def fetch_with_last_modified(url, last_modified=None):
headers = {}
if last_modified:
headers['If-Modified-Since'] = last_modified
response = requests.get(url, headers=headers)
if response.status_code == 304:
print("Resource not modified, using cached version")
return None
elif response.status_code == 200:
print("Resource updated, processing new content")
# Store the Last-Modified header for future requests
last_modified = response.headers.get('Last-Modified')
return response.text, last_modified
return response.text, None
# Example 2: Using ETags for cache validation
def fetch_with_etag(url, etag=None):
headers = {}
if etag:
headers['If-None-Match'] = etag
response = requests.get(url, headers=headers)
if response.status_code == 304:
print("ETag matches, content unchanged")
return None
elif response.status_code == 200:
new_etag = response.headers.get('ETag')
print(f"Content updated, new ETag: {new_etag}")
return response.text, new_etag
return response.text, None
# Example 3: Safe updates with If-Match
def update_resource_safely(url, data, etag):
headers = {
'If-Match': etag,
'Content-Type': 'application/json'
}
response = requests.put(url, json=data, headers=headers)
if response.status_code == 412:
print("Precondition failed - resource was modified by another client")
return False
elif response.status_code == 200:
print("Resource updated successfully")
return True
return False
# Usage example
url = "https://api.example.com/data"
content, last_modified = fetch_with_last_modified(url)
# Later, check if the resource has been updated
updated_content, new_last_modified = fetch_with_last_modified(url, last_modified)
Implementing Conditional Requests in JavaScript
Here's how to use conditional requests in JavaScript with the Fetch API:
// Example 1: Using If-Modified-Since with fetch
async function fetchWithLastModified(url, lastModified = null) {
const headers = {};
if (lastModified) {
headers['If-Modified-Since'] = lastModified;
}
try {
const response = await fetch(url, { headers });
if (response.status === 304) {
console.log('Resource not modified, using cached version');
return { cached: true };
}
if (response.ok) {
const content = await response.text();
const newLastModified = response.headers.get('Last-Modified');
console.log('Resource updated, processing new content');
return { content, lastModified: newLastModified };
}
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
} catch (error) {
console.error('Request failed:', error);
throw error;
}
}
// Example 2: ETag-based conditional requests
async function fetchWithETag(url, etag = null) {
const headers = {};
if (etag) {
headers['If-None-Match'] = etag;
}
const response = await fetch(url, { headers });
if (response.status === 304) {
return { cached: true, etag };
}
if (response.ok) {
const content = await response.text();
const newETag = response.headers.get('ETag');
return { content, etag: newETag };
}
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
// Example 3: Safe updates with conditional requests
async function updateResourceSafely(url, data, etag) {
const response = await fetch(url, {
method: 'PUT',
headers: {
'If-Match': etag,
'Content-Type': 'application/json'
},
body: JSON.stringify(data)
});
if (response.status === 412) {
throw new Error('Precondition failed - resource was modified by another client');
}
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${response.statusText}`);
}
return await response.json();
}
// Usage with async/await
(async () => {
try {
const url = 'https://api.example.com/data';
let result = await fetchWithLastModified(url);
if (!result.cached) {
console.log('Processing new content:', result.content);
// Later, check for updates
const updateResult = await fetchWithLastModified(url, result.lastModified);
if (updateResult.cached) {
console.log('No updates available');
}
}
} catch (error) {
console.error('Error:', error);
}
})();
Practical Use Cases for Conditional Requests
1. Web Scraping Optimization
When scraping websites regularly, conditional requests can significantly reduce bandwidth and improve efficiency:
import requests
import json
from datetime import datetime
class ConditionalScraper:
def __init__(self):
self.cache = {}
def scrape_with_cache(self, url):
cache_key = url
headers = {}
# Check if we have cached data
if cache_key in self.cache:
cached_data = self.cache[cache_key]
if 'last_modified' in cached_data:
headers['If-Modified-Since'] = cached_data['last_modified']
if 'etag' in cached_data:
headers['If-None-Match'] = cached_data['etag']
response = requests.get(url, headers=headers)
if response.status_code == 304:
print(f"Using cached data for {url}")
return self.cache[cache_key]['content']
if response.status_code == 200:
# Update cache
self.cache[cache_key] = {
'content': response.text,
'last_modified': response.headers.get('Last-Modified'),
'etag': response.headers.get('ETag'),
'cached_at': datetime.now().isoformat()
}
print(f"Updated cache for {url}")
return response.text
response.raise_for_status()
# Usage
scraper = ConditionalScraper()
content = scraper.scrape_with_cache('https://example.com/api/data')
2. API Rate Limiting and Efficiency
Conditional requests help you stay within API rate limits while ensuring you get fresh data when needed:
class APIClient {
constructor(baseURL, apiKey) {
this.baseURL = baseURL;
this.apiKey = apiKey;
this.cache = new Map();
}
async get(endpoint) {
const url = `${this.baseURL}${endpoint}`;
const cacheKey = url;
const headers = {
'Authorization': `Bearer ${this.apiKey}`,
'Accept': 'application/json'
};
// Add conditional headers if we have cached data
const cached = this.cache.get(cacheKey);
if (cached) {
if (cached.etag) {
headers['If-None-Match'] = cached.etag;
}
if (cached.lastModified) {
headers['If-Modified-Since'] = cached.lastModified;
}
}
const response = await fetch(url, { headers });
if (response.status === 304) {
console.log(`Using cached data for ${endpoint}`);
return cached.data;
}
if (response.ok) {
const data = await response.json();
// Update cache
this.cache.set(cacheKey, {
data,
etag: response.headers.get('ETag'),
lastModified: response.headers.get('Last-Modified'),
cachedAt: new Date().toISOString()
});
return data;
}
throw new Error(`API request failed: ${response.status} ${response.statusText}`);
}
}
3. Preventing Lost Updates
When multiple clients might modify the same resource, conditional requests prevent data corruption:
def safe_update_user_profile(user_id, updates, current_etag):
"""
Safely update a user profile using conditional requests
to prevent lost updates in concurrent scenarios.
"""
url = f"https://api.example.com/users/{user_id}"
headers = {
'If-Match': current_etag,
'Content-Type': 'application/json'
}
response = requests.put(url, json=updates, headers=headers)
if response.status_code == 412:
# Precondition failed - someone else modified the resource
print("Conflict detected! Resource was modified by another user.")
# Fetch the latest version
latest_response = requests.get(url)
if latest_response.ok:
latest_data = latest_response.json()
latest_etag = latest_response.headers.get('ETag')
return {
'success': False,
'conflict': True,
'latest_data': latest_data,
'latest_etag': latest_etag,
'message': 'Please resolve conflicts and try again'
}
elif response.status_code == 200:
updated_data = response.json()
new_etag = response.headers.get('ETag')
return {
'success': True,
'data': updated_data,
'etag': new_etag
}
else:
response.raise_for_status()
Best Practices for Conditional Requests
1. Always Handle 304 Responses
When implementing conditional requests, always check for 304 responses and handle them appropriately:
def handle_conditional_response(response, cached_content=None):
if response.status_code == 304:
if cached_content is None:
raise ValueError("Received 304 but no cached content available")
return cached_content
if response.ok:
return response.content
response.raise_for_status()
2. Combine Multiple Validation Methods
Use both ETag and Last-Modified headers when available for more robust caching:
function buildConditionalHeaders(cachedData) {
const headers = {};
if (cachedData.etag) {
headers['If-None-Match'] = cachedData.etag;
}
if (cachedData.lastModified) {
headers['If-Modified-Since'] = cachedData.lastModified;
}
return headers;
}
3. Implement Proper Cache Storage
Store validation data persistently for long-term efficiency:
import sqlite3
import json
class PersistentCache:
def __init__(self, db_path):
self.db_path = db_path
self.init_db()
def init_db(self):
conn = sqlite3.connect(self.db_path)
conn.execute('''
CREATE TABLE IF NOT EXISTS cache (
url TEXT PRIMARY KEY,
content TEXT,
etag TEXT,
last_modified TEXT,
cached_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
conn.commit()
conn.close()
def get_cached(self, url):
conn = sqlite3.connect(self.db_path)
cursor = conn.cursor()
cursor.execute('SELECT content, etag, last_modified FROM cache WHERE url = ?', (url,))
result = cursor.fetchone()
conn.close()
if result:
return {
'content': result[0],
'etag': result[1],
'last_modified': result[2]
}
return None
def update_cache(self, url, content, etag=None, last_modified=None):
conn = sqlite3.connect(self.db_path)
conn.execute('''
INSERT OR REPLACE INTO cache (url, content, etag, last_modified)
VALUES (?, ?, ?, ?)
''', (url, content, etag, last_modified))
conn.commit()
conn.close()
Common Pitfalls and Solutions
1. Clock Skew Issues
When using If-Modified-Since
, be aware of clock differences between client and server:
from datetime import datetime, timezone
def safe_last_modified_header(last_modified_str):
"""
Safely handle Last-Modified headers with potential clock skew
"""
try:
# Parse the server's Last-Modified header
server_time = datetime.strptime(last_modified_str, '%a, %d %b %Y %H:%M:%S GMT')
server_time = server_time.replace(tzinfo=timezone.utc)
# Add a small buffer to account for clock skew
current_time = datetime.now(timezone.utc)
if server_time > current_time:
# Server clock is ahead, use current time instead
return current_time.strftime('%a, %d %b %Y %H:%M:%S GMT')
return last_modified_str
except ValueError:
# Invalid date format, don't use conditional request
return None
2. Weak vs Strong ETags
Understand the difference between weak and strong ETags:
function parseETag(etag) {
if (!etag) return null;
const isWeak = etag.startsWith('W/');
const value = isWeak ? etag.substring(2) : etag;
return {
value: value.replace(/"/g, ''), // Remove quotes
isWeak,
original: etag
};
}
function shouldUseETag(etag, operation) {
const parsed = parseETag(etag);
if (!parsed) return false;
// For range requests or PUT operations, only use strong ETags
if (operation === 'range' || operation === 'put') {
return !parsed.isWeak;
}
// For GET requests, both weak and strong ETags are fine
return true;
}
Monitoring and Debugging
When implementing conditional requests, proper monitoring helps identify issues:
import logging
from functools import wraps
def log_conditional_requests(func):
@wraps(func)
def wrapper(*args, **kwargs):
url = args[0] if args else kwargs.get('url', 'unknown')
# Log the request
logging.info(f"Making conditional request to {url}")
try:
result = func(*args, **kwargs)
# Log the outcome
if hasattr(result, 'get') and result.get('cached'):
logging.info(f"Cache hit for {url}")
else:
logging.info(f"Cache miss for {url}")
return result
except Exception as e:
logging.error(f"Conditional request failed for {url}: {e}")
raise
return wrapper
@log_conditional_requests
def fetch_with_conditions(url, **kwargs):
# Your conditional request implementation
pass
Conclusion
HTTP conditional requests are essential for building efficient, scalable web applications and scraping systems. They reduce bandwidth usage, minimize server load, and help prevent data conflicts in concurrent scenarios. When implementing browser automation tools like Puppeteer for handling AJAX requests, understanding conditional requests becomes even more valuable for optimizing network interactions.
By implementing proper caching strategies with ETags and Last-Modified headers, you can significantly improve your application's performance while respecting server resources. Remember to handle edge cases like clock skew and weak ETags, and always implement proper monitoring to ensure your conditional requests work as expected.
Whether you're building a web scraper, API client, or complex web application, mastering conditional requests will make your HTTP interactions more efficient and reliable. The examples and patterns shown here provide a solid foundation for implementing these powerful HTTP features in your projects.