The stream
parameter in Python's requests
library controls how response content is downloaded and handled. When set to True
, it enables chunked, memory-efficient data processing instead of loading entire responses into memory at once.
How Stream Parameter Works
Default Behavior (stream=False
):
- Downloads entire response content immediately
- Stores complete data in memory before returning Response object
- Simple but memory-intensive for large files
Streaming Behavior (stream=True
):
- Returns Response object immediately without downloading content
- Downloads data on-demand as you iterate through it
- Memory-efficient for large files and real-time data
Basic File Download Example
import requests
# Download large file with streaming
response = requests.get('https://example.com/largefile.zip', stream=True)
if response.status_code == 200:
with open('largefile.zip', 'wb') as file:
for chunk in response.iter_content(chunk_size=8192):
file.write(chunk)
Advanced Examples
Download with Progress Tracking
import requests
from tqdm import tqdm
def download_with_progress(url, filename):
response = requests.get(url, stream=True)
total_size = int(response.headers.get('content-length', 0))
with open(filename, 'wb') as file, tqdm(
desc=filename,
total=total_size,
unit='B',
unit_scale=True,
unit_divisor=1024,
) as progress_bar:
for chunk in response.iter_content(chunk_size=8192):
size = file.write(chunk)
progress_bar.update(size)
download_with_progress('https://example.com/file.zip', 'file.zip')
Line-by-Line Text Processing
import requests
response = requests.get('https://example.com/logfile.txt', stream=True)
# Process large text files line by line
for line in response.iter_lines(decode_unicode=True):
if line: # Filter out empty lines
process_log_line(line)
JSON Streaming API
import requests
import json
def stream_json_api(url):
response = requests.get(url, stream=True)
for line in response.iter_lines():
if line:
try:
data = json.loads(line.decode('utf-8'))
yield data
except json.JSONDecodeError:
continue
# Process streaming JSON data
for item in stream_json_api('https://api.example.com/stream'):
handle_data(item)
When to Use Stream Parameter
✅ Use stream=True
when:
- Large File Downloads - Files that exceed available memory
- Progress Tracking - Need to show download progress to users
- Real-time Data - Streaming APIs or live data feeds
- Memory Constraints - Limited memory environments
- Processing on the Fly - Data processing during download
❌ Avoid stream=True
when:
- Small Responses - Files under a few MB where memory isn't a concern
- Simple API Calls - JSON responses that fit comfortably in memory
- Response Processing - When you need the complete response for parsing
Important Considerations
Connection Management
import requests
# Always use context manager or explicitly close
response = requests.get('https://example.com/file', stream=True)
try:
for chunk in response.iter_content(chunk_size=8192):
process_chunk(chunk)
finally:
response.close() # Important: close the connection
# Or use with statement (recommended)
with requests.get('https://example.com/file', stream=True) as response:
for chunk in response.iter_content(chunk_size=8192):
process_chunk(chunk)
Optimal Chunk Sizes
# Different chunk sizes for different use cases
response = requests.get(url, stream=True)
# Small chunks for real-time processing
for chunk in response.iter_content(chunk_size=1024): # 1KB
process_immediately(chunk)
# Larger chunks for file downloads
for chunk in response.iter_content(chunk_size=65536): # 64KB
write_to_file(chunk)
Error Handling
import requests
from requests.exceptions import RequestException
def safe_download(url, filename):
try:
with requests.get(url, stream=True, timeout=30) as response:
response.raise_for_status()
with open(filename, 'wb') as file:
for chunk in response.iter_content(chunk_size=8192):
if chunk: # Filter out keep-alive chunks
file.write(chunk)
except RequestException as e:
print(f"Download failed: {e}")
return False
return True
Key Methods for Streaming
iter_content(chunk_size=1)
- Iterate over response content in chunksiter_lines(chunk_size=512)
- Iterate over response linesraw.read(amt=None)
- Read raw bytes directly from urllib3 response
The stream
parameter is essential for building memory-efficient, scalable applications that handle large files or real-time data streams.