Table of contents

What is the 'stream' parameter in Requests, and when should I use it?

The stream parameter in Python's requests library controls how response content is downloaded and handled. When set to True, it enables chunked, memory-efficient data processing instead of loading entire responses into memory at once.

How Stream Parameter Works

Default Behavior (stream=False): - Downloads entire response content immediately - Stores complete data in memory before returning Response object - Simple but memory-intensive for large files

Streaming Behavior (stream=True): - Returns Response object immediately without downloading content - Downloads data on-demand as you iterate through it - Memory-efficient for large files and real-time data

Basic File Download Example

import requests

# Download large file with streaming
response = requests.get('https://example.com/largefile.zip', stream=True)

if response.status_code == 200:
    with open('largefile.zip', 'wb') as file:
        for chunk in response.iter_content(chunk_size=8192):
            file.write(chunk)

Advanced Examples

Download with Progress Tracking

import requests
from tqdm import tqdm

def download_with_progress(url, filename):
    response = requests.get(url, stream=True)
    total_size = int(response.headers.get('content-length', 0))

    with open(filename, 'wb') as file, tqdm(
        desc=filename,
        total=total_size,
        unit='B',
        unit_scale=True,
        unit_divisor=1024,
    ) as progress_bar:
        for chunk in response.iter_content(chunk_size=8192):
            size = file.write(chunk)
            progress_bar.update(size)

download_with_progress('https://example.com/file.zip', 'file.zip')

Line-by-Line Text Processing

import requests

response = requests.get('https://example.com/logfile.txt', stream=True)

# Process large text files line by line
for line in response.iter_lines(decode_unicode=True):
    if line:  # Filter out empty lines
        process_log_line(line)

JSON Streaming API

import requests
import json

def stream_json_api(url):
    response = requests.get(url, stream=True)

    for line in response.iter_lines():
        if line:
            try:
                data = json.loads(line.decode('utf-8'))
                yield data
            except json.JSONDecodeError:
                continue

# Process streaming JSON data
for item in stream_json_api('https://api.example.com/stream'):
    handle_data(item)

When to Use Stream Parameter

✅ Use stream=True when:

  1. Large File Downloads - Files that exceed available memory
  2. Progress Tracking - Need to show download progress to users
  3. Real-time Data - Streaming APIs or live data feeds
  4. Memory Constraints - Limited memory environments
  5. Processing on the Fly - Data processing during download

❌ Avoid stream=True when:

  1. Small Responses - Files under a few MB where memory isn't a concern
  2. Simple API Calls - JSON responses that fit comfortably in memory
  3. Response Processing - When you need the complete response for parsing

Important Considerations

Connection Management

import requests

# Always use context manager or explicitly close
response = requests.get('https://example.com/file', stream=True)
try:
    for chunk in response.iter_content(chunk_size=8192):
        process_chunk(chunk)
finally:
    response.close()  # Important: close the connection

# Or use with statement (recommended)
with requests.get('https://example.com/file', stream=True) as response:
    for chunk in response.iter_content(chunk_size=8192):
        process_chunk(chunk)

Optimal Chunk Sizes

# Different chunk sizes for different use cases
response = requests.get(url, stream=True)

# Small chunks for real-time processing
for chunk in response.iter_content(chunk_size=1024):  # 1KB
    process_immediately(chunk)

# Larger chunks for file downloads
for chunk in response.iter_content(chunk_size=65536):  # 64KB
    write_to_file(chunk)

Error Handling

import requests
from requests.exceptions import RequestException

def safe_download(url, filename):
    try:
        with requests.get(url, stream=True, timeout=30) as response:
            response.raise_for_status()

            with open(filename, 'wb') as file:
                for chunk in response.iter_content(chunk_size=8192):
                    if chunk:  # Filter out keep-alive chunks
                        file.write(chunk)

    except RequestException as e:
        print(f"Download failed: {e}")
        return False
    return True

Key Methods for Streaming

  • iter_content(chunk_size=1) - Iterate over response content in chunks
  • iter_lines(chunk_size=512) - Iterate over response lines
  • raw.read(amt=None) - Read raw bytes directly from urllib3 response

The stream parameter is essential for building memory-efficient, scalable applications that handle large files or real-time data streams.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

📖 Related Blog Guides

Expand your knowledge with these comprehensive tutorials:

Web Scraping with Python

Master HTTP requests for web scraping

Python Web Scraping Libraries

Requests library comprehensive guide

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon