Table of contents

How do I access the raw response content using Requests?

The Python requests library provides multiple ways to access response content depending on your needs. Here's a comprehensive guide to accessing different types of response data:

1. Text Content (.text)

Use .text for accessing decoded string content from HTML, JSON, or text responses:

import requests

response = requests.get('https://httpbin.org/get')
text_content = response.text
print(text_content)  # Decoded string content

# Check encoding used
print(f"Encoding: {response.encoding}")

The .text attribute automatically decodes bytes using the encoding specified in the response headers or falls back to ISO-8859-1.

2. Binary Content (.content)

Use .content for raw bytes, ideal for binary files, images, or when you need the exact bytes:

import requests

# Download an image
response = requests.get('https://httpbin.org/image/png')
binary_content = response.content

# Save binary content to file
with open('downloaded_image.png', 'wb') as file:
    file.write(binary_content)

# Or work with the bytes directly
print(f"Content length: {len(binary_content)} bytes")
print(f"First 10 bytes: {binary_content[:10]}")

3. Streaming Raw Content (.raw)

Use streaming for large files or when you need the unprocessed socket response:

Basic Streaming with iter_content()

import requests

# Stream large file efficiently
response = requests.get('https://httpbin.org/stream-bytes/1024', stream=True)

with open('streamed_file.bin', 'wb') as file:
    for chunk in response.iter_content(chunk_size=8192):
        if chunk:  # Filter out keep-alive chunks
            file.write(chunk)

Raw Socket Access

import requests

response = requests.get('https://httpbin.org/get', stream=True)

# Access the raw urllib3 response
raw_response = response.raw

# Read all at once (memory intensive for large responses)
raw_data = raw_response.read()

# Or read in chunks (recommended)
chunk_size = 1024
with open('raw_output.txt', 'wb') as file:
    while True:
        chunk = raw_response.read(chunk_size)
        if not chunk:
            break
        file.write(chunk)

4. JSON Content (.json())

For JSON responses, use the built-in JSON decoder:

import requests

response = requests.get('https://httpbin.org/json')
json_data = response.json()  # Automatically parses JSON
print(json_data)

# Equivalent to:
# import json
# json_data = json.loads(response.text)

5. Advanced Content Handling

Custom Encoding

import requests

response = requests.get('https://example.com')

# Force specific encoding if auto-detection fails
response.encoding = 'utf-8'
text_content = response.text

# Or decode manually from bytes
manual_decode = response.content.decode('utf-8')

Content Type Detection

import requests

response = requests.get('https://httpbin.org/html')

# Check content type before processing
content_type = response.headers.get('content-type', '')
print(f"Content-Type: {content_type}")

if 'application/json' in content_type:
    data = response.json()
elif 'text/' in content_type:
    data = response.text
else:
    data = response.content  # Handle as binary

6. Memory-Efficient Large File Handling

import requests
from pathlib import Path

def download_large_file(url, filename):
    """Download large files without loading entire content into memory"""
    with requests.get(url, stream=True) as response:
        response.raise_for_status()

        total_size = int(response.headers.get('content-length', 0))
        downloaded = 0

        with open(filename, 'wb') as file:
            for chunk in response.iter_content(chunk_size=8192):
                if chunk:
                    file.write(chunk)
                    downloaded += len(chunk)
                    # Optional: show progress
                    if total_size > 0:
                        percent = (downloaded / total_size) * 100
                        print(f"\rDownloaded: {percent:.1f}%", end="")

# Usage
download_large_file('https://httpbin.org/stream-bytes/1048576', 'large_file.bin')

Key Differences Summary

| Method | Use Case | Memory Usage | Returns | |--------|----------|--------------|---------| | .text | Text content (HTML, JSON, etc.) | Full content in memory | Decoded string | | .content | Binary data, exact bytes | Full content in memory | Raw bytes | | .raw | Large files, streaming | Minimal (with streaming) | urllib3.HTTPResponse object | | .json() | JSON responses | Full content in memory | Parsed Python object |

Best Practices

  • Use .text for small text responses
  • Use .content for binary files that fit in memory
  • Use streaming (.raw or iter_content()) for large files
  • Always use stream=True for large downloads
  • Check content type headers when content format is unknown
  • Handle encoding explicitly when automatic detection fails

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon