Table of contents

How do I handle gzip or deflate compressed responses in Guzzle?

Guzzle automatically handles gzip and deflate compressed HTTP responses, making content compression transparent to developers. This feature reduces bandwidth usage and improves performance when making HTTP requests.

How Guzzle Handles Compression

By default, Guzzle: - Automatically adds the Accept-Encoding: gzip, deflate header to requests - Detects compressed responses via the Content-Encoding header - Automatically decompresses gzip and deflate content - Returns uncompressed content for immediate use

Basic Usage (Automatic Decompression)

The simplest approach is to let Guzzle handle compression automatically:

<?php
require 'vendor/autoload.php';

use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;

$client = new Client();

try {
    // Guzzle automatically handles compression by default
    $response = $client->request('GET', 'https://httpbin.org/gzip');

    // Content is automatically decompressed
    $body = $response->getBody()->getContents();

    // Check if the response was compressed
    $encoding = $response->getHeader('Content-Encoding');
    if (!empty($encoding)) {
        echo "Response was compressed with: " . implode(', ', $encoding) . "\n";
    }

    echo "Decompressed content: " . $body;
} catch (RequestException $e) {
    echo "Error: " . $e->getMessage();
}

Explicit Control Over Decompression

You can explicitly control the decompression behavior:

// Explicitly enable automatic decompression (default behavior)
$response = $client->request('GET', 'https://httpbin.org/gzip', [
    'decode_content' => true
]);

// Disable automatic decompression
$response = $client->request('GET', 'https://httpbin.org/gzip', [
    'decode_content' => false
]);

Working with Raw Compressed Data

Sometimes you need access to the raw compressed data:

$client = new Client();

$response = $client->request('GET', 'https://httpbin.org/gzip', [
    'decode_content' => false  // Get raw compressed data
]);

$compressedBody = $response->getBody()->getContents();
$contentEncoding = $response->getHeaderLine('Content-Encoding');

echo "Content-Encoding: " . $contentEncoding . "\n";
echo "Compressed size: " . strlen($compressedBody) . " bytes\n";

// Manually decompress if needed
if ($contentEncoding === 'gzip') {
    $decompressed = gzdecode($compressedBody);
    echo "Decompressed size: " . strlen($decompressed) . " bytes\n";
} elseif ($contentEncoding === 'deflate') {
    $decompressed = gzinflate($compressedBody);
    echo "Decompressed size: " . strlen($decompressed) . " bytes\n";
}

Advanced Configuration

Custom Accept-Encoding Header

You can customize which compression methods to accept:

$response = $client->request('GET', 'https://example.com', [
    'headers' => [
        'Accept-Encoding' => 'gzip'  // Only accept gzip compression
    ]
]);

Handling Different Compression Types

$client = new Client();

try {
    $response = $client->request('GET', 'https://httpbin.org/deflate');

    $contentEncoding = $response->getHeaderLine('Content-Encoding');
    $body = $response->getBody()->getContents();

    switch ($contentEncoding) {
        case 'gzip':
            echo "Response was gzip compressed\n";
            break;
        case 'deflate':
            echo "Response was deflate compressed\n";
            break;
        case 'br':
            echo "Response was Brotli compressed\n";
            break;
        default:
            echo "Response was not compressed\n";
    }

    echo "Content: " . substr($body, 0, 100) . "...\n";
} catch (RequestException $e) {
    echo "Error: " . $e->getMessage();
}

Best Practices for Web Scraping

When scraping websites, compression handling is crucial for performance:

$client = new Client([
    'timeout' => 30,
    'verify' => false,  // For development only
]);

$urls = [
    'https://example.com/page1',
    'https://example.com/page2',
    'https://example.com/page3'
];

foreach ($urls as $url) {
    try {
        $response = $client->request('GET', $url, [
            'headers' => [
                'User-Agent' => 'Mozilla/5.0 (compatible; WebScraper/1.0)',
                'Accept-Encoding' => 'gzip, deflate'
            ],
            'decode_content' => true  // Automatic decompression
        ]);

        $originalSize = $response->getHeaderLine('Content-Length');
        $content = $response->getBody()->getContents();
        $actualSize = strlen($content);

        echo "URL: {$url}\n";
        echo "Original size: {$originalSize} bytes\n";
        echo "Decompressed size: {$actualSize} bytes\n";
        echo "Compression ratio: " . round(($originalSize / $actualSize) * 100, 2) . "%\n\n";

    } catch (RequestException $e) {
        echo "Failed to fetch {$url}: " . $e->getMessage() . "\n";
    }
}

Troubleshooting Common Issues

Issue: Content appears garbled

// Ensure decode_content is enabled
$response = $client->request('GET', $url, [
    'decode_content' => true
]);

Issue: Want to measure compression savings

// Get both compressed and uncompressed sizes
$response = $client->request('GET', $url);
$contentLength = $response->getHeaderLine('Content-Length');
$actualSize = strlen($response->getBody()->getContents());

echo "Bandwidth saved: " . ($contentLength - $actualSize) . " bytes\n";

Key Points

  • Automatic by default: Guzzle handles compression transparently
  • Performance benefit: Compressed responses reduce bandwidth and transfer time
  • Control available: Use decode_content option when you need raw data
  • Multiple formats: Supports gzip, deflate, and other compression methods
  • Headers matter: Check Content-Encoding header to understand compression used

For most web scraping scenarios, Guzzle's default automatic decompression is perfect and requires no additional configuration.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon