Guzzle automatically handles gzip and deflate compressed HTTP responses, making content compression transparent to developers. This feature reduces bandwidth usage and improves performance when making HTTP requests.
How Guzzle Handles Compression
By default, Guzzle:
- Automatically adds the Accept-Encoding: gzip, deflate
header to requests
- Detects compressed responses via the Content-Encoding
header
- Automatically decompresses gzip and deflate content
- Returns uncompressed content for immediate use
Basic Usage (Automatic Decompression)
The simplest approach is to let Guzzle handle compression automatically:
<?php
require 'vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;
$client = new Client();
try {
// Guzzle automatically handles compression by default
$response = $client->request('GET', 'https://httpbin.org/gzip');
// Content is automatically decompressed
$body = $response->getBody()->getContents();
// Check if the response was compressed
$encoding = $response->getHeader('Content-Encoding');
if (!empty($encoding)) {
echo "Response was compressed with: " . implode(', ', $encoding) . "\n";
}
echo "Decompressed content: " . $body;
} catch (RequestException $e) {
echo "Error: " . $e->getMessage();
}
Explicit Control Over Decompression
You can explicitly control the decompression behavior:
// Explicitly enable automatic decompression (default behavior)
$response = $client->request('GET', 'https://httpbin.org/gzip', [
'decode_content' => true
]);
// Disable automatic decompression
$response = $client->request('GET', 'https://httpbin.org/gzip', [
'decode_content' => false
]);
Working with Raw Compressed Data
Sometimes you need access to the raw compressed data:
$client = new Client();
$response = $client->request('GET', 'https://httpbin.org/gzip', [
'decode_content' => false // Get raw compressed data
]);
$compressedBody = $response->getBody()->getContents();
$contentEncoding = $response->getHeaderLine('Content-Encoding');
echo "Content-Encoding: " . $contentEncoding . "\n";
echo "Compressed size: " . strlen($compressedBody) . " bytes\n";
// Manually decompress if needed
if ($contentEncoding === 'gzip') {
$decompressed = gzdecode($compressedBody);
echo "Decompressed size: " . strlen($decompressed) . " bytes\n";
} elseif ($contentEncoding === 'deflate') {
$decompressed = gzinflate($compressedBody);
echo "Decompressed size: " . strlen($decompressed) . " bytes\n";
}
Advanced Configuration
Custom Accept-Encoding Header
You can customize which compression methods to accept:
$response = $client->request('GET', 'https://example.com', [
'headers' => [
'Accept-Encoding' => 'gzip' // Only accept gzip compression
]
]);
Handling Different Compression Types
$client = new Client();
try {
$response = $client->request('GET', 'https://httpbin.org/deflate');
$contentEncoding = $response->getHeaderLine('Content-Encoding');
$body = $response->getBody()->getContents();
switch ($contentEncoding) {
case 'gzip':
echo "Response was gzip compressed\n";
break;
case 'deflate':
echo "Response was deflate compressed\n";
break;
case 'br':
echo "Response was Brotli compressed\n";
break;
default:
echo "Response was not compressed\n";
}
echo "Content: " . substr($body, 0, 100) . "...\n";
} catch (RequestException $e) {
echo "Error: " . $e->getMessage();
}
Best Practices for Web Scraping
When scraping websites, compression handling is crucial for performance:
$client = new Client([
'timeout' => 30,
'verify' => false, // For development only
]);
$urls = [
'https://example.com/page1',
'https://example.com/page2',
'https://example.com/page3'
];
foreach ($urls as $url) {
try {
$response = $client->request('GET', $url, [
'headers' => [
'User-Agent' => 'Mozilla/5.0 (compatible; WebScraper/1.0)',
'Accept-Encoding' => 'gzip, deflate'
],
'decode_content' => true // Automatic decompression
]);
$originalSize = $response->getHeaderLine('Content-Length');
$content = $response->getBody()->getContents();
$actualSize = strlen($content);
echo "URL: {$url}\n";
echo "Original size: {$originalSize} bytes\n";
echo "Decompressed size: {$actualSize} bytes\n";
echo "Compression ratio: " . round(($originalSize / $actualSize) * 100, 2) . "%\n\n";
} catch (RequestException $e) {
echo "Failed to fetch {$url}: " . $e->getMessage() . "\n";
}
}
Troubleshooting Common Issues
Issue: Content appears garbled
// Ensure decode_content is enabled
$response = $client->request('GET', $url, [
'decode_content' => true
]);
Issue: Want to measure compression savings
// Get both compressed and uncompressed sizes
$response = $client->request('GET', $url);
$contentLength = $response->getHeaderLine('Content-Length');
$actualSize = strlen($response->getBody()->getContents());
echo "Bandwidth saved: " . ($contentLength - $actualSize) . " bytes\n";
Key Points
- Automatic by default: Guzzle handles compression transparently
- Performance benefit: Compressed responses reduce bandwidth and transfer time
- Control available: Use
decode_content
option when you need raw data - Multiple formats: Supports gzip, deflate, and other compression methods
- Headers matter: Check
Content-Encoding
header to understand compression used
For most web scraping scenarios, Guzzle's default automatic decompression is perfect and requires no additional configuration.