When using Guzzle, PHP's popular HTTP client library, proper error handling is crucial for building reliable web scraping and API integration applications. Guzzle provides several approaches to handle HTTP errors ranging from client-side errors (4xx status codes) to server-side errors (5xx status codes).
Understanding Guzzle's Exception Hierarchy
Guzzle throws specific exceptions based on the type of error encountered:
GuzzleHttp\Exception\ClientException
: Thrown for 4xx HTTP errors (client-side issues like 400 Bad Request, 404 Not Found)GuzzleHttp\Exception\ServerException
: Thrown for 5xx HTTP errors (server-side issues like 500 Internal Server Error, 503 Service Unavailable)GuzzleHttp\Exception\ConnectException
: Thrown for networking errors (DNS resolution failures, connection timeouts)GuzzleHttp\Exception\TooManyRedirectsException
: Thrown when redirect limit is exceededGuzzleHttp\Exception\RequestException
: Base class for all request-related exceptions
Method 1: Exception Handling with Try-Catch
The most common approach is using try-catch blocks to handle different types of exceptions:
use GuzzleHttp\Client;
use GuzzleHttp\Exception\ClientException;
use GuzzleHttp\Exception\ServerException;
use GuzzleHttp\Exception\ConnectException;
use GuzzleHttp\Exception\RequestException;
$client = new Client();
try {
$response = $client->request('GET', 'https://example.com/api/resource');
$data = json_decode($response->getBody(), true);
echo "Success: " . $response->getStatusCode();
} catch (ClientException $e) {
// Handle 4xx errors
$statusCode = $e->getResponse()->getStatusCode();
echo "Client error ($statusCode): " . $e->getMessage();
// Access response body for error details
$errorBody = $e->getResponse()->getBody()->getContents();
echo "Error details: " . $errorBody;
} catch (ServerException $e) {
// Handle 5xx errors
$statusCode = $e->getResponse()->getStatusCode();
echo "Server error ($statusCode): " . $e->getMessage();
} catch (ConnectException $e) {
// Handle connection errors
echo "Connection error: " . $e->getMessage();
} catch (RequestException $e) {
// Handle any other request-related errors
echo "Request error: " . $e->getMessage();
}
Method 2: Disabling HTTP Errors and Manual Status Checking
Sometimes you prefer to handle HTTP errors manually by checking status codes:
use GuzzleHttp\Client;
$client = new Client();
try {
$response = $client->request('GET', 'https://example.com/api/resource', [
'http_errors' => false, // Disable automatic exception throwing
'timeout' => 30,
'connect_timeout' => 10
]);
$statusCode = $response->getStatusCode();
if ($statusCode >= 200 && $statusCode < 300) {
// Success
echo "Success: " . $response->getBody();
} elseif ($statusCode >= 400 && $statusCode < 500) {
// Client error
echo "Client Error ($statusCode): " . $response->getReasonPhrase();
echo "\nResponse: " . $response->getBody();
} elseif ($statusCode >= 500) {
// Server error
echo "Server Error ($statusCode): " . $response->getReasonPhrase();
}
} catch (ConnectException $e) {
// Still need to catch connection errors
echo "Connection failed: " . $e->getMessage();
}
Method 3: Using Middleware for Global Error Handling
For applications making multiple requests, middleware provides a centralized error handling approach:
use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\Exception\RequestException;
use Psr\Http\Message\RequestInterface;
use Psr\Http\Message\ResponseInterface;
// Create custom error handling middleware
$errorHandler = Middleware::mapResponse(function (ResponseInterface $response) {
$statusCode = $response->getStatusCode();
if ($statusCode >= 400) {
// Log error or perform custom handling
error_log("HTTP Error $statusCode: " . $response->getReasonPhrase());
// Optionally modify response or throw custom exception
if ($statusCode >= 500) {
// Could implement retry logic here
error_log("Server error detected, consider retrying");
}
}
return $response;
});
// Create handler stack with middleware
$stack = HandlerStack::create();
$stack->push($errorHandler);
$client = new Client([
'handler' => $stack,
'http_errors' => false // Let middleware handle errors
]);
$response = $client->request('GET', 'https://example.com/api/resource');
Method 4: Retry Middleware for Transient Errors
For handling temporary failures with automatic retries:
use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\Exception\ConnectException;
use GuzzleHttp\Exception\RequestException;
use Psr\Http\Message\RequestInterface;
use Psr\Http\Message\ResponseInterface;
$stack = HandlerStack::create();
// Add retry middleware
$stack->push(Middleware::retry(function (
$retries,
RequestInterface $request,
ResponseInterface $response = null,
RequestException $exception = null
) {
// Retry on connection errors or 5xx responses
if ($retries < 3) {
if ($exception instanceof ConnectException) {
return true;
}
if ($response && $response->getStatusCode() >= 500) {
return true;
}
}
return false;
}, function ($retries) {
// Exponential backoff: 1s, 2s, 4s
return 1000 * pow(2, $retries);
}));
$client = new Client(['handler' => $stack]);
try {
$response = $client->request('GET', 'https://example.com/api/resource');
echo "Success after retries: " . $response->getBody();
} catch (RequestException $e) {
echo "Failed after all retries: " . $e->getMessage();
}
Best Practices for HTTP Error Handling
- Always handle connection errors: Network issues are common in web scraping
- Implement appropriate retry logic: Use exponential backoff for transient errors
- Log errors appropriately: Include request details and timestamps for debugging
- Check response status codes: Don't assume 2xx responses are always successful
- Handle rate limiting: Watch for 429 status codes and implement delays
- Validate response content: Check for expected data structure even on 200 responses
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;
function makeRobustRequest($url, $maxRetries = 3) {
$client = new Client(['timeout' => 30]);
for ($attempt = 1; $attempt <= $maxRetries; $attempt++) {
try {
$response = $client->request('GET', $url, [
'http_errors' => false,
'headers' => [
'User-Agent' => 'MyApp/1.0'
]
]);
$statusCode = $response->getStatusCode();
if ($statusCode === 200) {
return $response->getBody()->getContents();
} elseif ($statusCode === 429) {
// Rate limited - wait before retry
sleep(pow(2, $attempt));
continue;
} elseif ($statusCode >= 500 && $attempt < $maxRetries) {
// Server error - retry
sleep($attempt);
continue;
} else {
throw new Exception("HTTP Error $statusCode: " . $response->getReasonPhrase());
}
} catch (RequestException $e) {
if ($attempt === $maxRetries) {
throw $e;
}
sleep($attempt);
}
}
throw new Exception("Max retries exceeded");
}
By implementing proper error handling strategies, your Guzzle-based applications will be more resilient and provide better user experiences when dealing with unreliable network conditions or external service issues.