How to Debug Network Issues When Using Guzzle
Debugging network issues in Guzzle can be challenging, especially when dealing with complex web scraping scenarios or API integrations. This comprehensive guide covers various debugging techniques, tools, and best practices to help you identify and resolve network-related problems when using Guzzle HTTP client in PHP.
Understanding Guzzle's Built-in Debugging Features
Guzzle provides several built-in debugging capabilities that can help you diagnose network issues effectively.
Enabling Debug Mode
The simplest way to start debugging is by enabling Guzzle's debug mode:
use GuzzleHttp\Client;
$client = new Client([
'debug' => true, // Enable debug output to STDOUT
]);
$response = $client->get('https://api.example.com/data');
For more control over debug output, you can specify a resource or file:
$debugFile = fopen('debug.log', 'a');
$client = new Client([
'debug' => $debugFile,
]);
Using the RequestOptions Debug Parameter
You can also enable debugging on a per-request basis:
use GuzzleHttp\RequestOptions;
$response = $client->get('https://api.example.com/data', [
RequestOptions::DEBUG => true
]);
Implementing Custom Logging and Monitoring
Creating a Custom Logger
For production environments, implement a custom logger using PSR-3 compatible loggers like Monolog:
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
use GuzzleHttp\MessageFormatter;
use GuzzleHttp\Middleware;
$logger = new Logger('guzzle');
$logger->pushHandler(new StreamHandler('guzzle.log', Logger::DEBUG));
$stack = \GuzzleHttp\HandlerStack::create();
// Log all requests and responses
$stack->push(
Middleware::log(
$logger,
new MessageFormatter('{method} {uri} HTTP/{version} {req_body}')
)
);
$client = new Client([
'handler' => $stack,
]);
Advanced Request/Response Logging
Create detailed logs with custom formatting:
$stack->push(
Middleware::log(
$logger,
new MessageFormatter(
"REQUEST: {method} {uri}\n" .
"Headers: {req_headers}\n" .
"Body: {req_body}\n" .
"RESPONSE: {code} {phrase}\n" .
"Headers: {res_headers}\n" .
"Body: {res_body}\n" .
"Time: {total_time}s"
)
)
);
Network Request Monitoring and Analysis
Capturing Network Metrics
Monitor network performance and identify bottlenecks:
use GuzzleHttp\TransferStats;
$client->get('https://api.example.com/data', [
'on_stats' => function (TransferStats $stats) use ($logger) {
$logger->info('Request stats:', [
'url' => $stats->getEffectiveUri(),
'total_time' => $stats->getTransferTime(),
'connect_time' => $stats->getHandlerStats()['connect_time'] ?? null,
'dns_time' => $stats->getHandlerStats()['namelookup_time'] ?? null,
'size_download' => $stats->getHandlerStats()['size_download'] ?? null,
'speed_download' => $stats->getHandlerStats()['speed_download'] ?? null,
]);
}
]);
Monitoring Redirects
Track redirect chains to identify potential issues:
$stack->push(Middleware::redirect(), 'redirect');
$client = new Client([
'handler' => $stack,
'allow_redirects' => [
'max' => 5,
'strict' => true,
'referer' => true,
'track_redirects' => true
]
]);
$response = $client->get('https://example.com/redirect-chain');
// Access redirect history
$redirectHistory = $response->getHeaderLine('X-Guzzle-Redirect-History');
Error Handling and Exception Analysis
Comprehensive Exception Handling
Implement robust error handling to capture and analyze different types of network failures:
use GuzzleHttp\Exception\RequestException;
use GuzzleHttp\Exception\ConnectException;
use GuzzleHttp\Exception\TooManyRedirectsException;
use GuzzleHttp\Exception\ClientException;
use GuzzleHttp\Exception\ServerException;
try {
$response = $client->get('https://api.example.com/data');
} catch (ConnectException $e) {
// Network connectivity issues
$logger->error('Connection failed:', [
'message' => $e->getMessage(),
'request' => \GuzzleHttp\Psr7\str($e->getRequest()),
'handler_context' => $e->getHandlerContext()
]);
} catch (TooManyRedirectsException $e) {
// Redirect loop detection
$logger->error('Too many redirects:', [
'message' => $e->getMessage(),
'redirect_count' => count($e->getRedirectHistory())
]);
} catch (ClientException $e) {
// 4xx HTTP status codes
$logger->error('Client error:', [
'status_code' => $e->getResponse()->getStatusCode(),
'response_body' => $e->getResponse()->getBody()->getContents()
]);
} catch (ServerException $e) {
// 5xx HTTP status codes
$logger->error('Server error:', [
'status_code' => $e->getResponse()->getStatusCode(),
'response_headers' => $e->getResponse()->getHeaders()
]);
} catch (RequestException $e) {
// General request exceptions
$logger->error('Request exception:', [
'message' => $e->getMessage(),
'has_response' => $e->hasResponse()
]);
}
Analyzing SSL/TLS Issues
Debug SSL certificate and TLS connection problems:
$client = new Client([
'curl' => [
CURLOPT_VERBOSE => true,
CURLOPT_STDERR => fopen('curl_verbose.log', 'a'),
CURLOPT_SSL_VERIFYPEER => false, // Only for debugging
CURLOPT_SSL_VERIFYHOST => false, // Only for debugging
]
]);
Timeout and Connection Debugging
Configuring Timeouts for Debugging
Set appropriate timeouts and monitor their effectiveness:
$client = new Client([
'timeout' => 30, // Request timeout
'connect_timeout' => 10, // Connection timeout
'read_timeout' => 20, // Read timeout
]);
// Monitor timeout events
$client->get('https://slow-api.example.com/data', [
'on_stats' => function (TransferStats $stats) {
if ($stats->getTransferTime() > 25) {
error_log("Slow request detected: " . $stats->getTransferTime() . "s");
}
}
]);
Implementing Retry Logic with Debugging
Add retry mechanisms with detailed logging:
use GuzzleHttp\Retry\GenericRetryStrategy;
use GuzzleHttp\Retry\RetryMiddleware;
$retryStrategy = new GenericRetryStrategy([
ConnectException::class,
RequestException::class
], 3); // Retry up to 3 times
$stack->push(RetryMiddleware::factory($retryStrategy), 'retry');
$client = new Client(['handler' => $stack]);
DNS and Network Layer Debugging
DNS Resolution Debugging
Debug DNS-related issues:
$client = new Client([
'curl' => [
CURLOPT_RESOLVE => [
'api.example.com:443:192.168.1.100' // Force specific IP
]
]
]);
// Log DNS resolution time
$client->get('https://api.example.com/data', [
'on_stats' => function (TransferStats $stats) {
$handlerStats = $stats->getHandlerStats();
if (isset($handlerStats['namelookup_time'])) {
error_log("DNS lookup time: " . $handlerStats['namelookup_time'] . "s");
}
}
]);
Network Interface and Proxy Debugging
Debug proxy configurations and network interfaces:
$client = new Client([
'proxy' => [
'http' => 'tcp://proxy.example.com:8080',
'https' => 'tcp://proxy.example.com:8080',
],
'curl' => [
CURLOPT_INTERFACE => '192.168.1.50', // Bind to specific interface
CURLOPT_PROXYTYPE => CURLPROXY_HTTP,
]
]);
Performance Profiling and Optimization
Request Profiling
Profile request performance to identify bottlenecks:
class RequestProfiler
{
private $profiles = [];
public function profileRequest($url, callable $requestCallback)
{
$startTime = microtime(true);
$startMemory = memory_get_usage();
try {
$result = $requestCallback();
$status = 'success';
} catch (Exception $e) {
$result = $e;
$status = 'error';
}
$endTime = microtime(true);
$endMemory = memory_get_usage();
$this->profiles[] = [
'url' => $url,
'duration' => $endTime - $startTime,
'memory_used' => $endMemory - $startMemory,
'status' => $status,
'timestamp' => date('Y-m-d H:i:s')
];
return $result;
}
public function getProfiles()
{
return $this->profiles;
}
}
// Usage
$profiler = new RequestProfiler();
$result = $profiler->profileRequest('https://api.example.com/data', function() use ($client) {
return $client->get('https://api.example.com/data');
});
Testing and Validation Tools
Mock Servers for Testing
Use Guzzle's mock handler for testing network scenarios:
use GuzzleHttp\Handler\MockHandler;
use GuzzleHttp\Psr7\Response;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\Exception\RequestException;
$mock = new MockHandler([
new Response(200, ['X-Foo' => 'Bar'], 'Success'),
new Response(500, [], 'Server Error'),
new RequestException('Connection timeout', new Request('GET', 'test'))
]);
$handlerStack = HandlerStack::create($mock);
$client = new Client(['handler' => $handlerStack]);
Network Validation Tools
Implement validation tools to verify network behavior:
class NetworkValidator
{
public static function validateResponse($response)
{
$issues = [];
// Check response time
if ($response->hasHeader('X-Response-Time')) {
$responseTime = floatval($response->getHeaderLine('X-Response-Time'));
if ($responseTime > 2.0) {
$issues[] = "Slow response time: {$responseTime}s";
}
}
// Check content encoding
if ($response->hasHeader('Content-Encoding')) {
$encoding = $response->getHeaderLine('Content-Encoding');
if (!in_array($encoding, ['gzip', 'deflate', 'br'])) {
$issues[] = "Unsupported encoding: {$encoding}";
}
}
return $issues;
}
}
Command Line Debugging Tools
Using cURL for Comparison
Compare Guzzle behavior with cURL commands:
# Test basic connectivity
curl -v https://api.example.com/data
# Test with specific headers
curl -H "User-Agent: GuzzleHttp/7.0" -H "Accept: application/json" \
-v https://api.example.com/data
# Test with timeout
curl --connect-timeout 10 --max-time 30 https://api.example.com/data
# Test SSL/TLS
curl -vvv --tlsv1.2 https://api.example.com/data
Network Diagnostics Commands
Use system tools for network diagnostics:
# Test DNS resolution
nslookup api.example.com
dig api.example.com
# Test connectivity
ping api.example.com
traceroute api.example.com
# Test port connectivity
telnet api.example.com 443
nc -zv api.example.com 443
Best Practices for Production Debugging
Structured Logging
Implement structured logging for better analysis:
$logger->info('Guzzle request completed', [
'method' => $request->getMethod(),
'uri' => (string) $request->getUri(),
'status_code' => $response->getStatusCode(),
'duration' => $transferStats->getTransferTime(),
'user_agent' => $request->getHeaderLine('User-Agent'),
'content_length' => $response->getHeaderLine('Content-Length'),
'server' => $response->getHeaderLine('Server')
]);
Monitoring Integration
Integrate with monitoring systems for real-time debugging:
// Example with custom metrics collection
class GuzzleMetricsCollector
{
public function collectMetrics(TransferStats $stats)
{
$metrics = [
'guzzle.request.duration' => $stats->getTransferTime(),
'guzzle.request.size' => $stats->getRequest()->getBody()->getSize(),
'guzzle.response.size' => $stats->getResponse() ?
$stats->getResponse()->getBody()->getSize() : 0
];
// Send to monitoring system (StatsD, Prometheus, etc.)
foreach ($metrics as $name => $value) {
$this->sendMetric($name, $value);
}
}
}
Integration with Browser Automation Tools
When debugging complex web scraping workflows, network issues in Guzzle may need to be analyzed alongside browser automation tools. For comprehensive debugging of JavaScript-heavy sites, you might need to monitor network requests in Puppeteer to understand the complete picture of your scraping pipeline.
Conclusion
By implementing these comprehensive debugging techniques, you'll be able to effectively diagnose and resolve network issues when using Guzzle. Remember to remove or secure debug information in production environments, and always monitor the performance impact of extensive logging and debugging features.
For web scraping applications that require JavaScript execution alongside HTTP requests, consider how these debugging techniques complement browser automation tools for comprehensive troubleshooting across different layers of your scraping infrastructure.