Table of contents

What is the Difference Between Synchronous and Asynchronous Requests in Guzzle?

When working with HTTP requests in PHP using Guzzle, understanding the difference between synchronous and asynchronous requests is crucial for building efficient web scraping applications and API integrations. This distinction can significantly impact your application's performance, especially when dealing with multiple HTTP requests.

Synchronous Requests: Blocking Execution

Synchronous requests in Guzzle are the default behavior where each HTTP request blocks the execution of your script until a response is received. The script waits for the server to respond before moving to the next line of code.

Basic Synchronous Request Example

<?php
use GuzzleHttp\Client;

$client = new Client();

// This request blocks execution until response is received
$response = $client->request('GET', 'https://api.example.com/data');
echo "Response status: " . $response->getStatusCode() . "\n";

// This line only executes after the above request completes
$response2 = $client->request('GET', 'https://api.example.com/more-data');
echo "Second response status: " . $response2->getStatusCode() . "\n";

Sequential Processing with Multiple Requests

<?php
use GuzzleHttp\Client;

$client = new Client();
$urls = [
    'https://api.example.com/endpoint1',
    'https://api.example.com/endpoint2',
    'https://api.example.com/endpoint3'
];

$start_time = microtime(true);

foreach ($urls as $url) {
    $response = $client->request('GET', $url);
    echo "Status: " . $response->getStatusCode() . " for " . $url . "\n";
}

$total_time = microtime(true) - $start_time;
echo "Total execution time: " . $total_time . " seconds\n";

Asynchronous Requests: Non-Blocking Execution

Asynchronous requests allow your script to send multiple HTTP requests without waiting for each response before proceeding. This approach uses promises and enables concurrent request processing, significantly improving performance when dealing with multiple API calls.

Basic Asynchronous Request Example

<?php
use GuzzleHttp\Client;
use GuzzleHttp\Promise;

$client = new Client();

// Send asynchronous request and get a promise
$promise = $client->requestAsync('GET', 'https://api.example.com/data');

// Do other work while the request is being processed
echo "Doing other work...\n";

// Wait for the promise to resolve
$response = $promise->wait();
echo "Response status: " . $response->getStatusCode() . "\n";

Concurrent Asynchronous Requests

<?php
use GuzzleHttp\Client;
use GuzzleHttp\Promise;

$client = new Client();
$urls = [
    'https://api.example.com/endpoint1',
    'https://api.example.com/endpoint2',
    'https://api.example.com/endpoint3'
];

$start_time = microtime(true);

// Create promises for all requests
$promises = [];
foreach ($urls as $index => $url) {
    $promises[$index] = $client->requestAsync('GET', $url);
}

// Wait for all promises to complete
$responses = Promise\settle($promises)->wait();

// Process responses
foreach ($responses as $index => $response) {
    if ($response['state'] === 'fulfilled') {
        echo "Status: " . $response['value']->getStatusCode() . " for " . $urls[$index] . "\n";
    } else {
        echo "Failed request for " . $urls[$index] . ": " . $response['reason'] . "\n";
    }
}

$total_time = microtime(true) - $start_time;
echo "Total execution time: " . $total_time . " seconds\n";

Key Differences and Performance Impact

Execution Model

Synchronous requests: - Block script execution until response is received - Process requests sequentially, one after another - Simple to understand and debug - Total time = sum of all individual request times

Asynchronous requests: - Non-blocking execution using promises - Multiple requests can be processed concurrently - More complex error handling required - Total time ≈ time of the slowest request (when running concurrently)

Performance Comparison

<?php
use GuzzleHttp\Client;
use GuzzleHttp\Promise;

function benchmarkSynchronous($urls) {
    $client = new Client();
    $start_time = microtime(true);

    foreach ($urls as $url) {
        try {
            $response = $client->request('GET', $url, ['timeout' => 10]);
        } catch (Exception $e) {
            echo "Error: " . $e->getMessage() . "\n";
        }
    }

    return microtime(true) - $start_time;
}

function benchmarkAsynchronous($urls) {
    $client = new Client();
    $start_time = microtime(true);

    $promises = [];
    foreach ($urls as $url) {
        $promises[] = $client->requestAsync('GET', $url, ['timeout' => 10]);
    }

    Promise\settle($promises)->wait();

    return microtime(true) - $start_time;
}

$urls = array_fill(0, 5, 'https://httpbin.org/delay/2'); // 5 URLs with 2-second delay

$sync_time = benchmarkSynchronous($urls);
$async_time = benchmarkAsynchronous($urls);

echo "Synchronous time: " . $sync_time . " seconds\n";
echo "Asynchronous time: " . $async_time . " seconds\n";
echo "Performance improvement: " . round(($sync_time / $async_time), 2) . "x faster\n";

Advanced Asynchronous Patterns

Using Pool for Controlled Concurrency

<?php
use GuzzleHttp\Client;
use GuzzleHttp\Pool;
use GuzzleHttp\Psr7\Request;

$client = new Client();
$urls = [
    'https://api.example.com/endpoint1',
    'https://api.example.com/endpoint2',
    'https://api.example.com/endpoint3',
    'https://api.example.com/endpoint4',
    'https://api.example.com/endpoint5'
];

// Create request generator
$requests = function () use ($urls) {
    foreach ($urls as $url) {
        yield new Request('GET', $url);
    }
};

// Create pool with concurrency limit
$pool = new Pool($client, $requests(), [
    'concurrency' => 3, // Maximum 3 concurrent requests
    'fulfilled' => function ($response, $index) use ($urls) {
        echo "Completed: " . $urls[$index] . " (Status: " . $response->getStatusCode() . ")\n";
    },
    'rejected' => function ($reason, $index) use ($urls) {
        echo "Failed: " . $urls[$index] . " (Reason: " . $reason . ")\n";
    },
]);

// Execute the pool
$promise = $pool->promise();
$promise->wait();

Error Handling in Asynchronous Requests

<?php
use GuzzleHttp\Client;
use GuzzleHttp\Promise;
use GuzzleHttp\Exception\RequestException;

$client = new Client();
$urls = [
    'https://api.example.com/valid-endpoint',
    'https://invalid-domain-that-does-not-exist.com',
    'https://api.example.com/another-endpoint'
];

$promises = [];
foreach ($urls as $index => $url) {
    $promises[$index] = $client->requestAsync('GET', $url)
        ->then(
            function ($response) use ($url) {
                return [
                    'url' => $url,
                    'status' => $response->getStatusCode(),
                    'success' => true
                ];
            },
            function ($exception) use ($url) {
                return [
                    'url' => $url,
                    'error' => $exception->getMessage(),
                    'success' => false
                ];
            }
        );
}

$results = Promise\settle($promises)->wait();

foreach ($results as $result) {
    $data = $result['value'];
    if ($data['success']) {
        echo "Success: " . $data['url'] . " (Status: " . $data['status'] . ")\n";
    } else {
        echo "Error: " . $data['url'] . " (" . $data['error'] . ")\n";
    }
}

When to Use Each Approach

Use Synchronous Requests When:

  1. Simple applications with few HTTP requests
  2. Sequential processing is required (each request depends on the previous one)
  3. Debugging and development phases for easier troubleshooting
  4. Memory constraints are a concern (async uses more memory)
  5. Error handling needs to be straightforward

Use Asynchronous Requests When:

  1. Multiple independent requests need to be made
  2. Performance optimization is critical
  3. Web scraping operations with many URLs
  4. API aggregation from multiple sources
  5. Background processing of HTTP requests

Best Practices for Asynchronous Requests

1. Implement Proper Concurrency Limits

<?php
// Limit concurrent connections to avoid overwhelming servers
$pool = new Pool($client, $requests(), [
    'concurrency' => 10, // Adjust based on target server capacity
]);

2. Handle Timeouts Appropriately

<?php
$promise = $client->requestAsync('GET', $url, [
    'timeout' => 30,         // Total timeout
    'connect_timeout' => 10  // Connection timeout
]);

3. Implement Retry Logic

<?php
use GuzzleHttp\Exception\ConnectException;
use GuzzleHttp\Exception\RequestException;
use GuzzleHttp\Middleware;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\Psr7\Response;

$handlerStack = HandlerStack::create();
$handlerStack->push(Middleware::retry(function ($retries, Request $request, Response $response = null, RequestException $exception = null) {
    // Retry on connection exceptions or 5xx responses
    if ($exception instanceof ConnectException || ($response && $response->getStatusCode() >= 500)) {
        return $retries < 3; // Retry up to 3 times
    }
    return false;
}));

$client = new Client(['handler' => $handlerStack]);

Performance Considerations

When implementing asynchronous requests in your web scraping or API integration projects, consider these performance factors:

  1. Memory Usage: Async requests consume more memory as promises are held in memory
  2. Server Load: Limit concurrency to avoid overwhelming target servers
  3. Network Resources: Monitor bandwidth usage with concurrent requests
  4. Error Rates: High concurrency may increase error rates due to rate limiting

For web scraping applications that need to handle multiple concurrent requests efficiently, understanding these patterns becomes essential. Similar to how modern browser automation tools handle concurrent page processing, Guzzle's asynchronous capabilities allow you to scale your HTTP operations effectively.

JavaScript Comparison

For developers familiar with JavaScript, Guzzle's asynchronous pattern is similar to Promise-based HTTP clients:

// JavaScript equivalent using fetch
async function asyncRequests() {
    const urls = [
        'https://api.example.com/endpoint1',
        'https://api.example.com/endpoint2',
        'https://api.example.com/endpoint3'
    ];

    // Create all promises
    const promises = urls.map(url => fetch(url));

    // Wait for all to complete
    const responses = await Promise.allSettled(promises);

    responses.forEach((response, index) => {
        if (response.status === 'fulfilled') {
            console.log(`Success: ${urls[index]} (Status: ${response.value.status})`);
        } else {
            console.log(`Failed: ${urls[index]} (${response.reason})`);
        }
    });
}

Conclusion

The choice between synchronous and asynchronous requests in Guzzle depends on your specific use case and performance requirements. Synchronous requests offer simplicity and straightforward error handling, making them ideal for simple applications and development phases. Asynchronous requests provide significant performance benefits when dealing with multiple HTTP requests, making them essential for high-performance web scraping and API integration applications.

By leveraging Guzzle's asynchronous capabilities with proper error handling, concurrency limits, and timeout configurations, you can build robust and efficient HTTP client applications that scale effectively with your requirements.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon