Table of contents

How do I configure timeouts for page loads and element interactions in Symfony Panther?

Configuring proper timeouts is crucial for reliable web scraping and testing with Symfony Panther. This guide covers various timeout configurations, from basic page load timeouts to advanced element interaction timeouts and custom wait conditions.

Understanding Symfony Panther Timeouts

Symfony Panther, built on top of ChromeDriver and Facebook WebDriver, provides several timeout mechanisms to handle different scenarios:

  • Page load timeouts: Control how long to wait for complete page loading
  • Element interaction timeouts: Set maximum wait times for element availability
  • Script execution timeouts: Configure timeout for JavaScript execution
  • Implicit waits: Set default wait times for element location

Basic Timeout Configuration

Setting Page Load Timeout

Configure the maximum time to wait for page loads using the setPageLoadTimeout() method:

<?php

use Symfony\Component\Panther\PantherTestCase;
use Symfony\Component\Panther\Client;

class WebScrapingTest extends PantherTestCase
{
    public function testPageLoadTimeout()
    {
        $client = static::createPantherClient();

        // Set page load timeout to 30 seconds
        $client->getWebDriver()->manage()->timeouts()->pageLoadTimeout(30);

        // Navigate to a page
        $crawler = $client->request('GET', 'https://example.com');

        // Page will timeout if loading takes more than 30 seconds
    }
}

Configuring Implicit Wait Timeout

Set default timeout for element location attempts:

<?php

public function testImplicitWait()
{
    $client = static::createPantherClient();

    // Set implicit wait to 10 seconds
    $client->getWebDriver()->manage()->timeouts()->implicitlyWait(10);

    $crawler = $client->request('GET', 'https://example.com');

    // All element searches will wait up to 10 seconds
    $element = $crawler->filter('#dynamic-content')->first();
}

Setting Script Execution Timeout

Configure timeout for JavaScript execution:

<?php

public function testScriptTimeout()
{
    $client = static::createPantherClient();

    // Set script execution timeout to 15 seconds
    $client->getWebDriver()->manage()->timeouts()->setScriptTimeout(15);

    $crawler = $client->request('GET', 'https://example.com');

    // Execute JavaScript with timeout protection
    $result = $client->executeScript('
        return new Promise((resolve) => {
            setTimeout(() => resolve("Done"), 5000);
        });
    ');
}

Advanced Timeout Strategies

Custom Wait Conditions

Implement custom wait conditions for specific scenarios:

<?php

use Facebook\WebDriver\WebDriverWait;
use Facebook\WebDriver\WebDriverBy;
use Facebook\WebDriver\WebDriverExpectedCondition;

public function testCustomWaitConditions()
{
    $client = static::createPantherClient();
    $crawler = $client->request('GET', 'https://example.com');

    $webDriver = $client->getWebDriver();
    $wait = new WebDriverWait($webDriver, 20); // 20 second timeout

    // Wait for element to be clickable
    $wait->until(
        WebDriverExpectedCondition::elementToBeClickable(
            WebDriverBy::id('submit-button')
        )
    );

    // Wait for text to appear
    $wait->until(
        WebDriverExpectedCondition::textToBePresentInElement(
            WebDriverBy::className('status'),
            'Processing complete'
        )
    );

    // Wait for element to disappear
    $wait->until(
        WebDriverExpectedCondition::invisibilityOfElementLocated(
            WebDriverBy::className('loading-spinner')
        )
    );
}

Fluent Wait Implementation

Create more flexible wait conditions with polling intervals:

<?php

use Facebook\WebDriver\WebDriverWait;
use Facebook\WebDriver\Exception\TimeoutException;

public function testFluentWait()
{
    $client = static::createPantherClient();
    $crawler = $client->request('GET', 'https://example.com');

    $webDriver = $client->getWebDriver();

    // Create fluent wait with custom polling
    $wait = new WebDriverWait($webDriver, 30, 500); // 30s timeout, 500ms polling

    try {
        $element = $wait->until(function ($driver) {
            $elements = $driver->findElements(WebDriverBy::className('dynamic-content'));

            if (count($elements) > 0 && $elements[0]->isDisplayed()) {
                return $elements[0];
            }

            return false;
        });

        echo "Element found and visible!";
    } catch (TimeoutException $e) {
        echo "Element not found within timeout period";
    }
}

Timeout Configuration for Different Scenarios

AJAX Content Loading

Handle dynamic content that loads via AJAX requests:

<?php

public function testAjaxContentTimeout()
{
    $client = static::createPantherClient();
    $crawler = $client->request('GET', 'https://example.com/ajax-page');

    $webDriver = $client->getWebDriver();
    $wait = new WebDriverWait($webDriver, 25);

    // Click button that triggers AJAX
    $crawler->filter('#load-data')->click();

    // Wait for AJAX content to load
    $wait->until(function ($driver) {
        $elements = $driver->findElements(WebDriverBy::className('ajax-content'));
        return count($elements) > 0 && !empty($elements[0]->getText());
    });

    // Verify content loaded
    $content = $crawler->filter('.ajax-content')->text();
    $this->assertNotEmpty($content);
}

Form Submission Timeouts

Configure timeouts for form interactions and submissions:

<?php

public function testFormSubmissionTimeout()
{
    $client = static::createPantherClient();

    // Set longer page load timeout for form submissions
    $client->getWebDriver()->manage()->timeouts()->pageLoadTimeout(45);

    $crawler = $client->request('GET', 'https://example.com/contact');

    $webDriver = $client->getWebDriver();
    $wait = new WebDriverWait($webDriver, 20);

    // Fill form fields
    $form = $crawler->selectButton('Submit')->form([
        'name' => 'John Doe',
        'email' => 'john@example.com',
        'message' => 'Test message'
    ]);

    // Submit form
    $crawler = $client->submit($form);

    // Wait for success message
    $wait->until(
        WebDriverExpectedCondition::presenceOfElementLocated(
            WebDriverBy::className('success-message')
        )
    );
}

Global Timeout Configuration

Setting Default Timeouts

Configure default timeouts in your test setup:

<?php

use Symfony\Component\Panther\PantherTestCase;

class BaseWebScrapingTest extends PantherTestCase
{
    protected function setUp(): void
    {
        parent::setUp();

        $this->configureDefaultTimeouts();
    }

    private function configureDefaultTimeouts(): void
    {
        $client = static::createPantherClient();
        $timeouts = $client->getWebDriver()->manage()->timeouts();

        // Set standard timeouts for all tests
        $timeouts->pageLoadTimeout(30);        // 30 seconds for page loads
        $timeouts->implicitlyWait(10);         // 10 seconds for element searches
        $timeouts->setScriptTimeout(20);       // 20 seconds for script execution
    }
}

Environment-Based Timeout Configuration

Adjust timeouts based on environment conditions:

<?php

public function configureEnvironmentTimeouts(): void
{
    $client = static::createPantherClient();
    $timeouts = $client->getWebDriver()->manage()->timeouts();

    // Adjust timeouts based on environment
    if (getenv('APP_ENV') === 'test') {
        // Shorter timeouts for fast test execution
        $timeouts->pageLoadTimeout(15);
        $timeouts->implicitlyWait(5);
    } elseif (getenv('CI') === 'true') {
        // Longer timeouts for CI environments
        $timeouts->pageLoadTimeout(60);
        $timeouts->implicitlyWait(15);
    } else {
        // Standard timeouts for development
        $timeouts->pageLoadTimeout(30);
        $timeouts->implicitlyWait(10);
    }
}

Error Handling and Timeout Recovery

Implementing Retry Logic

Handle timeout exceptions with retry mechanisms:

<?php

use Facebook\WebDriver\Exception\TimeoutException;

public function testWithRetryLogic()
{
    $client = static::createPantherClient();
    $maxRetries = 3;
    $retryCount = 0;

    while ($retryCount < $maxRetries) {
        try {
            $crawler = $client->request('GET', 'https://example.com');

            $webDriver = $client->getWebDriver();
            $wait = new WebDriverWait($webDriver, 15);

            $element = $wait->until(
                WebDriverExpectedCondition::presenceOfElementLocated(
                    WebDriverBy::id('target-element')
                )
            );

            // Success - break out of retry loop
            break;

        } catch (TimeoutException $e) {
            $retryCount++;

            if ($retryCount >= $maxRetries) {
                throw new \Exception("Failed after {$maxRetries} retries: " . $e->getMessage());
            }

            // Wait before retry
            sleep(2);
        }
    }
}

Best Practices for Timeout Configuration

Choosing Appropriate Timeout Values

Consider these factors when setting timeouts:

<?php

class TimeoutBestPractices
{
    // Conservative timeouts for production scraping
    const PRODUCTION_PAGE_LOAD = 45;
    const PRODUCTION_ELEMENT_WAIT = 20;
    const PRODUCTION_SCRIPT_TIMEOUT = 30;

    // Faster timeouts for testing
    const TEST_PAGE_LOAD = 15;
    const TEST_ELEMENT_WAIT = 8;
    const TEST_SCRIPT_TIMEOUT = 10;

    public function getRecommendedTimeouts(string $environment): array
    {
        return match($environment) {
            'production', 'scraping' => [
                'pageLoad' => self::PRODUCTION_PAGE_LOAD,
                'elementWait' => self::PRODUCTION_ELEMENT_WAIT,
                'scriptTimeout' => self::PRODUCTION_SCRIPT_TIMEOUT,
            ],
            'test', 'ci' => [
                'pageLoad' => self::TEST_PAGE_LOAD,
                'elementWait' => self::TEST_ELEMENT_WAIT,
                'scriptTimeout' => self::TEST_SCRIPT_TIMEOUT,
            ],
            default => [
                'pageLoad' => 30,
                'elementWait' => 15,
                'scriptTimeout' => 20,
            ]
        };
    }
}

Progressive Timeout Strategy

Implement progressive timeout increases for unstable connections:

<?php

public function testProgressiveTimeouts()
{
    $client = static::createPantherClient();
    $baseTimeout = 10;
    $maxTimeout = 60;

    for ($attempt = 1; $attempt <= 3; $attempt++) {
        $currentTimeout = min($baseTimeout * $attempt, $maxTimeout);

        try {
            $webDriver = $client->getWebDriver();
            $wait = new WebDriverWait($webDriver, $currentTimeout);

            $crawler = $client->request('GET', 'https://example.com');

            $element = $wait->until(
                WebDriverExpectedCondition::presenceOfElementLocated(
                    WebDriverBy::className('slow-loading-content')
                )
            );

            break; // Success

        } catch (TimeoutException $e) {
            if ($attempt === 3) {
                throw $e; // Final attempt failed
            }

            echo "Attempt {$attempt} timed out after {$currentTimeout}s, retrying...";
        }
    }
}

Working with WebDriver Wait Classes

Understanding WebDriverWait vs WebDriverExpectedCondition

Use these classes for sophisticated timeout handling:

<?php

public function testAdvancedWaitStrategies()
{
    $client = static::createPantherClient();
    $crawler = $client->request('GET', 'https://example.com');

    $webDriver = $client->getWebDriver();
    $wait = new WebDriverWait($webDriver, 20);

    // Wait for element to be present
    $wait->until(
        WebDriverExpectedCondition::presenceOfElementLocated(
            WebDriverBy::id('dynamic-element')
        )
    );

    // Wait for element to be visible
    $wait->until(
        WebDriverExpectedCondition::visibilityOfElementLocated(
            WebDriverBy::className('hidden-element')
        )
    );

    // Wait for title to contain specific text
    $wait->until(
        WebDriverExpectedCondition::titleContains('Expected Title')
    );

    // Wait for URL to change
    $wait->until(
        WebDriverExpectedCondition::urlContains('/new-page')
    );
}

Custom Condition Functions

Create your own waiting conditions:

<?php

public function testCustomConditions()
{
    $client = static::createPantherClient();
    $crawler = $client->request('GET', 'https://example.com');

    $webDriver = $client->getWebDriver();
    $wait = new WebDriverWait($webDriver, 30);

    // Wait for multiple elements to be present
    $wait->until(function ($driver) {
        $elements = $driver->findElements(WebDriverBy::className('item'));
        return count($elements) >= 5; // Wait for at least 5 items
    });

    // Wait for element text to match pattern
    $wait->until(function ($driver) {
        $elements = $driver->findElements(WebDriverBy::id('status'));
        if (count($elements) > 0) {
            $text = $elements[0]->getText();
            return preg_match('/^Complete/', $text);
        }
        return false;
    });
}

Integration with WebScraping.AI

For production web scraping scenarios, you might want to combine Symfony Panther's timeout capabilities with handling timeouts in Puppeteer to understand cross-platform timeout management strategies.

When dealing with single-page applications that require specific timeout handling, consider the techniques for crawling SPAs with proper timeout management to ensure reliable data extraction.

Debugging Timeout Issues

Logging and Monitoring

Add logging to understand timeout behavior:

<?php

use Psr\Log\LoggerInterface;

class TimoutDebugger
{
    private LoggerInterface $logger;

    public function __construct(LoggerInterface $logger)
    {
        $this->logger = $logger;
    }

    public function waitWithLogging($wait, $condition, string $description): mixed
    {
        $startTime = microtime(true);

        try {
            $this->logger->info("Starting wait for: {$description}");

            $result = $wait->until($condition);

            $duration = microtime(true) - $startTime;
            $this->logger->info("Wait successful after {$duration}s: {$description}");

            return $result;

        } catch (TimeoutException $e) {
            $duration = microtime(true) - $startTime;
            $this->logger->error("Wait timeout after {$duration}s: {$description}");
            throw $e;
        }
    }
}

Performance Monitoring

Track timeout performance across different scenarios:

<?php

class TimeoutPerformanceMonitor
{
    private array $metrics = [];

    public function recordTimeout(string $operation, float $duration, bool $success): void
    {
        $this->metrics[] = [
            'operation' => $operation,
            'duration' => $duration,
            'success' => $success,
            'timestamp' => time()
        ];
    }

    public function getAverageTimeout(string $operation): float
    {
        $relevant = array_filter($this->metrics, fn($m) => $m['operation'] === $operation);
        $successful = array_filter($relevant, fn($m) => $m['success']);

        if (empty($successful)) {
            return 0.0;
        }

        $totalDuration = array_sum(array_column($successful, 'duration'));
        return $totalDuration / count($successful);
    }

    public function getTimeoutRecommendation(string $operation): int
    {
        $average = $this->getAverageTimeout($operation);
        // Add 50% buffer to average for reliability
        return max(10, ceil($average * 1.5));
    }
}

Conclusion

Proper timeout configuration in Symfony Panther is essential for reliable web scraping and testing. By implementing appropriate page load timeouts, element interaction timeouts, and custom wait conditions, you can handle various scenarios from simple page navigation to complex AJAX interactions. Remember to adjust timeout values based on your environment and implement retry logic for better resilience against network issues and slow-loading content.

The key is to balance between reasonable wait times that accommodate slow-loading content and efficient execution that doesn't unnecessarily delay your scraping operations. Start with conservative timeouts and adjust based on your specific use cases and performance requirements. Monitor your timeout performance over time and use the data to optimize your configuration for maximum reliability and efficiency.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon