Table of contents

How do I debug JavaScript errors that occur during scraping with Symfony Panther?

Debugging JavaScript errors in Symfony Panther can be challenging since the browser runs in the background. This comprehensive guide covers various debugging techniques, error handling strategies, and best practices to help you identify and resolve JavaScript issues during web scraping.

Understanding JavaScript Errors in Symfony Panther

Symfony Panther uses Chrome/Chromium in headless mode by default, which means JavaScript errors might not be immediately visible. These errors can cause scraping failures, incomplete data extraction, or unexpected behavior in your automation scripts.

Enable Console Logging

The first step in debugging is to enable console logging to capture JavaScript errors and messages:

<?php

use Symfony\Component\Panther\PantherTestCase;
use Symfony\Component\Panther\Client;

class WebScrapingTest extends PantherTestCase
{
    public function testJavaScriptDebugging()
    {
        // Create client with console logging enabled
        $client = static::createPantherClient([
            'browser' => static::CHROME,
            'chromeOptions' => [
                'args' => [
                    '--enable-logging',
                    '--log-level=0',
                    '--v=1'
                ]
            ]
        ]);

        // Enable console message collection
        $client->getWebDriver()->manage()->logs()->getAvailableLogTypes();

        $crawler = $client->request('GET', 'https://example.com');

        // Get console logs after page load
        $logs = $client->getWebDriver()->manage()->logs()->get('browser');

        foreach ($logs as $log) {
            echo "Level: " . $log->getLevel() . "\n";
            echo "Message: " . $log->getMessage() . "\n";
            echo "Timestamp: " . $log->getTimestamp() . "\n\n";
        }
    }
}

Use Non-Headless Mode for Visual Debugging

Running Panther in non-headless mode allows you to see the browser window and observe JavaScript execution:

<?php

use Symfony\Component\Panther\PantherTestCase;

class DebugTest extends PantherTestCase
{
    public function testWithVisibleBrowser()
    {
        // Create client in non-headless mode
        $client = static::createPantherClient([
            'browser' => static::CHROME,
            'chromeOptions' => [
                'args' => [
                    '--window-size=1920,1080',
                    '--start-maximized'
                ]
            ]
        ]);

        $crawler = $client->request('GET', 'https://example.com');

        // Add a pause to observe the page
        sleep(5);

        // Take a screenshot for debugging
        $client->takeScreenshot('debug_screenshot.png');
    }
}

Capture and Handle JavaScript Errors

Implement error handling to catch and process JavaScript errors systematically:

<?php

use Symfony\Component\Panther\PantherTestCase;
use Facebook\WebDriver\Exception\JavascriptException;

class ErrorHandlingTest extends PantherTestCase
{
    public function testJavaScriptErrorHandling()
    {
        $client = static::createPantherClient();
        $crawler = $client->request('GET', 'https://example.com');

        try {
            // Execute JavaScript that might throw an error
            $result = $client->executeScript('
                try {
                    // Your JavaScript code here
                    return document.querySelector("#data").textContent;
                } catch (error) {
                    console.error("JavaScript Error:", error.message);
                    throw new Error("Element not found: " + error.message);
                }
            ');

            echo "Result: " . $result;

        } catch (JavascriptException $e) {
            echo "JavaScript Error: " . $e->getMessage() . "\n";

            // Log additional debugging information
            $this->logPageState($client);
        }
    }

    private function logPageState($client)
    {
        // Get page source for debugging
        $pageSource = $client->getPageSource();
        file_put_contents('debug_page_source.html', $pageSource);

        // Get current URL
        echo "Current URL: " . $client->getCurrentURL() . "\n";

        // Get page title
        echo "Page Title: " . $client->getTitle() . "\n";
    }
}

Wait for JavaScript Execution

Many JavaScript errors occur because elements aren't ready when your code tries to interact with them. Use proper waiting strategies:

<?php

use Symfony\Component\Panther\PantherTestCase;
use Facebook\WebDriver\WebDriverBy;
use Facebook\WebDriver\WebDriverWait;
use Facebook\WebDriver\WebDriverExpectedCondition;

class WaitingTest extends PantherTestCase
{
    public function testWaitForJavaScript()
    {
        $client = static::createPantherClient();
        $crawler = $client->request('GET', 'https://example.com');

        // Wait for specific element to be present
        $wait = new WebDriverWait($client->getWebDriver(), 10);

        try {
            $element = $wait->until(
                WebDriverExpectedCondition::presenceOfElementLocated(
                    WebDriverBy::id('dynamic-content')
                )
            );

            // Wait for JavaScript to modify the element
            $wait->until(function() use ($client) {
                $result = $client->executeScript('
                    var element = document.getElementById("dynamic-content");
                    return element && element.textContent.trim() !== "";
                ');
                return $result === true;
            });

            $content = $element->getText();
            echo "Content: " . $content;

        } catch (\Exception $e) {
            echo "Timeout waiting for element: " . $e->getMessage();
            $this->debugCurrentState($client);
        }
    }

    private function debugCurrentState($client)
    {
        // Check if jQuery is loaded
        $jqueryLoaded = $client->executeScript('return typeof jQuery !== "undefined"');
        echo "jQuery loaded: " . ($jqueryLoaded ? 'Yes' : 'No') . "\n";

        // Check for common JavaScript errors
        $errors = $client->executeScript('
            var errors = [];
            if (window.console && console.error) {
                // Override console.error to capture errors
                var originalError = console.error;
                console.error = function() {
                    errors.push(Array.from(arguments).join(" "));
                    originalError.apply(console, arguments);
                };
            }
            return errors;
        ');

        if (!empty($errors)) {
            echo "JavaScript Errors Found:\n";
            foreach ($errors as $error) {
                echo "- " . $error . "\n";
            }
        }
    }
}

Debug AJAX Requests and Network Issues

Monitor network requests to identify failed AJAX calls that might cause JavaScript errors:

<?php

use Symfony\Component\Panther\PantherTestCase;

class NetworkDebugTest extends PantherTestCase
{
    public function testNetworkMonitoring()
    {
        $client = static::createPantherClient();

        // Enable network domain for Chrome DevTools
        $client->getWebDriver()->executeScript('
            window.requestsLog = [];
            window.errorsLog = [];

            // Override XMLHttpRequest to track AJAX requests
            var originalXHR = window.XMLHttpRequest;
            window.XMLHttpRequest = function() {
                var xhr = new originalXHR();
                var originalOpen = xhr.open;
                var originalSend = xhr.send;

                xhr.open = function(method, url) {
                    this._method = method;
                    this._url = url;
                    return originalOpen.apply(this, arguments);
                };

                xhr.send = function() {
                    var self = this;
                    this.addEventListener("loadend", function() {
                        window.requestsLog.push({
                            method: self._method,
                            url: self._url,
                            status: self.status,
                            response: self.responseText
                        });

                        if (self.status >= 400) {
                            window.errorsLog.push({
                                method: self._method,
                                url: self._url,
                                status: self.status,
                                error: "HTTP " + self.status
                            });
                        }
                    });
                    return originalSend.apply(this, arguments);
                };

                return xhr;
            };
        ');

        $crawler = $client->request('GET', 'https://example.com');

        // Wait for page to load and AJAX requests to complete
        sleep(3);

        // Get network logs
        $requests = $client->executeScript('return window.requestsLog || []');
        $errors = $client->executeScript('return window.errorsLog || []');

        echo "Network Requests:\n";
        foreach ($requests as $request) {
            echo sprintf("- %s %s (Status: %d)\n", 
                $request['method'], 
                $request['url'], 
                $request['status']
            );
        }

        if (!empty($errors)) {
            echo "\nNetwork Errors:\n";
            foreach ($errors as $error) {
                echo sprintf("- %s %s: %s\n", 
                    $error['method'], 
                    $error['url'], 
                    $error['error']
                );
            }
        }
    }
}

Handle Asynchronous JavaScript

When dealing with modern web applications that heavily use asynchronous JavaScript, proper handling is crucial. This is similar to how to handle AJAX requests using Puppeteer, but adapted for Symfony Panther:

<?php

use Symfony\Component\Panther\PantherTestCase;

class AsyncJavaScriptTest extends PantherTestCase
{
    public function testAsyncJavaScript()
    {
        $client = static::createPantherClient();
        $crawler = $client->request('GET', 'https://example.com');

        // Wait for async operations to complete
        $client->executeScript('
            window.asyncComplete = false;

            // Simulate async operation
            setTimeout(function() {
                // Your async code here
                window.asyncComplete = true;
            }, 2000);
        ');

        // Wait for async operation
        $wait = new \Facebook\WebDriver\WebDriverWait($client->getWebDriver(), 10);
        $wait->until(function() use ($client) {
            return $client->executeScript('return window.asyncComplete === true');
        });

        // Now proceed with scraping
        $data = $client->executeScript('
            return document.querySelector("#async-content").textContent;
        ');

        echo "Async Data: " . $data;
    }
}

Debug with DevTools Protocol

For advanced debugging, you can use Chrome DevTools Protocol features:

<?php

use Symfony\Component\Panther\PantherTestCase;

class DevToolsDebugTest extends PantherTestCase
{
    public function testAdvancedDebugging()
    {
        $client = static::createPantherClient([
            'chromeOptions' => [
                'args' => [
                    '--remote-debugging-port=9222',
                    '--enable-logging',
                    '--log-level=0'
                ]
            ]
        ]);

        $crawler = $client->request('GET', 'https://example.com');

        // Execute JavaScript with error handling
        $result = $client->executeScript('
            try {
                // Enable more detailed error reporting
                window.onerror = function(message, source, lineno, colno, error) {
                    console.error("Global Error:", {
                        message: message,
                        source: source,
                        line: lineno,
                        column: colno,
                        error: error ? error.stack : null
                    });
                    return false;
                };

                // Your scraping logic here
                var element = document.querySelector("#target");
                if (!element) {
                    throw new Error("Target element not found");
                }

                return element.textContent;

            } catch (error) {
                console.error("Caught Error:", error.message);
                console.error("Stack:", error.stack);
                return null;
            }
        ');

        if ($result === null) {
            echo "JavaScript execution failed. Check console logs.\n";
            $this->dumpDebuggingInfo($client);
        } else {
            echo "Result: " . $result;
        }
    }

    private function dumpDebuggingInfo($client)
    {
        // Get console logs
        $logs = $client->getWebDriver()->manage()->logs()->get('browser');
        echo "Console Logs:\n";
        foreach ($logs as $log) {
            echo sprintf("[%s] %s\n", $log->getLevel(), $log->getMessage());
        }

        // Get page performance data
        $performanceData = $client->executeScript('
            return {
                readyState: document.readyState,
                loadEventEnd: performance.timing.loadEventEnd,
                domContentLoaded: performance.timing.domContentLoadedEventEnd,
                navigationStart: performance.timing.navigationStart
            };
        ');

        echo "Performance Data:\n";
        print_r($performanceData);
    }
}

Common JavaScript Error Patterns and Solutions

1. Element Not Found Errors

// Problem: Element not ready when JavaScript executes
// Solution: Use proper waiting strategies

$wait = new WebDriverWait($client->getWebDriver(), 10);
$element = $wait->until(
    WebDriverExpectedCondition::elementToBeClickable(
        WebDriverBy::id('button')
    )
);

2. Timing Issues with Dynamic Content

// Problem: Content loads after page ready event
// Solution: Wait for specific conditions

$client->executeScript('
    function waitForContent() {
        return new Promise((resolve) => {
            const checkContent = () => {
                const element = document.querySelector("#dynamic");
                if (element && element.textContent.trim()) {
                    resolve(element.textContent);
                } else {
                    setTimeout(checkContent, 100);
                }
            };
            checkContent();
        });
    }

    return waitForContent();
');

3. Cross-Origin Issues

// Problem: CORS errors preventing resource loading
// Solution: Disable web security for testing (development only)

$client = static::createPantherClient([
    'chromeOptions' => [
        'args' => [
            '--disable-web-security',
            '--allow-running-insecure-content'
        ]
    ]
]);

Best Practices for JavaScript Debugging

  1. Always use try-catch blocks in your JavaScript code
  2. Enable verbose logging during development
  3. Take screenshots at critical points for visual debugging
  4. Use proper waiting strategies instead of fixed sleeps
  5. Monitor network requests to identify failed API calls
  6. Test in non-headless mode when debugging complex issues

Similar to how to handle errors in Puppeteer, proper error handling in Symfony Panther requires a systematic approach to catch, log, and respond to different types of failures.

Advanced Debugging Techniques

Custom Error Reporter

<?php

class JavaScriptErrorReporter
{
    private $client;
    private $errors = [];

    public function __construct($client)
    {
        $this->client = $client;
        $this->setupErrorCapture();
    }

    private function setupErrorCapture()
    {
        $this->client->executeScript('
            window.pantherErrors = [];

            window.onerror = function(message, source, lineno, colno, error) {
                window.pantherErrors.push({
                    type: "error",
                    message: message,
                    source: source,
                    line: lineno,
                    column: colno,
                    stack: error ? error.stack : null,
                    timestamp: Date.now()
                });
            };

            window.addEventListener("unhandledrejection", function(event) {
                window.pantherErrors.push({
                    type: "unhandledrejection",
                    message: event.reason ? event.reason.toString() : "Unknown promise rejection",
                    timestamp: Date.now()
                });
            });
        ');
    }

    public function getErrors()
    {
        return $this->client->executeScript('return window.pantherErrors || []');
    }

    public function clearErrors()
    {
        $this->client->executeScript('window.pantherErrors = []');
    }

    public function hasErrors()
    {
        $errors = $this->getErrors();
        return !empty($errors);
    }
}

Debugging with Screenshots and Page Dumps

Visual debugging can be incredibly helpful when JavaScript errors affect page rendering:

<?php

class VisualDebugger
{
    private $client;
    private $debugDir;

    public function __construct($client, $debugDir = 'debug')
    {
        $this->client = $client;
        $this->debugDir = $debugDir;

        if (!is_dir($debugDir)) {
            mkdir($debugDir, 0755, true);
        }
    }

    public function captureState($label = 'debug')
    {
        $timestamp = date('Y-m-d_H-i-s');
        $prefix = $this->debugDir . '/' . $label . '_' . $timestamp;

        // Take screenshot
        $screenshotPath = $prefix . '.png';
        $this->client->takeScreenshot($screenshotPath);

        // Save page source
        $htmlPath = $prefix . '.html';
        file_put_contents($htmlPath, $this->client->getPageSource());

        // Save console logs
        $logPath = $prefix . '_console.log';
        $logs = $this->client->getWebDriver()->manage()->logs()->get('browser');
        $logContent = "";
        foreach ($logs as $log) {
            $logContent .= sprintf("[%s] %s: %s\n", 
                date('Y-m-d H:i:s', $log->getTimestamp() / 1000),
                $log->getLevel(),
                $log->getMessage()
            );
        }
        file_put_contents($logPath, $logContent);

        echo "Debug info saved to: $prefix.*\n";
    }
}

By implementing these debugging techniques and following best practices, you can effectively identify and resolve JavaScript errors in your Symfony Panther web scraping projects. Remember to always test your scraping scripts thoroughly and implement proper error handling to make your automation more robust and reliable.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon