Table of contents

How to handle timeouts in Puppeteer?

How to Handle Timeouts in Puppeteer

Puppeteer timeout handling is crucial for building robust web scraping and automation scripts. This guide covers comprehensive timeout management strategies to prevent your scripts from hanging indefinitely.

Understanding Puppeteer Timeouts

By default, Puppeteer operations have a 30-second timeout. When operations exceed this limit, they throw a TimeoutError. Proper timeout handling ensures your scripts fail gracefully rather than hanging indefinitely.

1. Setting Default Timeouts

Global Navigation Timeout

Use setDefaultNavigationTimeout() to set timeouts for all navigation operations:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Set default navigation timeout to 60 seconds
  page.setDefaultNavigationTimeout(60000);

  await page.goto('https://example.com');
  await page.click('a[href="/slow-page"]'); // Uses the 60s timeout

  await browser.close();
})();

Global Operation Timeout

Use setDefaultTimeout() for all Puppeteer operations including waitForSelector, waitForFunction, etc.:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Set default timeout for ALL operations to 45 seconds
  page.setDefaultTimeout(45000);

  await page.goto('https://example.com');
  await page.waitForSelector('.dynamic-content'); // Uses the 45s timeout

  await browser.close();
})();

2. Operation-Specific Timeouts

Navigation Timeouts

Specify timeouts for individual navigation operations:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Different timeouts for different pages
  await page.goto('https://fast-site.com', { timeout: 15000 });
  await page.goto('https://slow-site.com', { timeout: 90000 });

  await browser.close();
})();

Wait Operations Timeouts

Control timeouts for waiting operations:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Wait for selector with custom timeout
  await page.waitForSelector('.load-button', { timeout: 20000 });

  // Wait for function with custom timeout
  await page.waitForFunction(
    () => document.querySelector('.status').textContent === 'Ready',
    { timeout: 10000 }
  );

  await browser.close();
})();

3. Error Handling Strategies

Basic Try-Catch

Handle timeout errors gracefully:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  try {
    await page.goto('https://unreliable-site.com', { timeout: 15000 });
    console.log('Page loaded successfully');
  } catch (error) {
    if (error.name === 'TimeoutError') {
      console.log('Page took too long to load');
    } else {
      console.log('Other error occurred:', error.message);
    }
  }

  await browser.close();
})();

Retry Logic with Timeouts

Implement retry mechanisms for unreliable operations:

const puppeteer = require('puppeteer');

async function navigateWithRetry(page, url, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      await page.goto(url, { 
        timeout: 20000,
        waitUntil: 'networkidle2' 
      });
      console.log(`Successfully loaded ${url}`);
      return;
    } catch (error) {
      console.log(`Attempt ${i + 1} failed: ${error.message}`);
      if (i === maxRetries - 1) throw error;
      await new Promise(resolve => setTimeout(resolve, 2000)); // Wait 2s before retry
    }
  }
}

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  try {
    await navigateWithRetry(page, 'https://flaky-website.com');
  } catch (error) {
    console.log('All retry attempts failed');
  }

  await browser.close();
})();

4. Advanced Timeout Patterns

Race Conditions with Promise.race()

Implement custom timeout logic using Promise.race():

const puppeteer = require('puppeteer');

function withTimeout(promise, ms) {
  const timeout = new Promise((_, reject) =>
    setTimeout(() => reject(new Error('Custom timeout')), ms)
  );
  return Promise.race([promise, timeout]);
}

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  try {
    // Custom 10-second timeout for complex operations
    await withTimeout(
      page.evaluate(() => {
        // Complex client-side operation
        return new Promise(resolve => {
          setTimeout(() => resolve('Operation complete'), 8000);
        });
      }),
      10000
    );
  } catch (error) {
    console.log('Operation timed out or failed:', error.message);
  }

  await browser.close();
})();

Conditional Timeouts

Adjust timeouts based on conditions:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  const isProduction = process.env.NODE_ENV === 'production';
  const timeout = isProduction ? 60000 : 30000; // Longer timeout in production

  page.setDefaultTimeout(timeout);

  try {
    await page.goto('https://example.com');

    // Conditional timeout based on page content
    const hasSlowContent = await page.$('.slow-loading-widget');
    const waitTimeout = hasSlowContent ? 45000 : 15000;

    await page.waitForSelector('.main-content', { timeout: waitTimeout });
  } catch (error) {
    console.log('Timeout or error occurred:', error.message);
  }

  await browser.close();
})();

Best Practices

  1. Set reasonable defaults: Use setDefaultTimeout() early in your script
  2. Be specific: Use operation-specific timeouts for critical operations
  3. Handle errors gracefully: Always wrap timeout-prone operations in try-catch blocks
  4. Consider network conditions: Adjust timeouts based on your target environment
  5. Use appropriate wait conditions: Choose between load, domcontentloaded, networkidle0, and networkidle2 based on your needs
  6. Implement retries: Add retry logic for operations that might occasionally fail due to network issues

Remember that timeout values are specified in milliseconds, and finding the right balance between reliability and performance is key to successful Puppeteer automation.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon