Table of contents

How to navigate to different pages using Puppeteer?

How to Navigate to Different Pages Using Puppeteer

Puppeteer provides several methods for navigating between pages in a headless Chrome browser. This guide covers all navigation techniques with practical examples.

Installation

First, install Puppeteer in your Node.js project:

npm install puppeteer

Basic Navigation with page.goto()

The page.goto() method is the primary way to navigate to URLs:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Navigate to a URL
  await page.goto('https://example.com');

  // Take a screenshot to verify navigation
  await page.screenshot({ path: 'example.png' });

  await browser.close();
})();

Navigation Options and Wait Strategies

Control when navigation is considered complete using the waitUntil option:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Wait for different loading states
  await page.goto('https://example.com', {
    waitUntil: 'networkidle0' // Wait until no network requests for 500ms
  });

  // Other wait strategies:
  // 'load' - Wait for load event (default)
  // 'domcontentloaded' - Wait for DOMContentLoaded event
  // 'networkidle2' - Wait until ≤2 network requests for 500ms

  await browser.close();
})();

Sequential Navigation

Navigate through multiple pages in sequence:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Navigate through multiple pages
  const urls = [
    'https://example.com',
    'https://example.com/about',
    'https://example.com/contact'
  ];

  for (const url of urls) {
    console.log(`Navigating to: ${url}`);
    await page.goto(url, { waitUntil: 'networkidle2' });

    // Wait a moment between navigations
    await page.waitForTimeout(1000);
  }

  await browser.close();
})();

Navigation with Error Handling

Handle navigation errors gracefully:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  try {
    const response = await page.goto('https://example.com', {
      waitUntil: 'networkidle2',
      timeout: 30000 // 30 second timeout
    });

    if (response.ok()) {
      console.log('Navigation successful');
    } else {
      console.log(`Navigation failed with status: ${response.status()}`);
    }
  } catch (error) {
    console.error('Navigation error:', error.message);
  }

  await browser.close();
})();

Browser Navigation Methods

Use browser-like navigation methods:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Initial navigation
  await page.goto('https://example.com');

  // Navigate to another page
  await page.goto('https://example.com/about');

  // Go back (like browser back button)
  await page.goBack();

  // Go forward (like browser forward button)
  await page.goForward();

  // Reload the current page
  await page.reload();

  await browser.close();
})();

Navigation with Custom Headers and Referrer

Set custom headers or referrer for navigation:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Set custom headers
  await page.setExtraHTTPHeaders({
    'User-Agent': 'Custom User Agent',
    'Accept-Language': 'en-US,en;q=0.9'
  });

  // Navigate with custom referrer
  await page.goto('https://example.com', {
    referer: 'https://google.com',
    waitUntil: 'networkidle2'
  });

  await browser.close();
})();

Waiting for Specific Elements After Navigation

Wait for specific elements to appear after navigation:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Wait for specific selector to appear
  await page.waitForSelector('#main-content', { timeout: 5000 });

  // Or wait for function to return true
  await page.waitForFunction(
    () => document.querySelector('#main-content') !== null
  );

  await browser.close();
})();

Multiple Tabs Navigation

Navigate in multiple tabs simultaneously:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();

  // Create multiple pages/tabs
  const page1 = await browser.newPage();
  const page2 = await browser.newPage();

  // Navigate both tabs simultaneously
  await Promise.all([
    page1.goto('https://example.com'),
    page2.goto('https://google.com')
  ]);

  console.log('Both pages loaded');

  await browser.close();
})();

Best Practices

  1. Always use try-catch blocks for navigation error handling
  2. Set appropriate timeouts based on expected page load times
  3. Choose the right waitUntil strategy for your use case
  4. Wait for specific elements when needed after navigation
  5. Handle network failures gracefully with retry logic

Common Navigation Patterns

// Retry navigation on failure
async function navigateWithRetry(page, url, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      await page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 });
      return; // Success
    } catch (error) {
      console.log(`Navigation attempt ${i + 1} failed:`, error.message);
      if (i === maxRetries - 1) throw error; // Last attempt failed
      await new Promise(resolve => setTimeout(resolve, 2000)); // Wait before retry
    }
  }
}

Navigation in Puppeteer is powerful and flexible, allowing you to control exactly when and how pages load to ensure reliable automation scripts.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon