Table of contents

How to use Puppeteer with headless vs non-headless modes?

Puppeteer offers two distinct execution modes: headless and non-headless (also called "headed"). Understanding when and how to use each mode is crucial for effective web scraping, testing, and automation. This guide explores both modes, their advantages, disadvantages, and practical implementation strategies.

Understanding Headless vs Non-Headless Modes

Headless Mode

Headless mode runs the browser without a visible user interface. The browser operates in the background, executing JavaScript and rendering pages without displaying them on screen. This is the default mode for Puppeteer and is ideal for automated tasks where visual feedback isn't necessary.

Non-Headless Mode

Non-headless mode displays the browser window, allowing you to see the automation in action. This mode is particularly useful for debugging, development, and scenarios where you need to observe the browser's behavior visually.

Basic Configuration

Launching Puppeteer in Headless Mode

const puppeteer = require('puppeteer');

(async () => {
  // Default headless mode
  const browser = await puppeteer.launch();

  // Explicitly set headless mode
  const browserHeadless = await puppeteer.launch({
    headless: true
  });

  const page = await browser.newPage();
  await page.goto('https://example.com');

  // Your scraping logic here
  const title = await page.title();
  console.log('Page title:', title);

  await browser.close();
})();

Launching Puppeteer in Non-Headless Mode

const puppeteer = require('puppeteer');

(async () => {
  // Launch in non-headless mode
  const browser = await puppeteer.launch({
    headless: false,
    slowMo: 100, // Slow down operations for better visibility
    devtools: true // Open DevTools automatically
  });

  const page = await browser.newPage();
  await page.goto('https://example.com');

  // Your automation logic here
  await page.click('button');
  await page.type('input[name="search"]', 'web scraping');

  // Keep browser open for inspection
  // await browser.close();
})();

Advanced Configuration Options

Puppeteer Launch Options for Different Modes

const puppeteer = require('puppeteer');

// Headless mode with performance optimizations
const headlessConfig = {
  headless: true,
  args: [
    '--no-sandbox',
    '--disable-setuid-sandbox',
    '--disable-dev-shm-usage',
    '--disable-web-security',
    '--disable-features=site-per-process'
  ]
};

// Non-headless mode with debugging features
const nonHeadlessConfig = {
  headless: false,
  devtools: true,
  slowMo: 250,
  args: [
    '--start-maximized',
    '--disable-web-security',
    '--disable-features=site-per-process'
  ],
  defaultViewport: null
};

// Usage example
const browser = await puppeteer.launch(headlessConfig);

New Headless Mode (Puppeteer 19+)

const puppeteer = require('puppeteer');

(async () => {
  // New headless mode (Chrome's new headless implementation)
  const browser = await puppeteer.launch({
    headless: 'new' // or 'chrome' in newer versions
  });

  // Old headless mode (legacy)
  const browserOld = await puppeteer.launch({
    headless: 'shell' // or true for backward compatibility
  });

  const page = await browser.newPage();
  await page.goto('https://example.com');

  await browser.close();
})();

When to Use Each Mode

Use Headless Mode For:

  1. Production Web Scraping: Maximum performance and resource efficiency
  2. Automated Testing: CI/CD pipelines and automated test suites
  3. Server Environments: Docker containers and cloud deployments
  4. Batch Processing: Large-scale data extraction tasks
  5. Performance-Critical Applications: When speed and memory usage matter

Use Non-Headless Mode For:

  1. Development and Debugging: Visualizing automation steps
  2. Interactive Applications: User-guided automation
  3. Troubleshooting: Identifying issues with selectors or timing
  4. Learning and Experimentation: Understanding how automation works
  5. Complex User Interactions: When manual intervention might be needed

Python Implementation with Pyppeteer

import asyncio
from pyppeteer import launch

async def headless_scraping():
    # Headless mode
    browser = await launch(headless=True)
    page = await browser.newPage()
    await page.goto('https://example.com')

    title = await page.title()
    print(f'Page title: {title}')

    await browser.close()

async def non_headless_scraping():
    # Non-headless mode
    browser = await launch(
        headless=False,
        slowMo=100,
        devtools=True,
        args=['--start-maximized']
    )

    page = await browser.newPage()
    await page.goto('https://example.com')

    # Perform actions with visual feedback
    await page.click('button')
    await page.type('input[name="search"]', 'automation')

    # Keep browser open for inspection
    # await browser.close()

# Run the functions
asyncio.run(headless_scraping())

Dynamic Mode Switching

const puppeteer = require('puppeteer');

class PuppeteerManager {
  constructor() {
    this.browser = null;
    this.debugMode = process.env.DEBUG === 'true';
  }

  async initialize() {
    const config = {
      headless: !this.debugMode,
      devtools: this.debugMode,
      slowMo: this.debugMode ? 250 : 0,
      args: [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        ...(this.debugMode ? ['--start-maximized'] : [])
      ]
    };

    this.browser = await puppeteer.launch(config);
    return this.browser;
  }

  async createPage() {
    if (!this.browser) {
      await this.initialize();
    }
    return await this.browser.newPage();
  }

  async close() {
    if (this.browser) {
      await this.browser.close();
    }
  }
}

// Usage
const manager = new PuppeteerManager();
const page = await manager.createPage();
await page.goto('https://example.com');

Performance Considerations

Headless Mode Optimizations

const puppeteer = require('puppeteer');

const optimizedHeadlessConfig = {
  headless: true,
  args: [
    '--no-sandbox',
    '--disable-setuid-sandbox',
    '--disable-dev-shm-usage',
    '--disable-accelerated-2d-canvas',
    '--no-first-run',
    '--no-zygote',
    '--single-process',
    '--disable-gpu'
  ]
};

const browser = await puppeteer.launch(optimizedHeadlessConfig);
const page = await browser.newPage();

// Disable images and CSS for faster loading
await page.setRequestInterception(true);
page.on('request', (req) => {
  if (req.resourceType() === 'stylesheet' || req.resourceType() === 'image') {
    req.abort();
  } else {
    req.continue();
  }
});

Memory Management

const puppeteer = require('puppeteer');

async function efficientScraping() {
  const browser = await puppeteer.launch({
    headless: true,
    args: ['--max-old-space-size=4096']
  });

  try {
    const page = await browser.newPage();

    // Set viewport for consistent rendering
    await page.setViewport({ width: 1920, height: 1080 });

    // Navigate and scrape
    await page.goto('https://example.com');
    const data = await page.evaluate(() => {
      // Your scraping logic
      return document.title;
    });

    console.log('Scraped data:', data);

  } finally {
    await browser.close();
  }
}

Debugging and Development Tips

Console Logging in Different Modes

const puppeteer = require('puppeteer');

async function debuggingExample() {
  const browser = await puppeteer.launch({
    headless: false,
    devtools: true
  });

  const page = await browser.newPage();

  // Listen to console messages
  page.on('console', msg => {
    console.log('PAGE LOG:', msg.text());
  });

  // Listen to page errors
  page.on('pageerror', err => {
    console.log('PAGE ERROR:', err.message);
  });

  await page.goto('https://example.com');

  // Inject debugging code
  await page.evaluate(() => {
    console.log('This message will appear in both browser and Node.js console');
  });
}

Screenshots and Visual Debugging

const puppeteer = require('puppeteer');

async function visualDebugging() {
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Take screenshot for debugging
  await page.screenshot({ 
    path: 'debug-screenshot.png',
    fullPage: true 
  });

  // Highlight elements for debugging
  await page.evaluate(() => {
    const elements = document.querySelectorAll('a');
    elements.forEach(el => {
      el.style.border = '2px solid red';
    });
  });

  await page.screenshot({ path: 'debug-highlighted.png' });

  await browser.close();
}

Integration with Testing Frameworks

Jest Configuration

// jest-puppeteer.config.js
module.exports = {
  launch: {
    headless: process.env.CI === 'true',
    devtools: process.env.NODE_ENV === 'development',
    slowMo: process.env.NODE_ENV === 'development' ? 250 : 0
  }
};

// test file
describe('Page Tests', () => {
  beforeEach(async () => {
    await page.goto('https://example.com');
  });

  test('should have correct title', async () => {
    const title = await page.title();
    expect(title).toBe('Expected Title');
  });
});

Best Practices and Recommendations

  1. Default to Headless: Use headless mode for production and automated tasks
  2. Debug with Non-Headless: Switch to non-headless mode during development
  3. Environment-Based Configuration: Use environment variables to control mode
  4. Resource Management: Always close browsers to prevent memory leaks
  5. Error Handling: Implement proper error handling for both modes
  6. Performance Monitoring: Monitor resource usage, especially in headless mode

Alternative Solutions

While Puppeteer is excellent for browser automation, consider how to run Playwright in headless mode for similar functionality with additional browser support. For more complex scenarios involving multiple browsers, explore how to set up Playwright for multiple browsers.

Conclusion

Choosing between headless and non-headless modes depends on your specific use case. Headless mode offers superior performance and resource efficiency for automated tasks, while non-headless mode provides valuable visual feedback for development and debugging. By understanding the strengths and limitations of each mode, you can optimize your web scraping and automation workflows for both development and production environments.

Remember to implement proper error handling, resource management, and consider the trade-offs between performance and visibility when selecting the appropriate mode for your Puppeteer applications.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon