How to use Puppeteer with headless vs non-headless modes?
Puppeteer offers two distinct execution modes: headless and non-headless (also called "headed"). Understanding when and how to use each mode is crucial for effective web scraping, testing, and automation. This guide explores both modes, their advantages, disadvantages, and practical implementation strategies.
Understanding Headless vs Non-Headless Modes
Headless Mode
Headless mode runs the browser without a visible user interface. The browser operates in the background, executing JavaScript and rendering pages without displaying them on screen. This is the default mode for Puppeteer and is ideal for automated tasks where visual feedback isn't necessary.
Non-Headless Mode
Non-headless mode displays the browser window, allowing you to see the automation in action. This mode is particularly useful for debugging, development, and scenarios where you need to observe the browser's behavior visually.
Basic Configuration
Launching Puppeteer in Headless Mode
const puppeteer = require('puppeteer');
(async () => {
// Default headless mode
const browser = await puppeteer.launch();
// Explicitly set headless mode
const browserHeadless = await puppeteer.launch({
headless: true
});
const page = await browser.newPage();
await page.goto('https://example.com');
// Your scraping logic here
const title = await page.title();
console.log('Page title:', title);
await browser.close();
})();
Launching Puppeteer in Non-Headless Mode
const puppeteer = require('puppeteer');
(async () => {
// Launch in non-headless mode
const browser = await puppeteer.launch({
headless: false,
slowMo: 100, // Slow down operations for better visibility
devtools: true // Open DevTools automatically
});
const page = await browser.newPage();
await page.goto('https://example.com');
// Your automation logic here
await page.click('button');
await page.type('input[name="search"]', 'web scraping');
// Keep browser open for inspection
// await browser.close();
})();
Advanced Configuration Options
Puppeteer Launch Options for Different Modes
const puppeteer = require('puppeteer');
// Headless mode with performance optimizations
const headlessConfig = {
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-web-security',
'--disable-features=site-per-process'
]
};
// Non-headless mode with debugging features
const nonHeadlessConfig = {
headless: false,
devtools: true,
slowMo: 250,
args: [
'--start-maximized',
'--disable-web-security',
'--disable-features=site-per-process'
],
defaultViewport: null
};
// Usage example
const browser = await puppeteer.launch(headlessConfig);
New Headless Mode (Puppeteer 19+)
const puppeteer = require('puppeteer');
(async () => {
// New headless mode (Chrome's new headless implementation)
const browser = await puppeteer.launch({
headless: 'new' // or 'chrome' in newer versions
});
// Old headless mode (legacy)
const browserOld = await puppeteer.launch({
headless: 'shell' // or true for backward compatibility
});
const page = await browser.newPage();
await page.goto('https://example.com');
await browser.close();
})();
When to Use Each Mode
Use Headless Mode For:
- Production Web Scraping: Maximum performance and resource efficiency
- Automated Testing: CI/CD pipelines and automated test suites
- Server Environments: Docker containers and cloud deployments
- Batch Processing: Large-scale data extraction tasks
- Performance-Critical Applications: When speed and memory usage matter
Use Non-Headless Mode For:
- Development and Debugging: Visualizing automation steps
- Interactive Applications: User-guided automation
- Troubleshooting: Identifying issues with selectors or timing
- Learning and Experimentation: Understanding how automation works
- Complex User Interactions: When manual intervention might be needed
Python Implementation with Pyppeteer
import asyncio
from pyppeteer import launch
async def headless_scraping():
# Headless mode
browser = await launch(headless=True)
page = await browser.newPage()
await page.goto('https://example.com')
title = await page.title()
print(f'Page title: {title}')
await browser.close()
async def non_headless_scraping():
# Non-headless mode
browser = await launch(
headless=False,
slowMo=100,
devtools=True,
args=['--start-maximized']
)
page = await browser.newPage()
await page.goto('https://example.com')
# Perform actions with visual feedback
await page.click('button')
await page.type('input[name="search"]', 'automation')
# Keep browser open for inspection
# await browser.close()
# Run the functions
asyncio.run(headless_scraping())
Dynamic Mode Switching
const puppeteer = require('puppeteer');
class PuppeteerManager {
constructor() {
this.browser = null;
this.debugMode = process.env.DEBUG === 'true';
}
async initialize() {
const config = {
headless: !this.debugMode,
devtools: this.debugMode,
slowMo: this.debugMode ? 250 : 0,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
...(this.debugMode ? ['--start-maximized'] : [])
]
};
this.browser = await puppeteer.launch(config);
return this.browser;
}
async createPage() {
if (!this.browser) {
await this.initialize();
}
return await this.browser.newPage();
}
async close() {
if (this.browser) {
await this.browser.close();
}
}
}
// Usage
const manager = new PuppeteerManager();
const page = await manager.createPage();
await page.goto('https://example.com');
Performance Considerations
Headless Mode Optimizations
const puppeteer = require('puppeteer');
const optimizedHeadlessConfig = {
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-accelerated-2d-canvas',
'--no-first-run',
'--no-zygote',
'--single-process',
'--disable-gpu'
]
};
const browser = await puppeteer.launch(optimizedHeadlessConfig);
const page = await browser.newPage();
// Disable images and CSS for faster loading
await page.setRequestInterception(true);
page.on('request', (req) => {
if (req.resourceType() === 'stylesheet' || req.resourceType() === 'image') {
req.abort();
} else {
req.continue();
}
});
Memory Management
const puppeteer = require('puppeteer');
async function efficientScraping() {
const browser = await puppeteer.launch({
headless: true,
args: ['--max-old-space-size=4096']
});
try {
const page = await browser.newPage();
// Set viewport for consistent rendering
await page.setViewport({ width: 1920, height: 1080 });
// Navigate and scrape
await page.goto('https://example.com');
const data = await page.evaluate(() => {
// Your scraping logic
return document.title;
});
console.log('Scraped data:', data);
} finally {
await browser.close();
}
}
Debugging and Development Tips
Console Logging in Different Modes
const puppeteer = require('puppeteer');
async function debuggingExample() {
const browser = await puppeteer.launch({
headless: false,
devtools: true
});
const page = await browser.newPage();
// Listen to console messages
page.on('console', msg => {
console.log('PAGE LOG:', msg.text());
});
// Listen to page errors
page.on('pageerror', err => {
console.log('PAGE ERROR:', err.message);
});
await page.goto('https://example.com');
// Inject debugging code
await page.evaluate(() => {
console.log('This message will appear in both browser and Node.js console');
});
}
Screenshots and Visual Debugging
const puppeteer = require('puppeteer');
async function visualDebugging() {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://example.com');
// Take screenshot for debugging
await page.screenshot({
path: 'debug-screenshot.png',
fullPage: true
});
// Highlight elements for debugging
await page.evaluate(() => {
const elements = document.querySelectorAll('a');
elements.forEach(el => {
el.style.border = '2px solid red';
});
});
await page.screenshot({ path: 'debug-highlighted.png' });
await browser.close();
}
Integration with Testing Frameworks
Jest Configuration
// jest-puppeteer.config.js
module.exports = {
launch: {
headless: process.env.CI === 'true',
devtools: process.env.NODE_ENV === 'development',
slowMo: process.env.NODE_ENV === 'development' ? 250 : 0
}
};
// test file
describe('Page Tests', () => {
beforeEach(async () => {
await page.goto('https://example.com');
});
test('should have correct title', async () => {
const title = await page.title();
expect(title).toBe('Expected Title');
});
});
Best Practices and Recommendations
- Default to Headless: Use headless mode for production and automated tasks
- Debug with Non-Headless: Switch to non-headless mode during development
- Environment-Based Configuration: Use environment variables to control mode
- Resource Management: Always close browsers to prevent memory leaks
- Error Handling: Implement proper error handling for both modes
- Performance Monitoring: Monitor resource usage, especially in headless mode
Alternative Solutions
While Puppeteer is excellent for browser automation, consider how to run Playwright in headless mode for similar functionality with additional browser support. For more complex scenarios involving multiple browsers, explore how to set up Playwright for multiple browsers.
Conclusion
Choosing between headless and non-headless modes depends on your specific use case. Headless mode offers superior performance and resource efficiency for automated tasks, while non-headless mode provides valuable visual feedback for development and debugging. By understanding the strengths and limitations of each mode, you can optimize your web scraping and automation workflows for both development and production environments.
Remember to implement proper error handling, resource management, and consider the trade-offs between performance and visibility when selecting the appropriate mode for your Puppeteer applications.