How can I take screenshots of web pages using Headless Chromium?

Taking screenshots of web pages using headless Chromium is essential for web scraping, testing, and automation. This guide covers multiple approaches using Puppeteer, Selenium, and direct command-line methods.

JavaScript with Puppeteer

Puppeteer is the most popular library for controlling headless Chrome, offering a high-level API for screenshot capture.

Installation

npm install puppeteer

Basic Screenshot

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Navigate to the webpage
  await page.goto('https://example.com', { waitUntil: 'networkidle2' });

  // Take screenshot
  await page.screenshot({ path: 'example.png' });

  await browser.close();
})();

Advanced Screenshot Options

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Set viewport size
  await page.setViewport({ width: 1920, height: 1080 });

  await page.goto('https://example.com', { waitUntil: 'networkidle2' });

  // Full page screenshot
  await page.screenshot({
    path: 'full-page.png',
    fullPage: true
  });

  // Element-specific screenshot
  const element = await page.$('header');
  await element.screenshot({ path: 'header.png' });

  // Screenshot with custom quality (JPEG)
  await page.screenshot({
    path: 'compressed.jpg',
    type: 'jpeg',
    quality: 80
  });

  await browser.close();
})();

Python with Selenium

Selenium WebDriver provides cross-platform support for browser automation and screenshot capture.

Installation

pip install selenium webdriver-manager

Modern Selenium Approach (with WebDriver Manager)

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
import time

# Setup Chrome options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--window-size=1920,1080")

# Initialize driver with WebDriver Manager
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=chrome_options)

try:
    # Navigate to webpage
    driver.get("https://example.com")

    # Wait for page to load
    time.sleep(2)

    # Take full page screenshot
    driver.save_screenshot("example.png")

    # Take element screenshot
    element = driver.find_element(By.TAG_NAME, "header")
    element.screenshot("header.png")

finally:
    driver.quit()

Legacy Selenium Approach

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")

# For systems with ChromeDriver in PATH
driver = webdriver.Chrome(options=chrome_options)

# Or specify ChromeDriver path
# driver = webdriver.Chrome(executable_path='/path/to/chromedriver', options=chrome_options)

driver.get("https://example.com")
driver.save_screenshot("example.png")
driver.quit()

Command Line Interface

You can also use headless Chromium directly from the command line:

# Basic screenshot
google-chrome --headless --disable-gpu --screenshot=example.png https://example.com

# Full page screenshot with custom size
google-chrome --headless --disable-gpu --screenshot=full.png --window-size=1920,1080 --virtual-time-budget=2000 https://example.com

# PDF generation
google-chrome --headless --disable-gpu --print-to-pdf=example.pdf https://example.com

Advanced Configuration Options

Viewport and Page Size

// Puppeteer - Mobile viewport
await page.setViewport({ 
  width: 375, 
  height: 667, 
  deviceScaleFactor: 2,
  isMobile: true 
});

// Selenium - Custom window size
driver.set_window_size(1920, 1080)

Wait Strategies

// Puppeteer - Wait for specific elements
await page.goto('https://example.com');
await page.waitForSelector('.dynamic-content');
await page.screenshot({ path: 'loaded.png' });

// Wait for network activity to finish
await page.goto('https://example.com', { waitUntil: 'networkidle0' });
# Selenium - Wait for elements
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_element_located((By.CLASS_NAME, "dynamic-content")))
driver.save_screenshot("loaded.png")

Error Handling

// Puppeteer error handling
try {
  await page.goto('https://example.com', { timeout: 30000 });
  await page.screenshot({ path: 'screenshot.png' });
} catch (error) {
  console.error('Screenshot failed:', error);
} finally {
  await browser.close();
}
# Selenium error handling
try:
    driver.get("https://example.com")
    driver.save_screenshot("screenshot.png")
except Exception as e:
    print(f"Screenshot failed: {e}")
finally:
    driver.quit()

Best Practices

  1. Always close browsers: Prevent memory leaks by properly closing browser instances
  2. Use appropriate wait strategies: Wait for content to load before taking screenshots
  3. Set consistent viewport sizes: Ensure reproducible screenshots across runs
  4. Handle errors gracefully: Implement proper error handling for network issues
  5. Optimize for performance: Use networkidle2 instead of networkidle0 for faster captures
  6. Respect robots.txt: Check website policies before automated screenshot capture

Troubleshooting

  • Empty screenshots: Add wait times or use waitUntil options
  • Partial content: Use fullPage: true in Puppeteer or scroll to bottom in Selenium
  • Font rendering issues: Install necessary fonts in headless environments
  • Permission errors: Use --no-sandbox flag in containerized environments

Remember to comply with website terms of service and legal requirements when taking automated screenshots.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon