Taking screenshots of web pages using headless Chromium is essential for web scraping, testing, and automation. This guide covers multiple approaches using Puppeteer, Selenium, and direct command-line methods.
JavaScript with Puppeteer
Puppeteer is the most popular library for controlling headless Chrome, offering a high-level API for screenshot capture.
Installation
npm install puppeteer
Basic Screenshot
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Navigate to the webpage
await page.goto('https://example.com', { waitUntil: 'networkidle2' });
// Take screenshot
await page.screenshot({ path: 'example.png' });
await browser.close();
})();
Advanced Screenshot Options
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Set viewport size
await page.setViewport({ width: 1920, height: 1080 });
await page.goto('https://example.com', { waitUntil: 'networkidle2' });
// Full page screenshot
await page.screenshot({
path: 'full-page.png',
fullPage: true
});
// Element-specific screenshot
const element = await page.$('header');
await element.screenshot({ path: 'header.png' });
// Screenshot with custom quality (JPEG)
await page.screenshot({
path: 'compressed.jpg',
type: 'jpeg',
quality: 80
});
await browser.close();
})();
Python with Selenium
Selenium WebDriver provides cross-platform support for browser automation and screenshot capture.
Installation
pip install selenium webdriver-manager
Modern Selenium Approach (with WebDriver Manager)
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
import time
# Setup Chrome options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--window-size=1920,1080")
# Initialize driver with WebDriver Manager
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=chrome_options)
try:
# Navigate to webpage
driver.get("https://example.com")
# Wait for page to load
time.sleep(2)
# Take full page screenshot
driver.save_screenshot("example.png")
# Take element screenshot
element = driver.find_element(By.TAG_NAME, "header")
element.screenshot("header.png")
finally:
driver.quit()
Legacy Selenium Approach
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
# For systems with ChromeDriver in PATH
driver = webdriver.Chrome(options=chrome_options)
# Or specify ChromeDriver path
# driver = webdriver.Chrome(executable_path='/path/to/chromedriver', options=chrome_options)
driver.get("https://example.com")
driver.save_screenshot("example.png")
driver.quit()
Command Line Interface
You can also use headless Chromium directly from the command line:
# Basic screenshot
google-chrome --headless --disable-gpu --screenshot=example.png https://example.com
# Full page screenshot with custom size
google-chrome --headless --disable-gpu --screenshot=full.png --window-size=1920,1080 --virtual-time-budget=2000 https://example.com
# PDF generation
google-chrome --headless --disable-gpu --print-to-pdf=example.pdf https://example.com
Advanced Configuration Options
Viewport and Page Size
// Puppeteer - Mobile viewport
await page.setViewport({
width: 375,
height: 667,
deviceScaleFactor: 2,
isMobile: true
});
// Selenium - Custom window size
driver.set_window_size(1920, 1080)
Wait Strategies
// Puppeteer - Wait for specific elements
await page.goto('https://example.com');
await page.waitForSelector('.dynamic-content');
await page.screenshot({ path: 'loaded.png' });
// Wait for network activity to finish
await page.goto('https://example.com', { waitUntil: 'networkidle0' });
# Selenium - Wait for elements
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
wait = WebDriverWait(driver, 10)
wait.until(EC.presence_of_element_located((By.CLASS_NAME, "dynamic-content")))
driver.save_screenshot("loaded.png")
Error Handling
// Puppeteer error handling
try {
await page.goto('https://example.com', { timeout: 30000 });
await page.screenshot({ path: 'screenshot.png' });
} catch (error) {
console.error('Screenshot failed:', error);
} finally {
await browser.close();
}
# Selenium error handling
try:
driver.get("https://example.com")
driver.save_screenshot("screenshot.png")
except Exception as e:
print(f"Screenshot failed: {e}")
finally:
driver.quit()
Best Practices
- Always close browsers: Prevent memory leaks by properly closing browser instances
- Use appropriate wait strategies: Wait for content to load before taking screenshots
- Set consistent viewport sizes: Ensure reproducible screenshots across runs
- Handle errors gracefully: Implement proper error handling for network issues
- Optimize for performance: Use
networkidle2
instead ofnetworkidle0
for faster captures - Respect robots.txt: Check website policies before automated screenshot capture
Troubleshooting
- Empty screenshots: Add wait times or use
waitUntil
options - Partial content: Use
fullPage: true
in Puppeteer or scroll to bottom in Selenium - Font rendering issues: Install necessary fonts in headless environments
- Permission errors: Use
--no-sandbox
flag in containerized environments
Remember to comply with website terms of service and legal requirements when taking automated screenshots.