How can I take screenshots of web pages using Headless Chromium?

To take screenshots of web pages using headless Chromium, you can use Puppeteer for JavaScript or Selenium with the ChromeDriver for Python. Below are examples for both languages.

JavaScript (using Puppeteer)

Puppeteer is a Node library which provides a high-level API to control headless Chrome. Below is an example of how to take a screenshot of a webpage using Puppeteer:

  1. First, you need to install Puppeteer using npm:
npm install puppeteer
  1. Use the following script to navigate to a page and take a screenshot:
const puppeteer = require('puppeteer');

(async () => {
  // Launch headless Chrome
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Go to the webpage
  await page.goto('https://example.com');

  // Take screenshot
  await page.screenshot({path: 'example.png'});

  // Close the browser
  await browser.close();
})();

This script will create a screenshot of example.com and save it as example.png.

Python (using Selenium and ChromeDriver)

In Python, you can use Selenium with ChromeDriver to control headless Chrome. Here's how to take a screenshot:

  1. Install Selenium and the ChromeDriver:

You can install Selenium using pip:

pip install selenium

For ChromeDriver, you need to download the appropriate version for your OS and Chrome version from the ChromeDriver download page and place it in your system PATH, or you can provide the path to the executable in your code.

  1. Here's a Python script to take a screenshot:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

# Setup Chrome options
chrome_options = Options()
chrome_options.add_argument("--headless")  # Ensure GUI is off
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")

# Set path to chromedriver as per your configuration
webdriver_path = '/path/to/chromedriver'

# Initialize the driver
driver = webdriver.Chrome(options=chrome_options, executable_path=webdriver_path)

# Fetch web page
driver.get("https://example.com")

# Take screenshot
driver.save_screenshot("example.png")

# Close web browser
driver.quit()

This Python script will navigate to example.com in headless mode and save a screenshot to example.png.

Note:

In both examples, you may need to adjust the window size before taking a screenshot if you require a screenshot of the full page content or a specific viewport. You can do this by setting the --window-size argument in Chrome options or by using the set_window_size method in Selenium.

For Puppeteer:

await page.setViewport({ width: 1280, height: 800 });

For Selenium:

driver.set_window_size(1280, 800)

Remember that taking screenshots with headless browsers is generally for development, testing, or automation purposes. Ensure you have the legal right to take screenshots of the web pages you target, as some websites may have terms of service that restrict automated access or capturing content.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon