How do I run JavaScript code inside a page loaded by Headless Chromium?

To run JavaScript code inside a page loaded by headless Chromium, you can use the Puppeteer library if you're working with Node.js, or use Selenium with a headless ChromeDriver if you're working with Python or another language that supports Selenium.

Here's how to do it with both Puppeteer and Selenium:

Using Puppeteer with Node.js

Puppeteer is a Node library developed by the Chrome DevTools team which provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol.

  1. First, install Puppeteer using npm:
npm install puppeteer
  1. Then, you can use the following Node.js script to run JavaScript inside a page:
const puppeteer = require('puppeteer');

(async () => {
    // Launch headless Chromium
    const browser = await puppeteer.launch();

    // Open a new page
    const page = await browser.newPage();

    // Navigate to the desired URL
    await page.goto('https://example.com');

    // Run JavaScript within the page context
    const result = await page.evaluate(() => {
        // JavaScript code goes here - this function is executed in the browser context
        const title = document.title;
        return `The page title is: ${title}`;
    });

    console.log(result); // Outputs the result of the evaluate script

    // Close the browser
    await browser.close();
})();

Using Selenium with Python

Selenium is a tool for automating web browsers. It has bindings for various programming languages, including Python. To control headless Chrome with Selenium, you'll need to install both Selenium and the ChromeDriver.

  1. Install Selenium:
pip install selenium
  1. Download ChromeDriver from https://sites.google.com/a/chromium.org/chromedriver/downloads. Make sure to download the version that corresponds to your version of Chrome.

  2. Here's a Python script to run JavaScript inside a page using Selenium:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

# Set up headless Chrome options
options = Options()
options.headless = True

# Set the location of the ChromeDriver or use ChromeDriverManager to handle it
service = Service(ChromeDriverManager().install())

# Initialize the driver
driver = webdriver.Chrome(service=service, options=options)

# Navigate to the page
driver.get('https://example.com')

# Execute JavaScript and get the result
result = driver.execute_script("return document.title;")

print(f"The page title is: {result}")  # Outputs the page title

# Close the browser
driver.quit()

Conclusion

Both methods will allow you to run JavaScript inside a headless Chromium browser. Choose the one that best fits your language preferences and project needs. Puppeteer is a popular choice for JavaScript developers, while Selenium is widely used across various programming languages and has a long history of development.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon