Injecting JavaScript into web pages using Puppeteer enables powerful automation capabilities for testing, web scraping, and browser manipulation. Puppeteer provides several methods to execute JavaScript code directly in the browser context.
Methods for JavaScript Injection
1. page.evaluate() - Execute JavaScript in Page Context
The most common method for injecting JavaScript is page.evaluate()
, which runs code in the browser's page context:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// Execute JavaScript in the page context
const result = await page.evaluate(() => {
// Modify page content
document.title = 'Modified by Puppeteer';
// Extract data
const links = Array.from(document.querySelectorAll('a')).map(a => ({
text: a.textContent,
href: a.href
}));
return links;
});
console.log('Extracted links:', result);
await browser.close();
})();
2. page.addScriptTag() - Inject External Scripts
Use page.addScriptTag()
to inject external JavaScript files or inline scripts:
// Inject external library (e.g., jQuery)
await page.addScriptTag({
url: 'https://code.jquery.com/jquery-3.6.0.min.js'
});
// Inject inline script
await page.addScriptTag({
content: `
window.myCustomFunction = function() {
console.log('Custom function injected!');
};
`
});
// Inject local file
await page.addScriptTag({
path: './my-script.js'
});
3. page.evaluateOnNewDocument() - Run Script on Every Page Load
Execute JavaScript before any page scripts run:
await page.evaluateOnNewDocument(() => {
// Override navigator properties
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined,
});
// Add custom global variables
window.injectedData = { timestamp: Date.now() };
});
await page.goto('https://example.com');
Practical Examples
Example 1: Form Automation
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com/form');
// Inject form filling logic
await page.evaluate(() => {
const form = document.querySelector('#myForm');
if (form) {
form.querySelector('#name').value = 'John Doe';
form.querySelector('#email').value = 'john@example.com';
form.querySelector('#submit').click();
}
});
await browser.close();
})();
Example 2: Data Extraction with Parameters
const searchTerm = 'puppeteer';
const maxResults = 10;
const results = await page.evaluate((term, limit) => {
const elements = document.querySelectorAll(`[data-search*="${term}"]`);
return Array.from(elements)
.slice(0, limit)
.map(el => ({
title: el.textContent,
url: el.href || el.dataset.url
}));
}, searchTerm, maxResults);
Example 3: Wait for Dynamic Content
// Inject JavaScript that waits for dynamic content
await page.evaluate(() => {
return new Promise((resolve) => {
const observer = new MutationObserver((mutations) => {
const targetElement = document.querySelector('.dynamic-content');
if (targetElement) {
observer.disconnect();
resolve(targetElement.textContent);
}
});
observer.observe(document.body, {
childList: true,
subtree: true
});
});
});
Important Considerations
Context Isolation
- Code in
page.evaluate()
runs in the browser context, not Node.js - You cannot access Node.js variables or modules directly
- Pass data as function parameters:
const nodeData = { user: 'admin', token: 'abc123' };
await page.evaluate((data) => {
// Use data.user and data.token here
localStorage.setItem('auth', JSON.stringify(data));
}, nodeData);
Return Value Limitations
- Only JSON-serializable values can be returned
- Functions, DOM elements, and complex objects won't work
- Convert complex data to plain objects:
const domData = await page.evaluate(() => {
const elements = document.querySelectorAll('div');
return Array.from(elements).map(el => ({
tagName: el.tagName,
className: el.className,
textContent: el.textContent.trim()
}));
});
Error Handling
try {
const result = await page.evaluate(() => {
const element = document.querySelector('#nonexistent');
if (!element) {
throw new Error('Element not found');
}
return element.textContent;
});
} catch (error) {
console.error('JavaScript injection failed:', error.message);
}
Best Practices
- Wait for elements before injection:
await page.waitForSelector('#targetElement');
await page.evaluate(() => {
document.querySelector('#targetElement').click();
});
- Use modern JavaScript features:
await page.evaluate(() => {
// Use arrow functions, async/await, destructuring
const { title, url } = document;
return { title, url: url.href };
});
- Inject utility functions once:
await page.evaluateOnNewDocument(() => {
window.utils = {
waitForElement: (selector, timeout = 5000) => {
return new Promise((resolve, reject) => {
const element = document.querySelector(selector);
if (element) return resolve(element);
const observer = new MutationObserver(() => {
const element = document.querySelector(selector);
if (element) {
observer.disconnect();
resolve(element);
}
});
observer.observe(document.body, {
childList: true,
subtree: true
});
setTimeout(() => {
observer.disconnect();
reject(new Error(`Element ${selector} not found within ${timeout}ms`));
}, timeout);
});
}
};
});
JavaScript injection with Puppeteer provides powerful capabilities for browser automation, enabling you to modify page behavior, extract data, and automate complex user interactions programmatically.