How to monitor network requests in Puppeteer?

Puppeteer is a Node library that provides high-level APIs to control Chrome or Chromium over the DevTools Protocol. It's commonly used for web scraping, automated testing of web pages, and capturing screenshots or PDFs of pages. Monitoring network requests can be useful for observing the data your page or application sends and receives.

In Puppeteer, monitoring network requests is done by listening for the request event on the page object.

Here's a simple example in JavaScript:

const puppeteer = require('puppeteer');

async function monitorNetwork() {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();

    // Listen for all network requests
    page.on('request', request => {
        console.log('Request URL:', request.url());
    });

    await page.goto('https://example.com');
    await browser.close();
}

monitorNetwork();

In this example, we're launching the browser, opening a new page, and then setting up an event listener for the request event. Every time a network request is made by the page, our callback function is called, and we log the URL of the request.

Note that the request event is emitted as soon as the request is initiated, before any data is sent. If you want to examine the response to the request, you can listen for the response event:

page.on('response', response => {
    console.log('Response URL:', response.url());
});

You can also access more information about the request or response, such as the HTTP method, headers, and body. For example:

page.on('request', request => {
    console.log('Request method:', request.method());
    console.log('Request headers:', request.headers());
});

page.on('response', async response => {
    console.log('Response status:', response.status());
    console.log('Response headers:', response.headers());
    console.log('Response body:', await response.text());
});

Remember that response.text() returns a Promise that resolves to the response body in text form, so you need to await it if you want to log the body content.

This way, you can monitor all network requests and responses that are happening in your Puppeteer-controlled browser context.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon