How to handle authentication in Puppeteer?

Puppeteer is a Node.js library that provides a high-level API to control Chrome or Chromium over the DevTools Protocol. Puppeteer runs headless by default but can be configured to run full (non-headless) Chrome or Chromium.

Handling authentication in Puppeteer is quite straightforward. It can be done in several ways but the most common ones are:

  1. Using the page.authenticate method
  2. Using the page.type method to fill in the form.

Here's how to do it:

1. Using page.authenticate method

The page.authenticate method allows you to set the credentials for http authentication.

Here's a code example:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.authenticate({
    username: 'USERNAME',
    password: 'PASSWORD',
  });
  await page.goto('http://example.com');
  await browser.close();
})();

In this example, USERNAME and PASSWORD are your http authentication credentials.

2. Using page.type method to fill in the form

If the site you're trying to scrape uses a form for authentication, you can use the page.type method to fill in the form and the page.click method to submit it.

Here's a code example:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('http://example.com/login');

  // Use `page.type` to fill in the form
  await page.type('#username', 'USERNAME');
  await page.type('#password', 'PASSWORD');

  // Click the submit button
  await page.click('#submit');

  await browser.close();
})();

In this example, USERNAME and PASSWORD are your credentials, and #username, #password, and #submit are the selectors for the username field, password field, and submit button respectively. You will need to adjust these according to the actual selectors used in the website you're scraping.

Please note that it's very important to respect the terms and conditions of the website you're scraping. Some websites do not allow web scraping and doing so can get you banned.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon