How can I manage cookies while scraping with Selenium?

Cookies are small pieces of data stored on the client side, and they play an important role in web scraping, especially when dealing with websites that require login or hold session information. Here's how you can manage cookies while scraping with Selenium in both Python and JavaScript:

Python

Selenium WebDriver provides several methods to manage cookies. Here are some examples:

Adding a Cookie

You can add a cookie using add_cookie() method:

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("http://www.example.com")
cookie = {'name' : 'foo', 'value' : 'bar'}
driver.add_cookie(cookie)

Getting a Cookie

You can get a cookie using get_cookie() method:

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("http://www.example.com")
cookie = driver.get_cookie('foo')
print(cookie)

Getting All Cookies

You can get all cookies using get_cookies() method:

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("http://www.example.com")
cookies = driver.get_cookies()
print(cookies)

Deleting a Cookie

You can delete a cookie using delete_cookie() method:

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("http://www.example.com")
driver.delete_cookie('foo')

Deleting All Cookies

You can delete all cookies using delete_all_cookies() method:

from selenium import webdriver

driver = webdriver.Firefox()
driver.get("http://www.example.com")
driver.delete_all_cookies()

JavaScript

In JavaScript, the WebDriver API provides similar methods to manage cookies.

Adding a Cookie

const {Builder} = require('selenium-webdriver');

(async function myFunction() {
    let driver = await new Builder().forBrowser('firefox').build();
    await driver.get('http://www.example.com');
    await driver.manage().addCookie({name: 'foo', value: 'bar'});
})();

Getting a Cookie

const {Builder} = require('selenium-webdriver');

(async function myFunction() {
    let driver = await new Builder().forBrowser('firefox').build();
    await driver.get('http://www.example.com');
    let cookie = await driver.manage().getCookie('foo');
    console.log(cookie);
})();

Getting All Cookies

const {Builder} = require('selenium-webdriver');

(async function myFunction() {
    let driver = await new Builder().forBrowser('firefox').build();
    await driver.get('http://www.example.com');
    let cookies = await driver.manage().getCookies();
    console.log(cookies);
})();

Deleting a Cookie

const {Builder} = require('selenium-webdriver');

(async function myFunction() {
    let driver = await new Builder().forBrowser('firefox').build();
    await driver.get('http://www.example.com');
    await driver.manage().deleteCookie('foo');
})();

Deleting All Cookies

const {Builder} = require('selenium-webdriver');

(async function myFunction() {
    let driver = await new Builder().forBrowser('firefox').build();
    await driver.get('http://www.example.com');
    await driver.manage().deleteAllCookies();
})();

Remember, managing cookies is crucial when dealing with websites that require login or save session information. By adding, getting, and deleting cookies, you can maintain state within your Selenium WebDriver sessions.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon