Cookies are small pieces of data stored on the client side, and they play an important role in web scraping, especially when dealing with websites that require login or hold session information. Here's how you can manage cookies while scraping with Selenium in both Python and JavaScript:
Python
Selenium WebDriver provides several methods to manage cookies. Here are some examples:
Adding a Cookie
You can add a cookie using add_cookie()
method:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://www.example.com")
cookie = {'name' : 'foo', 'value' : 'bar'}
driver.add_cookie(cookie)
Getting a Cookie
You can get a cookie using get_cookie()
method:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://www.example.com")
cookie = driver.get_cookie('foo')
print(cookie)
Getting All Cookies
You can get all cookies using get_cookies()
method:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://www.example.com")
cookies = driver.get_cookies()
print(cookies)
Deleting a Cookie
You can delete a cookie using delete_cookie()
method:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://www.example.com")
driver.delete_cookie('foo')
Deleting All Cookies
You can delete all cookies using delete_all_cookies()
method:
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("http://www.example.com")
driver.delete_all_cookies()
JavaScript
In JavaScript, the WebDriver API provides similar methods to manage cookies.
Adding a Cookie
const {Builder} = require('selenium-webdriver');
(async function myFunction() {
let driver = await new Builder().forBrowser('firefox').build();
await driver.get('http://www.example.com');
await driver.manage().addCookie({name: 'foo', value: 'bar'});
})();
Getting a Cookie
const {Builder} = require('selenium-webdriver');
(async function myFunction() {
let driver = await new Builder().forBrowser('firefox').build();
await driver.get('http://www.example.com');
let cookie = await driver.manage().getCookie('foo');
console.log(cookie);
})();
Getting All Cookies
const {Builder} = require('selenium-webdriver');
(async function myFunction() {
let driver = await new Builder().forBrowser('firefox').build();
await driver.get('http://www.example.com');
let cookies = await driver.manage().getCookies();
console.log(cookies);
})();
Deleting a Cookie
const {Builder} = require('selenium-webdriver');
(async function myFunction() {
let driver = await new Builder().forBrowser('firefox').build();
await driver.get('http://www.example.com');
await driver.manage().deleteCookie('foo');
})();
Deleting All Cookies
const {Builder} = require('selenium-webdriver');
(async function myFunction() {
let driver = await new Builder().forBrowser('firefox').build();
await driver.get('http://www.example.com');
await driver.manage().deleteAllCookies();
})();
Remember, managing cookies is crucial when dealing with websites that require login or save session information. By adding, getting, and deleting cookies, you can maintain state within your Selenium WebDriver sessions.