How does Headless Chromium handle website cookies and local storage?

Headless Chromium behaves similarly to a regular GUI-based Chrome browser when it comes to handling website cookies and local storage. When you visit a website using headless Chrome, it will receive and store cookies as well as read and write data to local storage just like a regular browser session.

Here's how headless Chromium handles cookies and local storage:

Cookies

  • Setting Cookies: When a server sends a Set-Cookie header as part of an HTTP response, headless Chromium will store the cookie based on the current cookie policy. By default, Chromium accepts all cookies.

  • Sending Cookies: When making subsequent requests to the same domain, headless Chromium will automatically include the appropriate cookies in the HTTP request headers, just as a regular browser would.

  • Cookie Isolation: Cookies are isolated by profile or session. If you start a new headless session without specifying a user data directory, it will not have access to any existing cookies.

  • Persistent Cookies: If you want to persist cookies across sessions, you can specify a user data directory using the --user-data-dir flag. This flag tells Chromium where to store user data such as cookies, local storage, and cache.

  • Session Cookies: Any cookies set without an explicit expiration will be considered session cookies and will be lost when the headless browser session ends, unless a user data directory is specified.

Local Storage

  • Writing to Local Storage: When a webpage uses JavaScript to store data in local storage, headless Chromium will save this data in the specified user data directory.

  • Reading from Local Storage: When a webpage accesses local storage, headless Chromium will read the data from the user data directory, if available.

  • Persistence: Local storage is persistent by default. Data will remain across sessions when you use the --user-data-dir flag. Without this flag, the data will be lost when the session ends.

How to Use Cookies and Local Storage in Headless Chromium

Here are some practical examples of how to use cookies and local storage in headless Chromium:

Python (with Selenium and ChromeDriver)

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless")
options.add_argument("--user-data-dir=/path/to/your/user/data/dir")

driver = webdriver.Chrome(executable_path='/path/to/chromedriver', options=options)

driver.get('http://example.com')

# To check cookies
cookies = driver.get_cookies()
print(cookies)

# To add a cookie
driver.add_cookie({'name': 'key', 'value': 'value'})

# Local storage is accessed through JavaScript execution
local_storage_item = driver.execute_script("return localStorage.getItem('key');")
print(local_storage_item)

driver.quit()

JavaScript (with Puppeteer)

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
    headless: true,
    userDataDir: '/path/to/your/user/data/dir'
  });

  const page = await browser.newPage();
  await page.goto('http://example.com');

  // To check cookies
  const cookies = await page.cookies();
  console.log(cookies);

  // To set a cookie
  await page.setCookie({name: 'key', value: 'value'});

  // To get local storage data
  const localStorageData = await page.evaluate(() => {
    return localStorage.getItem('key');
  });
  console.log(localStorageData);

  await browser.close();
})();

Using a user data directory is key to persisting cookies and local storage data across sessions. Without it, each headless session is essentially a "fresh" instance, and all stored data will be lost when the session ends.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon