Table of contents

How can I set custom headers or cookies in Headless Chromium?

Setting custom headers and cookies in headless Chromium is essential for web scraping, testing, and automation tasks. Whether you need to simulate user sessions, pass authentication tokens, or modify request headers, there are several methods available depending on your technology stack.

Why Set Custom Headers and Cookies?

  • Authentication: Pass bearer tokens or API keys
  • Session simulation: Maintain logged-in user states
  • User-Agent spoofing: Avoid detection by web scrapers
  • Localization: Set language/region preferences
  • Rate limiting: Include API quotas or request identifiers

Method 1: JavaScript with Puppeteer (Recommended)

Puppeteer offers the most straightforward approach for setting both headers and cookies:

Setting Custom Headers

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
    headless: true,
    args: ['--no-sandbox', '--disable-setuid-sandbox']
  });

  const page = await browser.newPage();

  // Set multiple custom headers
  await page.setExtraHTTPHeaders({
    'User-Agent': 'MyCustomBot/1.0',
    'Authorization': 'Bearer your-token-here',
    'Custom-API-Key': 'your-api-key',
    'Accept-Language': 'en-US,en;q=0.9'
  });

  await page.goto('https://httpbin.org/headers');

  const content = await page.content();
  console.log(content);

  await browser.close();
})();

Setting Cookies

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();

  // Set multiple cookies
  await page.setCookie(
    {
      name: 'session_id',
      value: 'abc123def456',
      domain: 'example.com',
      path: '/',
      httpOnly: true,
      secure: true
    },
    {
      name: 'user_preference',
      value: 'dark_mode',
      domain: 'example.com',
      path: '/'
    }
  );

  await page.goto('https://example.com');

  // Verify cookies were set
  const cookies = await page.cookies();
  console.log('Current cookies:', cookies);

  await browser.close();
})();

Combined Headers and Cookies Example

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({ headless: true });
  const page = await browser.newPage();

  // Set headers first
  await page.setExtraHTTPHeaders({
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
    'Referer': 'https://google.com',
    'X-Requested-With': 'XMLHttpRequest'
  });

  // Navigate to domain first, then set cookies
  await page.goto('https://example.com');

  await page.setCookie({
    name: 'auth_token',
    value: 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...',
    domain: 'example.com',
    httpOnly: true,
    secure: true,
    sameSite: 'Strict'
  });

  // Now make authenticated requests
  await page.goto('https://example.com/dashboard');

  await browser.close();
})();

Method 2: Python with Selenium

Setting Cookies with Selenium

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager

# Configure Chrome options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")

# Initialize driver
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service, options=chrome_options)

try:
    # Navigate to domain first (required for cookies)
    driver.get("https://example.com")

    # Set multiple cookies
    cookies = [
        {"name": "session_id", "value": "abc123def456"},
        {"name": "user_lang", "value": "en-US"},
        {"name": "theme", "value": "dark", "path": "/", "secure": True}
    ]

    for cookie in cookies:
        driver.add_cookie(cookie)

    # Refresh to apply cookies
    driver.refresh()

    # Verify cookies
    all_cookies = driver.get_cookies()
    print(f"Set {len(all_cookies)} cookies")

finally:
    driver.quit()

Setting Headers with Selenium (Advanced)

Selenium doesn't directly support custom headers, but you can use Chrome DevTools Protocol:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import json

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_experimental_option("useAutomationExtension", False)
chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])

driver = webdriver.Chrome(options=chrome_options)

# Enable CDP
driver.execute_cdp_cmd('Network.enable', {})

# Set custom headers using CDP
headers = {
    "User-Agent": "CustomBot/1.0",
    "Authorization": "Bearer token123",
    "Custom-Header": "custom-value"
}

driver.execute_cdp_cmd('Network.setUserAgentOverride', {
    "userAgent": headers.get("User-Agent", "")
})

# For other headers, use Network.setExtraHTTPHeaders
driver.execute_cdp_cmd('Network.setExtraHTTPHeaders', {
    "headers": {k: v for k, v in headers.items() if k != "User-Agent"}
})

driver.get("https://httpbin.org/headers")
driver.quit()

Method 3: Chrome DevTools Protocol (CDP)

For more advanced control, use CDP directly:

import asyncio
from pyppeteer import launch

async def set_headers_cookies_cdp():
    browser = await launch(headless=True)
    page = await browser.newPage()

    # Enable network domain
    await page._client.send('Network.enable')

    # Set custom headers
    await page._client.send('Network.setExtraHTTPHeaders', {
        'headers': {
            'Custom-Token': 'abc123',
            'API-Version': 'v2',
            'User-Agent': 'CDP-Bot/1.0'
        }
    })

    # Set cookies via CDP
    await page._client.send('Network.setCookie', {
        'name': 'session',
        'value': 'cdp-session-123',
        'domain': 'example.com',
        'path': '/',
        'httpOnly': True
    })

    await page.goto('https://example.com')
    await browser.close()

# Run the async function
asyncio.run(set_headers_cookies_cdp())

Method 4: Command Line Options

While limited, you can set some headers and cookies via command line:

# Set user agent
google-chrome --headless --disable-gpu --user-agent="MyBot/1.0" --dump-dom https://example.com

# Load cookies from file (Chrome format)
google-chrome --headless --disable-gpu --cookie-file=/path/to/cookies.txt https://example.com

# Set additional Chrome flags
google-chrome --headless \
  --disable-gpu \
  --no-sandbox \
  --disable-dev-shm-usage \
  --user-agent="Custom Agent" \
  --dump-dom https://example.com

Best Practices

Cookie Management

  • Always navigate to the target domain before setting cookies
  • Set appropriate cookie attributes (httpOnly, secure, sameSite)
  • Handle cookie expiration for long-running sessions
  • Clear cookies between sessions to avoid conflicts

Header Configuration

  • Set headers before making requests
  • Use realistic User-Agent strings to avoid detection
  • Include standard headers like Accept, Accept-Language
  • Be consistent with header casing

Error Handling

const puppeteer = require('puppeteer');

(async () => {
  let browser;
  try {
    browser = await puppeteer.launch({ headless: true });
    const page = await browser.newPage();

    // Set headers with error handling
    try {
      await page.setExtraHTTPHeaders({
        'Authorization': 'Bearer ' + process.env.AUTH_TOKEN
      });
    } catch (error) {
      console.error('Failed to set headers:', error);
    }

    // Set cookies with validation
    const cookiesToSet = [
      { name: 'session', value: 'abc123', domain: 'example.com' }
    ];

    for (const cookie of cookiesToSet) {
      try {
        await page.setCookie(cookie);
      } catch (error) {
        console.error(`Failed to set cookie ${cookie.name}:`, error);
      }
    }

    await page.goto('https://example.com');

  } catch (error) {
    console.error('Browser operation failed:', error);
  } finally {
    if (browser) {
      await browser.close();
    }
  }
})();

Troubleshooting Common Issues

  1. Cookies not being sent: Ensure you navigate to the domain before setting cookies
  2. Headers not applied: Set headers before navigation, not after
  3. SSL errors: Use --ignore-certificate-errors flag for development
  4. CORS issues: Set appropriate Origin and Referer headers
  5. Bot detection: Use realistic headers and implement delays between requests

Choose the method that best fits your technology stack and requirements. Puppeteer is generally recommended for its simplicity and comprehensive feature set.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon