Websites increasingly use sophisticated detection methods to identify and block headless browsers. Modern bot detection systems analyze browser fingerprints, behavioral patterns, and JavaScript properties to distinguish automated scripts from real users.
This guide covers proven techniques to make your headless Chromium browser appear more human-like and bypass common detection mechanisms.
Understanding Detection Methods
Before implementing evasion techniques, it's important to understand how websites detect headless browsers:
- Navigator properties (e.g.,
navigator.webdriver
= true) - Missing browser features (plugins, WebGL, canvas)
- Behavioral patterns (no mouse movements, consistent timing)
- HTTP fingerprinting (headers, TLS signatures)
- Canvas and WebGL fingerprinting
Core Evasion Techniques
1. Disable Automation Indicators
The most effective first step is disabling Chromium's automation control features:
# Python with Selenium
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--headless=new") # Use new headless mode
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options)
# Execute script to hide webdriver property
driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
// Node.js with Puppeteer
const puppeteer = require('puppeteer');
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--disable-blink-features=AutomationControlled',
'--disable-dev-shm-usage',
'--no-sandbox'
],
ignoreDefaultArgs: ["--enable-automation"]
});
const page = await browser.newPage();
// Hide webdriver property
await page.evaluateOnNewDocument(() => {
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined,
});
});
2. User Agent and Headers Management
Set realistic user agents and HTTP headers that match real browsers:
# Python - Dynamic user agent rotation
import random
user_agents = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
]
options.add_argument(f"--user-agent={random.choice(user_agents)}")
# Add realistic viewport
options.add_argument("--window-size=1366,768")
options.add_argument("--start-maximized")
// JavaScript - Set headers and viewport
await page.setUserAgent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36');
await page.setViewport({
width: 1366,
height: 768,
deviceScaleFactor: 1,
hasTouch: false,
isLandscape: true,
isMobile: false
});
// Set additional headers
await page.setExtraHTTPHeaders({
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1'
});
3. Advanced Stealth with Puppeteer Extra
For Node.js projects, puppeteer-extra-plugin-stealth
provides comprehensive evasion:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
// Configure stealth plugin
puppeteer.use(StealthPlugin({
// Enable all evasion techniques
enabledEvasions: new Set([
'chrome.app',
'chrome.csi',
'chrome.loadTimes',
'chrome.runtime',
'defaultArgs',
'iframe.contentWindow',
'media.codecs',
'navigator.hardwareConcurrency',
'navigator.languages',
'navigator.permissions',
'navigator.plugins',
'navigator.webdriver',
'sourceurl',
'user-agent-override',
'webgl.vendor',
'window.outerdimensions'
])
}));
const browser = await puppeteer.launch({
headless: 'new',
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-infobars',
'--window-position=0,0',
'--ignore-certificate-errors',
'--ignore-certificate-errors-spki-list',
'--ignore-ssl-errors'
]
});
4. Behavioral Simulation
Make your automation behave more like a human user:
# Python - Human-like interactions
import time
import random
from selenium.webdriver.common.action_chains import ActionChains
def human_like_delay():
time.sleep(random.uniform(1.5, 4.0))
def simulate_human_behavior(driver):
# Random mouse movements
actions = ActionChains(driver)
# Move to random positions
for _ in range(random.randint(2, 5)):
x = random.randint(100, 1200)
y = random.randint(100, 700)
actions.move_by_offset(x, y)
actions.pause(random.uniform(0.5, 1.5))
actions.perform()
human_like_delay()
# Usage
driver.get("https://example.com")
simulate_human_behavior(driver)
// JavaScript - Mouse movements and scrolling
async function humanLikeInteraction(page) {
// Random mouse movements
for (let i = 0; i < Math.floor(Math.random() * 5) + 2; i++) {
await page.mouse.move(
Math.random() * 1200,
Math.random() * 700
);
await page.waitForTimeout(Math.random() * 1000 + 500);
}
// Random scrolling
await page.evaluate(() => {
const scrollHeight = document.body.scrollHeight;
const scrollStep = Math.random() * 500 + 200;
window.scrollTo(0, scrollStep);
});
await page.waitForTimeout(Math.random() * 2000 + 1000);
}
5. Fingerprint Randomization
Randomize browser fingerprints to avoid pattern detection:
// JavaScript - Canvas and WebGL fingerprint evasion
await page.evaluateOnNewDocument(() => {
// Canvas fingerprint randomization
const getImageData = HTMLCanvasElement.prototype.toDataURL;
HTMLCanvasElement.prototype.toDataURL = function(type) {
if (type === 'image/png' && this.width === 280 && this.height === 60) {
// Add slight randomization to canvas data
const context = this.getContext('2d');
const imageData = context.getImageData(0, 0, this.width, this.height);
for (let i = 0; i < imageData.data.length; i += 4) {
imageData.data[i] += Math.floor(Math.random() * 3) - 1;
}
context.putImageData(imageData, 0, 0);
}
return getImageData.apply(this, arguments);
};
// WebGL fingerprint evasion
const getParameter = WebGLRenderingContext.prototype.getParameter;
WebGLRenderingContext.prototype.getParameter = function(parameter) {
if (parameter === 37445) {
return 'Intel Inc.'; // Generic GPU vendor
}
if (parameter === 37446) {
return 'Intel(R) HD Graphics'; // Generic GPU renderer
}
return getParameter.apply(this, arguments);
};
});
6. Proxy and Network Management
Implement proper proxy rotation and network patterns:
# Python - Proxy rotation with realistic timing
import itertools
import requests
class ProxyRotator:
def __init__(self, proxy_list):
self.proxies = itertools.cycle(proxy_list)
self.current_proxy = None
def get_next_proxy(self):
self.current_proxy = next(self.proxies)
return {
'http': f'http://{self.current_proxy}',
'https': f'http://{self.current_proxy}'
}
def configure_selenium(self, options):
if self.current_proxy:
options.add_argument(f'--proxy-server=http://{self.current_proxy}')
# Usage
proxy_list = ['proxy1:port', 'proxy2:port', 'proxy3:port']
rotator = ProxyRotator(proxy_list)
for url in urls_to_scrape:
proxy_config = rotator.get_next_proxy()
# Configure new browser instance with proxy
options = Options()
rotator.configure_selenium(options)
driver = webdriver.Chrome(options=options)
# ... scraping logic ...
driver.quit()
# Human-like delay between requests
time.sleep(random.uniform(10, 30))
Complete Stealth Configuration Example
Here's a comprehensive example combining all techniques:
# Python - Complete stealth setup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import random
import time
def create_stealth_driver():
options = Options()
# Basic stealth options
options.add_argument("--headless=new")
options.add_argument("--disable-blink-features=AutomationControlled")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
# Realistic browser behavior
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--no-sandbox")
options.add_argument("--disable-gpu")
options.add_argument("--remote-debugging-port=9222")
# Random viewport
viewports = [(1366, 768), (1920, 1080), (1440, 900), (1280, 720)]
width, height = random.choice(viewports)
options.add_argument(f"--window-size={width},{height}")
# Random user agent
user_agents = [
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
]
options.add_argument(f"--user-agent={random.choice(user_agents)}")
driver = webdriver.Chrome(options=options)
# Execute stealth scripts
stealth_js = """
Object.defineProperty(navigator, 'webdriver', {get: () => undefined});
Object.defineProperty(navigator, 'plugins', {get: () => [1, 2, 3, 4, 5]});
Object.defineProperty(navigator, 'languages', {get: () => ['en-US', 'en']});
window.chrome = {runtime: {}};
"""
driver.execute_script(stealth_js)
return driver
# Usage
driver = create_stealth_driver()
driver.get("https://bot-detection-test.com")
Detection Testing and Validation
Test your stealth configuration against bot detection services:
// Test against common detection services
const testUrls = [
'https://intoli.com/blog/not-possible-to-block-chrome-headless/chrome-headless-test.html',
'https://arh.antoinevastel.com/bots/areyouheadless',
'https://bot.sannysoft.com/'
];
for (const url of testUrls) {
console.log(`Testing: ${url}`);
await page.goto(url, { waitUntil: 'networkidle2' });
// Take screenshot to verify results
await page.screenshot({
path: `test-${url.split('/').pop()}.png`,
fullPage: true
});
await page.waitForTimeout(5000);
}
Best Practices and Considerations
Compliance and Ethics
- Always respect robots.txt and website terms of service
- Implement rate limiting to avoid overwhelming servers
- Use official APIs when available instead of scraping
- Consider legal implications in your jurisdiction
Performance Optimization
- Reuse browser instances when possible to reduce overhead
- Implement connection pooling for better resource management
- Monitor memory usage to prevent crashes during long sessions
Monitoring and Maintenance
- Log detection events to identify pattern failures
- Update user agents regularly to match current browser versions
- Test configurations periodically as detection methods evolve
Alternative Solutions
For large-scale or mission-critical scraping, consider:
- Residential proxy services with automatic rotation
- Browser automation services with built-in stealth features
- Web scraping APIs that handle detection evasion automatically
- CAPTCHA solving services for interactive challenges
Remember that detection techniques continuously evolve, so no evasion method is permanently effective. The key is combining multiple techniques and staying updated with the latest developments in both bot detection and evasion methods.