Table of contents

How can I configure custom user agents in Playwright?

Configuring custom user agents in Playwright is essential for web scraping projects where you need to simulate different browsers, devices, or avoid detection. Playwright provides multiple ways to set user agents at different levels - from browser context to individual pages. This guide covers all the methods and best practices for user agent configuration.

What is a User Agent?

A user agent is a string that identifies the browser, operating system, and device making the request to a web server. Websites often use user agents to serve different content based on the client's capabilities or to block automated requests. When web scraping, customizing user agents helps you:

  • Simulate real browser behavior
  • Access mobile or desktop-specific content
  • Bypass basic bot detection
  • Test how websites respond to different browsers

Setting User Agent at Browser Context Level

The most common approach is to set the user agent when creating a browser context. This applies the user agent to all pages within that context:

JavaScript/Node.js Example

const { chromium } = require('playwright');

async function scrapeWithCustomUserAgent() {
  const browser = await chromium.launch();

  // Set custom user agent for the entire context
  const context = await browser.newContext({
    userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
  });

  const page = await context.newPage();
  await page.goto('https://httpbin.org/user-agent');

  // The page will use the custom user agent
  const userAgent = await page.textContent('body');
  console.log('User Agent:', userAgent);

  await browser.close();
}

scrapeWithCustomUserAgent();

Python Example

from playwright.sync_api import sync_playwright

def scrape_with_custom_user_agent():
    with sync_playwright() as p:
        browser = p.chromium.launch()

        # Set custom user agent at context level
        context = browser.new_context(
            user_agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
        )

        page = context.new_page()
        page.goto("https://httpbin.org/user-agent")

        # Extract and print the user agent
        user_agent_text = page.text_content("body")
        print(f"User Agent: {user_agent_text}")

        browser.close()

scrape_with_custom_user_agent()

Setting User Agent at Page Level

You can also set user agents for individual pages using the setExtraHTTPHeaders method:

JavaScript Example

const { chromium } = require('playwright');

async function setPageUserAgent() {
  const browser = await chromium.launch();
  const context = await browser.newContext();
  const page = await context.newPage();

  // Set user agent for this specific page
  await page.setExtraHTTPHeaders({
    'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 14_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1'
  });

  await page.goto('https://httpbin.org/headers');

  const headers = await page.textContent('pre');
  console.log('Headers:', headers);

  await browser.close();
}

setPageUserAgent();

Python Example

from playwright.sync_api import sync_playwright

def set_page_user_agent():
    with sync_playwright() as p:
        browser = p.chromium.launch()
        context = browser.new_context()
        page = context.new_page()

        # Set user agent for this specific page
        page.set_extra_http_headers({
            "User-Agent": "Mozilla/5.0 (iPad; CPU OS 14_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1"
        })

        page.goto("https://httpbin.org/headers")

        headers = page.text_content("pre")
        print(f"Headers: {headers}")

        browser.close()

set_page_user_agent()

Using Predefined Device User Agents

Playwright provides predefined device configurations that include appropriate user agents. This is particularly useful for mobile emulation:

JavaScript Example

const { chromium, devices } = require('playwright');

async function useDeviceUserAgent() {
  const browser = await chromium.launch();

  // Use iPhone 12 configuration (includes user agent)
  const iPhone12 = devices['iPhone 12'];
  const context = await browser.newContext({
    ...iPhone12,
  });

  const page = await context.newPage();
  await page.goto('https://httpbin.org/user-agent');

  const userAgent = await page.textContent('body');
  console.log('iPhone 12 User Agent:', userAgent);

  await browser.close();
}

useDeviceUserAgent();

Python Example

from playwright.sync_api import sync_playwright

def use_device_user_agent():
    with sync_playwright() as p:
        browser = p.chromium.launch()

        # Use iPhone 12 Pro configuration
        iphone_12_pro = p.devices['iPhone 12 Pro']
        context = browser.new_context(**iphone_12_pro)

        page = context.new_page()
        page.goto("https://httpbin.org/user-agent")

        user_agent = page.text_content("body")
        print(f"iPhone 12 Pro User Agent: {user_agent}")

        browser.close()

use_device_user_agent()

Dynamic User Agent Rotation

For advanced web scraping scenarios, you might want to rotate user agents to avoid detection. Here's how to implement user agent rotation:

JavaScript Example

const { chromium } = require('playwright');

const userAgents = [
  'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
  'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
  'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
  'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0'
];

async function rotateUserAgents() {
  const browser = await chromium.launch();

  for (let i = 0; i < userAgents.length; i++) {
    const context = await browser.newContext({
      userAgent: userAgents[i]
    });

    const page = await context.newPage();
    await page.goto('https://httpbin.org/user-agent');

    const userAgent = await page.textContent('body');
    console.log(`Request ${i + 1} User Agent:`, userAgent);

    await context.close();
  }

  await browser.close();
}

rotateUserAgents();

User Agent Best Practices

1. Use Realistic User Agents

Always use real, current user agent strings from actual browsers. Avoid outdated or obviously fake user agents:

// Good - Real Chrome user agent
const goodUA = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36';

// Bad - Fake or outdated user agent
const badUA = 'MyBot/1.0 (Web Scraper)';

2. Match User Agent with Other Headers

When setting custom user agents, ensure other headers are consistent. For example, when using a mobile user agent, also set appropriate viewport and headers:

const context = await browser.newContext({
  userAgent: 'Mozilla/5.0 (iPhone; CPU iPhone OS 14_6 like Mac OS X) AppleWebKit/605.1.15',
  viewport: { width: 375, height: 667 },
  extraHTTPHeaders: {
    'Accept-Language': 'en-US,en;q=0.9',
    'Accept-Encoding': 'gzip, deflate, br'
  }
});

3. Test User Agent Detection

Always verify that your custom user agent is being sent correctly:

// Check if user agent is properly set
const userAgent = await page.evaluate(() => navigator.userAgent);
console.log('Browser User Agent:', userAgent);

// Also check server-side detection
await page.goto('https://httpbin.org/user-agent');
const serverUA = await page.textContent('body');
console.log('Server-detected User Agent:', serverUA);

Common User Agent Strings

Here are some commonly used user agent strings for different browsers and devices:

Desktop Browsers

const desktopUserAgents = {
  chrome: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
  firefox: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0',
  safari: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.1 Safari/605.1.15',
  edge: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36 Edg/91.0.864.59'
};

Mobile Browsers

const mobileUserAgents = {
  iphone: 'Mozilla/5.0 (iPhone; CPU iPhone OS 14_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1',
  android: 'Mozilla/5.0 (Linux; Android 10; SM-G973F) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Mobile Safari/537.36',
  ipad: 'Mozilla/5.0 (iPad; CPU OS 14_6 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Mobile/15E148 Safari/604.1'
};

Combining User Agents with Browser Context Options

For comprehensive browser fingerprinting, combine user agent settings with other context options:

const { chromium } = require('playwright');

async function comprehensiveBrowserEmulation() {
  const browser = await chromium.launch();

  const context = await browser.newContext({
    userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
    viewport: { width: 1366, height: 768 },
    locale: 'en-US',
    timezoneId: 'America/New_York',
    extraHTTPHeaders: {
      'Accept-Language': 'en-US,en;q=0.9',
      'Accept-Encoding': 'gzip, deflate, br',
      'Sec-Fetch-Site': 'none',
      'Sec-Fetch-Mode': 'navigate',
      'Sec-Fetch-User': '?1',
      'Sec-Fetch-Dest': 'document'
    }
  });

  const page = await context.newPage();
  await page.goto('https://httpbin.org/headers');

  const headers = await page.textContent('pre');
  console.log('Complete Headers:', headers);

  await browser.close();
}

comprehensiveBrowserEmulation();

Troubleshooting User Agent Issues

Issue 1: User Agent Not Being Applied

If your custom user agent isn't working, check:

  1. Set the user agent before navigating to the page
  2. Verify the user agent string is properly formatted
  3. Check if the website is detecting other browser fingerprints

Issue 2: Inconsistent Behavior

Some websites check multiple factors beyond user agents. Consider also setting:

  • Viewport size
  • Accept headers
  • Accept-Language headers
  • Platform-specific features

Issue 3: Mobile User Agent Detection

When using mobile user agents, also configure:

const context = await browser.newContext({
  userAgent: 'Mozilla/5.0 (iPhone; CPU iPhone OS 14_6 like Mac OS X) AppleWebKit/605.1.15',
  viewport: { width: 375, height: 667 },
  hasTouch: true,
  isMobile: true
});

Using User Agents with Proxy Servers

When combining user agents with proxy servers, ensure consistency:

const context = await browser.newContext({
  userAgent: 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
  proxy: {
    server: 'http://proxy-server:port',
    username: 'username',
    password: 'password'
  }
});

Integration with Web Scraping APIs

When using web scraping services, you can often specify custom user agents. Similar to how you might handle browser sessions in Puppeteer, many scraping APIs allow user agent customization through parameters.

For more advanced scenarios involving handling authentication in Puppeteer, custom user agents can be combined with other headers and session management techniques.

Conclusion

Configuring custom user agents in Playwright is straightforward and essential for effective web scraping. Whether you're setting them at the browser context level, page level, or using predefined device configurations, the key is to use realistic user agent strings that match your scraping requirements.

Remember to: - Use current, realistic user agent strings - Match user agents with appropriate headers and viewport settings - Test your configuration to ensure it works as expected - Consider rotating user agents for large-scale scraping operations - Combine user agents with other browser context options for comprehensive emulation - Respect website terms of service and robots.txt files

By following these practices, you'll be able to effectively simulate different browsers and devices in your Playwright automation scripts, leading to more successful web scraping outcomes.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon