Table of contents

How do I set up proxy configuration in Playwright?

Setting up proxy configuration in Playwright is essential for web scraping projects that require IP rotation, bypassing geo-restrictions, or routing traffic through specific servers. Playwright provides flexible proxy support for HTTP, HTTPS, and SOCKS proxies at both browser and context levels.

Basic Proxy Configuration

Browser-Level Proxy Setup

The most common approach is to configure the proxy when launching the browser. This applies the proxy settings to all contexts and pages within that browser instance.

const { chromium } = require('playwright');

const browser = await chromium.launch({
  proxy: {
    server: 'http://proxy-server.com:8080'
  }
});

const context = await browser.newContext();
const page = await context.newPage();

// All requests will now go through the proxy
await page.goto('https://httpbin.org/ip');
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(
        proxy={
            'server': 'http://proxy-server.com:8080'
        }
    )

    context = browser.new_context()
    page = context.new_page()

    page.goto('https://httpbin.org/ip')
    browser.close()

Context-Level Proxy Configuration

For more granular control, you can configure proxies at the context level, allowing different contexts to use different proxies.

const { chromium } = require('playwright');

const browser = await chromium.launch();

const context = await browser.newContext({
  proxy: {
    server: 'http://proxy-server.com:8080'
  }
});

const page = await context.newPage();
await page.goto('https://example.com');
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()

    context = browser.new_context(
        proxy={
            'server': 'http://proxy-server.com:8080'
        }
    )

    page = context.new_page()
    page.goto('https://example.com')
    browser.close()

Proxy Authentication

Many proxy services require authentication. Playwright supports both username/password and token-based authentication.

Username and Password Authentication

const browser = await chromium.launch({
  proxy: {
    server: 'http://proxy-server.com:8080',
    username: 'your-username',
    password: 'your-password'
  }
});
browser = p.chromium.launch(
    proxy={
        'server': 'http://proxy-server.com:8080',
        'username': 'your-username',
        'password': 'your-password'
    }
)

Advanced Authentication Headers

For more complex authentication scenarios, you can set custom headers:

const context = await browser.newContext({
  proxy: {
    server: 'http://proxy-server.com:8080'
  },
  extraHTTPHeaders: {
    'Proxy-Authorization': 'Bearer your-token-here'
  }
});

Different Proxy Types

HTTP/HTTPS Proxies

// HTTP proxy
const browser = await chromium.launch({
  proxy: {
    server: 'http://proxy-server.com:8080'
  }
});

// HTTPS proxy
const browser = await chromium.launch({
  proxy: {
    server: 'https://secure-proxy.com:8080'
  }
});

SOCKS Proxies

Playwright supports both SOCKS4 and SOCKS5 proxies:

// SOCKS5 proxy
const browser = await chromium.launch({
  proxy: {
    server: 'socks5://proxy-server.com:1080'
  }
});

// SOCKS4 proxy
const browser = await chromium.launch({
  proxy: {
    server: 'socks4://proxy-server.com:1080'
  }
});
# SOCKS5 proxy
browser = p.chromium.launch(
    proxy={
        'server': 'socks5://proxy-server.com:1080'
    }
)

Proxy Bypass Configuration

You can configure Playwright to bypass the proxy for specific URLs or domains:

const browser = await chromium.launch({
  proxy: {
    server: 'http://proxy-server.com:8080',
    bypass: 'localhost,127.0.0.1,*.internal.com'
  }
});
browser = p.chromium.launch(
    proxy={
        'server': 'http://proxy-server.com:8080',
        'bypass': 'localhost,127.0.0.1,*.internal.com'
    }
)

Multiple Proxy Configuration

For advanced web scraping scenarios, you might need to rotate between multiple proxies. Here's how to implement proxy rotation:

const proxies = [
  { server: 'http://proxy1.com:8080', username: 'user1', password: 'pass1' },
  { server: 'http://proxy2.com:8080', username: 'user2', password: 'pass2' },
  { server: 'http://proxy3.com:8080', username: 'user3', password: 'pass3' }
];

async function scrapeWithProxyRotation(urls) {
  const { chromium } = require('playwright');

  for (let i = 0; i < urls.length; i++) {
    const proxy = proxies[i % proxies.length];

    const browser = await chromium.launch({ proxy });
    const context = await browser.newContext();
    const page = await context.newPage();

    try {
      await page.goto(urls[i]);
      // Process the page
      const content = await page.content();
      console.log(`Scraped ${urls[i]} via ${proxy.server}`);
    } catch (error) {
      console.error(`Failed to scrape ${urls[i]}:`, error.message);
    } finally {
      await browser.close();
    }
  }
}

Proxy Health Checking

It's important to verify that your proxy is working correctly. Here's a utility function to test proxy connectivity:

async function testProxy(proxyConfig) {
  const { chromium } = require('playwright');

  try {
    const browser = await chromium.launch({ proxy: proxyConfig });
    const context = await browser.newContext();
    const page = await context.newPage();

    // Test the proxy by checking IP
    await page.goto('https://httpbin.org/ip', { timeout: 10000 });
    const response = await page.textContent('body');
    const ipData = JSON.parse(response);

    console.log('Proxy working. Current IP:', ipData.origin);

    await browser.close();
    return true;
  } catch (error) {
    console.error('Proxy test failed:', error.message);
    return false;
  }
}

// Usage
const proxyConfig = {
  server: 'http://proxy-server.com:8080',
  username: 'your-username',
  password: 'your-password'
};

await testProxy(proxyConfig);

Environment-Based Proxy Configuration

For production applications, it's best practice to store proxy configuration in environment variables:

const proxyConfig = {
  server: process.env.PROXY_SERVER,
  username: process.env.PROXY_USERNAME,
  password: process.env.PROXY_PASSWORD
};

const browser = await chromium.launch({
  proxy: proxyConfig.server ? proxyConfig : undefined
});
import os

proxy_config = {
    'server': os.getenv('PROXY_SERVER'),
    'username': os.getenv('PROXY_USERNAME'),
    'password': os.getenv('PROXY_PASSWORD')
}

# Only use proxy if server is configured
proxy_settings = proxy_config if proxy_config['server'] else None

browser = p.chromium.launch(proxy=proxy_settings)

Troubleshooting Common Issues

Connection Timeouts

If you're experiencing connection timeouts, increase the timeout values:

const page = await context.newPage();
await page.goto('https://example.com', { 
  timeout: 30000,  // 30 seconds
  waitUntil: 'networkidle'
});

Proxy Authentication Errors

For authentication issues, verify your credentials and try different authentication methods:

// Try different authentication approaches
const configs = [
  {
    server: 'http://proxy-server.com:8080',
    username: 'user',
    password: 'pass'
  },
  {
    server: 'http://user:pass@proxy-server.com:8080'
  }
];

SSL Certificate Issues

For HTTPS proxies with SSL issues, you might need to ignore SSL errors:

const context = await browser.newContext({
  proxy: {
    server: 'https://proxy-server.com:8080'
  },
  ignoreHTTPSErrors: true
});

Best Practices

  1. Test Proxy Configuration: Always test your proxy setup before running production scraping tasks
  2. Handle Failures Gracefully: Implement retry logic and fallback mechanisms
  3. Monitor Proxy Performance: Track response times and success rates
  4. Rotate Proxies: Use multiple proxies to distribute load and avoid rate limiting
  5. Secure Credentials: Store proxy credentials securely using environment variables or secret management systems

Integration with Web Scraping Workflows

When building robust web scraping solutions, proper proxy configuration is crucial for avoiding detection and maintaining consistent access to target websites. Similar to how you might handle browser sessions in Puppeteer, managing proxy connections requires careful planning and error handling.

For complex scraping scenarios involving multiple pages or extensive data extraction, consider implementing proxy rotation strategies alongside other anti-detection measures. This approach works particularly well when monitoring network requests in Puppeteer to understand traffic patterns and optimize your scraping strategy.

By properly configuring proxies in Playwright, you can create more resilient web scraping applications that can handle various network conditions and access restrictions while maintaining the reliability and performance your projects require.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon