Table of contents

How can I intercept and modify network requests in Playwright?

Playwright provides powerful network interception capabilities that allow you to monitor, modify, and mock HTTP requests and responses. This functionality is essential for testing scenarios, debugging web applications, and controlling network behavior during web scraping operations.

Understanding Network Interception in Playwright

Network interception in Playwright works by setting up route handlers that can intercept requests matching specific patterns. You can then choose to fulfill the request with custom data, modify the request before it's sent, or simply monitor network traffic.

Basic Request Interception

JavaScript/TypeScript Implementation

const { chromium } = require('playwright');

async function interceptBasicRequests() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Intercept all requests
  await page.route('**/*', (route) => {
    console.log(`Intercepted: ${route.request().method()} ${route.request().url()}`);
    // Continue with the original request
    route.continue();
  });

  await page.goto('https://example.com');
  await browser.close();
}

interceptBasicRequests();

Python Implementation

import asyncio
from playwright.async_api import async_playwright

async def intercept_basic_requests():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()

        # Intercept all requests
        async def handle_route(route):
            print(f"Intercepted: {route.request.method} {route.request.url}")
            await route.continue_()

        await page.route("**/*", handle_route)
        await page.goto("https://example.com")
        await browser.close()

asyncio.run(intercept_basic_requests())

Modifying Request Headers and Data

You can modify requests before they're sent to the server by changing headers, URL parameters, or request body:

JavaScript Example

async function modifyRequestHeaders() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  await page.route('**/*', (route) => {
    const request = route.request();

    // Modify headers
    const headers = {
      ...request.headers(),
      'User-Agent': 'Custom-Bot/1.0',
      'X-Custom-Header': 'Modified-Request'
    };

    // Continue with modified headers
    route.continue({
      headers: headers
    });
  });

  await page.goto('https://httpbin.org/headers');
  await browser.close();
}

Python Example

async def modify_request_headers():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()

        async def handle_route(route):
            # Get original headers
            headers = route.request.headers

            # Modify headers
            headers['User-Agent'] = 'Custom-Bot/1.0'
            headers['X-Custom-Header'] = 'Modified-Request'

            await route.continue_(headers=headers)

        await page.route("**/*", handle_route)
        await page.goto("https://httpbin.org/headers")
        await browser.close()

Intercepting Specific Request Types

Intercepting API Calls

async function interceptAPIRequests() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Intercept only API requests
  await page.route('**/api/**', (route) => {
    const request = route.request();
    console.log(`API Request: ${request.method()} ${request.url()}`);

    // Log request body for POST requests
    if (request.method() === 'POST') {
      console.log('Request body:', request.postData());
    }

    route.continue();
  });

  await page.goto('https://example.com');
  await browser.close();
}

Intercepting Image Requests

async function blockImageRequests() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Block all image requests to speed up loading
  await page.route('**/*.{png,jpg,jpeg,gif,svg,webp}', (route) => {
    console.log(`Blocked image: ${route.request().url()}`);
    route.abort();
  });

  await page.goto('https://example.com');
  await browser.close();
}

Mocking Network Responses

Static Response Mocking

async function mockStaticResponse() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Mock API response
  await page.route('**/api/users', (route) => {
    const mockData = {
      users: [
        { id: 1, name: 'John Doe', email: 'john@example.com' },
        { id: 2, name: 'Jane Smith', email: 'jane@example.com' }
      ]
    };

    route.fulfill({
      status: 200,
      contentType: 'application/json',
      body: JSON.stringify(mockData)
    });
  });

  await page.goto('https://example.com');
  await browser.close();
}

Dynamic Response Modification

async def modify_response_data():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()

        async def handle_route(route):
            # Get the original response
            response = await route.fetch()

            if 'application/json' in response.headers.get('content-type', ''):
                # Modify JSON response
                original_data = await response.json()
                original_data['modified'] = True
                original_data['timestamp'] = '2024-01-01T00:00:00Z'

                await route.fulfill(
                    response=response,
                    json=original_data
                )
            else:
                # Continue with original response
                await route.fulfill(response=response)

        await page.route("**/api/**", handle_route)
        await page.goto("https://example.com")
        await browser.close()

Advanced Network Interception Patterns

Conditional Request Modification

async function conditionalInterception() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  await page.route('**/*', (route) => {
    const request = route.request();
    const url = request.url();

    // Different handling based on URL patterns
    if (url.includes('/slow-endpoint')) {
      // Speed up slow endpoints with cached response
      route.fulfill({
        status: 200,
        contentType: 'application/json',
        body: JSON.stringify({ cached: true, data: 'fast response' })
      });
    } else if (url.includes('/analytics')) {
      // Block analytics requests
      route.abort();
    } else if (request.method() === 'POST' && url.includes('/form')) {
      // Log form submissions
      console.log('Form data:', request.postData());
      route.continue();
    } else {
      // Default behavior
      route.continue();
    }
  });

  await page.goto('https://example.com');
  await browser.close();
}

Request Timing and Performance Analysis

import time

async def analyze_request_performance():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()

        request_timings = {}

        async def handle_route(route):
            start_time = time.time()
            request_url = route.request.url

            # Continue with original request
            await route.continue_()

            # Note: This is a simplified example
            # In practice, you'd use page.on('response') for timing
            end_time = time.time()
            request_timings[request_url] = end_time - start_time

        await page.route("**/*", handle_route)
        await page.goto("https://example.com")

        # Print timing analysis
        for url, timing in request_timings.items():
            print(f"{url}: {timing:.2f}s")

        await browser.close()

Working with Request and Response Events

Monitoring All Network Activity

async function monitorNetworkActivity() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  // Listen to all requests
  page.on('request', (request) => {
    console.log(`→ ${request.method()} ${request.url()}`);
  });

  // Listen to all responses
  page.on('response', (response) => {
    console.log(`← ${response.status()} ${response.url()}`);
  });

  // Listen to failed requests
  page.on('requestfailed', (request) => {
    console.log(`✗ Failed: ${request.url()} - ${request.failure().errorText}`);
  });

  await page.goto('https://example.com');
  await browser.close();
}

Intercepting and Modifying POST Requests

async function interceptPostRequests() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  await page.route('**/api/submit', (route) => {
    const request = route.request();

    if (request.method() === 'POST') {
      // Get original POST data
      const postData = request.postData();
      let modifiedData;

      try {
        // Parse and modify JSON data
        const originalData = JSON.parse(postData);
        modifiedData = {
          ...originalData,
          timestamp: new Date().toISOString(),
          modified: true
        };
      } catch (e) {
        // Handle non-JSON data
        modifiedData = postData + '&modified=true';
      }

      // Continue with modified data
      route.continue({
        postData: typeof modifiedData === 'object' 
          ? JSON.stringify(modifiedData) 
          : modifiedData
      });
    } else {
      route.continue();
    }
  });

  await page.goto('https://example.com');
  await browser.close();
}

Best Practices for Network Interception

1. Efficient Route Patterns

Use specific route patterns to avoid unnecessary interception:

// Good - specific patterns
await page.route('**/api/v1/**', handler);
await page.route('**/*.{js,css}', handler);

// Avoid - overly broad patterns that match everything
await page.route('**/*', handler);

2. Proper Error Handling

async function robustInterception() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  await page.route('**/*', async (route) => {
    try {
      const request = route.request();

      // Your interception logic here
      await route.continue();
    } catch (error) {
      console.error('Route handler error:', error);
      // Fallback to continue the request
      try {
        await route.continue();
      } catch (fallbackError) {
        console.error('Fallback error:', fallbackError);
      }
    }
  });

  await page.goto('https://example.com');
  await browser.close();
}

3. Cleanup and Resource Management

async def proper_cleanup():
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        page = await browser.new_page()

        try:
            async def handle_route(route):
                await route.continue_()

            await page.route("**/*", handle_route)
            await page.goto("https://example.com")

            # Your scraping logic here

        finally:
            # Ensure cleanup
            await page.unroute("**/*")
            await browser.close()

Common Use Cases

Testing API Integration

Network interception is particularly useful for testing applications that depend on external APIs. You can simulate different API responses, network failures, and latency scenarios without relying on actual external services.

Performance Optimization

By intercepting and blocking unnecessary requests (like analytics, ads, or large images), you can significantly improve page load times for web scraping operations, similar to techniques used when monitoring network requests in Puppeteer.

Data Collection and Analysis

Intercepting network requests allows you to collect detailed information about API calls, form submissions, and other network activity, which can be valuable for understanding application behavior and data flow.

Debugging Network Issues

Request/Response Logging

async function debugNetworkIssues() {
  const browser = await chromium.launch();
  const page = await browser.newPage();

  await page.route('**/*', (route) => {
    const request = route.request();

    console.log(`Request: ${request.method()} ${request.url()}`);
    console.log(`Headers:`, request.headers());

    if (request.postData()) {
      console.log(`Body:`, request.postData());
    }

    route.continue();
  });

  page.on('response', (response) => {
    console.log(`Response: ${response.status()} ${response.url()}`);
    console.log(`Headers:`, response.headers());
  });

  await page.goto('https://example.com');
  await browser.close();
}

Conclusion

Playwright's network interception capabilities provide a powerful toolkit for controlling and monitoring HTTP traffic. Whether you're testing web applications, optimizing scraping performance, or debugging network issues, these techniques enable fine-grained control over network behavior.

The key to effective network interception is understanding your specific use case and applying the appropriate level of interception - from simple monitoring to complete request mocking. Always remember to handle errors gracefully and clean up resources properly to maintain robust automation scripts.

For more advanced scenarios involving request handling and authentication, consider exploring authentication handling techniques in Puppeteer, which share similar concepts across browser automation tools.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon