Table of contents

How do I use Selenium with MCP servers for web scraping?

Selenium is one of the most popular browser automation frameworks, and when combined with Model Context Protocol (MCP) servers, it becomes a powerful tool for web scraping workflows. This guide will show you how to integrate Selenium with MCP servers to create robust, scalable web scraping solutions.

Understanding Selenium and MCP Integration

The Model Context Protocol (MCP) is an open protocol that standardizes how applications provide context to AI assistants. While MCP servers traditionally focus on Playwright and Puppeteer integrations, you can create custom MCP servers that leverage Selenium's capabilities for browser automation and web scraping.

Selenium offers several advantages for web scraping: - Cross-browser support (Chrome, Firefox, Safari, Edge) - Mature ecosystem with extensive documentation - Support for multiple programming languages (Python, Java, JavaScript, C#) - Robust handling of dynamic content and JavaScript-heavy websites - Built-in wait mechanisms and element interactions

Setting Up Selenium with MCP Servers

Prerequisites

Before integrating Selenium with MCP servers, ensure you have:

  1. Selenium WebDriver installed for your preferred programming language
  2. Browser drivers (ChromeDriver, GeckoDriver, etc.)
  3. MCP SDK for building custom servers
  4. Python or Node.js runtime environment

Installation

For Python:

pip install selenium
pip install mcp
pip install webdriver-manager

For JavaScript/Node.js:

npm install selenium-webdriver
npm install @modelcontextprotocol/sdk
npm install webdriver-manager

Installing Browser Drivers

Use WebDriver Manager to automatically handle browser drivers:

Python:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service

service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)

JavaScript:

const {Builder} = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');

const driver = await new Builder()
    .forBrowser('chrome')
    .setChromeOptions(new chrome.Options())
    .build();

Building a Custom MCP Server with Selenium

Since there isn't a native Selenium MCP server like there is for Playwright, you'll need to create a custom MCP server that exposes Selenium's capabilities as MCP tools.

Python MCP Server Example

Here's a comprehensive example of building an MCP server with Selenium in Python:

from mcp.server import Server, Tool
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
import json

class SeleniumMCPServer:
    def __init__(self):
        self.server = Server("selenium-scraper")
        self.driver = None
        self.setup_tools()

    def setup_tools(self):
        @self.server.tool()
        async def navigate(url: str) -> dict:
            """Navigate to a URL"""
            if not self.driver:
                self.init_driver()

            self.driver.get(url)
            return {
                "success": True,
                "url": self.driver.current_url,
                "title": self.driver.title
            }

        @self.server.tool()
        async def get_element_text(selector: str, by: str = "css") -> dict:
            """Extract text from an element using CSS selector or XPath"""
            if not self.driver:
                return {"error": "Browser not initialized"}

            try:
                by_type = By.CSS_SELECTOR if by == "css" else By.XPATH
                element = WebDriverWait(self.driver, 10).until(
                    EC.presence_of_element_located((by_type, selector))
                )
                return {
                    "success": True,
                    "text": element.text,
                    "tag": element.tag_name
                }
            except Exception as e:
                return {"error": str(e)}

        @self.server.tool()
        async def get_elements(selector: str, by: str = "css") -> dict:
            """Extract multiple elements"""
            if not self.driver:
                return {"error": "Browser not initialized"}

            try:
                by_type = By.CSS_SELECTOR if by == "css" else By.XPATH
                elements = self.driver.find_elements(by_type, selector)

                results = []
                for elem in elements:
                    results.append({
                        "text": elem.text,
                        "tag": elem.tag_name,
                        "attributes": {
                            "class": elem.get_attribute("class"),
                            "id": elem.get_attribute("id")
                        }
                    })

                return {
                    "success": True,
                    "count": len(results),
                    "elements": results
                }
            except Exception as e:
                return {"error": str(e)}

        @self.server.tool()
        async def click_element(selector: str, by: str = "css") -> dict:
            """Click an element"""
            if not self.driver:
                return {"error": "Browser not initialized"}

            try:
                by_type = By.CSS_SELECTOR if by == "css" else By.XPATH
                element = WebDriverWait(self.driver, 10).until(
                    EC.element_to_be_clickable((by_type, selector))
                )
                element.click()
                return {"success": True}
            except Exception as e:
                return {"error": str(e)}

        @self.server.tool()
        async def get_page_source() -> dict:
            """Get the complete page HTML"""
            if not self.driver:
                return {"error": "Browser not initialized"}

            return {
                "success": True,
                "html": self.driver.page_source,
                "url": self.driver.current_url
            }

        @self.server.tool()
        async def screenshot(filename: str = "screenshot.png") -> dict:
            """Take a screenshot of the current page"""
            if not self.driver:
                return {"error": "Browser not initialized"}

            try:
                self.driver.save_screenshot(filename)
                return {"success": True, "filename": filename}
            except Exception as e:
                return {"error": str(e)}

        @self.server.tool()
        async def close_browser() -> dict:
            """Close the browser and clean up"""
            if self.driver:
                self.driver.quit()
                self.driver = None
                return {"success": True}
            return {"error": "No browser to close"}

    def init_driver(self, headless: bool = True):
        """Initialize the Selenium WebDriver"""
        chrome_options = Options()
        if headless:
            chrome_options.add_argument("--headless")
        chrome_options.add_argument("--no-sandbox")
        chrome_options.add_argument("--disable-dev-shm-usage")

        service = Service(ChromeDriverManager().install())
        self.driver = webdriver.Chrome(service=service, options=chrome_options)

    async def run(self):
        """Run the MCP server"""
        async with self.server:
            await self.server.run()

if __name__ == "__main__":
    import asyncio
    server = SeleniumMCPServer()
    asyncio.run(server.run())

JavaScript/Node.js MCP Server Example

Here's how to build a similar MCP server using JavaScript:

const { Server } = require('@modelcontextprotocol/sdk');
const { Builder, By, until } = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');

class SeleniumMCPServer {
    constructor() {
        this.server = new Server({ name: 'selenium-scraper' });
        this.driver = null;
        this.setupTools();
    }

    async initDriver(headless = true) {
        const options = new chrome.Options();
        if (headless) {
            options.addArguments('--headless');
        }
        options.addArguments('--no-sandbox');
        options.addArguments('--disable-dev-shm-usage');

        this.driver = await new Builder()
            .forBrowser('chrome')
            .setChromeOptions(options)
            .build();
    }

    setupTools() {
        this.server.addTool({
            name: 'navigate',
            description: 'Navigate to a URL',
            parameters: {
                url: { type: 'string', required: true }
            },
            execute: async ({ url }) => {
                if (!this.driver) {
                    await this.initDriver();
                }

                await this.driver.get(url);
                const title = await this.driver.getTitle();
                const currentUrl = await this.driver.getCurrentUrl();

                return {
                    success: true,
                    url: currentUrl,
                    title: title
                };
            }
        });

        this.server.addTool({
            name: 'getElementText',
            description: 'Extract text from an element',
            parameters: {
                selector: { type: 'string', required: true },
                by: { type: 'string', default: 'css' }
            },
            execute: async ({ selector, by }) => {
                if (!this.driver) {
                    return { error: 'Browser not initialized' };
                }

                try {
                    const byType = by === 'css' ? By.css : By.xpath;
                    const element = await this.driver.wait(
                        until.elementLocated(byType(selector)),
                        10000
                    );
                    const text = await element.getText();
                    const tag = await element.getTagName();

                    return {
                        success: true,
                        text: text,
                        tag: tag
                    };
                } catch (error) {
                    return { error: error.message };
                }
            }
        });

        this.server.addTool({
            name: 'getPageSource',
            description: 'Get the complete page HTML',
            parameters: {},
            execute: async () => {
                if (!this.driver) {
                    return { error: 'Browser not initialized' };
                }

                const html = await this.driver.getPageSource();
                const url = await this.driver.getCurrentUrl();

                return {
                    success: true,
                    html: html,
                    url: url
                };
            }
        });

        this.server.addTool({
            name: 'closeBrowser',
            description: 'Close the browser',
            parameters: {},
            execute: async () => {
                if (this.driver) {
                    await this.driver.quit();
                    this.driver = null;
                    return { success: true };
                }
                return { error: 'No browser to close' };
            }
        });
    }

    async run() {
        await this.server.start();
    }
}

const server = new SeleniumMCPServer();
server.run().catch(console.error);

Practical Web Scraping Examples

Example 1: Scraping E-commerce Product Data

async def scrape_products():
    # Navigate to product listing page
    await navigate(url="https://example.com/products")

    # Wait for products to load and extract them
    products = await get_elements(selector=".product-card", by="css")

    product_data = []
    for i, product in enumerate(products['elements']):
        # Extract product details
        title_elem = await get_element_text(
            selector=f".product-card:nth-child({i+1}) .product-title",
            by="css"
        )
        price_elem = await get_element_text(
            selector=f".product-card:nth-child({i+1}) .product-price",
            by="css"
        )

        product_data.append({
            "title": title_elem['text'],
            "price": price_elem['text']
        })

    await close_browser()
    return product_data

Example 2: Handling Dynamic Content

Similar to how you would handle AJAX requests using Puppeteer, Selenium provides robust waiting mechanisms:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Wait for dynamic content to load
wait = WebDriverWait(driver, 10)
element = wait.until(
    EC.presence_of_element_located((By.CLASS_NAME, "dynamic-content"))
)

# Wait for AJAX call to complete
wait.until(
    EC.invisibility_of_element_located((By.CLASS_NAME, "loading-spinner"))
)

Example 3: Handling Pagination

async def scrape_paginated_content():
    all_data = []
    page = 1

    while True:
        # Navigate to page
        await navigate(url=f"https://example.com/data?page={page}")

        # Extract data
        items = await get_elements(selector=".data-item", by="css")
        all_data.extend(items['elements'])

        # Check if next page exists
        next_button = await get_element_text(
            selector=".next-page",
            by="css"
        )

        if 'error' in next_button:
            break  # No more pages

        page += 1

    return all_data

Best Practices for Selenium with MCP Servers

1. Resource Management

Always properly close browser instances to prevent memory leaks:

try:
    # Your scraping code
    await navigate(url="https://example.com")
    data = await get_page_source()
finally:
    await close_browser()

2. Error Handling

Implement robust error handling for network issues and element not found errors:

from selenium.common.exceptions import TimeoutException, NoSuchElementException

try:
    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "myElement"))
    )
except TimeoutException:
    print("Element not found within timeout period")
except NoSuchElementException:
    print("Element does not exist on the page")

3. Headless Mode

Run browsers in headless mode for better performance:

chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--window-size=1920,1080")

4. Wait Strategies

Use explicit waits instead of implicit waits for better control, similar to using the waitFor function in Puppeteer:

# Explicit wait (recommended)
wait = WebDriverWait(driver, 10)
element = wait.until(EC.element_to_be_clickable((By.ID, "submit")))

# Avoid implicit waits globally
# driver.implicitly_wait(10)  # Not recommended

5. Handle Authentication

For sites requiring authentication, just like handling authentication in Puppeteer:

async def login(username, password):
    await navigate(url="https://example.com/login")

    # Find and fill username
    driver.find_element(By.ID, "username").send_keys(username)

    # Find and fill password
    driver.find_element(By.ID, "password").send_keys(password)

    # Submit form
    await click_element(selector="#login-button", by="css")

    # Wait for redirect
    WebDriverWait(driver, 10).until(
        EC.url_contains("dashboard")
    )

Performance Optimization

1. Disable Unnecessary Features

chrome_options = Options()
chrome_options.add_argument("--disable-images")
chrome_options.add_argument("--disable-javascript")  # Only if JS not needed
chrome_options.add_argument("--blink-settings=imagesEnabled=false")

2. Use Page Load Strategies

chrome_options = Options()
chrome_options.page_load_strategy = 'eager'  # Don't wait for all resources

3. Parallel Scraping

Run multiple browser instances for concurrent scraping:

from concurrent.futures import ThreadPoolExecutor

async def scrape_url(url):
    # Each thread gets its own browser instance
    driver = webdriver.Chrome(options=chrome_options)
    try:
        driver.get(url)
        return driver.page_source
    finally:
        driver.quit()

urls = ["https://example.com/page1", "https://example.com/page2"]
with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(scrape_url, urls))

Debugging and Troubleshooting

Enable Logging

import logging

logging.basicConfig(level=logging.DEBUG)
selenium_logger = logging.getLogger('selenium')
selenium_logger.setLevel(logging.DEBUG)

Take Screenshots on Errors

try:
    element = driver.find_element(By.ID, "myElement")
except Exception as e:
    driver.save_screenshot("error_screenshot.png")
    raise e

Check Console Logs

logs = driver.get_log('browser')
for log in logs:
    print(log)

Comparison: Selenium vs. Playwright with MCP

While Selenium can be integrated with MCP servers through custom implementations, Playwright has native MCP server support. Here's when to choose each:

Choose Selenium when: - You need cross-browser testing (Safari, older browsers) - Your team is already familiar with Selenium - You have existing Selenium infrastructure - You need Java or C# support

Choose Playwright when: - You want native MCP integration - You need modern browser features - Performance is critical - You're starting a new project

Conclusion

While Selenium doesn't have native MCP server support like Playwright or Puppeteer, you can create powerful custom MCP servers that leverage Selenium's robust browser automation capabilities. This approach gives you the flexibility to use Selenium's mature ecosystem while benefiting from the Model Context Protocol's standardized integration with AI assistants.

By following the examples and best practices outlined in this guide, you can build scalable web scraping solutions that combine Selenium's reliability with MCP's modern architecture for AI-powered data extraction workflows.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon