How do I Find a List of Available MCP Servers?
Finding Model Context Protocol (MCP) servers for web scraping and automation can be done through several official and community resources. MCP servers provide specialized tools and capabilities that can be integrated into your scraping workflows, from browser automation to data extraction.
Official MCP Server Directory
The primary resource for finding MCP servers is the official Anthropic MCP servers repository on GitHub. This repository contains a curated list of first-party and community MCP servers.
Browsing the Official Repository
You can explore available servers by visiting the repository directly:
# Clone the official MCP servers repository
git clone https://github.com/modelcontextprotocol/servers.git
cd servers
# List all available server directories
ls -la src/
The repository is organized with each server in its own directory under the src/
folder. Common servers for web scraping include:
- Playwright MCP Server - Browser automation and web scraping
- Puppeteer MCP Server - Headless Chrome automation
- Fetch MCP Server - HTTP requests and API interactions
- Memory MCP Server - State management across scraping sessions
- Filesystem MCP Server - File operations for saving scraped data
Using the MCP CLI to Discover Servers
The MCP SDK includes command-line tools for discovering and managing servers:
# Install the MCP CLI globally
npm install -g @modelcontextprotocol/cli
# List installed MCP servers
mcp list
# Search for available MCP servers
mcp search web-scraping
mcp search browser
mcp search automation
Installing Servers from NPM
Many MCP servers are published as npm packages, making them easy to discover and install:
# Search for MCP servers on npm
npm search @modelcontextprotocol
# Install a specific MCP server
npm install @modelcontextprotocol/server-playwright
# Install the Puppeteer MCP server for browser automation
npm install @modelcontextprotocol/server-puppeteer
Configuring MCP Servers in Claude Desktop
Once you've identified servers you want to use, configure them in your Claude Desktop settings:
{
"mcpServers": {
"playwright": {
"command": "node",
"args": [
"/path/to/node_modules/@modelcontextprotocol/server-playwright/dist/index.js"
]
},
"puppeteer": {
"command": "node",
"args": [
"/path/to/node_modules/@modelcontextprotocol/server-puppeteer/dist/index.js"
]
},
"webscraping-ai": {
"command": "npx",
"args": [
"-y",
"@drakula2k/webscraping-ai-mcp-server"
],
"env": {
"WEBSCRAPING_AI_API_KEY": "your-api-key-here"
}
}
}
}
Programmatically Listing Available MCP Tools
After connecting to an MCP server, you can programmatically list available tools using the MCP SDK:
Python Example
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
async def list_mcp_tools(server_command, server_args):
"""List all tools available from an MCP server"""
server_params = StdioServerParameters(
command=server_command,
args=server_args
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# List all available tools
tools = await session.list_tools()
print(f"Available tools from {server_command}:")
for tool in tools.tools:
print(f"\n- {tool.name}")
print(f" Description: {tool.description}")
print(f" Input schema: {tool.inputSchema}")
return tools
# Example: List tools from Playwright MCP server
import asyncio
asyncio.run(list_mcp_tools(
"node",
["node_modules/@modelcontextprotocol/server-playwright/dist/index.js"]
))
JavaScript Example
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
async function listMcpTools(serverCommand, serverArgs) {
// Create transport for stdio communication
const transport = new StdioClientTransport({
command: serverCommand,
args: serverArgs
});
// Create MCP client
const client = new Client({
name: "mcp-tool-lister",
version: "1.0.0"
}, {
capabilities: {}
});
await client.connect(transport);
// List all available tools
const tools = await client.listTools();
console.log(`Available tools from ${serverCommand}:`);
tools.tools.forEach(tool => {
console.log(`\n- ${tool.name}`);
console.log(` Description: ${tool.description}`);
console.log(` Input schema:`, JSON.stringify(tool.inputSchema, null, 2));
});
await client.close();
return tools;
}
// Example: List tools from Puppeteer MCP server
listMcpTools(
"node",
["node_modules/@modelcontextprotocol/server-puppeteer/dist/index.js"]
).catch(console.error);
Community MCP Server Resources
Beyond the official repository, you can find community-built MCP servers through several channels:
GitHub Search
Use GitHub's search functionality to discover MCP servers:
# Search GitHub for MCP server repositories
# Visit: https://github.com/search?q=mcp+server+web+scraping
# Or: https://github.com/search?q=modelcontextprotocol+server
Popular Community MCP Servers for Web Scraping
- WebScraping.AI MCP Server - Full-featured web scraping with proxy support and AI extraction
- Selenium MCP Server - WebDriver-based browser automation
- Axios MCP Server - HTTP client for API requests
- Cheerio MCP Server - Fast HTML parsing and manipulation
- BeautifulSoup MCP Server - Python HTML/XML parsing
Listing MCP Resources Programmatically
MCP servers can expose resources (like URLs, files, or data sources) that can be listed:
async def list_mcp_resources(server_command, server_args):
"""List all resources available from an MCP server"""
server_params = StdioServerParameters(
command=server_command,
args=server_args
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
# List all available resources
resources = await session.list_resources()
print(f"Available resources from {server_command}:")
for resource in resources.resources:
print(f"\n- {resource.uri}")
print(f" Name: {resource.name}")
print(f" Description: {resource.description}")
print(f" MIME type: {resource.mimeType}")
return resources
Testing MCP Server Availability
Before integrating an MCP server into your workflow, test its availability and functionality:
# Test MCP server connection
node test-mcp-server.js
Create a test script (test-mcp-server.js
):
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
async function testMcpServer(serverCommand, serverArgs) {
try {
const transport = new StdioClientTransport({
command: serverCommand,
args: serverArgs
});
const client = new Client({
name: "mcp-tester",
version: "1.0.0"
}, {
capabilities: {}
});
await client.connect(transport);
console.log("✓ Successfully connected to MCP server");
const tools = await client.listTools();
console.log(`✓ Server provides ${tools.tools.length} tools`);
const resources = await client.listResources();
console.log(`✓ Server provides ${resources.resources.length} resources`);
await client.close();
console.log("✓ Test completed successfully");
return true;
} catch (error) {
console.error("✗ MCP server test failed:", error);
return false;
}
}
// Test Playwright MCP server
testMcpServer(
"node",
["node_modules/@modelcontextprotocol/server-playwright/dist/index.js"]
);
MCP Server Discovery Best Practices
When searching for MCP servers for web scraping tasks, consider these best practices:
- Check Server Maintenance - Verify the repository is actively maintained with recent commits
- Review Documentation - Ensure the server has clear documentation and examples
- Test Locally First - Always test servers in a development environment before production use
- Check Dependencies - Review the server's dependencies and security status
- Community Support - Look for servers with active community support and issue resolution
Integration with Web Scraping Workflows
Once you've found suitable MCP servers, integrate them into your scraping workflows. For browser automation tasks, handling browser sessions in Puppeteer and monitoring network requests in Puppeteer become easier when using MCP-powered tools.
Creating a Custom MCP Server Registry
For organizations managing multiple MCP servers, consider creating a custom registry:
// mcp-registry.json
{
"servers": [
{
"name": "playwright-scraper",
"description": "Browser automation with Playwright",
"category": "browser-automation",
"command": "node",
"args": ["./servers/playwright/index.js"],
"tags": ["browser", "scraping", "automation"]
},
{
"name": "webscraping-ai",
"description": "AI-powered web scraping API",
"category": "api-scraping",
"command": "npx",
"args": ["-y", "@drakula2k/webscraping-ai-mcp-server"],
"tags": ["api", "ai", "extraction"],
"requiresApiKey": true
}
]
}
Load and use the registry:
import fs from 'fs/promises';
async function loadMcpRegistry(registryPath) {
const registry = JSON.parse(
await fs.readFile(registryPath, 'utf-8')
);
// Filter servers by category or tags
const browserServers = registry.servers.filter(
s => s.category === 'browser-automation'
);
const apiServers = registry.servers.filter(
s => s.tags.includes('api')
);
return {
all: registry.servers,
browser: browserServers,
api: apiServers
};
}
// Use the registry
const servers = await loadMcpRegistry('./mcp-registry.json');
console.log('Available browser automation servers:', servers.browser);
Conclusion
Finding and using MCP servers for web scraping involves exploring the official repository, community resources, and npm packages. By leveraging the MCP SDK's discovery tools and programmatically listing available tools and resources, you can build powerful, modular scraping workflows that combine multiple specialized servers.
Start with the official Anthropic MCP servers repository, test servers locally, and gradually build your custom registry of trusted MCP servers for your specific web scraping needs. Whether you're using browser automation with Puppeteer or API-based scraping solutions, MCP servers provide a standardized way to extend your capabilities.