How do I authenticate with an MCP server?
Authentication with MCP (Model Context Protocol) servers is primarily handled through environment variables, configuration files, and credential management systems rather than direct authentication protocols. Unlike traditional client-server architectures with login flows, MCP servers typically run as local processes or trusted services where credentials for external APIs and resources are securely passed during initialization.
Understanding how to properly configure authentication for MCP servers is crucial for web scraping applications that need to access protected APIs, handle authenticated web sessions, or manage sensitive credentials like API keys for services such as WebScraping.AI.
MCP Authentication Architecture
MCP servers themselves don't implement user authentication in the traditional sense. Instead, authentication happens at three levels:
- Server Process Access: MCP servers run as child processes spawned by MCP clients (like Claude Desktop), inheriting the security context of the parent process
- External API Authentication: Credentials for third-party services (scraping APIs, databases) are passed via environment variables
- Resource Authorization: MCP servers can implement their own authorization logic for tool execution
Environment Variable Authentication
The most common and recommended method for authentication is using environment variables to pass API keys and credentials to MCP servers.
Basic Environment Variable Configuration
When configuring an MCP server in Claude Desktop, you can specify environment variables:
macOS Configuration (~/Library/Application Support/Claude/claude_desktop_config.json
):
{
"mcpServers": {
"webscraping": {
"command": "node",
"args": ["/path/to/mcp-server/dist/index.js"],
"env": {
"WEBSCRAPING_AI_API_KEY": "your_api_key_here",
"PROXY_USERNAME": "proxy_user",
"PROXY_PASSWORD": "proxy_pass",
"DATABASE_URL": "postgresql://user:pass@localhost/db"
}
}
}
}
Windows Configuration (%APPDATA%\Claude\claude_desktop_config.json
):
{
"mcpServers": {
"webscraping": {
"command": "node",
"args": ["C:\\path\\to\\mcp-server\\dist\\index.js"],
"env": {
"WEBSCRAPING_AI_API_KEY": "your_api_key_here",
"API_TIMEOUT": "30000"
}
}
}
}
Using System Environment Variables
Instead of hardcoding credentials in the configuration file, reference system environment variables:
{
"mcpServers": {
"webscraping": {
"command": "node",
"args": ["/path/to/server.js"],
"env": {
"API_KEY": "${WEBSCRAPING_AI_API_KEY}",
"DATABASE_URL": "${DATABASE_URL}"
}
}
}
}
Set the environment variables in your system:
macOS/Linux:
# Add to ~/.bashrc or ~/.zshrc
export WEBSCRAPING_AI_API_KEY="your_api_key_here"
export DATABASE_URL="postgresql://localhost/scraping_db"
Windows (PowerShell):
[System.Environment]::SetEnvironmentVariable('WEBSCRAPING_AI_API_KEY', 'your_api_key_here', 'User')
Windows (Command Prompt):
setx WEBSCRAPING_AI_API_KEY "your_api_key_here"
Implementing Authentication in MCP Servers
Python MCP Server Authentication
Here's how to implement secure authentication in a Python MCP server:
import os
import asyncio
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent
import httpx
# Load credentials from environment variables
API_KEY = os.environ.get("WEBSCRAPING_AI_API_KEY")
PROXY_URL = os.environ.get("PROXY_URL")
if not API_KEY:
raise ValueError("WEBSCRAPING_AI_API_KEY environment variable is required")
# Create server instance
app = Server("authenticated-webscraping-server")
@app.list_tools()
async def list_tools() -> list[Tool]:
return [
Tool(
name="scrape_authenticated",
description="Scrape a webpage with API authentication",
inputSchema={
"type": "object",
"properties": {
"url": {"type": "string", "description": "URL to scrape"},
"wait_for": {"type": "string", "description": "CSS selector to wait for"}
},
"required": ["url"]
}
)
]
@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
if name == "scrape_authenticated":
async with httpx.AsyncClient() as client:
# Use API key for authentication
response = await client.get(
"https://api.webscraping.ai/html",
params={
"url": arguments["url"],
"api_key": API_KEY, # Secure API key from environment
"js": "true",
"wait_for": arguments.get("wait_for"),
"proxy": PROXY_URL # Optional proxy authentication
},
timeout=30.0
)
response.raise_for_status()
return [TextContent(
type="text",
text=f"Successfully scraped {arguments['url']}\n{response.text}"
)]
async def main():
async with stdio_server() as (read_stream, write_stream):
await app.run(read_stream, write_stream)
if __name__ == "__main__":
asyncio.run(main())
JavaScript/TypeScript MCP Server Authentication
For Node.js-based MCP servers, implement authentication similarly:
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";
import axios from "axios";
// Load and validate credentials
const API_KEY = process.env.WEBSCRAPING_AI_API_KEY;
const PROXY_USERNAME = process.env.PROXY_USERNAME;
const PROXY_PASSWORD = process.env.PROXY_PASSWORD;
if (!API_KEY) {
throw new Error("WEBSCRAPING_AI_API_KEY environment variable is required");
}
// Create authenticated server
const server = new Server(
{
name: "authenticated-scraping-server",
version: "1.0.0",
},
{
capabilities: {
tools: {},
},
}
);
server.setRequestHandler(ListToolsRequestSchema, async () => {
return {
tools: [
{
name: "secure_scrape",
description: "Scrape websites with authenticated API access",
inputSchema: {
type: "object",
properties: {
url: { type: "string", description: "Target URL" },
use_proxy: { type: "boolean", description: "Use authenticated proxy" },
},
required: ["url"],
},
},
],
};
});
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
if (name === "secure_scrape") {
try {
const config: any = {
params: {
url: args.url,
api_key: API_KEY, // Secure API authentication
js: true,
},
timeout: 30000,
};
// Add proxy authentication if requested
if (args.use_proxy && PROXY_USERNAME && PROXY_PASSWORD) {
config.params.proxy_username = PROXY_USERNAME;
config.params.proxy_password = PROXY_PASSWORD;
}
const response = await axios.get(
"https://api.webscraping.ai/html",
config
);
return {
content: [
{
type: "text",
text: response.data,
},
],
};
} catch (error: any) {
return {
content: [
{
type: "text",
text: `Authentication error: ${error.message}`,
},
],
isError: true,
};
}
}
throw new Error(`Unknown tool: ${name}`);
});
// Start authenticated server
async function main() {
const transport = new StdioServerTransport();
await server.connect(transport);
console.error("Authenticated MCP Server running");
}
main().catch((error) => {
console.error("Server initialization failed:", error);
process.exit(1);
});
Advanced Authentication Patterns
Multi-Service Authentication
When your MCP server needs to authenticate with multiple services, organize credentials systematically:
import os
from dataclasses import dataclass
from typing import Optional
@dataclass
class ServiceCredentials:
"""Centralized credential management"""
webscraping_api_key: str
database_url: str
redis_url: Optional[str] = None
oauth_token: Optional[str] = None
@classmethod
def from_environment(cls):
"""Load all credentials from environment variables"""
return cls(
webscraping_api_key=os.environ["WEBSCRAPING_AI_API_KEY"],
database_url=os.environ["DATABASE_URL"],
redis_url=os.environ.get("REDIS_URL"),
oauth_token=os.environ.get("OAUTH_TOKEN")
)
def validate(self):
"""Validate required credentials are present"""
if not self.webscraping_api_key:
raise ValueError("WebScraping.AI API key is required")
if not self.database_url:
raise ValueError("Database URL is required")
return True
# Initialize credentials at server startup
credentials = ServiceCredentials.from_environment()
credentials.validate()
@app.call_tool()
async def call_tool(name: str, arguments: dict):
if name == "scrape_and_store":
# Use credentials for API call
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.webscraping.ai/html",
params={
"url": arguments["url"],
"api_key": credentials.webscraping_api_key
}
)
# Use credentials for database storage
# (database connection code here)
return [TextContent(type="text", text="Data scraped and stored")]
OAuth Token Management
For services requiring OAuth authentication, implement token refresh logic:
class TokenManager {
private accessToken: string | null = null;
private refreshToken: string | null = null;
private expiresAt: number = 0;
constructor() {
this.accessToken = process.env.OAUTH_ACCESS_TOKEN || null;
this.refreshToken = process.env.OAUTH_REFRESH_TOKEN || null;
}
async getValidToken(): Promise<string> {
// Check if token is still valid
if (this.accessToken && Date.now() < this.expiresAt) {
return this.accessToken;
}
// Refresh token if expired
if (this.refreshToken) {
await this.refreshAccessToken();
return this.accessToken!;
}
throw new Error("No valid OAuth token available");
}
private async refreshAccessToken() {
const response = await axios.post("https://oauth.example.com/token", {
grant_type: "refresh_token",
refresh_token: this.refreshToken,
client_id: process.env.OAUTH_CLIENT_ID,
client_secret: process.env.OAUTH_CLIENT_SECRET,
});
this.accessToken = response.data.access_token;
this.expiresAt = Date.now() + response.data.expires_in * 1000;
}
}
const tokenManager = new TokenManager();
// Use in tool handlers
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const token = await tokenManager.getValidToken();
// Use token for authenticated requests
const response = await axios.get("https://api.example.com/data", {
headers: {
Authorization: `Bearer ${token}`,
},
});
});
Credential Rotation and Secrets Management
For production environments, implement credential rotation:
import boto3
from datetime import datetime, timedelta
class SecretsManager:
"""AWS Secrets Manager integration for MCP server"""
def __init__(self):
self.client = boto3.client('secretsmanager')
self.cache = {}
self.cache_duration = timedelta(hours=1)
def get_secret(self, secret_name: str) -> str:
"""Retrieve secret with caching"""
if secret_name in self.cache:
cached_value, cached_time = self.cache[secret_name]
if datetime.now() - cached_time < self.cache_duration:
return cached_value
# Fetch from AWS Secrets Manager
response = self.client.get_secret_value(SecretId=secret_name)
secret_value = response['SecretString']
# Update cache
self.cache[secret_name] = (secret_value, datetime.now())
return secret_value
def get_api_key(self) -> str:
"""Get WebScraping.AI API key"""
return self.get_secret('webscraping-ai-api-key')
# Initialize secrets manager
secrets = SecretsManager()
@app.call_tool()
async def call_tool(name: str, arguments: dict):
# Retrieve fresh API key
api_key = secrets.get_api_key()
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.webscraping.ai/html",
params={"url": arguments["url"], "api_key": api_key}
)
return [TextContent(type="text", text=response.text)]
Authentication for Different Scraping Scenarios
Authenticating with Target Websites
When scraping authenticated websites, pass session credentials through your MCP server similar to how you handle authentication in Puppeteer:
@app.call_tool()
async def call_tool(name: str, arguments: dict):
if name == "scrape_authenticated_site":
# Use cookies for website authentication
cookies = arguments.get("cookies", {})
async with httpx.AsyncClient() as client:
response = await client.post(
"https://api.webscraping.ai/html",
params={
"url": arguments["url"],
"api_key": API_KEY
},
json={
"cookies": cookies,
"headers": arguments.get("headers", {})
}
)
return [TextContent(type="text", text=response.text)]
Proxy Authentication
For proxies requiring authentication, configure credentials properly:
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const { name, arguments: args } = request.params;
if (name === "scrape_with_proxy") {
const proxyConfig = {
url: args.url,
api_key: API_KEY,
proxy: "residential",
proxy_username: process.env.PROXY_USERNAME,
proxy_password: process.env.PROXY_PASSWORD,
};
const response = await axios.get(
"https://api.webscraping.ai/html",
{ params: proxyConfig }
);
return {
content: [{ type: "text", text: response.data }],
};
}
});
Security Best Practices
1. Never Hardcode Credentials
Bad Practice:
const API_KEY = "sk_live_abc123..."; // NEVER do this
Good Practice:
const API_KEY = process.env.WEBSCRAPING_AI_API_KEY;
if (!API_KEY) {
throw new Error("API key not configured");
}
2. Validate Environment Variables at Startup
import os
REQUIRED_ENV_VARS = [
"WEBSCRAPING_AI_API_KEY",
"DATABASE_URL"
]
def validate_environment():
"""Ensure all required credentials are present"""
missing = [var for var in REQUIRED_ENV_VARS if not os.environ.get(var)]
if missing:
raise ValueError(
f"Missing required environment variables: {', '.join(missing)}"
)
# Validate before starting server
validate_environment()
3. Use Least Privilege Access
Grant only the permissions needed:
# Good: API key with limited scope
API_KEY = os.environ["WEBSCRAPING_API_KEY_READONLY"]
# Bad: Using admin credentials for read-only operations
4. Implement Rate Limiting
Protect your credentials from abuse:
from datetime import datetime, timedelta
from collections import defaultdict
class RateLimiter:
def __init__(self, max_requests: int, time_window: timedelta):
self.max_requests = max_requests
self.time_window = time_window
self.requests = defaultdict(list)
def check_rate_limit(self, identifier: str) -> bool:
"""Check if request is within rate limit"""
now = datetime.now()
# Remove old requests outside time window
self.requests[identifier] = [
req_time for req_time in self.requests[identifier]
if now - req_time < self.time_window
]
# Check if under limit
if len(self.requests[identifier]) >= self.max_requests:
return False
self.requests[identifier].append(now)
return True
rate_limiter = RateLimiter(max_requests=100, time_window=timedelta(minutes=1))
@app.call_tool()
async def call_tool(name: str, arguments: dict):
if not rate_limiter.check_rate_limit(API_KEY):
raise ValueError("Rate limit exceeded")
# Proceed with API call
5. Log Authentication Failures
Implement comprehensive logging without exposing credentials:
import winston from "winston";
const logger = winston.createLogger({
level: "info",
format: winston.format.json(),
transports: [
new winston.transports.File({ filename: "mcp-server.log" }),
],
});
server.setRequestHandler(CallToolRequestSchema, async (request) => {
try {
const response = await axios.get(
"https://api.webscraping.ai/html",
{ params: { url: args.url, api_key: API_KEY } }
);
logger.info("Successful scrape", {
url: args.url,
status: response.status
});
return { content: [{ type: "text", text: response.data }] };
} catch (error: any) {
// Log error without exposing API key
logger.error("Authentication failed", {
url: args.url,
error: error.message,
status: error.response?.status,
});
throw error;
}
});
Troubleshooting Authentication Issues
Error: "API key not configured"
Solution: Verify environment variables are set correctly:
# Check if variable is set (macOS/Linux)
echo $WEBSCRAPING_AI_API_KEY
# Check if variable is set (Windows)
echo %WEBSCRAPING_AI_API_KEY%
Ensure the MCP server configuration includes the env
section:
{
"mcpServers": {
"webscraping": {
"command": "node",
"args": ["/path/to/server.js"],
"env": {
"WEBSCRAPING_AI_API_KEY": "your_key_here"
}
}
}
}
Error: "401 Unauthorized"
Causes: - Invalid or expired API key - API key lacks required permissions - Request not properly formatted
Solution:
# Validate API key format before use
import re
def validate_api_key(key: str) -> bool:
"""Validate API key format"""
# Example: Check if key matches expected pattern
if not re.match(r'^[a-zA-Z0-9_-]+$', key):
return False
if len(key) < 20:
return False
return True
API_KEY = os.environ.get("WEBSCRAPING_AI_API_KEY", "")
if not validate_api_key(API_KEY):
raise ValueError("Invalid API key format")
Error: "Environment variable not found"
Solution: Restart the MCP client after setting environment variables:
# macOS/Linux: Reload shell configuration
source ~/.bashrc # or ~/.zshrc
# Windows: Restart application or reboot
For Claude Desktop, quit completely and relaunch after updating claude_desktop_config.json
.
Testing Authentication
Create a test script to verify authentication works correctly:
import asyncio
import httpx
import os
async def test_authentication():
"""Test MCP server authentication"""
api_key = os.environ.get("WEBSCRAPING_AI_API_KEY")
if not api_key:
print("❌ API key not found in environment")
return False
try:
async with httpx.AsyncClient() as client:
response = await client.get(
"https://api.webscraping.ai/html",
params={
"url": "https://example.com",
"api_key": api_key
},
timeout=10.0
)
if response.status_code == 200:
print("✅ Authentication successful")
return True
else:
print(f"❌ Authentication failed: {response.status_code}")
return False
except Exception as e:
print(f"❌ Authentication test failed: {e}")
return False
if __name__ == "__main__":
asyncio.run(test_authentication())
Run the test:
python test_auth.py
Conclusion
Authentication with MCP servers relies on securely passing credentials through environment variables and configuration files rather than implementing traditional login mechanisms. By following best practices like storing API keys in environment variables, validating credentials at startup, implementing rate limiting, and using secrets management systems for production deployments, you can build secure MCP servers for web scraping applications.
When connecting to an MCP server, always ensure authentication is properly configured before attempting to use scraping tools. For production environments, consider integrating with secrets management services like AWS Secrets Manager, HashiCorp Vault, or Azure Key Vault to automatically rotate credentials and enhance security.
Remember that the security of your MCP server depends not only on how you authenticate with external services but also on how you protect the credentials themselves. Never commit credentials to version control, always use environment-specific configuration, and implement comprehensive logging to detect and respond to authentication failures quickly.