Table of contents

How do I authenticate with an MCP server?

Authentication with MCP (Model Context Protocol) servers is primarily handled through environment variables, configuration files, and credential management systems rather than direct authentication protocols. Unlike traditional client-server architectures with login flows, MCP servers typically run as local processes or trusted services where credentials for external APIs and resources are securely passed during initialization.

Understanding how to properly configure authentication for MCP servers is crucial for web scraping applications that need to access protected APIs, handle authenticated web sessions, or manage sensitive credentials like API keys for services such as WebScraping.AI.

MCP Authentication Architecture

MCP servers themselves don't implement user authentication in the traditional sense. Instead, authentication happens at three levels:

  1. Server Process Access: MCP servers run as child processes spawned by MCP clients (like Claude Desktop), inheriting the security context of the parent process
  2. External API Authentication: Credentials for third-party services (scraping APIs, databases) are passed via environment variables
  3. Resource Authorization: MCP servers can implement their own authorization logic for tool execution

Environment Variable Authentication

The most common and recommended method for authentication is using environment variables to pass API keys and credentials to MCP servers.

Basic Environment Variable Configuration

When configuring an MCP server in Claude Desktop, you can specify environment variables:

macOS Configuration (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "webscraping": {
      "command": "node",
      "args": ["/path/to/mcp-server/dist/index.js"],
      "env": {
        "WEBSCRAPING_AI_API_KEY": "your_api_key_here",
        "PROXY_USERNAME": "proxy_user",
        "PROXY_PASSWORD": "proxy_pass",
        "DATABASE_URL": "postgresql://user:pass@localhost/db"
      }
    }
  }
}

Windows Configuration (%APPDATA%\Claude\claude_desktop_config.json):

{
  "mcpServers": {
    "webscraping": {
      "command": "node",
      "args": ["C:\\path\\to\\mcp-server\\dist\\index.js"],
      "env": {
        "WEBSCRAPING_AI_API_KEY": "your_api_key_here",
        "API_TIMEOUT": "30000"
      }
    }
  }
}

Using System Environment Variables

Instead of hardcoding credentials in the configuration file, reference system environment variables:

{
  "mcpServers": {
    "webscraping": {
      "command": "node",
      "args": ["/path/to/server.js"],
      "env": {
        "API_KEY": "${WEBSCRAPING_AI_API_KEY}",
        "DATABASE_URL": "${DATABASE_URL}"
      }
    }
  }
}

Set the environment variables in your system:

macOS/Linux:

# Add to ~/.bashrc or ~/.zshrc
export WEBSCRAPING_AI_API_KEY="your_api_key_here"
export DATABASE_URL="postgresql://localhost/scraping_db"

Windows (PowerShell):

[System.Environment]::SetEnvironmentVariable('WEBSCRAPING_AI_API_KEY', 'your_api_key_here', 'User')

Windows (Command Prompt):

setx WEBSCRAPING_AI_API_KEY "your_api_key_here"

Implementing Authentication in MCP Servers

Python MCP Server Authentication

Here's how to implement secure authentication in a Python MCP server:

import os
import asyncio
from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent
import httpx

# Load credentials from environment variables
API_KEY = os.environ.get("WEBSCRAPING_AI_API_KEY")
PROXY_URL = os.environ.get("PROXY_URL")

if not API_KEY:
    raise ValueError("WEBSCRAPING_AI_API_KEY environment variable is required")

# Create server instance
app = Server("authenticated-webscraping-server")

@app.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="scrape_authenticated",
            description="Scrape a webpage with API authentication",
            inputSchema={
                "type": "object",
                "properties": {
                    "url": {"type": "string", "description": "URL to scrape"},
                    "wait_for": {"type": "string", "description": "CSS selector to wait for"}
                },
                "required": ["url"]
            }
        )
    ]

@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name == "scrape_authenticated":
        async with httpx.AsyncClient() as client:
            # Use API key for authentication
            response = await client.get(
                "https://api.webscraping.ai/html",
                params={
                    "url": arguments["url"],
                    "api_key": API_KEY,  # Secure API key from environment
                    "js": "true",
                    "wait_for": arguments.get("wait_for"),
                    "proxy": PROXY_URL  # Optional proxy authentication
                },
                timeout=30.0
            )

            response.raise_for_status()

            return [TextContent(
                type="text",
                text=f"Successfully scraped {arguments['url']}\n{response.text}"
            )]

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await app.run(read_stream, write_stream)

if __name__ == "__main__":
    asyncio.run(main())

JavaScript/TypeScript MCP Server Authentication

For Node.js-based MCP servers, implement authentication similarly:

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import {
  CallToolRequestSchema,
  ListToolsRequestSchema,
} from "@modelcontextprotocol/sdk/types.js";
import axios from "axios";

// Load and validate credentials
const API_KEY = process.env.WEBSCRAPING_AI_API_KEY;
const PROXY_USERNAME = process.env.PROXY_USERNAME;
const PROXY_PASSWORD = process.env.PROXY_PASSWORD;

if (!API_KEY) {
  throw new Error("WEBSCRAPING_AI_API_KEY environment variable is required");
}

// Create authenticated server
const server = new Server(
  {
    name: "authenticated-scraping-server",
    version: "1.0.0",
  },
  {
    capabilities: {
      tools: {},
    },
  }
);

server.setRequestHandler(ListToolsRequestSchema, async () => {
  return {
    tools: [
      {
        name: "secure_scrape",
        description: "Scrape websites with authenticated API access",
        inputSchema: {
          type: "object",
          properties: {
            url: { type: "string", description: "Target URL" },
            use_proxy: { type: "boolean", description: "Use authenticated proxy" },
          },
          required: ["url"],
        },
      },
    ],
  };
});

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: args } = request.params;

  if (name === "secure_scrape") {
    try {
      const config: any = {
        params: {
          url: args.url,
          api_key: API_KEY,  // Secure API authentication
          js: true,
        },
        timeout: 30000,
      };

      // Add proxy authentication if requested
      if (args.use_proxy && PROXY_USERNAME && PROXY_PASSWORD) {
        config.params.proxy_username = PROXY_USERNAME;
        config.params.proxy_password = PROXY_PASSWORD;
      }

      const response = await axios.get(
        "https://api.webscraping.ai/html",
        config
      );

      return {
        content: [
          {
            type: "text",
            text: response.data,
          },
        ],
      };
    } catch (error: any) {
      return {
        content: [
          {
            type: "text",
            text: `Authentication error: ${error.message}`,
          },
        ],
        isError: true,
      };
    }
  }

  throw new Error(`Unknown tool: ${name}`);
});

// Start authenticated server
async function main() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
  console.error("Authenticated MCP Server running");
}

main().catch((error) => {
  console.error("Server initialization failed:", error);
  process.exit(1);
});

Advanced Authentication Patterns

Multi-Service Authentication

When your MCP server needs to authenticate with multiple services, organize credentials systematically:

import os
from dataclasses import dataclass
from typing import Optional

@dataclass
class ServiceCredentials:
    """Centralized credential management"""
    webscraping_api_key: str
    database_url: str
    redis_url: Optional[str] = None
    oauth_token: Optional[str] = None

    @classmethod
    def from_environment(cls):
        """Load all credentials from environment variables"""
        return cls(
            webscraping_api_key=os.environ["WEBSCRAPING_AI_API_KEY"],
            database_url=os.environ["DATABASE_URL"],
            redis_url=os.environ.get("REDIS_URL"),
            oauth_token=os.environ.get("OAUTH_TOKEN")
        )

    def validate(self):
        """Validate required credentials are present"""
        if not self.webscraping_api_key:
            raise ValueError("WebScraping.AI API key is required")
        if not self.database_url:
            raise ValueError("Database URL is required")
        return True

# Initialize credentials at server startup
credentials = ServiceCredentials.from_environment()
credentials.validate()

@app.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "scrape_and_store":
        # Use credentials for API call
        async with httpx.AsyncClient() as client:
            response = await client.get(
                "https://api.webscraping.ai/html",
                params={
                    "url": arguments["url"],
                    "api_key": credentials.webscraping_api_key
                }
            )

            # Use credentials for database storage
            # (database connection code here)

        return [TextContent(type="text", text="Data scraped and stored")]

OAuth Token Management

For services requiring OAuth authentication, implement token refresh logic:

class TokenManager {
  private accessToken: string | null = null;
  private refreshToken: string | null = null;
  private expiresAt: number = 0;

  constructor() {
    this.accessToken = process.env.OAUTH_ACCESS_TOKEN || null;
    this.refreshToken = process.env.OAUTH_REFRESH_TOKEN || null;
  }

  async getValidToken(): Promise<string> {
    // Check if token is still valid
    if (this.accessToken && Date.now() < this.expiresAt) {
      return this.accessToken;
    }

    // Refresh token if expired
    if (this.refreshToken) {
      await this.refreshAccessToken();
      return this.accessToken!;
    }

    throw new Error("No valid OAuth token available");
  }

  private async refreshAccessToken() {
    const response = await axios.post("https://oauth.example.com/token", {
      grant_type: "refresh_token",
      refresh_token: this.refreshToken,
      client_id: process.env.OAUTH_CLIENT_ID,
      client_secret: process.env.OAUTH_CLIENT_SECRET,
    });

    this.accessToken = response.data.access_token;
    this.expiresAt = Date.now() + response.data.expires_in * 1000;
  }
}

const tokenManager = new TokenManager();

// Use in tool handlers
server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const token = await tokenManager.getValidToken();

  // Use token for authenticated requests
  const response = await axios.get("https://api.example.com/data", {
    headers: {
      Authorization: `Bearer ${token}`,
    },
  });
});

Credential Rotation and Secrets Management

For production environments, implement credential rotation:

import boto3
from datetime import datetime, timedelta

class SecretsManager:
    """AWS Secrets Manager integration for MCP server"""

    def __init__(self):
        self.client = boto3.client('secretsmanager')
        self.cache = {}
        self.cache_duration = timedelta(hours=1)

    def get_secret(self, secret_name: str) -> str:
        """Retrieve secret with caching"""
        if secret_name in self.cache:
            cached_value, cached_time = self.cache[secret_name]
            if datetime.now() - cached_time < self.cache_duration:
                return cached_value

        # Fetch from AWS Secrets Manager
        response = self.client.get_secret_value(SecretId=secret_name)
        secret_value = response['SecretString']

        # Update cache
        self.cache[secret_name] = (secret_value, datetime.now())
        return secret_value

    def get_api_key(self) -> str:
        """Get WebScraping.AI API key"""
        return self.get_secret('webscraping-ai-api-key')

# Initialize secrets manager
secrets = SecretsManager()

@app.call_tool()
async def call_tool(name: str, arguments: dict):
    # Retrieve fresh API key
    api_key = secrets.get_api_key()

    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://api.webscraping.ai/html",
            params={"url": arguments["url"], "api_key": api_key}
        )

    return [TextContent(type="text", text=response.text)]

Authentication for Different Scraping Scenarios

Authenticating with Target Websites

When scraping authenticated websites, pass session credentials through your MCP server similar to how you handle authentication in Puppeteer:

@app.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "scrape_authenticated_site":
        # Use cookies for website authentication
        cookies = arguments.get("cookies", {})

        async with httpx.AsyncClient() as client:
            response = await client.post(
                "https://api.webscraping.ai/html",
                params={
                    "url": arguments["url"],
                    "api_key": API_KEY
                },
                json={
                    "cookies": cookies,
                    "headers": arguments.get("headers", {})
                }
            )

            return [TextContent(type="text", text=response.text)]

Proxy Authentication

For proxies requiring authentication, configure credentials properly:

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  const { name, arguments: args } = request.params;

  if (name === "scrape_with_proxy") {
    const proxyConfig = {
      url: args.url,
      api_key: API_KEY,
      proxy: "residential",
      proxy_username: process.env.PROXY_USERNAME,
      proxy_password: process.env.PROXY_PASSWORD,
    };

    const response = await axios.get(
      "https://api.webscraping.ai/html",
      { params: proxyConfig }
    );

    return {
      content: [{ type: "text", text: response.data }],
    };
  }
});

Security Best Practices

1. Never Hardcode Credentials

Bad Practice:

const API_KEY = "sk_live_abc123...";  // NEVER do this

Good Practice:

const API_KEY = process.env.WEBSCRAPING_AI_API_KEY;
if (!API_KEY) {
  throw new Error("API key not configured");
}

2. Validate Environment Variables at Startup

import os

REQUIRED_ENV_VARS = [
    "WEBSCRAPING_AI_API_KEY",
    "DATABASE_URL"
]

def validate_environment():
    """Ensure all required credentials are present"""
    missing = [var for var in REQUIRED_ENV_VARS if not os.environ.get(var)]

    if missing:
        raise ValueError(
            f"Missing required environment variables: {', '.join(missing)}"
        )

# Validate before starting server
validate_environment()

3. Use Least Privilege Access

Grant only the permissions needed:

# Good: API key with limited scope
API_KEY = os.environ["WEBSCRAPING_API_KEY_READONLY"]

# Bad: Using admin credentials for read-only operations

4. Implement Rate Limiting

Protect your credentials from abuse:

from datetime import datetime, timedelta
from collections import defaultdict

class RateLimiter:
    def __init__(self, max_requests: int, time_window: timedelta):
        self.max_requests = max_requests
        self.time_window = time_window
        self.requests = defaultdict(list)

    def check_rate_limit(self, identifier: str) -> bool:
        """Check if request is within rate limit"""
        now = datetime.now()

        # Remove old requests outside time window
        self.requests[identifier] = [
            req_time for req_time in self.requests[identifier]
            if now - req_time < self.time_window
        ]

        # Check if under limit
        if len(self.requests[identifier]) >= self.max_requests:
            return False

        self.requests[identifier].append(now)
        return True

rate_limiter = RateLimiter(max_requests=100, time_window=timedelta(minutes=1))

@app.call_tool()
async def call_tool(name: str, arguments: dict):
    if not rate_limiter.check_rate_limit(API_KEY):
        raise ValueError("Rate limit exceeded")

    # Proceed with API call

5. Log Authentication Failures

Implement comprehensive logging without exposing credentials:

import winston from "winston";

const logger = winston.createLogger({
  level: "info",
  format: winston.format.json(),
  transports: [
    new winston.transports.File({ filename: "mcp-server.log" }),
  ],
});

server.setRequestHandler(CallToolRequestSchema, async (request) => {
  try {
    const response = await axios.get(
      "https://api.webscraping.ai/html",
      { params: { url: args.url, api_key: API_KEY } }
    );

    logger.info("Successful scrape", {
      url: args.url,
      status: response.status
    });

    return { content: [{ type: "text", text: response.data }] };
  } catch (error: any) {
    // Log error without exposing API key
    logger.error("Authentication failed", {
      url: args.url,
      error: error.message,
      status: error.response?.status,
    });

    throw error;
  }
});

Troubleshooting Authentication Issues

Error: "API key not configured"

Solution: Verify environment variables are set correctly:

# Check if variable is set (macOS/Linux)
echo $WEBSCRAPING_AI_API_KEY

# Check if variable is set (Windows)
echo %WEBSCRAPING_AI_API_KEY%

Ensure the MCP server configuration includes the env section:

{
  "mcpServers": {
    "webscraping": {
      "command": "node",
      "args": ["/path/to/server.js"],
      "env": {
        "WEBSCRAPING_AI_API_KEY": "your_key_here"
      }
    }
  }
}

Error: "401 Unauthorized"

Causes: - Invalid or expired API key - API key lacks required permissions - Request not properly formatted

Solution:

# Validate API key format before use
import re

def validate_api_key(key: str) -> bool:
    """Validate API key format"""
    # Example: Check if key matches expected pattern
    if not re.match(r'^[a-zA-Z0-9_-]+$', key):
        return False
    if len(key) < 20:
        return False
    return True

API_KEY = os.environ.get("WEBSCRAPING_AI_API_KEY", "")
if not validate_api_key(API_KEY):
    raise ValueError("Invalid API key format")

Error: "Environment variable not found"

Solution: Restart the MCP client after setting environment variables:

# macOS/Linux: Reload shell configuration
source ~/.bashrc  # or ~/.zshrc

# Windows: Restart application or reboot

For Claude Desktop, quit completely and relaunch after updating claude_desktop_config.json.

Testing Authentication

Create a test script to verify authentication works correctly:

import asyncio
import httpx
import os

async def test_authentication():
    """Test MCP server authentication"""
    api_key = os.environ.get("WEBSCRAPING_AI_API_KEY")

    if not api_key:
        print("❌ API key not found in environment")
        return False

    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(
                "https://api.webscraping.ai/html",
                params={
                    "url": "https://example.com",
                    "api_key": api_key
                },
                timeout=10.0
            )

            if response.status_code == 200:
                print("✅ Authentication successful")
                return True
            else:
                print(f"❌ Authentication failed: {response.status_code}")
                return False

    except Exception as e:
        print(f"❌ Authentication test failed: {e}")
        return False

if __name__ == "__main__":
    asyncio.run(test_authentication())

Run the test:

python test_auth.py

Conclusion

Authentication with MCP servers relies on securely passing credentials through environment variables and configuration files rather than implementing traditional login mechanisms. By following best practices like storing API keys in environment variables, validating credentials at startup, implementing rate limiting, and using secrets management systems for production deployments, you can build secure MCP servers for web scraping applications.

When connecting to an MCP server, always ensure authentication is properly configured before attempting to use scraping tools. For production environments, consider integrating with secrets management services like AWS Secrets Manager, HashiCorp Vault, or Azure Key Vault to automatically rotate credentials and enhance security.

Remember that the security of your MCP server depends not only on how you authenticate with external services but also on how you protect the credentials themselves. Never commit credentials to version control, always use environment-specific configuration, and implement comprehensive logging to detect and respond to authentication failures quickly.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon