What are the token limits for Claude API in web scraping?

Understanding token limits is crucial when using the Claude API for web scraping tasks. Claude's token limits determine how much text you can process in a single API call, which directly impacts your ability to extract data from web pages efficiently.

Claude API Token Limits by Model

Different Claude models have varying token limits, also known as context windows. Here's a breakdown of the current limits:

Claude 3.5 Sonnet (claude-3-5-sonnet-20241022)

Context Window: 200,000 tokens
Maximum Output: 8,192 tokens
Best For: Complex web scraping tasks requiring deep analysis of large pages

Claude 3 Opus (claude-3-opus-20240229)

Context Window: 200,000 tokens
Maximum Output: 4,096 tokens
Best For: High-accuracy extraction from extensive HTML documents

Claude 3 Sonnet (claude-3-sonnet-20240229)

Context Window: 200,000 tokens
Maximum Output: 4,096 tokens
Best For: Balanced performance for medium-sized web pages

Claude 3 Haiku (claude-3-haiku-20240307)

Context Window: 200,000 tokens
Maximum Output: 4,096 tokens
Best For: Fast, cost-effective scraping of simpler pages

Understanding Tokens

A token is approximately 3-4 characters in English text. For web scraping:

1 token ≈ 0.75 words
100 tokens ≈ 75 words
1,000 tokens ≈ 750 words
10,000 tokens ≈ 7,500 words

HTML markup significantly increases token count compared to plain text, as tags, attributes, and whitespace all consume tokens.

Token Consumption in Web Scraping

When scraping with Claude, tokens are consumed by:

System Instructions: Your prompts and instructions (typically 100-500 tokens)
HTML Content: The web page content you're analyzing (varies widely)
Examples: Few-shot examples you provide (if any)
Response: Claude's extracted data output

Example Token Calculation

import anthropic
import requests

def estimate_tokens(text):
    """Rough estimation: 1 token ≈ 4 characters"""
    return len(text) // 4

# Fetch a web page
url = "https://example.com/products"
response = requests.get(url)
html_content = response.text

# Estimate tokens
prompt = "Extract all product names and prices from this HTML"
total_input_tokens = estimate_tokens(prompt + html_content)

print(f"Estimated input tokens: {total_input_tokens:,}")
print(f"Remaining capacity: {200000 - total_input_tokens:,} tokens")

Optimizing Token Usage for Web Scraping

1. HTML Preprocessing

Remove unnecessary content before sending to Claude:

from bs4 import BeautifulSoup
import anthropic

def clean_html_for_claude(html_content):
    """Remove scripts, styles, and unnecessary attributes"""
    soup = BeautifulSoup(html_content, 'html.parser')

    # Remove script and style elements
    for script in soup(["script", "style", "svg", "noscript"]):
        script.decompose()

    # Remove comments
    for comment in soup.find_all(text=lambda text: isinstance(text, Comment)):
        comment.extract()

    # Get text or simplified HTML
    return str(soup)

# Use with Claude API
client = anthropic.Anthropic(api_key="your-api-key")

html = requests.get("https://example.com/article").text
cleaned_html = clean_html_for_claude(html)

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": f"Extract the article title, author, and publication date:\n\n{cleaned_html}"
        }
    ]
)

print(message.content[0].text)

2. Chunking Large Pages

For pages exceeding token limits, split content into chunks:

const Anthropic = require('@anthropic-ai/sdk');
const axios = require('axios');
const cheerio = require('cheerio');

async function scrapeInChunks(url) {
    const client = new Anthropic({
        apiKey: process.env.ANTHROPIC_API_KEY,
    });

    // Fetch and parse HTML
    const response = await axios.get(url);
    const $ = cheerio.load(response.data);

    // Split content into sections
    const sections = [];
    $('article section').each((i, section) => {
        sections.push($(section).html());
    });

    // Process each section
    const results = [];
    for (const section of sections) {
        const message = await client.messages.create({
            model: 'claude-3-haiku-20240307',
            max_tokens: 2048,
            messages: [{
                role: 'user',
                content: `Extract key information from this section:\n\n${section}`
            }]
        });

        results.push(message.content[0].text);
    }

    return results;
}

scrapeInChunks('https://example.com/long-article')
    .then(data => console.log(data));

3. Use Selective Extraction

Target specific elements instead of sending entire pages:

from bs4 import BeautifulSoup
import anthropic

def extract_product_sections(html):
    """Extract only product-related sections"""
    soup = BeautifulSoup(html, 'html.parser')

    # Find product containers
    products = soup.find_all('div', class_='product-card')

    # Combine into compact HTML
    return '\n'.join([str(p) for p in products[:50]])  # Limit to 50 products

client = anthropic.Anthropic(api_key="your-api-key")

html = requests.get("https://example.com/products").text
product_html = extract_product_sections(html)

response = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=8192,
    messages=[{
        "role": "user",
        "content": f"Extract product names, prices, and ratings as JSON:\n\n{product_html}"
    }]
)

print(response.content[0].text)

Handling Token Limit Errors

When you exceed token limits, Claude returns an error. Here's how to handle it:

import anthropic
from anthropic import APIError

def scrape_with_fallback(html_content, prompt):
    client = anthropic.Anthropic(api_key="your-api-key")

    models = [
        "claude-3-5-sonnet-20241022",
        "claude-3-haiku-20240307"
    ]

    for model in models:
        try:
            message = client.messages.create(
                model=model,
                max_tokens=4096,
                messages=[{
                    "role": "user",
                    "content": f"{prompt}\n\n{html_content}"
                }]
            )
            return message.content[0].text

        except APIError as e:
            if "max_tokens" in str(e).lower():
                # Try with reduced content
                reduced_content = html_content[:len(html_content)//2]
                print(f"Reducing content size and retrying...")
                html_content = reduced_content
            else:
                raise

    return None

Cost Optimization Strategies

Token usage directly impacts costs. Here are strategies to optimize:

1. Use Markdown Instead of HTML

Converting HTML to Markdown reduces token count by 40-60%:

from markdownify import markdownify as md

html = "<div><h1>Product Title</h1><p>Description here</p></div>"
markdown = md(html)

# Markdown uses fewer tokens than HTML
print(f"HTML length: {len(html)}")
print(f"Markdown length: {len(markdown)}")

2. Cache Common Prompts

Reuse system prompts to save tokens across multiple requests:

def create_scraper_with_cache(client):
    """Create a scraper function with cached instructions"""

    system_prompt = """You are a web scraping assistant.
    Extract structured data from HTML and return as JSON.
    Focus on accuracy and completeness."""

    def scrape(html_content, fields):
        return client.messages.create(
            model="claude-3-haiku-20240307",
            max_tokens=2048,
            system=system_prompt,
            messages=[{
                "role": "user",
                "content": f"Extract these fields: {fields}\n\nHTML:\n{html_content}"
            }]
        )

    return scrape

3. Batch Similar Requests

Process multiple similar pages in one request:

async function batchScrapeProducts(urls) {
    const client = new Anthropic({
        apiKey: process.env.ANTHROPIC_API_KEY,
    });

    // Fetch all pages
    const pages = await Promise.all(
        urls.map(url => axios.get(url))
    );

    // Combine into one prompt
    const combined = pages.map((page, i) =>
        `PAGE ${i + 1}:\n${page.data}`
    ).join('\n\n---\n\n');

    const message = await client.messages.create({
        model: 'claude-3-haiku-20240307',
        max_tokens: 4096,
        messages: [{
            role: 'user',
            content: `Extract product data from each page:\n\n${combined}`
        }]
    });

    return message.content[0].text;
}

Monitoring Token Usage

Track token consumption to optimize your scraping pipeline:

import anthropic

client = anthropic.Anthropic(api_key="your-api-key")

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": "Extract data from this HTML: <html>...</html>"
    }]
)

# Check token usage
usage = message.usage
print(f"Input tokens: {usage.input_tokens}")
print(f"Output tokens: {usage.output_tokens}")
print(f"Total tokens: {usage.input_tokens + usage.output_tokens}")

# Calculate cost (example rates)
input_cost = usage.input_tokens * 0.003 / 1000  # $0.003 per 1K tokens
output_cost = usage.output_tokens * 0.015 / 1000  # $0.015 per 1K tokens
print(f"Estimated cost: ${input_cost + output_cost:.6f}")

Best Practices for Token Management

Preprocess HTML: Remove scripts, styles, and unnecessary attributes before sending to Claude
Use Selective Selectors: Extract only relevant sections using CSS selectors or XPath
Choose the Right Model: Use Claude Haiku for simple extractions to save tokens and costs
Implement Chunking: Split large pages into manageable sections
Monitor Usage: Track token consumption per request to identify optimization opportunities
Cache Results: Store extracted data to avoid re-processing the same content
Use Streaming: For large responses, use streaming to get partial results faster

Comparing with Traditional Scraping

While Claude offers powerful AI-based extraction, traditional tools like handling AJAX requests using Puppeteer or using CSS selectors can be more token-efficient for structured data. Consider using Claude when:

Page structure varies significantly
You need semantic understanding of content
Traditional selectors are fragile or complex
You're extracting data from dynamic single-page applications

Conclusion

Claude API's 200,000 token context window provides ample capacity for most web scraping tasks. By understanding token consumption, preprocessing HTML content, and implementing chunking strategies, you can efficiently extract data from even the largest web pages while managing costs effectively.

Remember that token limits affect both input (your prompts and HTML) and output (Claude's responses). Always monitor usage, optimize your preprocessing pipeline, and choose the appropriate Claude model based on your accuracy and cost requirements. When dealing with complex browser automation scenarios, consider combining Claude with tools for handling browser sessions in Puppeteer to create a robust scraping solution.

Table of contents