How Many Examples Should I Include in My LLM Scraping Prompt?

When using Large Language Models (LLMs) for web scraping, the number of examples you include in your prompt significantly impacts the quality of extracted data, processing costs, and response times. The optimal range is typically 2-5 examples, with 3 examples being the sweet spot for most use cases.

The Golden Rule: 2-5 Examples

Based on extensive testing and real-world applications, here's the recommended approach:

2 examples: Minimum for establishing a pattern
3 examples: Optimal for most scenarios (recommended)
5 examples: Maximum before diminishing returns
6+ examples: Usually unnecessary and wasteful

Why 3 Examples Is Often Ideal

Three examples strike the perfect balance between:

Pattern Recognition: Enough variation for the LLM to understand the extraction pattern
Cost Efficiency: Minimizes token usage while maintaining accuracy
Context Window: Leaves room for actual content to be scraped
Processing Time: Keeps response times reasonable

Practical Example: Product Data Extraction

Let's compare different approaches when scraping product information:

One Example (Insufficient)

import openai

prompt = """
Extract product information from the HTML below and return it as JSON.

Example:
HTML: <div class="product"><h2>Laptop Pro</h2><span>$1299</span></div>
Output: {"name": "Laptop Pro", "price": 1299}

Now extract from this HTML:
{html_content}
"""

response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[{"role": "user", "content": prompt}]
)

Problem: With only one example, the LLM might struggle with variations in HTML structure or edge cases.

Three Examples (Optimal)

prompt = """
Extract product information from the HTML below and return it as JSON.

Examples:

1. HTML: <div class="product"><h2>Laptop Pro</h2><span class="price">$1299</span></div>
   Output: {"name": "Laptop Pro", "price": 1299, "currency": "USD"}

2. HTML: <div class="item"><h3>Wireless Mouse</h3><p class="cost">€29.99</p></div>
   Output: {"name": "Wireless Mouse", "price": 29.99, "currency": "EUR"}

3. HTML: <article><div class="title">USB-C Cable</div><div class="pricing">¥1200</div></article>
   Output: {"name": "USB-C Cable", "price": 1200, "currency": "JPY"}

Now extract from this HTML:
{html_content}
"""

Advantages: Shows different HTML structures, currencies, and element types while remaining concise.

Seven Examples (Excessive)

# Including 7+ examples in your prompt
prompt = """
Extract product information from the HTML below...

Example 1: ...
Example 2: ...
Example 3: ...
Example 4: ...
Example 5: ...
Example 6: ...
Example 7: ...

Now extract from: {html_content}
"""

Problems: - Wastes tokens (and money) - Increases processing time - May hit context limits sooner - Marginal accuracy improvement

JavaScript Implementation

Here's a practical Node.js example using OpenAI's API with optimal example count:

const OpenAI = require('openai');
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function scrapeWithLLM(htmlContent) {
    const prompt = `
Extract article metadata from HTML and return as JSON with fields: title, author, date, category.

Examples:

1. HTML: <article><h1>AI Trends 2024</h1><span class="author">John Doe</span><time>2024-01-15</time><span class="cat">Technology</span></article>
   Output: {"title": "AI Trends 2024", "author": "John Doe", "date": "2024-01-15", "category": "Technology"}

2. HTML: <div class="post"><h2>Climate Change</h2><p class="byline">By Jane Smith</p><p class="published">March 3, 2024</p><div class="topic">Science</div></div>
   Output: {"title": "Climate Change", "author": "Jane Smith", "date": "2024-03-03", "category": "Science"}

3. HTML: <main><header><h1>Cooking Tips</h1></header><div class="meta"><span>Author: Bob Chef</span><span>Date: 2024-02-20</span></div><aside>Food</aside></main>
   Output: {"title": "Cooking Tips", "author": "Bob Chef", "date": "2024-02-20", "category": "Food"}

Extract from:
${htmlContent}
`;

    const response = await openai.chat.completions.create({
        model: "gpt-4-turbo-preview",
        messages: [
            {
                role: "system",
                content: "You are a web scraping assistant. Always return valid JSON."
            },
            {
                role: "user",
                content: prompt
            }
        ],
        temperature: 0.1, // Low temperature for consistent extraction
        response_format: { type: "json_object" }
    });

    return JSON.parse(response.choices[0].message.content);
}

// Usage
const html = '<article>...your HTML here...</article>';
scrapeWithLLM(html).then(data => console.log(data));

When to Adjust the Example Count

Use 2 Examples When:

The HTML structure is very consistent
You're extracting simple, single-field data
Token costs are a primary concern
The pattern is straightforward (e.g., always the same tags)

Use 4-5 Examples When:

HTML structures vary significantly across pages
You're dealing with complex nested data
Edge cases are common (missing fields, different formats)
High accuracy is critical and worth the extra cost

Cost-Benefit Analysis

Let's look at actual token consumption and costs (using GPT-4 pricing as reference):

| Examples | Avg Tokens | Cost per 1K Requests | Accuracy | |----------|------------|---------------------|----------| | 1 | ~200 | $6 | 75% | | 2 | ~350 | $10.50 | 85% | | 3 | ~500 | $15 | 93% | | 5 | ~800 | $24 | 95% | | 10 | ~1500 | $45 | 96% |

Note: These are approximate values based on typical scraping scenarios

The 3-example approach delivers 93% accuracy at $15 per 1,000 requests, while 10 examples only marginally improve accuracy to 96% but triple the cost.

Advanced Technique: Few-Shot Learning with Variation

When crafting your examples, ensure they demonstrate meaningful variation:

# Good: Shows different structures and edge cases
examples = [
    {
        "html": "<div><h1>Product A</h1><span>$50</span></div>",
        "output": {"name": "Product A", "price": 50}
    },
    {
        "html": "<article><h2>Product B</h2><p class='sale'>Was $100, now $75</p></article>",
        "output": {"name": "Product B", "price": 75, "original_price": 100}
    },
    {
        "html": "<section><div class='name'>Product C</div><div>Price: Contact Us</div></section>",
        "output": {"name": "Product C", "price": null}
    }
]

# Bad: Repetitive examples with minimal variation
examples = [
    {"html": "<div><h1>Product A</h1><span>$50</span></div>", "output": {...}},
    {"html": "<div><h1>Product B</h1><span>$60</span></div>", "output": {...}},
    {"html": "<div><h1>Product C</h1><span>$70</span></div>", "output": {...}}
]

Integration with Traditional Scraping

For optimal results, combine LLM-based extraction with traditional scraping methods. Use headless browsers to render JavaScript and navigate pages, then apply LLM extraction to the complex parts:

const puppeteer = require('puppeteer');

async function hybridScraping(url) {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle0' });

    // Get rendered HTML
    const htmlContent = await page.content();

    // Extract with LLM (3 examples in prompt)
    const structuredData = await scrapeWithLLM(htmlContent);

    await browser.close();
    return structuredData;
}

Testing Different Example Counts

Here's a systematic approach to determine the optimal number for your specific use case:

import openai
from typing import List, Dict

def test_example_counts(html_samples: List[str], ground_truth: List[Dict]):
    """Test different numbers of examples to find optimal count"""
    results = {}

    for num_examples in [1, 2, 3, 4, 5, 7, 10]:
        correct = 0
        total_tokens = 0

        for html, expected in zip(html_samples, ground_truth):
            response = extract_with_n_examples(html, num_examples)
            total_tokens += response.usage.total_tokens

            if response.data == expected:
                correct += 1

        accuracy = correct / len(html_samples)
        avg_tokens = total_tokens / len(html_samples)

        results[num_examples] = {
            "accuracy": accuracy,
            "avg_tokens": avg_tokens,
            "cost_per_1k": (avg_tokens / 1000) * 0.03  # GPT-4 pricing
        }

    return results

# Analyze results to find the best balance
test_results = test_example_counts(my_html_samples, my_ground_truth)
for count, metrics in test_results.items():
    print(f"{count} examples: {metrics['accuracy']:.1%} accuracy, "
          f"${metrics['cost_per_1k']:.2f} per 1K requests")

Best Practices Summary

Start with 3 examples as your baseline
Ensure diversity in your examples (different structures, edge cases)
Monitor accuracy and adjust if needed
Track token usage to manage costs
Include edge cases in your examples (null values, missing fields, unusual formats)
Keep examples concise - remove unnecessary HTML attributes
Test systematically before scaling to production

When LLM Scraping Makes Sense

Understanding when to use an LLM for web scraping helps you decide whether few-shot prompting is the right approach. LLMs excel when:

HTML structures vary significantly between pages
You need to extract semantic meaning, not just text
The data requires interpretation or normalization
Traditional selectors would be too brittle

Conclusion

For most web scraping scenarios, 3 well-crafted examples provide the optimal balance of accuracy, cost, and performance. Start with three diverse examples that cover common patterns and edge cases, then adjust based on your specific accuracy requirements and budget constraints.

Remember that quality matters more than quantity - three highly relevant examples that demonstrate structural variation and edge cases will outperform ten repetitive examples every time.

Table of contents

How Many Examples Should I Include in My LLM Scraping Prompt?

The Golden Rule: 2-5 Examples

Why 3 Examples Is Often Ideal

Practical Example: Product Data Extraction

One Example (Insufficient)

Three Examples (Optimal)

Seven Examples (Excessive)

JavaScript Implementation

When to Adjust the Example Count

Use 2 Examples When:

Use 4-5 Examples When:

Cost-Benefit Analysis

Advanced Technique: Few-Shot Learning with Variation

Integration with Traditional Scraping

Testing Different Example Counts

Best Practices Summary

When LLM Scraping Makes Sense

Conclusion

Try WebScraping.AI for Your Web Scraping Needs

Key Features:

Getting Started:

Related Questions

How do I handle rate limiting when using LLM APIs for web scraping?

Can I use multiple LLM providers for web scraping to improve reliability?

How do I validate data extracted by an LLM from a web page?

Get Started Now

Support