How Does Firecrawl Handle API Throttling and Rate Limiting?
Firecrawl implements sophisticated API throttling and rate limiting mechanisms to ensure fair resource allocation, maintain service quality, and prevent abuse. Understanding how these systems work is crucial for building reliable web scraping applications that can handle large-scale data extraction without interruptions.
This comprehensive guide explains Firecrawl's internal rate limiting architecture, automatic retry mechanisms, response headers, and practical strategies for working within rate limits while maximizing scraping efficiency.
Understanding Firecrawl's Rate Limiting Architecture
Firecrawl employs a multi-layered rate limiting system that operates at different levels to ensure optimal performance and fair usage across all users.
Rate Limit Types
Firecrawl enforces several types of rate limits simultaneously:
1. Per-Second Request Limits Controls the maximum number of requests you can make per second to prevent burst traffic:
# Example rate structure (actual limits vary by plan)
{
'requests_per_second': 10, # Instant burst control
'requests_per_minute': 500, # Sustained rate control
'requests_per_hour': 10000 # Long-term quota
}
2. Concurrent Request Limits Restricts the number of simultaneous active requests to manage server load:
// Firecrawl tracks concurrent requests per API key
{
maxConcurrentRequests: 25, // Professional plan example
currentActive: 12, // Currently processing
queued: 0 // Waiting in queue
}
3. Credit-Based Limits Each API operation consumes credits from your monthly or annual allocation:
# Different operations consume different credit amounts
CREDIT_COSTS = {
'scrape': 1, # Single page scrape
'crawl_page': 1, # Each page in a crawl job
'screenshot': 2, # Screenshot generation
'extract': 3, # AI-powered extraction
}
Token Bucket Algorithm
Firecrawl uses a token bucket algorithm for rate limiting, which allows for controlled bursts while maintaining average rate limits:
class TokenBucket:
"""Simplified representation of Firecrawl's rate limiting"""
def __init__(self, capacity, refill_rate):
self.capacity = capacity # Maximum tokens (burst capacity)
self.tokens = capacity # Current available tokens
self.refill_rate = refill_rate # Tokens added per second
self.last_refill = time.time()
def consume(self, tokens=1):
"""Attempt to consume tokens for a request"""
self._refill()
if self.tokens >= tokens:
self.tokens -= tokens
return True
return False
def _refill(self):
"""Refill tokens based on elapsed time"""
now = time.time()
elapsed = now - self.last_refill
# Add tokens based on time passed
new_tokens = elapsed * self.refill_rate
self.tokens = min(self.capacity, self.tokens + new_tokens)
self.last_refill = now
This algorithm enables you to make quick bursts of requests up to the capacity while maintaining a sustainable rate over time.
HTTP Response Headers for Rate Limiting
Firecrawl includes rate limiting information in HTTP response headers, allowing you to track your usage and adapt your scraping strategy accordingly.
Key Rate Limit Headers
When you make a request to Firecrawl, the response includes these headers:
X-RateLimit-Limit: 500 # Maximum requests per time window
X-RateLimit-Remaining: 487 # Requests remaining in current window
X-RateLimit-Reset: 1678901234 # Unix timestamp when limit resets
Retry-After: 60 # Seconds to wait (only on 429 errors)
Reading Rate Limit Headers in Code
Python Implementation:
from firecrawl import FirecrawlApp
import requests
import time
from datetime import datetime
class RateLimitAwareFirecrawl:
"""Firecrawl client that monitors rate limits via headers"""
def __init__(self, api_key):
self.api_key = api_key
self.base_url = 'https://api.firecrawl.dev/v1'
self.rate_limit_info = {}
def scrape_with_headers(self, url, params=None):
"""Scrape URL and track rate limit headers"""
headers = {
'Authorization': f'Bearer {self.api_key}',
'Content-Type': 'application/json'
}
payload = {'url': url}
if params:
payload.update(params)
response = requests.post(
f'{self.base_url}/scrape',
headers=headers,
json=payload
)
# Extract rate limit information
self.rate_limit_info = {
'limit': int(response.headers.get('X-RateLimit-Limit', 0)),
'remaining': int(response.headers.get('X-RateLimit-Remaining', 0)),
'reset': int(response.headers.get('X-RateLimit-Reset', 0)),
'reset_time': datetime.fromtimestamp(
int(response.headers.get('X-RateLimit-Reset', 0))
)
}
# Log current rate limit status
print(f"Rate Limit: {self.rate_limit_info['remaining']}/{self.rate_limit_info['limit']}")
print(f"Resets at: {self.rate_limit_info['reset_time']}")
# Handle rate limit errors
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
raise RateLimitError(f"Rate limited. Retry after {retry_after} seconds")
response.raise_for_status()
return response.json()
def get_rate_limit_status(self):
"""Get current rate limit status"""
return self.rate_limit_info
class RateLimitError(Exception):
pass
# Usage
app = RateLimitAwareFirecrawl(api_key='your_api_key')
try:
result = app.scrape_with_headers('https://example.com')
status = app.get_rate_limit_status()
print(f"Requests remaining: {status['remaining']}")
except RateLimitError as e:
print(f"Rate limit hit: {e}")
JavaScript Implementation:
import axios from 'axios';
class RateLimitAwareFirecrawl {
constructor(apiKey) {
this.apiKey = apiKey;
this.baseUrl = 'https://api.firecrawl.dev/v1';
this.rateLimitInfo = {};
}
async scrapeWithHeaders(url, params = {}) {
const response = await axios.post(
`${this.baseUrl}/scrape`,
{ url, ...params },
{
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'Content-Type': 'application/json'
},
validateStatus: status => status < 500 // Don't throw on 429
}
);
// Extract rate limit information
this.rateLimitInfo = {
limit: parseInt(response.headers['x-ratelimit-limit'] || '0'),
remaining: parseInt(response.headers['x-ratelimit-remaining'] || '0'),
reset: parseInt(response.headers['x-ratelimit-reset'] || '0'),
resetTime: new Date(parseInt(response.headers['x-ratelimit-reset'] || '0') * 1000)
};
console.log(`Rate Limit: ${this.rateLimitInfo.remaining}/${this.rateLimitInfo.limit}`);
console.log(`Resets at: ${this.rateLimitInfo.resetTime.toLocaleString()}`);
// Handle rate limit errors
if (response.status === 429) {
const retryAfter = parseInt(response.headers['retry-after'] || '60');
throw new Error(`Rate limited. Retry after ${retryAfter} seconds`);
}
if (response.status >= 400) {
throw new Error(`Request failed with status ${response.status}`);
}
return response.data;
}
getRateLimitStatus() {
return this.rateLimitInfo;
}
}
// Usage
const app = new RateLimitAwareFirecrawl('your_api_key');
try {
const result = await app.scrapeWithHeaders('https://example.com');
const status = app.getRateLimitStatus();
console.log(`Requests remaining: ${status.remaining}`);
} catch (error) {
console.error('Error:', error.message);
}
Automatic Retry and Backoff Mechanisms
Firecrawl's SDK includes built-in retry logic for handling transient failures and rate limits. Understanding how this works helps you configure optimal retry strategies.
Default Retry Behavior
The Firecrawl SDK automatically retries failed requests with exponential backoff:
from firecrawl import FirecrawlApp
import time
app = FirecrawlApp(api_key='your_api_key')
# The SDK automatically handles retries internally
# Default configuration (conceptual):
DEFAULT_RETRY_CONFIG = {
'max_retries': 3,
'base_delay': 1, # Start with 1 second
'max_delay': 60, # Cap at 60 seconds
'exponential_base': 2, # Double each time
'jitter': True # Add randomization
}
Custom Retry Implementation
For more control, you can implement your own retry logic, similar to handling timeouts in Puppeteer:
Python Advanced Retry Handler:
import time
import random
from functools import wraps
def retry_with_backoff(max_retries=5, base_delay=1, max_delay=120, backoff_factor=2):
"""Decorator for exponential backoff with jitter"""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
retries = 0
while retries < max_retries:
try:
return func(*args, **kwargs)
except Exception as e:
error_msg = str(e).lower()
# Check if it's a rate limit error
if '429' not in error_msg and 'rate limit' not in error_msg:
# Not a rate limit error, don't retry
raise
retries += 1
if retries >= max_retries:
raise Exception(f"Max retries ({max_retries}) exceeded") from e
# Calculate delay with exponential backoff
delay = min(
base_delay * (backoff_factor ** (retries - 1)),
max_delay
)
# Add jitter (randomize ±25%)
jitter = delay * 0.25 * (random.random() * 2 - 1)
delay_with_jitter = delay + jitter
print(f"Retry {retries}/{max_retries} after {delay_with_jitter:.2f}s")
time.sleep(delay_with_jitter)
raise Exception("Unexpected retry loop exit")
return wrapper
return decorator
# Usage
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key='your_api_key')
@retry_with_backoff(max_retries=5, base_delay=2, backoff_factor=2)
def scrape_url(url):
"""Scrape URL with automatic retry on rate limits"""
return app.scrape_url(url)
# This will automatically retry with backoff on rate limit errors
result = scrape_url('https://example.com')
print("Successfully scraped:", result['metadata']['title'])
JavaScript Advanced Retry Handler:
import FirecrawlApp from '@mendable/firecrawl-js';
async function retryWithBackoff(
fn,
maxRetries = 5,
baseDelay = 1000,
maxDelay = 120000,
backoffFactor = 2
) {
let retries = 0;
while (retries < maxRetries) {
try {
return await fn();
} catch (error) {
const errorMsg = error.message.toLowerCase();
// Check if it's a rate limit error
if (!errorMsg.includes('429') && !errorMsg.includes('rate limit')) {
throw error;
}
retries++;
if (retries >= maxRetries) {
throw new Error(`Max retries (${maxRetries}) exceeded: ${error.message}`);
}
// Calculate delay with exponential backoff
let delay = Math.min(
baseDelay * Math.pow(backoffFactor, retries - 1),
maxDelay
);
// Add jitter (randomize ±25%)
const jitter = delay * 0.25 * (Math.random() * 2 - 1);
delay = delay + jitter;
console.log(`Retry ${retries}/${maxRetries} after ${(delay / 1000).toFixed(2)}s`);
await new Promise(resolve => setTimeout(resolve, delay));
}
}
throw new Error('Unexpected retry loop exit');
}
// Usage
const app = new FirecrawlApp({ apiKey: 'your_api_key' });
async function scrapeUrl(url) {
return retryWithBackoff(
() => app.scrapeUrl(url),
5, // maxRetries
2000, // baseDelay (2 seconds)
2 // backoffFactor
);
}
const result = await scrapeUrl('https://example.com');
console.log('Successfully scraped:', result.metadata.title);
Adaptive Rate Limiting Strategies
Smart applications adapt their request rate based on server responses, preventing unnecessary delays while respecting limits.
Dynamic Rate Adjustment
import time
from collections import deque
from datetime import datetime, timedelta
class AdaptiveRateLimiter:
"""Dynamically adjust request rate based on API responses"""
def __init__(self, initial_rate=10, min_rate=1, max_rate=50):
self.current_rate = initial_rate # Requests per second
self.min_rate = min_rate
self.max_rate = max_rate
# Track recent request timings
self.request_times = deque(maxlen=100)
self.rate_limit_hits = deque(maxlen=10)
def wait_if_needed(self):
"""Wait to maintain current rate limit"""
now = datetime.now()
# Remove requests older than 1 second
cutoff = now - timedelta(seconds=1)
while self.request_times and self.request_times[0] < cutoff:
self.request_times.popleft()
# Check if we need to wait
if len(self.request_times) >= self.current_rate:
# Calculate how long to wait
oldest_request = self.request_times[0]
wait_until = oldest_request + timedelta(seconds=1)
wait_seconds = (wait_until - now).total_seconds()
if wait_seconds > 0:
time.sleep(wait_seconds)
self.request_times.append(now)
def record_rate_limit(self):
"""Record that we hit a rate limit"""
self.rate_limit_hits.append(datetime.now())
# Reduce rate if we're hitting limits frequently
recent_hits = sum(
1 for hit_time in self.rate_limit_hits
if datetime.now() - hit_time < timedelta(minutes=1)
)
if recent_hits >= 3:
# Hitting rate limits frequently, slow down
self.current_rate = max(self.min_rate, self.current_rate * 0.7)
print(f"Reducing rate to {self.current_rate:.1f} req/s due to rate limits")
def record_success(self):
"""Record a successful request"""
# If we haven't hit rate limits recently, we can speed up
recent_hits = sum(
1 for hit_time in self.rate_limit_hits
if datetime.now() - hit_time < timedelta(minutes=2)
)
if recent_hits == 0 and len(self.request_times) >= 20:
# No recent rate limits and consistent requests, can increase
self.current_rate = min(self.max_rate, self.current_rate * 1.1)
print(f"Increasing rate to {self.current_rate:.1f} req/s")
# Usage with Firecrawl
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key='your_api_key')
limiter = AdaptiveRateLimiter(initial_rate=10, min_rate=2, max_rate=30)
def adaptive_scrape(url):
"""Scrape with adaptive rate limiting"""
limiter.wait_if_needed()
try:
result = app.scrape_url(url)
limiter.record_success()
return result
except Exception as e:
if '429' in str(e) or 'rate limit' in str(e).lower():
limiter.record_rate_limit()
# Wait and retry
time.sleep(30)
return adaptive_scrape(url)
raise
# Scrape multiple URLs with adaptive rate limiting
urls = ['https://example.com/page' + str(i) for i in range(100)]
for url in urls:
result = adaptive_scrape(url)
print(f"Scraped: {url}")
Working with Crawl Jobs and Rate Limits
When using Firecrawl's crawl functionality, rate limiting is handled automatically, but understanding the mechanics helps you optimize large-scale operations.
Crawl Job Rate Management
Firecrawl manages rate limits internally during crawl jobs, similar to handling browser sessions in Puppeteer:
from firecrawl import FirecrawlApp
import time
app = FirecrawlApp(api_key='your_api_key')
# Initiate a crawl job - Firecrawl handles rate limiting internally
crawl_result = app.crawl_url(
'https://example.com',
params={
'limit': 500,
'scrapeOptions': {
'formats': ['markdown', 'html'],
'onlyMainContent': True
}
},
poll_interval=5 # Check status every 5 seconds
)
print(f"Crawl completed: {len(crawl_result['data'])} pages")
The crawl endpoint automatically: - Respects your account's rate limits - Queues pages for processing - Retries failed pages - Distributes load over time
Monitoring Crawl Progress
import FirecrawlApp from '@mendable/firecrawl-js';
const app = new FirecrawlApp({ apiKey: 'your_api_key' });
async function monitoredCrawl(url, limit = 100) {
// Start crawl without waiting
const crawlId = await app.crawlUrl(url, {
limit,
scrapeOptions: {
formats: ['markdown']
}
}, false); // Don't wait for completion
console.log(`Crawl started with ID: ${crawlId}`);
// Poll for status
let isComplete = false;
let totalPages = 0;
while (!isComplete) {
await new Promise(resolve => setTimeout(resolve, 5000)); // Wait 5 seconds
const status = await app.checkCrawlStatus(crawlId);
console.log(`Status: ${status.status}`);
console.log(`Pages completed: ${status.completed || 0}/${limit}`);
if (status.status === 'completed') {
isComplete = true;
totalPages = status.data.length;
} else if (status.status === 'failed') {
throw new Error('Crawl failed');
}
}
console.log(`Crawl completed: ${totalPages} pages`);
return await app.getCrawlResults(crawlId);
}
const results = await monitoredCrawl('https://example.com', 100);
Best Practices for Throttling Compliance
1. Implement Circuit Breakers
Prevent cascading failures when rate limits are consistently exceeded:
class CircuitBreaker:
"""Stops requests after repeated rate limit failures"""
def __init__(self, failure_threshold=5, timeout=60):
self.failure_threshold = failure_threshold
self.timeout = timeout
self.failures = 0
self.last_failure_time = None
self.state = 'CLOSED' # CLOSED, OPEN, HALF_OPEN
def call(self, func, *args, **kwargs):
"""Execute function with circuit breaker protection"""
# Check if circuit is open
if self.state == 'OPEN':
if time.time() - self.last_failure_time < self.timeout:
raise Exception('Circuit breaker is OPEN - too many rate limit failures')
else:
self.state = 'HALF_OPEN'
print("Circuit breaker entering HALF_OPEN state")
try:
result = func(*args, **kwargs)
# Success - reset circuit breaker
if self.state == 'HALF_OPEN':
print("Circuit breaker closing after successful request")
self.state = 'CLOSED'
self.failures = 0
return result
except Exception as e:
if '429' in str(e) or 'rate limit' in str(e).lower():
self.failures += 1
self.last_failure_time = time.time()
if self.failures >= self.failure_threshold:
self.state = 'OPEN'
print(f"Circuit breaker OPEN after {self.failures} failures")
raise
# Usage
breaker = CircuitBreaker(failure_threshold=5, timeout=120)
app = FirecrawlApp(api_key='your_api_key')
def safe_scrape(url):
return breaker.call(app.scrape_url, url)
try:
result = safe_scrape('https://example.com')
except Exception as e:
print(f"Scraping failed: {e}")
2. Use Request Queuing
Queue requests to maintain steady throughput without exceeding limits:
class RequestQueue {
constructor(rateLimit = 10) {
this.rateLimit = rateLimit; // Requests per second
this.queue = [];
this.processing = false;
}
async add(fn) {
return new Promise((resolve, reject) => {
this.queue.push({ fn, resolve, reject });
this.process();
});
}
async process() {
if (this.processing) return;
this.processing = true;
while (this.queue.length > 0) {
const { fn, resolve, reject } = this.queue.shift();
const startTime = Date.now();
try {
const result = await fn();
resolve(result);
} catch (error) {
reject(error);
}
// Ensure we don't exceed rate limit
const elapsed = Date.now() - startTime;
const minInterval = 1000 / this.rateLimit;
if (elapsed < minInterval) {
await new Promise(r => setTimeout(r, minInterval - elapsed));
}
}
this.processing = false;
}
}
// Usage
const queue = new RequestQueue(10); // 10 requests per second
const app = new FirecrawlApp({ apiKey: 'your_api_key' });
async function queuedScrape(url) {
return queue.add(() => app.scrapeUrl(url));
}
// All requests will be automatically queued and rate-limited
const urls = ['https://example.com/1', 'https://example.com/2', 'https://example.com/3'];
const results = await Promise.all(urls.map(url => queuedScrape(url)));
3. Monitor and Alert on Rate Limit Issues
Track rate limit metrics to proactively address issues:
import logging
from datetime import datetime
class RateLimitMonitor:
"""Monitor and log rate limit metrics"""
def __init__(self):
self.total_requests = 0
self.rate_limited_requests = 0
self.total_wait_time = 0
logging.basicConfig(
filename=f'rate_limits_{datetime.now().strftime("%Y%m%d")}.log',
level=logging.INFO
)
def record_request(self, success=True, rate_limited=False, wait_time=0):
"""Record request metrics"""
self.total_requests += 1
if rate_limited:
self.rate_limited_requests += 1
self.total_wait_time += wait_time
logging.warning(
f"Rate limit hit (#{self.rate_limited_requests}). "
f"Waited {wait_time:.2f}s"
)
# Alert if rate limit rate is too high
if self.total_requests > 100:
rate_limit_percentage = (self.rate_limited_requests / self.total_requests) * 100
if rate_limit_percentage > 10:
logging.error(
f"HIGH RATE LIMIT RATE: {rate_limit_percentage:.1f}% "
f"({self.rate_limited_requests}/{self.total_requests})"
)
def get_stats(self):
"""Get current statistics"""
return {
'total_requests': self.total_requests,
'rate_limited': self.rate_limited_requests,
'rate_limit_percentage': (
(self.rate_limited_requests / self.total_requests * 100)
if self.total_requests > 0 else 0
),
'avg_wait_time': (
self.total_wait_time / self.rate_limited_requests
if self.rate_limited_requests > 0 else 0
)
}
# Usage
monitor = RateLimitMonitor()
app = FirecrawlApp(api_key='your_api_key')
def monitored_scrape(url):
start_time = time.time()
try:
result = app.scrape_url(url)
monitor.record_request(success=True)
return result
except Exception as e:
if '429' in str(e) or 'rate limit' in str(e).lower():
wait_time = time.time() - start_time
monitor.record_request(success=False, rate_limited=True, wait_time=wait_time)
raise
# Print statistics periodically
stats = monitor.get_stats()
print(f"Rate limit stats: {stats}")
Conclusion
Firecrawl's throttling and rate limiting mechanisms are designed to ensure fair usage and optimal performance for all users. By understanding how these systems work—from token bucket algorithms to HTTP response headers and automatic retry logic—you can build robust web scraping applications that work efficiently within these constraints.
Key takeaways:
- Multi-layered limits: Firecrawl enforces per-second, per-minute, concurrent, and credit-based limits
- Response headers: Monitor
X-RateLimit-*
headers to track your usage in real-time - Automatic retries: The SDK includes built-in exponential backoff for rate limit errors
- Adaptive strategies: Implement dynamic rate adjustment based on API responses
- Crawl jobs: Let Firecrawl manage rate limits automatically for large-scale crawling
- Best practices: Use circuit breakers, request queuing, and monitoring for production systems
By implementing these strategies and respecting Firecrawl's rate limits, you'll ensure reliable, efficient web scraping operations that scale effectively while maintaining service quality.