What is the Firecrawl Documentation and Where Can I Find It?
Firecrawl is a powerful web scraping and crawling API that transforms websites into clean, LLM-ready markdown or structured data. For developers looking to integrate Firecrawl into their applications, comprehensive documentation is essential. This guide covers everything you need to know about accessing and using Firecrawl's documentation resources.
Official Firecrawl Documentation
The primary source for Firecrawl documentation is the official Firecrawl docs website at docs.firecrawl.dev. This comprehensive resource provides detailed information about all Firecrawl features, API endpoints, and integration methods.
Key Documentation Sections
The Firecrawl documentation is organized into several main sections:
1. Getting Started - Quick start guides for different programming languages - Authentication and API key setup - Basic usage examples - Installation instructions for SDKs
2. API Reference - Complete endpoint documentation - Request/response schemas - Parameter descriptions - Error codes and handling
3. SDK Documentation - Language-specific implementation guides - Code examples and best practices - Advanced configuration options
4. Features and Capabilities - Web scraping options - Crawling strategies - Data extraction methods - Output format customization
Firecrawl API Endpoints
Firecrawl provides several core API endpoints that are thoroughly documented:
Scrape Endpoint
The /scrape
endpoint extracts data from a single URL and converts it to clean markdown or structured data.
Python Example:
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key='your_api_key')
# Scrape a single page
result = app.scrape_url('https://example.com', {
'formats': ['markdown', 'html'],
'onlyMainContent': True
})
print(result['markdown'])
JavaScript Example:
import FirecrawlApp from '@mendable/firecrawl-js';
const app = new FirecrawlApp({apiKey: 'your_api_key'});
// Scrape a single page
const result = await app.scrapeUrl('https://example.com', {
formats: ['markdown', 'html'],
onlyMainContent: true
});
console.log(result.markdown);
Crawl Endpoint
The /crawl
endpoint recursively crawls multiple pages from a website, following links up to a specified depth.
Python Example:
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key='your_api_key')
# Crawl a website
crawl_result = app.crawl_url('https://example.com', {
'limit': 100,
'scrapeOptions': {
'formats': ['markdown']
}
})
for page in crawl_result['data']:
print(f"URL: {page['metadata']['url']}")
print(f"Content: {page['markdown'][:200]}...")
JavaScript Example:
import FirecrawlApp from '@mendable/firecrawl-js';
const app = new FirecrawlApp({apiKey: 'your_api_key'});
// Crawl a website
const crawlResult = await app.crawlUrl('https://example.com', {
limit: 100,
scrapeOptions: {
formats: ['markdown']
}
});
crawlResult.data.forEach(page => {
console.log(`URL: ${page.metadata.url}`);
console.log(`Content: ${page.markdown.substring(0, 200)}...`);
});
Map Endpoint
The /map
endpoint returns a list of all URLs found on a website without scraping the content.
Python Example:
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key='your_api_key')
# Map a website
map_result = app.map_url('https://example.com')
print(f"Found {len(map_result['links'])} links")
for link in map_result['links'][:10]:
print(link)
GitHub Repository
The Firecrawl GitHub repository at github.com/mendableai/firecrawl is another valuable documentation resource. Here you'll find:
- Complete source code (for self-hosting)
- README with quick start instructions
- Examples directory with practical use cases
- Issue tracker for bug reports and feature requests
- Contributing guidelines
- Changelog and release notes
Exploring the Examples Directory
The GitHub repository contains an examples/
directory with practical implementations:
# Clone the repository
git clone https://github.com/mendableai/firecrawl.git
cd firecrawl/examples
# Browse examples for different use cases
ls -la
SDK-Specific Documentation
Firecrawl provides official SDKs for multiple programming languages, each with its own documentation:
Python SDK
Install via pip:
pip install firecrawl-py
The Python SDK documentation includes: - Synchronous and asynchronous operations - Error handling patterns - Configuration options - Type hints and IDE support
Advanced Python Example:
from firecrawl import FirecrawlApp
import asyncio
async def scrape_multiple_urls():
app = FirecrawlApp(api_key='your_api_key')
urls = [
'https://example.com/page1',
'https://example.com/page2',
'https://example.com/page3'
]
tasks = [app.scrape_url(url, {'formats': ['markdown']}) for url in urls]
results = await asyncio.gather(*tasks)
return results
# Run the async function
results = asyncio.run(scrape_multiple_urls())
JavaScript/TypeScript SDK
Install via npm:
npm install @mendable/firecrawl-js
The JavaScript SDK supports both CommonJS and ES modules, with full TypeScript definitions:
TypeScript Example:
import FirecrawlApp, { ScrapeResponse } from '@mendable/firecrawl-js';
interface CustomMetadata {
title: string;
description: string;
}
const app = new FirecrawlApp({apiKey: process.env.FIRECRAWL_API_KEY!});
async function scrapeWithTypes(url: string): Promise<ScrapeResponse> {
const result = await app.scrapeUrl(url, {
formats: ['markdown', 'html'],
onlyMainContent: true,
waitFor: 2000
});
return result;
}
Other Language SDKs
Firecrawl also provides SDKs for: - Go: github.com/mendableai/firecrawl-go - Ruby: Available via RubyGems - Rust: Community-maintained implementations
Check the official documentation for installation and usage instructions for each SDK.
API Reference Documentation
The API reference provides detailed information about every parameter and option:
Common Parameters
Scraping Options:
- formats
: Array of output formats (markdown
, html
, rawHtml
, links
, screenshot
)
- onlyMainContent
: Boolean to extract only main content
- includeTags
: Array of HTML tags to include
- excludeTags
: Array of HTML tags to exclude
- waitFor
: Milliseconds to wait for JavaScript rendering
- timeout
: Maximum time to wait for page load
Crawling Options:
- limit
: Maximum number of pages to crawl
- maxDepth
: Maximum crawl depth
- allowBackwardLinks
: Whether to follow links to parent pages
- allowExternalLinks
: Whether to follow external domains
- ignoreSitemap
: Skip sitemap when crawling
Response Formats
Understanding the response structure is crucial for handling browser events and processing scraped data effectively:
{
"success": true,
"data": {
"markdown": "# Page Title\n\nPage content...",
"html": "<html>...</html>",
"metadata": {
"title": "Page Title",
"description": "Page description",
"language": "en",
"sourceURL": "https://example.com",
"statusCode": 200
},
"links": ["https://example.com/page1", "https://example.com/page2"]
}
}
Learning Resources
Beyond the official documentation, several resources can help you master Firecrawl:
Video Tutorials
The Firecrawl YouTube channel features: - Getting started tutorials - Advanced use case demonstrations - Integration examples with popular frameworks - Performance optimization tips
Blog and Articles
The Mendable.ai blog publishes regular articles about: - New feature announcements - Best practices for web scraping - Case studies and success stories - Comparison with other scraping tools
Community Resources
Discord Server: Join the Firecrawl community Discord for: - Real-time support from developers - Sharing use cases and solutions - Feature discussions and feedback - Networking with other Firecrawl users
Stack Overflow: Search for the firecrawl
tag to find:
- Common issues and solutions
- Code examples from the community
- Best practices and patterns
Docker Documentation
For self-hosting Firecrawl, the documentation includes comprehensive Docker deployment guides:
# Pull the official Docker image
docker pull mendableai/firecrawl
# Run Firecrawl locally
docker run -p 3002:3002 \
-e FIRECRAWL_API_KEY=your_api_key \
mendableai/firecrawl
The Docker documentation covers: - Environment variable configuration - Volume mounting for persistent data - Scaling with Docker Compose - Production deployment considerations
Similar to using Puppeteer with Docker, Firecrawl's containerized deployment simplifies infrastructure management.
API Rate Limits and Quotas
The documentation clearly outlines rate limits for different subscription tiers:
- Free Tier: 500 credits/month
- Starter Plan: 10,000 credits/month
- Standard Plan: 100,000 credits/month
- Growth Plan: 500,000 credits/month
Each endpoint consumes different credit amounts: - Scrape: 1 credit per page - Crawl: 1 credit per page crawled - Map: 1 credit per website
Error Handling Documentation
Proper error handling is essential when building robust scraping solutions. The documentation provides detailed error codes:
from firecrawl import FirecrawlApp
from firecrawl.exceptions import FirecrawlException
app = FirecrawlApp(api_key='your_api_key')
try:
result = app.scrape_url('https://example.com')
except FirecrawlException as e:
if e.status_code == 402:
print("Insufficient credits")
elif e.status_code == 429:
print("Rate limit exceeded")
elif e.status_code == 500:
print("Server error, retry later")
else:
print(f"Error: {e.message}")
Understanding how to handle errors in Puppeteer can also help you implement robust error handling patterns in your Firecrawl integrations.
OpenAPI Specification
For developers who prefer working with OpenAPI/Swagger specifications, Firecrawl provides: - Complete OpenAPI 3.0 specification file - Interactive API explorer - Auto-generated client libraries - Contract testing capabilities
Access the OpenAPI spec at: https://api.firecrawl.dev/openapi.json
Conclusion
The Firecrawl documentation ecosystem provides comprehensive resources for developers at all skill levels. Start with the official docs at docs.firecrawl.dev, explore the GitHub repository for examples, and leverage the SDK-specific documentation for your preferred programming language. With these resources, you'll be able to build powerful web scraping solutions that convert any website into clean, structured data ready for analysis or LLM processing.
For ongoing updates, subscribe to the Firecrawl newsletter, watch the GitHub repository, and join the community Discord to stay informed about new features, best practices, and integration techniques.