What is the Deepseek API Pricing Structure?
Deepseek offers competitive pricing for their API services, making it an attractive option for developers working on web scraping and data extraction projects. Understanding the pricing structure is essential for budgeting and optimizing your API usage effectively.
Deepseek Pricing Tiers
Deepseek provides a tiered pricing model based on the specific model you use and the number of tokens processed. As of 2025, Deepseek offers several models with different pricing structures:
Deepseek V3 Pricing
Deepseek V3 is the flagship model offering excellent performance for complex data extraction tasks:
- Input tokens: $0.27 per million tokens
- Cached input tokens: $0.027 per million tokens (90% discount)
- Output tokens: $1.10 per million tokens
This pricing makes Deepseek V3 one of the most cost-effective large language models available, especially when utilizing the caching feature for repeated web scraping tasks.
Deepseek R1 Pricing
Deepseek R1 is optimized for reasoning tasks and structured data extraction:
- Input tokens: $0.55 per million tokens
- Cached input tokens: $0.055 per million tokens (90% discount)
- Output tokens: $2.19 per million tokens
While slightly more expensive than V3, R1 provides enhanced reasoning capabilities that can be valuable for complex web scraping scenarios requiring logical inference.
Deepseek Chat Model Pricing
The Deepseek Chat model offers a balance between cost and performance:
- Input tokens: $0.14 per million tokens
- Cached input tokens: $0.014 per million tokens
- Output tokens: $0.28 per million tokens
Understanding Token-Based Pricing
Deepseek charges based on tokens, which are pieces of text processed by the API. For web scraping applications:
- Input tokens: The HTML content, your prompt instructions, and any examples you provide
- Output tokens: The extracted data returned by the API (typically structured as JSON)
- Cached tokens: Previously processed input that can be reused at a 90% discount
Token Estimation
On average: - 1 token ≈ 4 characters in English - 1 token ≈ ¾ of a word - A typical web page (10KB HTML) ≈ 2,500-3,000 tokens
Comparing Costs with Other Providers
Deepseek's pricing is significantly more competitive compared to other LLM providers:
| Provider | Model | Input ($/1M tokens) | Output ($/1M tokens) | |----------|-------|---------------------|----------------------| | Deepseek | V3 | $0.27 | $1.10 | | Deepseek | R1 | $0.55 | $2.19 | | OpenAI | GPT-4 Turbo | $10.00 | $30.00 | | Anthropic | Claude 3.5 Sonnet | $3.00 | $15.00 | | Google | Gemini 1.5 Pro | $1.25 | $5.00 |
This makes Deepseek particularly attractive for high-volume web scraping projects where cost efficiency is critical.
Practical Cost Examples for Web Scraping
Example 1: Product Data Extraction
Scraping 1,000 e-commerce product pages:
- Average HTML size: 15KB per page
- Input tokens: ~3,750 per page
- Output tokens: ~200 per page (structured JSON)
- Total cost with Deepseek V3:
- Input: (1,000 × 3,750 / 1,000,000) × $0.27 = $0.001
- Output: (1,000 × 200 / 1,000,000) × $1.10 = $0.0002
- Total: ~$0.0012 or 0.12 cents
import requests
import json
def extract_product_data(html_content, api_key):
"""
Extract product data using Deepseek API
Cost-efficient approach for bulk scraping
"""
url = "https://api.deepseek.com/v1/chat/completions"
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
payload = {
"model": "deepseek-chat",
"messages": [
{
"role": "system",
"content": "Extract product information as JSON with fields: name, price, description, rating"
},
{
"role": "user",
"content": html_content
}
],
"temperature": 0.1,
"max_tokens": 500
}
response = requests.post(url, headers=headers, json=payload)
result = response.json()
# Track token usage for cost monitoring
usage = result.get("usage", {})
print(f"Tokens used - Input: {usage.get('prompt_tokens')}, Output: {usage.get('completion_tokens')}")
return json.loads(result["choices"][0]["message"]["content"])
# Example usage
api_key = "your_deepseek_api_key"
with open("product_page.html", "r") as f:
html = f.read()
product_data = extract_product_data(html, api_key)
print(json.dumps(product_data, indent=2))
Example 2: News Article Scraping
Scraping 10,000 news articles for sentiment analysis:
- Average HTML size: 25KB per article
- Input tokens: ~6,250 per article
- Output tokens: ~300 per article
- Total cost with Deepseek V3:
- Input: (10,000 × 6,250 / 1,000,000) × $0.27 = $0.017
- Output: (10,000 × 300 / 1,000,000) × $1.10 = $0.003
- Total: ~$0.02 or 2 cents
const axios = require('axios');
async function extractArticleData(htmlContent, apiKey) {
const url = 'https://api.deepseek.com/v1/chat/completions';
const payload = {
model: 'deepseek-chat',
messages: [
{
role: 'system',
content: 'Extract article metadata as JSON: title, author, publish_date, summary, sentiment (positive/negative/neutral)'
},
{
role: 'user',
content: htmlContent
}
],
temperature: 0.1,
max_tokens: 600
};
try {
const response = await axios.post(url, payload, {
headers: {
'Authorization': `Bearer ${apiKey}`,
'Content-Type': 'application/json'
}
});
const usage = response.data.usage;
console.log(`Tokens - Input: ${usage.prompt_tokens}, Output: ${usage.completion_tokens}`);
console.log(`Estimated cost: $${(usage.prompt_tokens * 0.14 + usage.completion_tokens * 0.28) / 1000000}`);
return JSON.parse(response.data.choices[0].message.content);
} catch (error) {
console.error('API Error:', error.response?.data || error.message);
throw error;
}
}
// Example usage
const fs = require('fs').promises;
(async () => {
const apiKey = 'your_deepseek_api_key';
const html = await fs.readFile('article.html', 'utf-8');
const articleData = await extractArticleData(html, apiKey);
console.log(JSON.stringify(articleData, null, 2));
})();
Optimizing Costs with Caching
Deepseek offers a 90% discount on cached input tokens, which is particularly valuable for web scraping when:
- Scraping similar pages: Product listings, search results, or category pages with similar structures
- Using consistent prompts: Reusing the same extraction instructions across multiple pages
- Batch processing: Processing multiple pages in sequence where the system prompt remains constant
Caching Example
import requests
import time
def scrape_with_caching(pages, api_key):
"""
Leverage caching for cost-efficient bulk scraping
First request incurs full cost, subsequent requests get 90% discount on system prompt
"""
url = "https://api.deepseek.com/v1/chat/completions"
# System prompt will be cached after first use
system_prompt = """Extract the following fields from the HTML:
- product_name
- price
- availability
- rating
Return as JSON only."""
results = []
total_cost = 0
for idx, page_html in enumerate(pages):
payload = {
"model": "deepseek-chat",
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": page_html}
],
"max_tokens": 400
}
response = requests.post(
url,
headers={"Authorization": f"Bearer {api_key}"},
json=payload
)
data = response.json()
usage = data.get("usage", {})
# Calculate cost considering cached tokens
prompt_tokens = usage.get("prompt_tokens", 0)
cached_tokens = usage.get("cached_tokens", 0)
completion_tokens = usage.get("completion_tokens", 0)
# Cached tokens are 90% cheaper
request_cost = (
(prompt_tokens - cached_tokens) * 0.14 / 1_000_000 +
cached_tokens * 0.014 / 1_000_000 +
completion_tokens * 0.28 / 1_000_000
)
total_cost += request_cost
print(f"Page {idx + 1}: ${request_cost:.6f} (Cached: {cached_tokens} tokens)")
results.append(data["choices"][0]["message"]["content"])
time.sleep(0.1) # Rate limiting
print(f"\nTotal cost for {len(pages)} pages: ${total_cost:.4f}")
return results
Cost Optimization Strategies
1. Minimize Input Size
Strip unnecessary HTML elements before sending to the API:
from bs4 import BeautifulSoup
def clean_html(html_content):
"""Remove scripts, styles, and comments to reduce token count"""
soup = BeautifulSoup(html_content, 'html.parser')
# Remove unwanted elements
for element in soup(['script', 'style', 'nav', 'footer', 'header']):
element.decompose()
# Get only the main content
main_content = soup.find('main') or soup.find('article') or soup.body
return str(main_content) if main_content else str(soup)
# Reduces token count by 40-60% on average
cleaned_html = clean_html(raw_html)
2. Use Batch Processing
Process multiple items in a single API call when appropriate:
async function batchExtract(items, apiKey) {
// Combine multiple small extractions into one request
const batchPrompt = items.map((item, idx) =>
`Item ${idx + 1}: ${item}`
).join('\n\n');
// Single API call instead of multiple
const response = await axios.post('https://api.deepseek.com/v1/chat/completions', {
model: 'deepseek-chat',
messages: [
{ role: 'system', content: 'Extract data from each item and return as JSON array' },
{ role: 'user', content: batchPrompt }
]
}, {
headers: { 'Authorization': `Bearer ${apiKey}` }
});
return response.data;
}
3. Set Appropriate Token Limits
Control output token usage by setting max_tokens
:
# For simple extractions
payload = {
"model": "deepseek-chat",
"max_tokens": 200, # Limit output to reduce costs
"messages": [...]
}
# For complex extractions
payload = {
"model": "deepseek-chat",
"max_tokens": 1000, # Allow more detailed responses
"messages": [...]
}
Monitoring and Tracking Costs
Implement cost tracking in your scraping pipeline:
class DeepseekCostTracker:
def __init__(self, model="deepseek-chat"):
self.model = model
self.total_input_tokens = 0
self.total_output_tokens = 0
self.total_cached_tokens = 0
# Pricing per model
self.pricing = {
"deepseek-chat": {"input": 0.14, "output": 0.28, "cached": 0.014},
"deepseek-v3": {"input": 0.27, "output": 1.10, "cached": 0.027},
"deepseek-r1": {"input": 0.55, "output": 2.19, "cached": 0.055}
}
def add_request(self, usage_data):
self.total_input_tokens += usage_data.get("prompt_tokens", 0)
self.total_output_tokens += usage_data.get("completion_tokens", 0)
self.total_cached_tokens += usage_data.get("cached_tokens", 0)
def get_total_cost(self):
prices = self.pricing[self.model]
# Calculate actual input tokens (excluding cached)
actual_input = self.total_input_tokens - self.total_cached_tokens
cost = (
actual_input * prices["input"] / 1_000_000 +
self.total_cached_tokens * prices["cached"] / 1_000_000 +
self.total_output_tokens * prices["output"] / 1_000_000
)
return cost
def print_summary(self):
print(f"\n=== Cost Summary ===")
print(f"Model: {self.model}")
print(f"Input tokens: {self.total_input_tokens:,}")
print(f"Cached tokens: {self.total_cached_tokens:,}")
print(f"Output tokens: {self.total_output_tokens:,}")
print(f"Total cost: ${self.get_total_cost():.4f}")
print(f"===================\n")
# Usage
tracker = DeepseekCostTracker("deepseek-chat")
for page in pages:
response = scrape_page(page)
tracker.add_request(response["usage"])
tracker.print_summary()
When to Choose Deepseek for Web Scraping
Deepseek's pricing makes it ideal for:
- High-volume scraping: Processing thousands or millions of pages
- Structured data extraction: Converting HTML to JSON with consistent schemas
- Sentiment analysis: Analyzing content from scraped pages
- Content summarization: Creating summaries of long articles or reviews
- Multi-language extraction: Processing content in various languages
For modern browser automation workflows that require both JavaScript rendering and AI-powered extraction, combining tools like Puppeteer with Deepseek can provide excellent results at minimal cost.
Conclusion
Deepseek offers one of the most competitive pricing structures in the LLM market, making it an excellent choice for web scraping and data extraction projects. With rates as low as $0.27 per million input tokens for Deepseek V3, and the ability to cache frequently used prompts at a 90% discount, developers can build cost-effective, scalable scraping solutions.
By implementing optimization strategies like HTML cleaning, caching, and batch processing, you can further reduce costs while maintaining high-quality data extraction. Always monitor your token usage and track costs to ensure your scraping projects remain within budget.
For developers looking to integrate AI-powered extraction into their browser automation pipelines, Deepseek's affordable pricing enables experimentation and scaling without breaking the bank.