Where Can I Find the Deepseek Documentation for API Integration?
The official Deepseek API documentation is available at platform.deepseek.com/api-docs and api-docs.deepseek.com. This comprehensive resource provides detailed information about API endpoints, authentication methods, request/response formats, and model specifications for integrating Deepseek's large language models into your applications.
Official Deepseek Documentation Resources
Primary Documentation Hub
The main Deepseek documentation portal is hosted at platform.deepseek.com, where you can find:
- API Reference: Complete endpoint specifications and parameters
- Authentication Guide: API key generation and management
- Model Documentation: Detailed information about DeepSeek-V3, DeepSeek-R1, and other models
- Rate Limits and Pricing: Usage quotas and cost structures
- SDKs and Libraries: Official client libraries for various programming languages
- Code Examples: Sample implementations in Python, JavaScript, and other languages
Alternative Documentation Sources
In addition to the official platform, Deepseek provides documentation through:
- GitHub Repository: github.com/deepseek-ai - Contains open-source models, tools, and example projects
- API Documentation Site: api-docs.deepseek.com - Dedicated API reference with interactive examples
- Developer Community: Discord and GitHub Discussions for community support and updates
Getting Started with Deepseek API
Obtaining API Credentials
Before you can integrate Deepseek API, you need to obtain an API key:
- Visit platform.deepseek.com
- Create an account or sign in
- Navigate to the API Keys section in your dashboard
- Generate a new API key
- Store your API key securely (never commit it to version control)
Basic API Integration Example (Python)
Here's a simple example of using the Deepseek API for web scraping data extraction:
import requests
import json
# Configuration
DEEPSEEK_API_KEY = "your_api_key_here"
DEEPSEEK_API_URL = "https://api.deepseek.com/v1/chat/completions"
def extract_data_with_deepseek(html_content, extraction_prompt):
"""
Use Deepseek to extract structured data from HTML content
"""
headers = {
"Authorization": f"Bearer {DEEPSEEK_API_KEY}",
"Content-Type": "application/json"
}
payload = {
"model": "deepseek-chat",
"messages": [
{
"role": "system",
"content": "You are a web scraping assistant. Extract structured data from HTML."
},
{
"role": "user",
"content": f"{extraction_prompt}\n\nHTML:\n{html_content}"
}
],
"temperature": 0.1,
"max_tokens": 2000,
"response_format": {"type": "json_object"}
}
response = requests.post(DEEPSEEK_API_URL, headers=headers, json=payload)
if response.status_code == 200:
return response.json()["choices"][0]["message"]["content"]
else:
raise Exception(f"API Error: {response.status_code} - {response.text}")
# Example usage for web scraping
html_snippet = """
<div class="product">
<h2>Wireless Headphones</h2>
<span class="price">$89.99</span>
<p class="rating">4.5 stars (1,234 reviews)</p>
</div>
"""
extraction_prompt = """
Extract the following fields from the product HTML:
- product_name
- price
- rating
- review_count
Return as JSON.
"""
result = extract_data_with_deepseek(html_snippet, extraction_prompt)
print(json.loads(result))
JavaScript/Node.js Integration
For JavaScript developers, here's how to integrate Deepseek API:
const axios = require('axios');
const DEEPSEEK_API_KEY = 'your_api_key_here';
const DEEPSEEK_API_URL = 'https://api.deepseek.com/v1/chat/completions';
async function extractDataWithDeepseek(htmlContent, extractionPrompt) {
try {
const response = await axios.post(
DEEPSEEK_API_URL,
{
model: 'deepseek-chat',
messages: [
{
role: 'system',
content: 'You are a web scraping assistant. Extract structured data from HTML.'
},
{
role: 'user',
content: `${extractionPrompt}\n\nHTML:\n${htmlContent}`
}
],
temperature: 0.1,
max_tokens: 2000,
response_format: { type: 'json_object' }
},
{
headers: {
'Authorization': `Bearer ${DEEPSEEK_API_KEY}`,
'Content-Type': 'application/json'
}
}
);
return JSON.parse(response.data.choices[0].message.content);
} catch (error) {
console.error('Deepseek API Error:', error.response?.data || error.message);
throw error;
}
}
// Example usage
const htmlContent = `
<article class="blog-post">
<h1>Understanding AI in Web Scraping</h1>
<span class="author">John Doe</span>
<time>2025-01-15</time>
</article>
`;
const prompt = 'Extract title, author, and publication date as JSON';
extractDataWithDeepseek(htmlContent, prompt)
.then(data => console.log(data))
.catch(error => console.error(error));
Key API Endpoints and Parameters
Chat Completions Endpoint
The primary endpoint for Deepseek API is /v1/chat/completions
, which follows the OpenAI-compatible format:
Endpoint: POST https://api.deepseek.com/v1/chat/completions
Key Parameters:
model
(string, required): Model identifier (e.g., "deepseek-chat", "deepseek-coder")messages
(array, required): Conversation history with role and contenttemperature
(float, optional): Controls randomness (0.0-2.0, default: 1.0)max_tokens
(integer, optional): Maximum tokens in responsetop_p
(float, optional): Nucleus sampling parameterfrequency_penalty
(float, optional): Reduces repetition (-2.0 to 2.0)presence_penalty
(float, optional): Encourages new topics (-2.0 to 2.0)response_format
(object, optional): Specify JSON output formatstream
(boolean, optional): Enable streaming responses
Response Format
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1704067200,
"model": "deepseek-chat",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "{\"product_name\": \"Wireless Headphones\", \"price\": 89.99}"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 150,
"completion_tokens": 50,
"total_tokens": 200
}
}
Available Deepseek Models
DeepSeek-V3
The latest and most powerful model, optimized for complex reasoning and large context windows:
- Model ID:
deepseek-chat
- Context Window: 64K tokens
- Strengths: Advanced reasoning, code generation, multilingual support
- Use Case: Complex data extraction from unstructured web content
DeepSeek-R1
Specialized reasoning model with chain-of-thought capabilities:
- Model ID:
deepseek-reasoner
- Context Window: 64K tokens
- Strengths: Multi-step reasoning, problem-solving
- Use Case: Analyzing complex web page structures and extracting nested data
DeepSeek-Coder
Optimized for code generation and technical tasks:
- Model ID:
deepseek-coder
- Context Window: 16K tokens
- Strengths: Code understanding, technical documentation
- Use Case: Generating scraping scripts and parsing logic
Integration with Web Scraping Workflows
Combining Deepseek with Traditional Scrapers
Deepseek works exceptionally well when combined with traditional scraping tools. Here's an example workflow:
import requests
from bs4 import BeautifulSoup
def scrape_and_extract(url):
# Step 1: Fetch HTML content
response = requests.get(url)
html_content = response.text
# Step 2: Pre-process with BeautifulSoup (optional)
soup = BeautifulSoup(html_content, 'html.parser')
main_content = soup.find('main') or soup.find('article')
cleaned_html = str(main_content) if main_content else html_content
# Step 3: Extract data using Deepseek
prompt = """
Extract all product information including:
- Name
- Price
- Description
- Availability
- SKU
Return as structured JSON.
"""
extracted_data = extract_data_with_deepseek(cleaned_html, prompt)
return extracted_data
This hybrid approach leverages the speed of traditional parsing for HTML cleanup while using Deepseek's intelligence for complex data extraction. When working with dynamic content, you might need to use browser automation tools to handle AJAX requests before passing the HTML to Deepseek.
Authentication Best Practices
Secure API Key Management
Never hardcode API keys in your source code. Instead, use environment variables:
import os
from dotenv import load_dotenv
load_dotenv()
DEEPSEEK_API_KEY = os.getenv('DEEPSEEK_API_KEY')
if not DEEPSEEK_API_KEY:
raise ValueError("DEEPSEEK_API_KEY environment variable not set")
Create a .env
file (and add it to .gitignore
):
DEEPSEEK_API_KEY=your_actual_api_key_here
Rate Limiting and Error Handling
Implement proper rate limiting and retry logic:
import time
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
def create_session_with_retries():
session = requests.Session()
retry = Retry(
total=3,
backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504]
)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)
return session
def rate_limited_request(session, url, headers, payload, requests_per_minute=20):
response = session.post(url, headers=headers, json=payload)
time.sleep(60 / requests_per_minute) # Rate limiting
return response
Pricing and Usage Limits
According to the official documentation, Deepseek offers competitive pricing:
- DeepSeek-V3: $0.27 per million input tokens, $1.10 per million output tokens
- DeepSeek-R1: Similar pricing structure with reasoning tokens counted separately
- Free Tier: Available with limited monthly quotas for testing
Always check the current pricing at platform.deepseek.com/pricing as rates may change.
Troubleshooting Common Issues
API Connection Errors
If you encounter connection issues:
try:
response = requests.post(DEEPSEEK_API_URL, headers=headers, json=payload, timeout=30)
response.raise_for_status()
except requests.exceptions.Timeout:
print("Request timed out. Check your network connection.")
except requests.exceptions.ConnectionError:
print("Failed to connect to Deepseek API. Verify the API URL.")
except requests.exceptions.HTTPError as e:
print(f"HTTP Error: {e.response.status_code} - {e.response.text}")
Rate Limit Exceeded
When you hit rate limits (HTTP 429), implement exponential backoff:
def exponential_backoff_request(url, headers, payload, max_retries=5):
for attempt in range(max_retries):
response = requests.post(url, headers=headers, json=payload)
if response.status_code == 429:
wait_time = 2 ** attempt
print(f"Rate limited. Waiting {wait_time} seconds...")
time.sleep(wait_time)
continue
return response
raise Exception("Max retries exceeded")
Additional Resources
For more advanced use cases and integration patterns, explore:
- OpenAI Compatibility: Deepseek API is compatible with OpenAI client libraries, making migration easy
- Streaming Responses: Use
stream=True
for real-time token generation in long responses - Function Calling: Deepseek supports function calling for structured outputs
- Batch Processing: Process multiple scraping tasks efficiently with async requests
When building complex scraping workflows that require handling browser sessions or dealing with dynamic content, consider integrating Deepseek with headless browser automation for comprehensive data extraction capabilities.
Conclusion
The Deepseek API documentation at platform.deepseek.com/api-docs provides everything you need to integrate powerful LLM capabilities into your web scraping projects. With its OpenAI-compatible interface, competitive pricing, and advanced reasoning models, Deepseek offers an excellent solution for extracting structured data from complex web pages.
Start by obtaining your API key from the platform, review the official documentation, and experiment with the code examples provided in this guide. The combination of traditional web scraping tools and Deepseek's AI-powered extraction creates a robust solution for modern data collection challenges.