Table of contents

Where Can I Find the Deepseek Documentation for API Integration?

The official Deepseek API documentation is available at platform.deepseek.com/api-docs and api-docs.deepseek.com. This comprehensive resource provides detailed information about API endpoints, authentication methods, request/response formats, and model specifications for integrating Deepseek's large language models into your applications.

Official Deepseek Documentation Resources

Primary Documentation Hub

The main Deepseek documentation portal is hosted at platform.deepseek.com, where you can find:

  • API Reference: Complete endpoint specifications and parameters
  • Authentication Guide: API key generation and management
  • Model Documentation: Detailed information about DeepSeek-V3, DeepSeek-R1, and other models
  • Rate Limits and Pricing: Usage quotas and cost structures
  • SDKs and Libraries: Official client libraries for various programming languages
  • Code Examples: Sample implementations in Python, JavaScript, and other languages

Alternative Documentation Sources

In addition to the official platform, Deepseek provides documentation through:

  1. GitHub Repository: github.com/deepseek-ai - Contains open-source models, tools, and example projects
  2. API Documentation Site: api-docs.deepseek.com - Dedicated API reference with interactive examples
  3. Developer Community: Discord and GitHub Discussions for community support and updates

Getting Started with Deepseek API

Obtaining API Credentials

Before you can integrate Deepseek API, you need to obtain an API key:

  1. Visit platform.deepseek.com
  2. Create an account or sign in
  3. Navigate to the API Keys section in your dashboard
  4. Generate a new API key
  5. Store your API key securely (never commit it to version control)

Basic API Integration Example (Python)

Here's a simple example of using the Deepseek API for web scraping data extraction:

import requests
import json

# Configuration
DEEPSEEK_API_KEY = "your_api_key_here"
DEEPSEEK_API_URL = "https://api.deepseek.com/v1/chat/completions"

def extract_data_with_deepseek(html_content, extraction_prompt):
    """
    Use Deepseek to extract structured data from HTML content
    """
    headers = {
        "Authorization": f"Bearer {DEEPSEEK_API_KEY}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": "deepseek-chat",
        "messages": [
            {
                "role": "system",
                "content": "You are a web scraping assistant. Extract structured data from HTML."
            },
            {
                "role": "user",
                "content": f"{extraction_prompt}\n\nHTML:\n{html_content}"
            }
        ],
        "temperature": 0.1,
        "max_tokens": 2000,
        "response_format": {"type": "json_object"}
    }

    response = requests.post(DEEPSEEK_API_URL, headers=headers, json=payload)

    if response.status_code == 200:
        return response.json()["choices"][0]["message"]["content"]
    else:
        raise Exception(f"API Error: {response.status_code} - {response.text}")

# Example usage for web scraping
html_snippet = """
<div class="product">
    <h2>Wireless Headphones</h2>
    <span class="price">$89.99</span>
    <p class="rating">4.5 stars (1,234 reviews)</p>
</div>
"""

extraction_prompt = """
Extract the following fields from the product HTML:
- product_name
- price
- rating
- review_count
Return as JSON.
"""

result = extract_data_with_deepseek(html_snippet, extraction_prompt)
print(json.loads(result))

JavaScript/Node.js Integration

For JavaScript developers, here's how to integrate Deepseek API:

const axios = require('axios');

const DEEPSEEK_API_KEY = 'your_api_key_here';
const DEEPSEEK_API_URL = 'https://api.deepseek.com/v1/chat/completions';

async function extractDataWithDeepseek(htmlContent, extractionPrompt) {
    try {
        const response = await axios.post(
            DEEPSEEK_API_URL,
            {
                model: 'deepseek-chat',
                messages: [
                    {
                        role: 'system',
                        content: 'You are a web scraping assistant. Extract structured data from HTML.'
                    },
                    {
                        role: 'user',
                        content: `${extractionPrompt}\n\nHTML:\n${htmlContent}`
                    }
                ],
                temperature: 0.1,
                max_tokens: 2000,
                response_format: { type: 'json_object' }
            },
            {
                headers: {
                    'Authorization': `Bearer ${DEEPSEEK_API_KEY}`,
                    'Content-Type': 'application/json'
                }
            }
        );

        return JSON.parse(response.data.choices[0].message.content);
    } catch (error) {
        console.error('Deepseek API Error:', error.response?.data || error.message);
        throw error;
    }
}

// Example usage
const htmlContent = `
<article class="blog-post">
    <h1>Understanding AI in Web Scraping</h1>
    <span class="author">John Doe</span>
    <time>2025-01-15</time>
</article>
`;

const prompt = 'Extract title, author, and publication date as JSON';

extractDataWithDeepseek(htmlContent, prompt)
    .then(data => console.log(data))
    .catch(error => console.error(error));

Key API Endpoints and Parameters

Chat Completions Endpoint

The primary endpoint for Deepseek API is /v1/chat/completions, which follows the OpenAI-compatible format:

Endpoint: POST https://api.deepseek.com/v1/chat/completions

Key Parameters:

  • model (string, required): Model identifier (e.g., "deepseek-chat", "deepseek-coder")
  • messages (array, required): Conversation history with role and content
  • temperature (float, optional): Controls randomness (0.0-2.0, default: 1.0)
  • max_tokens (integer, optional): Maximum tokens in response
  • top_p (float, optional): Nucleus sampling parameter
  • frequency_penalty (float, optional): Reduces repetition (-2.0 to 2.0)
  • presence_penalty (float, optional): Encourages new topics (-2.0 to 2.0)
  • response_format (object, optional): Specify JSON output format
  • stream (boolean, optional): Enable streaming responses

Response Format

{
    "id": "chatcmpl-abc123",
    "object": "chat.completion",
    "created": 1704067200,
    "model": "deepseek-chat",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "{\"product_name\": \"Wireless Headphones\", \"price\": 89.99}"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 150,
        "completion_tokens": 50,
        "total_tokens": 200
    }
}

Available Deepseek Models

DeepSeek-V3

The latest and most powerful model, optimized for complex reasoning and large context windows:

  • Model ID: deepseek-chat
  • Context Window: 64K tokens
  • Strengths: Advanced reasoning, code generation, multilingual support
  • Use Case: Complex data extraction from unstructured web content

DeepSeek-R1

Specialized reasoning model with chain-of-thought capabilities:

  • Model ID: deepseek-reasoner
  • Context Window: 64K tokens
  • Strengths: Multi-step reasoning, problem-solving
  • Use Case: Analyzing complex web page structures and extracting nested data

DeepSeek-Coder

Optimized for code generation and technical tasks:

  • Model ID: deepseek-coder
  • Context Window: 16K tokens
  • Strengths: Code understanding, technical documentation
  • Use Case: Generating scraping scripts and parsing logic

Integration with Web Scraping Workflows

Combining Deepseek with Traditional Scrapers

Deepseek works exceptionally well when combined with traditional scraping tools. Here's an example workflow:

import requests
from bs4 import BeautifulSoup

def scrape_and_extract(url):
    # Step 1: Fetch HTML content
    response = requests.get(url)
    html_content = response.text

    # Step 2: Pre-process with BeautifulSoup (optional)
    soup = BeautifulSoup(html_content, 'html.parser')
    main_content = soup.find('main') or soup.find('article')
    cleaned_html = str(main_content) if main_content else html_content

    # Step 3: Extract data using Deepseek
    prompt = """
    Extract all product information including:
    - Name
    - Price
    - Description
    - Availability
    - SKU
    Return as structured JSON.
    """

    extracted_data = extract_data_with_deepseek(cleaned_html, prompt)
    return extracted_data

This hybrid approach leverages the speed of traditional parsing for HTML cleanup while using Deepseek's intelligence for complex data extraction. When working with dynamic content, you might need to use browser automation tools to handle AJAX requests before passing the HTML to Deepseek.

Authentication Best Practices

Secure API Key Management

Never hardcode API keys in your source code. Instead, use environment variables:

import os
from dotenv import load_dotenv

load_dotenv()
DEEPSEEK_API_KEY = os.getenv('DEEPSEEK_API_KEY')

if not DEEPSEEK_API_KEY:
    raise ValueError("DEEPSEEK_API_KEY environment variable not set")

Create a .env file (and add it to .gitignore):

DEEPSEEK_API_KEY=your_actual_api_key_here

Rate Limiting and Error Handling

Implement proper rate limiting and retry logic:

import time
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

def create_session_with_retries():
    session = requests.Session()
    retry = Retry(
        total=3,
        backoff_factor=1,
        status_forcelist=[429, 500, 502, 503, 504]
    )
    adapter = HTTPAdapter(max_retries=retry)
    session.mount('http://', adapter)
    session.mount('https://', adapter)
    return session

def rate_limited_request(session, url, headers, payload, requests_per_minute=20):
    response = session.post(url, headers=headers, json=payload)
    time.sleep(60 / requests_per_minute)  # Rate limiting
    return response

Pricing and Usage Limits

According to the official documentation, Deepseek offers competitive pricing:

  • DeepSeek-V3: $0.27 per million input tokens, $1.10 per million output tokens
  • DeepSeek-R1: Similar pricing structure with reasoning tokens counted separately
  • Free Tier: Available with limited monthly quotas for testing

Always check the current pricing at platform.deepseek.com/pricing as rates may change.

Troubleshooting Common Issues

API Connection Errors

If you encounter connection issues:

try:
    response = requests.post(DEEPSEEK_API_URL, headers=headers, json=payload, timeout=30)
    response.raise_for_status()
except requests.exceptions.Timeout:
    print("Request timed out. Check your network connection.")
except requests.exceptions.ConnectionError:
    print("Failed to connect to Deepseek API. Verify the API URL.")
except requests.exceptions.HTTPError as e:
    print(f"HTTP Error: {e.response.status_code} - {e.response.text}")

Rate Limit Exceeded

When you hit rate limits (HTTP 429), implement exponential backoff:

def exponential_backoff_request(url, headers, payload, max_retries=5):
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=payload)

        if response.status_code == 429:
            wait_time = 2 ** attempt
            print(f"Rate limited. Waiting {wait_time} seconds...")
            time.sleep(wait_time)
            continue

        return response

    raise Exception("Max retries exceeded")

Additional Resources

For more advanced use cases and integration patterns, explore:

  1. OpenAI Compatibility: Deepseek API is compatible with OpenAI client libraries, making migration easy
  2. Streaming Responses: Use stream=True for real-time token generation in long responses
  3. Function Calling: Deepseek supports function calling for structured outputs
  4. Batch Processing: Process multiple scraping tasks efficiently with async requests

When building complex scraping workflows that require handling browser sessions or dealing with dynamic content, consider integrating Deepseek with headless browser automation for comprehensive data extraction capabilities.

Conclusion

The Deepseek API documentation at platform.deepseek.com/api-docs provides everything you need to integrate powerful LLM capabilities into your web scraping projects. With its OpenAI-compatible interface, competitive pricing, and advanced reasoning models, Deepseek offers an excellent solution for extracting structured data from complex web pages.

Start by obtaining your API key from the platform, review the official documentation, and experiment with the code examples provided in this guide. The combination of traditional web scraping tools and Deepseek's AI-powered extraction creates a robust solution for modern data collection challenges.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon