Table of contents

What does the --max-time option do in Curl?

The --max-time option in Curl sets the maximum time in seconds that the entire operation is allowed to take. This is a crucial timeout control that prevents Curl requests from hanging indefinitely and ensures your web scraping operations complete within a reasonable timeframe.

Understanding --max-time vs Other Timeout Options

Curl provides several timeout-related options, and it's important to understand how --max-time differs from others:

  • --max-time: Controls the total time for the entire operation (connection + data transfer)
  • --connect-timeout: Only controls the connection establishment phase
  • --speed-time: Works with --speed-limit to abort slow transfers

The --max-time option is the most comprehensive timeout control, covering the entire request lifecycle from initial connection to final data transfer completion.

Basic Syntax and Usage

Command Line Syntax

curl --max-time SECONDS URL
# or using the short form
curl -m SECONDS URL

Basic Examples

# Set maximum time to 30 seconds
curl --max-time 30 https://example.com

# Using short form (-m)
curl -m 10 https://api.example.com/data

# Combining with other options
curl -m 60 -H "User-Agent: MyBot/1.0" https://example.com/api

Practical Use Cases

1. Web Scraping with Timeout Protection

When scraping multiple pages, timeouts prevent your script from getting stuck on slow-responding servers:

#!/bin/bash
urls=(
    "https://example.com/page1"
    "https://example.com/page2"
    "https://example.com/page3"
)

for url in "${urls[@]}"; do
    echo "Scraping: $url"
    curl -m 30 -s "$url" > "$(basename $url).html" || echo "Failed or timed out: $url"
done

2. API Testing with Time Constraints

# Test API endpoint with 5-second timeout
curl -m 5 -w "Total time: %{time_total}s\n" https://api.example.com/health

# JSON API request with timeout
curl -m 15 \
  -H "Content-Type: application/json" \
  -d '{"query": "test"}' \
  https://api.example.com/search

3. Downloading Files with Size Limits

# Download with both time and progress monitoring
curl -m 300 -# -o largefile.zip https://example.com/largefile.zip

# Download with fallback if timeout occurs
curl -m 120 -o data.json https://example.com/data.json || \
curl -m 60 -o data.json https://backup.example.com/data.json

Advanced Configuration Examples

Combining Multiple Timeout Options

# Comprehensive timeout configuration
curl \
  --connect-timeout 10 \
  --max-time 60 \
  --speed-time 30 \
  --speed-limit 1000 \
  https://example.com/data

This configuration: - Allows 10 seconds for connection establishment - Sets total operation limit to 60 seconds - Aborts if transfer speed drops below 1000 bytes/sec for 30 seconds

Using with Retry Logic

#!/bin/bash
max_attempts=3
timeout_seconds=30
url="https://example.com/api/data"

for attempt in $(seq 1 $max_attempts); do
    echo "Attempt $attempt/$max_attempts"

    if curl -m $timeout_seconds -f -s "$url" -o response.json; then
        echo "Success on attempt $attempt"
        break
    else
        echo "Failed on attempt $attempt"
        if [ $attempt -eq $max_attempts ]; then
            echo "All attempts failed"
            exit 1
        fi
        sleep 2
    fi
done

Programming Language Integration

Python with subprocess

import subprocess
import json

def fetch_with_timeout(url, timeout=30):
    try:
        result = subprocess.run([
            'curl', '-m', str(timeout), '-s', '-f', url
        ], capture_output=True, text=True, timeout=timeout+5)

        if result.returncode == 0:
            return result.stdout
        else:
            print(f"Curl failed with code {result.returncode}")
            return None
    except subprocess.TimeoutExpired:
        print(f"Request timed out after {timeout} seconds")
        return None

# Usage
data = fetch_with_timeout('https://api.example.com/data.json', 15)
if data:
    json_data = json.loads(data)
    print(json_data)

Node.js with child_process

const { exec } = require('child_process');

function fetchWithTimeout(url, timeout = 30) {
    return new Promise((resolve, reject) => {
        const command = `curl -m ${timeout} -s -f "${url}"`;

        exec(command, (error, stdout, stderr) => {
            if (error) {
                if (error.code === 28) {
                    reject(new Error(`Request timed out after ${timeout} seconds`));
                } else {
                    reject(new Error(`Curl failed: ${error.message}`));
                }
                return;
            }
            resolve(stdout);
        });
    });
}

// Usage
fetchWithTimeout('https://api.example.com/data.json', 20)
    .then(data => {
        console.log('Response:', JSON.parse(data));
    })
    .catch(error => {
        console.error('Error:', error.message);
    });

Error Handling and Exit Codes

When --max-time timeout is reached, Curl exits with code 28. Understanding exit codes helps with proper error handling:

curl -m 5 https://httpbin.org/delay/10
echo "Exit code: $?"
# Output: Exit code: 28

Bash Script with Error Handling

#!/bin/bash
handle_curl_response() {
    local url=$1
    local timeout=${2:-30}

    curl -m $timeout -s -w "HTTPCODE:%{http_code};TIME:%{time_total}" "$url"
    local exit_code=$?

    case $exit_code in
        0)
            echo "Success"
            ;;
        28)
            echo "Error: Operation timed out after $timeout seconds"
            ;;
        6)
            echo "Error: Couldn't resolve host"
            ;;
        7)
            echo "Error: Failed to connect to host"
            ;;
        *)
            echo "Error: Curl failed with exit code $exit_code"
            ;;
    esac

    return $exit_code
}

Performance Monitoring with --max-time

Combine --max-time with Curl's built-in timing variables for performance analysis:

# Monitor timing metrics with timeout protection
curl -m 30 -w "
Total time: %{time_total}s
Connect time: %{time_connect}s
Download time: %{time_starttransfer}s
HTTP code: %{http_code}
" -s -o /dev/null https://example.com

# Create a performance monitoring script
#!/bin/bash
monitor_endpoint() {
    local url=$1
    local max_time=${2:-30}

    curl -m $max_time \
         -w "URL: $url\nTotal: %{time_total}s\nConnect: %{time_connect}s\nHTTP: %{http_code}\n---\n" \
         -s -o /dev/null "$url"
}

# Monitor multiple endpoints
monitor_endpoint "https://api.example.com/health" 10
monitor_endpoint "https://api.example.com/data" 25

Best Practices and Recommendations

1. Choose Appropriate Timeout Values

  • API calls: 10-30 seconds depending on complexity
  • File downloads: Calculate based on expected file size and minimum acceptable speed
  • Web scraping: 15-60 seconds depending on page complexity
  • Health checks: 5-10 seconds for quick validation

2. Consider Network Conditions

# Different timeouts for different environments
if [ "$ENVIRONMENT" = "production" ]; then
    MAX_TIME=30
elif [ "$ENVIRONMENT" = "staging" ]; then
    MAX_TIME=60
else
    MAX_TIME=120  # Development with potentially slower connections
fi

curl -m $MAX_TIME https://api.example.com/data

3. Implement Graceful Degradation

When building robust web scraping systems, handling timeouts effectively is crucial for maintaining reliability across different tools and scenarios.

# Primary attempt with strict timeout
if ! curl -m 10 -f -s "$PRIMARY_URL" -o data.json; then
    echo "Primary source timed out, trying backup..."
    # Backup attempt with longer timeout
    curl -m 30 -f -s "$BACKUP_URL" -o data.json
fi

Integration with Web Scraping Workflows

The --max-time option becomes especially valuable when integrated into larger web scraping workflows. Similar to how timeouts are managed in browser automation tools, Curl's timeout controls help maintain system stability and prevent resource exhaustion.

Batch Processing with Timeouts

#!/bin/bash
process_urls_from_file() {
    local url_file=$1
    local timeout=${2:-30}
    local max_parallel=${3:-5}

    while IFS= read -r url; do
        {
            curl -m $timeout -f -s "$url" -o "$(echo $url | sed 's|.*/||').html"
            echo "Processed: $url"
        } &

        # Limit parallel processes
        (($(jobs -r | wc -l) >= max_parallel)) && wait
    done < "$url_file"

    wait  # Wait for all background jobs to complete
}

Troubleshooting Common Issues

Issue 1: Timeouts Too Short

# Problem: Legitimate requests timing out
curl -m 5 https://slow-api.example.com/data

# Solution: Increase timeout for known slow endpoints
curl -m 60 https://slow-api.example.com/data

Issue 2: Inconsistent Network Performance

# Adaptive timeout based on previous response times
get_adaptive_timeout() {
    local base_timeout=30
    local last_time=$(curl -m $base_timeout -w "%{time_total}" -s -o /dev/null "$1" 2>/dev/null)

    if [ -n "$last_time" ]; then
        # Set timeout to 3x the last response time, minimum 10s
        echo $(echo "$last_time * 3" | bc | xargs printf "%.0f")
    else
        echo $base_timeout
    fi
}

Conclusion

The --max-time option is an essential tool for controlling Curl's execution time and building reliable web scraping and API interaction workflows. By setting appropriate timeout values, you can prevent hanging requests, improve system stability, and create more predictable automation scripts.

Whether you're building simple data collection scripts or complex web scraping systems, understanding and properly implementing timeout controls with --max-time will significantly improve the robustness and reliability of your HTTP-based operations.

Remember to test your timeout values under various network conditions and adjust them based on your specific use case requirements. The key is finding the balance between allowing sufficient time for legitimate requests while preventing indefinite hangs that can compromise your system's performance.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon