What does the --max-time option do in Curl?
The --max-time
option in Curl sets the maximum time in seconds that the entire operation is allowed to take. This is a crucial timeout control that prevents Curl requests from hanging indefinitely and ensures your web scraping operations complete within a reasonable timeframe.
Understanding --max-time vs Other Timeout Options
Curl provides several timeout-related options, and it's important to understand how --max-time
differs from others:
--max-time
: Controls the total time for the entire operation (connection + data transfer)--connect-timeout
: Only controls the connection establishment phase--speed-time
: Works with--speed-limit
to abort slow transfers
The --max-time
option is the most comprehensive timeout control, covering the entire request lifecycle from initial connection to final data transfer completion.
Basic Syntax and Usage
Command Line Syntax
curl --max-time SECONDS URL
# or using the short form
curl -m SECONDS URL
Basic Examples
# Set maximum time to 30 seconds
curl --max-time 30 https://example.com
# Using short form (-m)
curl -m 10 https://api.example.com/data
# Combining with other options
curl -m 60 -H "User-Agent: MyBot/1.0" https://example.com/api
Practical Use Cases
1. Web Scraping with Timeout Protection
When scraping multiple pages, timeouts prevent your script from getting stuck on slow-responding servers:
#!/bin/bash
urls=(
"https://example.com/page1"
"https://example.com/page2"
"https://example.com/page3"
)
for url in "${urls[@]}"; do
echo "Scraping: $url"
curl -m 30 -s "$url" > "$(basename $url).html" || echo "Failed or timed out: $url"
done
2. API Testing with Time Constraints
# Test API endpoint with 5-second timeout
curl -m 5 -w "Total time: %{time_total}s\n" https://api.example.com/health
# JSON API request with timeout
curl -m 15 \
-H "Content-Type: application/json" \
-d '{"query": "test"}' \
https://api.example.com/search
3. Downloading Files with Size Limits
# Download with both time and progress monitoring
curl -m 300 -# -o largefile.zip https://example.com/largefile.zip
# Download with fallback if timeout occurs
curl -m 120 -o data.json https://example.com/data.json || \
curl -m 60 -o data.json https://backup.example.com/data.json
Advanced Configuration Examples
Combining Multiple Timeout Options
# Comprehensive timeout configuration
curl \
--connect-timeout 10 \
--max-time 60 \
--speed-time 30 \
--speed-limit 1000 \
https://example.com/data
This configuration: - Allows 10 seconds for connection establishment - Sets total operation limit to 60 seconds - Aborts if transfer speed drops below 1000 bytes/sec for 30 seconds
Using with Retry Logic
#!/bin/bash
max_attempts=3
timeout_seconds=30
url="https://example.com/api/data"
for attempt in $(seq 1 $max_attempts); do
echo "Attempt $attempt/$max_attempts"
if curl -m $timeout_seconds -f -s "$url" -o response.json; then
echo "Success on attempt $attempt"
break
else
echo "Failed on attempt $attempt"
if [ $attempt -eq $max_attempts ]; then
echo "All attempts failed"
exit 1
fi
sleep 2
fi
done
Programming Language Integration
Python with subprocess
import subprocess
import json
def fetch_with_timeout(url, timeout=30):
try:
result = subprocess.run([
'curl', '-m', str(timeout), '-s', '-f', url
], capture_output=True, text=True, timeout=timeout+5)
if result.returncode == 0:
return result.stdout
else:
print(f"Curl failed with code {result.returncode}")
return None
except subprocess.TimeoutExpired:
print(f"Request timed out after {timeout} seconds")
return None
# Usage
data = fetch_with_timeout('https://api.example.com/data.json', 15)
if data:
json_data = json.loads(data)
print(json_data)
Node.js with child_process
const { exec } = require('child_process');
function fetchWithTimeout(url, timeout = 30) {
return new Promise((resolve, reject) => {
const command = `curl -m ${timeout} -s -f "${url}"`;
exec(command, (error, stdout, stderr) => {
if (error) {
if (error.code === 28) {
reject(new Error(`Request timed out after ${timeout} seconds`));
} else {
reject(new Error(`Curl failed: ${error.message}`));
}
return;
}
resolve(stdout);
});
});
}
// Usage
fetchWithTimeout('https://api.example.com/data.json', 20)
.then(data => {
console.log('Response:', JSON.parse(data));
})
.catch(error => {
console.error('Error:', error.message);
});
Error Handling and Exit Codes
When --max-time
timeout is reached, Curl exits with code 28. Understanding exit codes helps with proper error handling:
curl -m 5 https://httpbin.org/delay/10
echo "Exit code: $?"
# Output: Exit code: 28
Bash Script with Error Handling
#!/bin/bash
handle_curl_response() {
local url=$1
local timeout=${2:-30}
curl -m $timeout -s -w "HTTPCODE:%{http_code};TIME:%{time_total}" "$url"
local exit_code=$?
case $exit_code in
0)
echo "Success"
;;
28)
echo "Error: Operation timed out after $timeout seconds"
;;
6)
echo "Error: Couldn't resolve host"
;;
7)
echo "Error: Failed to connect to host"
;;
*)
echo "Error: Curl failed with exit code $exit_code"
;;
esac
return $exit_code
}
Performance Monitoring with --max-time
Combine --max-time
with Curl's built-in timing variables for performance analysis:
# Monitor timing metrics with timeout protection
curl -m 30 -w "
Total time: %{time_total}s
Connect time: %{time_connect}s
Download time: %{time_starttransfer}s
HTTP code: %{http_code}
" -s -o /dev/null https://example.com
# Create a performance monitoring script
#!/bin/bash
monitor_endpoint() {
local url=$1
local max_time=${2:-30}
curl -m $max_time \
-w "URL: $url\nTotal: %{time_total}s\nConnect: %{time_connect}s\nHTTP: %{http_code}\n---\n" \
-s -o /dev/null "$url"
}
# Monitor multiple endpoints
monitor_endpoint "https://api.example.com/health" 10
monitor_endpoint "https://api.example.com/data" 25
Best Practices and Recommendations
1. Choose Appropriate Timeout Values
- API calls: 10-30 seconds depending on complexity
- File downloads: Calculate based on expected file size and minimum acceptable speed
- Web scraping: 15-60 seconds depending on page complexity
- Health checks: 5-10 seconds for quick validation
2. Consider Network Conditions
# Different timeouts for different environments
if [ "$ENVIRONMENT" = "production" ]; then
MAX_TIME=30
elif [ "$ENVIRONMENT" = "staging" ]; then
MAX_TIME=60
else
MAX_TIME=120 # Development with potentially slower connections
fi
curl -m $MAX_TIME https://api.example.com/data
3. Implement Graceful Degradation
When building robust web scraping systems, handling timeouts effectively is crucial for maintaining reliability across different tools and scenarios.
# Primary attempt with strict timeout
if ! curl -m 10 -f -s "$PRIMARY_URL" -o data.json; then
echo "Primary source timed out, trying backup..."
# Backup attempt with longer timeout
curl -m 30 -f -s "$BACKUP_URL" -o data.json
fi
Integration with Web Scraping Workflows
The --max-time
option becomes especially valuable when integrated into larger web scraping workflows. Similar to how timeouts are managed in browser automation tools, Curl's timeout controls help maintain system stability and prevent resource exhaustion.
Batch Processing with Timeouts
#!/bin/bash
process_urls_from_file() {
local url_file=$1
local timeout=${2:-30}
local max_parallel=${3:-5}
while IFS= read -r url; do
{
curl -m $timeout -f -s "$url" -o "$(echo $url | sed 's|.*/||').html"
echo "Processed: $url"
} &
# Limit parallel processes
(($(jobs -r | wc -l) >= max_parallel)) && wait
done < "$url_file"
wait # Wait for all background jobs to complete
}
Troubleshooting Common Issues
Issue 1: Timeouts Too Short
# Problem: Legitimate requests timing out
curl -m 5 https://slow-api.example.com/data
# Solution: Increase timeout for known slow endpoints
curl -m 60 https://slow-api.example.com/data
Issue 2: Inconsistent Network Performance
# Adaptive timeout based on previous response times
get_adaptive_timeout() {
local base_timeout=30
local last_time=$(curl -m $base_timeout -w "%{time_total}" -s -o /dev/null "$1" 2>/dev/null)
if [ -n "$last_time" ]; then
# Set timeout to 3x the last response time, minimum 10s
echo $(echo "$last_time * 3" | bc | xargs printf "%.0f")
else
echo $base_timeout
fi
}
Conclusion
The --max-time
option is an essential tool for controlling Curl's execution time and building reliable web scraping and API interaction workflows. By setting appropriate timeout values, you can prevent hanging requests, improve system stability, and create more predictable automation scripts.
Whether you're building simple data collection scripts or complex web scraping systems, understanding and properly implementing timeout controls with --max-time
will significantly improve the robustness and reliability of your HTTP-based operations.
Remember to test your timeout values under various network conditions and adjust them based on your specific use case requirements. The key is finding the balance between allowing sufficient time for legitimate requests while preventing indefinite hangs that can compromise your system's performance.