How do I use Curl to perform health checks on web services?
Health checks are essential for monitoring web service availability and performance. Curl provides powerful options for automated health monitoring, allowing you to verify service status, response times, and content validation. This guide covers comprehensive techniques for implementing robust health checks using Curl.
Basic Health Check Commands
Simple Status Check
The most basic health check verifies that a service responds with a successful HTTP status code:
# Basic health check
curl -f -s -o /dev/null -w "%{http_code}" https://api.example.com/health
# With connection timeout
curl -f -s -o /dev/null -w "%{http_code}" --connect-timeout 5 https://api.example.com/health
The -f
flag makes curl fail silently on HTTP errors, -s
suppresses progress output, -o /dev/null
discards response body, and -w "%{http_code}"
outputs only the HTTP status code.
Comprehensive Health Check
For more detailed monitoring, include response time and connection details:
curl -w "HTTP Status: %{http_code}\nResponse Time: %{time_total}s\nDNS Lookup: %{time_namelookup}s\nConnect Time: %{time_connect}s\n" \
-o /dev/null -s --connect-timeout 10 --max-time 30 \
https://api.example.com/health
Advanced Health Check Techniques
JSON Response Validation
Many health endpoints return JSON with status information. Use jq
to parse and validate responses:
# Check if service status is "healthy"
response=$(curl -s --connect-timeout 5 --max-time 10 https://api.example.com/health)
status=$(echo "$response" | jq -r '.status')
if [ "$status" = "healthy" ]; then
echo "Service is healthy"
exit 0
else
echo "Service is unhealthy: $status"
exit 1
fi
Multiple Endpoint Monitoring
Check multiple services or endpoints simultaneously:
#!/bin/bash
endpoints=(
"https://api.example.com/health"
"https://database.example.com/status"
"https://cache.example.com/ping"
)
for endpoint in "${endpoints[@]}"; do
http_code=$(curl -f -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "$endpoint")
if [ "$http_code" -eq 200 ]; then
echo "✓ $endpoint - OK"
else
echo "✗ $endpoint - FAILED (HTTP $http_code)"
fi
done
Timeout and Error Handling
Setting Appropriate Timeouts
Configure timeouts to prevent hanging health checks:
# Conservative timeouts for production monitoring
curl --connect-timeout 5 \ # 5 seconds for connection
--max-time 15 \ # 15 seconds total timeout
--retry 3 \ # Retry 3 times on failure
--retry-delay 2 \ # 2 seconds between retries
-f -s -o /dev/null -w "%{http_code}" \
https://api.example.com/health
Error Code Interpretation
Handle different types of failures appropriately:
#!/bin/bash
health_check() {
local url=$1
local http_code
http_code=$(curl -f -s -o /dev/null -w "%{http_code}" \
--connect-timeout 5 --max-time 10 "$url" 2>/dev/null)
case $http_code in
200)
echo "Service healthy"
return 0
;;
404|502|503|504)
echo "Service unavailable (HTTP $http_code)"
return 1
;;
000)
echo "Connection failed"
return 2
;;
*)
echo "Unexpected response (HTTP $http_code)"
return 3
;;
esac
}
Authentication and Headers
API Key Authentication
Include authentication headers for protected health endpoints:
# API key authentication
curl -H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-f -s -o /dev/null -w "%{http_code}" \
https://api.example.com/health
# Basic authentication
curl -u username:password \
-f -s -o /dev/null -w "%{http_code}" \
https://api.example.com/health
Custom User Agent
Set a custom user agent to identify health check requests:
curl -H "User-Agent: HealthChecker/1.0" \
-f -s -o /dev/null -w "%{http_code}" \
https://api.example.com/health
Monitoring Scripts and Automation
Comprehensive Health Check Script
Create a reusable script for production monitoring:
#!/bin/bash
# health_check.sh
SERVICE_URL="${1:-https://api.example.com/health}"
TIMEOUT="${2:-10}"
EXPECTED_STATUS="${3:-200}"
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
health_check() {
local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
# Perform health check with detailed timing
local curl_output
curl_output=$(curl -w "http_code:%{http_code};time_total:%{time_total};time_connect:%{time_connect}" \
-s -o /dev/null --connect-timeout 5 --max-time "$TIMEOUT" "$SERVICE_URL" 2>&1)
local exit_code=$?
if [ $exit_code -eq 0 ]; then
local http_code=$(echo "$curl_output" | grep -o 'http_code:[0-9]*' | cut -d: -f2)
local time_total=$(echo "$curl_output" | grep -o 'time_total:[0-9.]*' | cut -d: -f2)
local time_connect=$(echo "$curl_output" | grep -o 'time_connect:[0-9.]*' | cut -d: -f2)
if [ "$http_code" -eq "$EXPECTED_STATUS" ]; then
echo -e "${GREEN}[$timestamp] ✓ Service healthy${NC} - HTTP $http_code (${time_total}s)"
return 0
else
echo -e "${YELLOW}[$timestamp] ⚠ Unexpected status${NC} - HTTP $http_code (${time_total}s)"
return 1
fi
else
echo -e "${RED}[$timestamp] ✗ Service unreachable${NC} - Connection failed"
return 2
fi
}
# Run health check
health_check
exit $?
Continuous Monitoring Loop
For ongoing monitoring, create a loop with configurable intervals:
#!/bin/bash
# continuous_monitor.sh
INTERVAL="${1:-60}" # Check every 60 seconds
SERVICE_URL="${2:-https://api.example.com/health}"
echo "Starting continuous health monitoring (interval: ${INTERVAL}s)"
echo "Monitoring: $SERVICE_URL"
echo "Press Ctrl+C to stop"
echo
while true; do
./health_check.sh "$SERVICE_URL"
sleep "$INTERVAL"
done
Integration with Monitoring Systems
Nagios Integration
Create a Nagios-compatible health check plugin:
#!/bin/bash
# nagios_health_check.sh
SERVICE_URL="$1"
WARNING_TIME="$2"
CRITICAL_TIME="$3"
if [ -z "$SERVICE_URL" ]; then
echo "UNKNOWN - Missing service URL parameter"
exit 3
fi
# Perform health check
curl_output=$(curl -w "%{http_code};%{time_total}" -s -o /dev/null \
--connect-timeout 5 --max-time 15 "$SERVICE_URL" 2>/dev/null)
if [ $? -ne 0 ]; then
echo "CRITICAL - Service unreachable"
exit 2
fi
http_code=$(echo "$curl_output" | cut -d';' -f1)
response_time=$(echo "$curl_output" | cut -d';' -f2)
if [ "$http_code" -ne 200 ]; then
echo "CRITICAL - HTTP $http_code"
exit 2
fi
# Check response time thresholds
if (( $(echo "$response_time > $CRITICAL_TIME" | bc -l) )); then
echo "CRITICAL - Response time ${response_time}s exceeds ${CRITICAL_TIME}s"
exit 2
elif (( $(echo "$response_time > $WARNING_TIME" | bc -l) )); then
echo "WARNING - Response time ${response_time}s exceeds ${WARNING_TIME}s"
exit 1
else
echo "OK - Service responding in ${response_time}s"
exit 0
fi
Prometheus Metrics
Export health check metrics for Prometheus monitoring:
#!/bin/bash
# prometheus_health_exporter.sh
METRICS_FILE="/var/lib/prometheus/node-exporter/health_check.prom"
perform_check() {
local service_name="$1"
local service_url="$2"
local start_time=$(date +%s.%N)
local http_code=$(curl -f -s -o /dev/null -w "%{http_code}" \
--connect-timeout 5 --max-time 10 "$service_url" 2>/dev/null)
local end_time=$(date +%s.%N)
local response_time=$(echo "$end_time - $start_time" | bc)
local status=0
if [ "$http_code" -eq 200 ]; then
status=1
fi
# Write metrics
echo "health_check_status{service=\"$service_name\"} $status" >> "$METRICS_FILE.tmp"
echo "health_check_response_time{service=\"$service_name\"} $response_time" >> "$METRICS_FILE.tmp"
echo "health_check_http_code{service=\"$service_name\"} $http_code" >> "$METRICS_FILE.tmp"
}
# Clear previous metrics
> "$METRICS_FILE.tmp"
# Check multiple services
perform_check "api" "https://api.example.com/health"
perform_check "database" "https://db.example.com/status"
perform_check "cache" "https://cache.example.com/ping"
# Atomically update metrics file
mv "$METRICS_FILE.tmp" "$METRICS_FILE"
Best Practices and Considerations
Performance Optimization
- Use appropriate timeouts to prevent hanging checks
- Implement exponential backoff for retry logic
- Cache DNS lookups when checking multiple endpoints
- Use connection pooling for frequent checks
Security Considerations
When implementing health checks, consider:
- Use HTTPS for sensitive health endpoints
- Implement proper authentication for internal services
- Avoid exposing sensitive information in health responses
- Use dedicated health check endpoints separate from main APIs
Alerting Strategy
Design effective alerting based on health check results:
# Example alerting logic
consecutive_failures=0
max_failures=3
while true; do
if ! ./health_check.sh; then
((consecutive_failures++))
if [ $consecutive_failures -ge $max_failures ]; then
# Send alert (email, Slack, PagerDuty, etc.)
echo "ALERT: Service down for $consecutive_failures consecutive checks"
fi
else
consecutive_failures=0
fi
sleep 60
done
Conclusion
Curl provides a robust foundation for implementing comprehensive health checks. By combining proper timeout configuration, response validation, and integration with monitoring systems, you can create reliable service monitoring solutions. Remember to implement appropriate retry logic, handle different failure scenarios, and design alerting strategies that minimize false positives while ensuring rapid incident response.
For more advanced monitoring scenarios involving JavaScript-heavy applications, consider complementing Curl-based checks with browser automation tools for comprehensive application monitoring.