Table of contents

How to Set Custom Headers for All Requests in a Guzzle Client

When building web scraping applications or consuming APIs with Guzzle, you often need to set custom headers that should be included with every HTTP request. This is particularly important for authentication, user agent spoofing, or API key management. This guide covers multiple approaches to configure default headers in Guzzle clients.

Understanding Guzzle Headers

Guzzle is a powerful PHP HTTP client library that provides flexible ways to configure request headers. Headers can be set at different levels:

  • Client-level: Headers applied to all requests made by the client
  • Request-level: Headers specific to individual requests
  • Middleware-level: Headers added through custom middleware

Method 1: Setting Headers During Client Instantiation

The most straightforward approach is to configure default headers when creating the Guzzle client:

<?php
use GuzzleHttp\Client;

$client = new Client([
    'base_uri' => 'https://api.example.com/',
    'timeout' => 30,
    'headers' => [
        'User-Agent' => 'Mozilla/5.0 (compatible; MyBot/1.0)',
        'Accept' => 'application/json',
        'Content-Type' => 'application/json',
        'Authorization' => 'Bearer your-api-token',
        'X-API-Key' => 'your-api-key',
        'X-Custom-Header' => 'custom-value'
    ]
]);

// All requests will now include these headers
$response = $client->get('/users');
$response = $client->post('/data', ['json' => ['key' => 'value']]);

Method 2: Using Configuration Arrays

For more complex configurations, you can separate the headers configuration:

<?php
use GuzzleHttp\Client;

$defaultHeaders = [
    'User-Agent' => 'WebScraper/2.0 (+https://example.com/bot)',
    'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language' => 'en-US,en;q=0.5',
    'Accept-Encoding' => 'gzip, deflate',
    'DNT' => '1',
    'Connection' => 'keep-alive',
    'Upgrade-Insecure-Requests' => '1'
];

$config = [
    'base_uri' => 'https://target-website.com/',
    'timeout' => 60,
    'headers' => $defaultHeaders,
    'cookies' => true,
    'allow_redirects' => true
];

$client = new Client($config);

Method 3: Environment-Based Header Configuration

For production applications, it's best practice to manage sensitive headers through environment variables:

<?php
use GuzzleHttp\Client;

class ApiClient
{
    private $client;

    public function __construct()
    {
        $this->client = new Client([
            'base_uri' => getenv('API_BASE_URL'),
            'timeout' => 30,
            'headers' => [
                'User-Agent' => getenv('USER_AGENT') ?: 'DefaultBot/1.0',
                'Authorization' => 'Bearer ' . getenv('API_TOKEN'),
                'X-API-Key' => getenv('API_KEY'),
                'Accept' => 'application/json',
                'Content-Type' => 'application/json'
            ]
        ]);
    }

    public function makeRequest($method, $uri, $options = [])
    {
        return $this->client->request($method, $uri, $options);
    }
}

// Usage
$apiClient = new ApiClient();
$response = $apiClient->makeRequest('GET', '/endpoint');

Method 4: Using Middleware for Dynamic Headers

For scenarios where headers need to be computed dynamically or modified based on request context, middleware provides the most flexibility:

<?php
use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use Psr\Http\Message\RequestInterface;

$stack = HandlerStack::create();

// Add middleware to modify headers for every request
$stack->push(Middleware::mapRequest(function (RequestInterface $request) {
    return $request
        ->withHeader('User-Agent', 'DynamicBot/1.0')
        ->withHeader('X-Request-ID', uniqid())
        ->withHeader('X-Timestamp', time())
        ->withHeader('X-Client-Version', '2.1.0');
}));

$client = new Client([
    'handler' => $stack,
    'base_uri' => 'https://api.example.com/',
    'timeout' => 30
]);

Method 5: Combining Static and Dynamic Headers

You can combine static client-level headers with dynamic middleware-based headers:

<?php
use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use Psr\Http\Message\RequestInterface;

$stack = HandlerStack::create();

// Middleware for dynamic headers
$stack->push(Middleware::mapRequest(function (RequestInterface $request) {
    $sessionId = session_id() ?: 'no-session';
    return $request
        ->withHeader('X-Session-ID', $sessionId)
        ->withHeader('X-Request-Time', date('c'));
}));

$client = new Client([
    'handler' => $stack,
    'base_uri' => 'https://api.example.com/',
    'headers' => [
        // Static headers
        'User-Agent' => 'MyApp/1.0',
        'Accept' => 'application/json',
        'Authorization' => 'Bearer ' . getenv('API_TOKEN')
    ]
]);

Common Use Cases and Headers

Web Scraping Headers

When scraping websites, you'll want to mimic real browser behavior:

<?php
$scrapingHeaders = [
    'User-Agent' => 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36',
    'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'Accept-Language' => 'en-US,en;q=0.9',
    'Accept-Encoding' => 'gzip, deflate, br',
    'DNT' => '1',
    'Connection' => 'keep-alive',
    'Upgrade-Insecure-Requests' => '1',
    'Sec-Fetch-Dest' => 'document',
    'Sec-Fetch-Mode' => 'navigate',
    'Sec-Fetch-Site' => 'none'
];

$client = new Client([
    'headers' => $scrapingHeaders,
    'cookies' => true
]);

API Authentication Headers

For API consumption with various authentication methods:

<?php
// Bearer Token Authentication
$apiHeaders = [
    'Authorization' => 'Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...',
    'Accept' => 'application/json',
    'Content-Type' => 'application/json'
];

// API Key Authentication
$apiKeyHeaders = [
    'X-API-Key' => 'your-secret-api-key',
    'X-API-Version' => 'v2',
    'Accept' => 'application/json'
];

// Basic Authentication
$basicAuthHeaders = [
    'Authorization' => 'Basic ' . base64_encode('username:password'),
    'Accept' => 'application/json'
];

Advanced Header Management

Conditional Headers

Sometimes you need different headers based on the request type or target:

<?php
use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use Psr\Http\Message\RequestInterface;

$stack = HandlerStack::create();

$stack->push(Middleware::mapRequest(function (RequestInterface $request) {
    $uri = $request->getUri();

    // Different headers for different endpoints
    if (strpos($uri->getPath(), '/api/') === 0) {
        return $request
            ->withHeader('Accept', 'application/json')
            ->withHeader('X-API-Client', 'PHP-Client');
    } elseif (strpos($uri->getPath(), '/upload') === 0) {
        return $request
            ->withHeader('Accept', '*/*')
            ->withHeader('X-Upload-Client', 'FileUploader');
    }

    return $request;
}));

$client = new Client(['handler' => $stack]);

Rate Limiting Headers

For APIs that require rate limiting information:

<?php
$rateLimitHeaders = [
    'X-RateLimit-Remaining' => '100',
    'X-RateLimit-Reset' => time() + 3600,
    'X-Client-ID' => 'client-12345'
];

Testing and Debugging Headers

When working with custom headers, it's important to verify they're being sent correctly:

<?php
use GuzzleHttp\Client;
use GuzzleHttp\Middleware;
use GuzzleHttp\HandlerStack;

$stack = HandlerStack::create();

// Add history middleware to track requests
$history = [];
$stack->push(Middleware::history($history));

$client = new Client([
    'handler' => $stack,
    'headers' => [
        'User-Agent' => 'TestBot/1.0',
        'X-Debug' => 'true'
    ]
]);

$response = $client->get('https://httpbin.org/headers');

// Inspect the actual request headers sent
$request = $history[0]['request'];
foreach ($request->getHeaders() as $name => $values) {
    echo $name . ': ' . implode(', ', $values) . "\n";
}

Best Practices

  1. Use Environment Variables: Store sensitive headers like API keys in environment variables
  2. Header Validation: Validate header values before setting them
  3. Consistent Naming: Use consistent header naming conventions across your application
  4. Documentation: Document custom headers and their purposes
  5. Testing: Always test header configuration with real endpoints

Integration with Web Scraping Services

When using web scraping APIs or services, proper header configuration becomes even more critical. Many modern websites employ sophisticated bot detection mechanisms that analyze request headers. While tools like Puppeteer handle browser sessions automatically, HTTP clients like Guzzle require manual header configuration to appear legitimate.

For complex scenarios involving JavaScript-heavy sites, you might need to combine Guzzle for initial requests with headless browsers for handling AJAX requests using Puppeteer.

Conclusion

Setting custom headers for all requests in a Guzzle client is essential for professional web scraping and API consumption. The method you choose depends on your specific requirements:

  • Use client instantiation for simple, static headers
  • Implement middleware for dynamic or conditional headers
  • Leverage environment variables for sensitive configuration
  • Combine approaches for complex scenarios

Remember to always respect rate limits, robots.txt files, and terms of service when implementing automated HTTP requests with custom headers.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon