OAuth authentication is essential for secure API access in web scraping applications. This guide shows you how to implement OAuth 1.0 and OAuth 2.0 authentication with Guzzle, PHP's powerful HTTP client.
Prerequisites
- PHP 7.4 or higher
- Composer installed
- Basic understanding of OAuth concepts
- Valid API credentials from your target service
OAuth 1.0 Implementation
OAuth 1.0 is commonly used by APIs like Twitter's v1.1 API. Here's how to implement it with Guzzle:
Installation
composer require guzzlehttp/guzzle
composer require guzzlehttp/oauth-subscriber
Basic OAuth 1.0 Setup
<?php
require 'vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Subscriber\Oauth\Oauth1;
// Create handler stack and add OAuth middleware
$stack = HandlerStack::create();
$oauth = new Oauth1([
'consumer_key' => 'your_consumer_key',
'consumer_secret' => 'your_consumer_secret',
'token' => 'your_access_token',
'token_secret' => 'your_access_token_secret'
]);
$stack->push($oauth);
// Create client with OAuth handler
$client = new Client([
'base_uri' => 'https://api.example.com/',
'handler' => $stack,
'auth' => 'oauth'
]);
try {
$response = $client->get('1.1/statuses/user_timeline.json');
$data = json_decode($response->getBody(), true);
print_r($data);
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
}
OAuth 1.0 with Custom Parameters
<?php
// For APIs requiring specific OAuth parameters
$oauth = new Oauth1([
'consumer_key' => 'your_consumer_key',
'consumer_secret' => 'your_consumer_secret',
'token' => 'your_access_token',
'token_secret' => 'your_access_token_secret',
'signature_method' => Oauth1::SIGNATURE_METHOD_HMAC,
'realm' => 'your_realm', // optional
'version' => '1.0'
]);
OAuth 2.0 Implementation
OAuth 2.0 is more modern and widely adopted. Here are implementations for different grant types:
Installation
composer require kamermans/guzzle-oauth2-subscriber
Client Credentials Grant (Server-to-Server)
<?php
require 'vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use kamermans\OAuth2\OAuth2Middleware;
use kamermans\OAuth2\GrantType\ClientCredentials;
// Create OAuth2 middleware
$stack = HandlerStack::create();
$oauth2 = new OAuth2Middleware(
new ClientCredentials(
new Client(['base_uri' => 'https://api.example.com']),
[
'client_id' => 'your_client_id',
'client_secret' => 'your_client_secret',
'scope' => 'read write',
'token_url' => '/oauth/token',
]
)
);
$stack->push($oauth2);
// Create authenticated client
$client = new Client([
'base_uri' => 'https://api.example.com/',
'handler' => $stack,
]);
try {
$response = $client->get('/api/data');
$data = json_decode($response->getBody(), true);
print_r($data);
} catch (Exception $e) {
echo "Error: " . $e->getMessage();
}
Authorization Code Grant (Web Applications)
<?php
use kamermans\OAuth2\GrantType\AuthorizationCode;
// Step 1: Get authorization code (redirect user to authorization server)
$authUrl = 'https://api.example.com/oauth/authorize?' . http_build_query([
'client_id' => 'your_client_id',
'redirect_uri' => 'https://yourapp.com/callback',
'response_type' => 'code',
'scope' => 'read write'
]);
// Step 2: Exchange authorization code for access token
$oauth2 = new OAuth2Middleware(
new AuthorizationCode(
new Client(['base_uri' => 'https://api.example.com']),
[
'client_id' => 'your_client_id',
'client_secret' => 'your_client_secret',
'redirect_uri' => 'https://yourapp.com/callback',
'token_url' => '/oauth/token',
'auth_code' => $_GET['code'], // From callback
]
)
);
Password Grant (Resource Owner Password Credentials)
<?php
use kamermans\OAuth2\GrantType\PasswordCredentials;
$oauth2 = new OAuth2Middleware(
new PasswordCredentials(
new Client(['base_uri' => 'https://api.example.com']),
[
'client_id' => 'your_client_id',
'client_secret' => 'your_client_secret',
'username' => 'user@example.com',
'password' => 'user_password',
'scope' => 'read write',
'token_url' => '/oauth/token',
]
)
);
Advanced Configuration
Token Persistence
<?php
use kamermans\OAuth2\Persistence\FileTokenPersistence;
// Save tokens to file for reuse
$tokenPersistence = new FileTokenPersistence('/path/to/token.json');
$oauth2 = new OAuth2Middleware(
new ClientCredentials(
new Client(['base_uri' => 'https://api.example.com']),
[
'client_id' => 'your_client_id',
'client_secret' => 'your_client_secret',
'token_url' => '/oauth/token',
]
),
$tokenPersistence
);
Custom Token Refresh
<?php
use kamermans\OAuth2\GrantType\RefreshToken;
// Handle token refresh automatically
$refreshGrant = new RefreshToken(
new Client(['base_uri' => 'https://api.example.com']),
[
'client_id' => 'your_client_id',
'client_secret' => 'your_client_secret',
'refresh_token' => 'your_refresh_token',
'token_url' => '/oauth/token',
]
);
$oauth2 = new OAuth2Middleware($refreshGrant);
Error Handling and Retries
<?php
use GuzzleHttp\Exception\ClientException;
use GuzzleHttp\Exception\ServerException;
function makeAuthenticatedRequest($client, $endpoint) {
$maxRetries = 3;
$retryDelay = 1; // seconds
for ($i = 0; $i < $maxRetries; $i++) {
try {
$response = $client->get($endpoint);
return json_decode($response->getBody(), true);
} catch (ClientException $e) {
if ($e->getResponse()->getStatusCode() === 401) {
// Token expired, middleware should handle refresh
if ($i === $maxRetries - 1) {
throw new Exception('Authentication failed after retries');
}
sleep($retryDelay);
continue;
}
throw $e;
} catch (ServerException $e) {
if ($i === $maxRetries - 1) {
throw $e;
}
sleep($retryDelay * ($i + 1)); // Exponential backoff
}
}
}
Complete Example: Twitter API with OAuth 1.0
<?php
require 'vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Subscriber\Oauth\Oauth1;
class TwitterScraper {
private $client;
public function __construct($consumerKey, $consumerSecret, $accessToken, $accessTokenSecret) {
$stack = HandlerStack::create();
$oauth = new Oauth1([
'consumer_key' => $consumerKey,
'consumer_secret' => $consumerSecret,
'token' => $accessToken,
'token_secret' => $accessTokenSecret
]);
$stack->push($oauth);
$this->client = new Client([
'base_uri' => 'https://api.twitter.com/',
'handler' => $stack,
'auth' => 'oauth'
]);
}
public function getUserTimeline($username, $count = 20) {
try {
$response = $this->client->get('1.1/statuses/user_timeline.json', [
'query' => [
'screen_name' => $username,
'count' => $count,
'tweet_mode' => 'extended'
]
]);
return json_decode($response->getBody(), true);
} catch (Exception $e) {
throw new Exception("Failed to fetch timeline: " . $e->getMessage());
}
}
}
// Usage
$scraper = new TwitterScraper(
'your_consumer_key',
'your_consumer_secret',
'your_access_token',
'your_access_token_secret'
);
$tweets = $scraper->getUserTimeline('username');
foreach ($tweets as $tweet) {
echo $tweet['full_text'] . "\n\n";
}
Best Practices
- Environment Variables: Store credentials in environment variables, not in code
- Token Refresh: Implement automatic token refresh for OAuth 2.0
- Error Handling: Always handle authentication errors gracefully
- Rate Limiting: Respect API rate limits and implement backoff strategies
- Logging: Log authentication events for debugging
- Secure Storage: Use secure methods to store tokens and credentials
Common Issues and Solutions
Token Expiration
// Check if token is expired and refresh if needed
if ($response->getStatusCode() === 401) {
// Token expired - OAuth2Middleware should handle this automatically
// For manual handling, implement token refresh logic
}
Invalid Signatures (OAuth 1.0)
// Ensure system time is synchronized
// Verify all OAuth parameters are correctly encoded
// Check that the signature method matches the API requirements
Legal and Ethical Considerations
When using OAuth for web scraping:
- Terms of Service: Always comply with API terms of service
- Rate Limits: Respect imposed rate limits
- Data Usage: Only collect data you're authorized to access
- Privacy: Handle user data responsibly and in compliance with privacy laws
- Authentication: Never share or expose authentication credentials
OAuth authentication provides secure, authorized access to APIs. Use it responsibly and in accordance with the service provider's terms and applicable laws.