How can I use Guzzle to download files?

Guzzle is a powerful PHP HTTP client that makes downloading files straightforward and memory-efficient. You can download files by sending a GET request to the file URL and streaming the response directly to disk using the sink option.

Basic File Download

Installation

First, install Guzzle via Composer:

composer require guzzlehttp/guzzle

Simple Download Example

<?php

require 'vendor/autoload.php';

use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;

$client = new Client();
$url = 'https://example.com/file.pdf';
$saveTo = 'downloads/file.pdf';

try {
    $response = $client->request('GET', $url, ['sink' => $saveTo]);
    echo "File downloaded successfully to " . $saveTo . "\n";
    echo "File size: " . filesize($saveTo) . " bytes\n";
} catch (RequestException $e) {
    echo "Download failed: " . $e->getMessage() . "\n";
    if ($e->hasResponse()) {
        echo "HTTP Status: " . $e->getResponse()->getStatusCode() . "\n";
    }
}

Advanced Download Features

Download with Progress Tracking

Monitor download progress for large files:

<?php

require 'vendor/autoload.php';

use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;

$client = new Client();
$url = 'https://example.com/large-file.zip';
$saveTo = 'downloads/large-file.zip';

try {
    $response = $client->request('GET', $url, [
        'sink' => $saveTo,
        'progress' => function ($downloadTotal, $downloadedBytes, $uploadTotal, $uploadedBytes) {
            if ($downloadTotal > 0) {
                $percent = round(($downloadedBytes / $downloadTotal) * 100, 2);
                echo "\rProgress: {$percent}% ({$downloadedBytes}/{$downloadTotal} bytes)";
            }
        }
    ]);
    echo "\nDownload completed!\n";
} catch (RequestException $e) {
    echo "\nDownload failed: " . $e->getMessage() . "\n";
}

Download with Custom Headers and Authentication

Download files that require authentication or custom headers:

<?php

require 'vendor/autoload.php';

use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;

$client = new Client();
$url = 'https://api.example.com/secure/file.pdf';
$saveTo = 'downloads/secure-file.pdf';

try {
    $response = $client->request('GET', $url, [
        'sink' => $saveTo,
        'headers' => [
            'Authorization' => 'Bearer YOUR_API_TOKEN',
            'User-Agent' => 'MyApp/1.0'
        ],
        'timeout' => 30, // 30 second timeout
        'verify' => true // Verify SSL certificates
    ]);
    echo "Secure file downloaded successfully!\n";
} catch (RequestException $e) {
    echo "Download failed: " . $e->getMessage() . "\n";
}

Download with Directory Creation

Automatically create directories if they don't exist:

<?php

require 'vendor/autoload.php';

use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;

function downloadFile($url, $saveTo) {
    // Create directory if it doesn't exist
    $directory = dirname($saveTo);
    if (!is_dir($directory)) {
        mkdir($directory, 0755, true);
    }

    $client = new Client();

    try {
        $response = $client->request('GET', $url, [
            'sink' => $saveTo,
            'headers' => [
                'User-Agent' => 'Mozilla/5.0 (compatible; FileDownloader/1.0)'
            ]
        ]);

        return [
            'success' => true,
            'file_size' => filesize($saveTo),
            'content_type' => $response->getHeader('Content-Type')[0] ?? 'unknown'
        ];
    } catch (RequestException $e) {
        return [
            'success' => false,
            'error' => $e->getMessage(),
            'status_code' => $e->hasResponse() ? $e->getResponse()->getStatusCode() : null
        ];
    }
}

// Usage
$result = downloadFile('https://example.com/document.pdf', 'downloads/documents/document.pdf');

if ($result['success']) {
    echo "Downloaded successfully!\n";
    echo "File size: {$result['file_size']} bytes\n";
    echo "Content type: {$result['content_type']}\n";
} else {
    echo "Download failed: {$result['error']}\n";
    if ($result['status_code']) {
        echo "HTTP Status: {$result['status_code']}\n";
    }
}

Important Considerations

Memory Efficiency

The sink option streams data directly to disk, making it memory-efficient for large files. Without sink, the entire file would be loaded into memory:

// Memory-efficient (recommended)
$client->request('GET', $url, ['sink' => $saveTo]);

// Memory-intensive (avoid for large files)
$response = $client->request('GET', $url);
file_put_contents($saveTo, $response->getBody());

File Permissions

Ensure the destination directory has write permissions:

$directory = dirname($saveTo);
if (!is_writable($directory)) {
    throw new Exception("Directory {$directory} is not writable");
}

Error Handling Best Practices

Always handle different types of exceptions:

use GuzzleHttp\Exception\ConnectException;
use GuzzleHttp\Exception\RequestException;
use GuzzleHttp\Exception\ClientException;
use GuzzleHttp\Exception\ServerException;

try {
    $response = $client->request('GET', $url, ['sink' => $saveTo]);
} catch (ConnectException $e) {
    echo "Connection failed: " . $e->getMessage() . "\n";
} catch (ClientException $e) {
    echo "Client error (4xx): " . $e->getResponse()->getStatusCode() . "\n";
} catch (ServerException $e) {
    echo "Server error (5xx): " . $e->getResponse()->getStatusCode() . "\n";
} catch (RequestException $e) {
    echo "Request failed: " . $e->getMessage() . "\n";
}

Guzzle's streaming capabilities and flexible configuration options make it an excellent choice for downloading files of any size while maintaining optimal performance and memory usage.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon