Table of contents

How do I Stream Large Files with HttpClient (C#)?

When working with large files in C#, streaming is essential to avoid loading entire files into memory, which can cause performance issues or even application crashes. HttpClient provides several methods to stream large files efficiently, allowing you to download files of any size while keeping memory usage minimal.

Why Stream Large Files?

Loading a large file entirely into memory using methods like GetStringAsync() or GetByteArrayAsync() can cause:

  • High memory consumption - A 1GB file would require at least 1GB of RAM
  • OutOfMemoryException errors for very large files
  • Poor performance due to waiting for the entire download before processing
  • Lack of progress tracking during long downloads

Streaming solves these problems by processing data in small chunks as it arrives.

Basic File Streaming with HttpClient

The most straightforward way to stream a large file is using HttpCompletionOption.ResponseHeadersRead, which tells HttpClient to return control as soon as headers are received rather than waiting for the entire content.

using System;
using System.IO;
using System.Net.Http;
using System.Threading.Tasks;

public class FileDownloader
{
    private static readonly HttpClient httpClient = new HttpClient();

    public async Task DownloadFileAsync(string url, string outputPath)
    {
        // Request with ResponseHeadersRead to start streaming immediately
        using (HttpResponseMessage response = await httpClient.GetAsync(url,
            HttpCompletionOption.ResponseHeadersRead))
        {
            response.EnsureSuccessStatusCode();

            // Get the content stream
            using (Stream contentStream = await response.Content.ReadAsStreamAsync())
            using (FileStream fileStream = new FileStream(outputPath,
                FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
            {
                // Copy the content stream to file stream
                await contentStream.CopyToAsync(fileStream);
            }
        }
    }
}

This approach downloads the file in chunks without loading it entirely into memory.

Streaming with Progress Tracking

For large downloads, users often need progress feedback. Here's how to implement progress tracking while streaming:

using System;
using System.IO;
using System.Net.Http;
using System.Threading.Tasks;

public class FileDownloaderWithProgress
{
    private static readonly HttpClient httpClient = new HttpClient();

    public async Task DownloadFileWithProgressAsync(
        string url,
        string outputPath,
        IProgress<DownloadProgress> progress = null)
    {
        using (HttpResponseMessage response = await httpClient.GetAsync(url,
            HttpCompletionOption.ResponseHeadersRead))
        {
            response.EnsureSuccessStatusCode();

            long? totalBytes = response.Content.Headers.ContentLength;

            using (Stream contentStream = await response.Content.ReadAsStreamAsync())
            using (FileStream fileStream = new FileStream(outputPath,
                FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
            {
                var buffer = new byte[8192];
                long totalBytesRead = 0;
                int bytesRead;

                while ((bytesRead = await contentStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
                {
                    await fileStream.WriteAsync(buffer, 0, bytesRead);
                    totalBytesRead += bytesRead;

                    // Report progress
                    if (totalBytes.HasValue)
                    {
                        var progressInfo = new DownloadProgress
                        {
                            BytesReceived = totalBytesRead,
                            TotalBytes = totalBytes.Value,
                            PercentComplete = (double)totalBytesRead / totalBytes.Value * 100
                        };
                        progress?.Report(progressInfo);
                    }
                }
            }
        }
    }
}

public class DownloadProgress
{
    public long BytesReceived { get; set; }
    public long TotalBytes { get; set; }
    public double PercentComplete { get; set; }
}

Using the Progress Tracker

var downloader = new FileDownloaderWithProgress();
var progressReporter = new Progress<DownloadProgress>(progress =>
{
    Console.WriteLine($"Downloaded {progress.BytesReceived:N0} of {progress.TotalBytes:N0} bytes " +
                      $"({progress.PercentComplete:F2}%)");
});

await downloader.DownloadFileWithProgressAsync(
    "https://example.com/large-file.zip",
    "output.zip",
    progressReporter
);

Streaming with Cancellation Support

For long-running downloads, you should support cancellation to allow users to stop the download:

public async Task DownloadFileWithCancellationAsync(
    string url,
    string outputPath,
    CancellationToken cancellationToken = default)
{
    using (HttpResponseMessage response = await httpClient.GetAsync(url,
        HttpCompletionOption.ResponseHeadersRead, cancellationToken))
    {
        response.EnsureSuccessStatusCode();

        using (Stream contentStream = await response.Content.ReadAsStreamAsync())
        using (FileStream fileStream = new FileStream(outputPath,
            FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
        {
            var buffer = new byte[8192];
            int bytesRead;

            while ((bytesRead = await contentStream.ReadAsync(
                buffer, 0, buffer.Length, cancellationToken)) > 0)
            {
                await fileStream.WriteAsync(buffer, 0, bytesRead, cancellationToken);
            }
        }
    }
}

Example with Cancellation

var cts = new CancellationTokenSource();

// Cancel after 30 seconds
cts.CancelAfter(TimeSpan.FromSeconds(30));

try
{
    await DownloadFileWithCancellationAsync(
        "https://example.com/large-file.zip",
        "output.zip",
        cts.Token
    );
}
catch (OperationCanceledException)
{
    Console.WriteLine("Download was cancelled");
}

Complete Production-Ready Implementation

Here's a robust, production-ready implementation combining all best practices:

using System;
using System.IO;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;

public class RobustFileDownloader
{
    private static readonly HttpClient httpClient = new HttpClient
    {
        Timeout = Timeout.InfiniteTimeSpan // We'll use CancellationToken instead
    };

    public async Task<bool> DownloadFileAsync(
        string url,
        string outputPath,
        IProgress<DownloadProgress> progress = null,
        CancellationToken cancellationToken = default)
    {
        try
        {
            // Ensure directory exists
            string directory = Path.GetDirectoryName(outputPath);
            if (!string.IsNullOrEmpty(directory))
            {
                Directory.CreateDirectory(directory);
            }

            using (HttpResponseMessage response = await httpClient.GetAsync(url,
                HttpCompletionOption.ResponseHeadersRead, cancellationToken))
            {
                response.EnsureSuccessStatusCode();

                long? totalBytes = response.Content.Headers.ContentLength;

                using (Stream contentStream = await response.Content.ReadAsStreamAsync())
                using (FileStream fileStream = new FileStream(outputPath,
                    FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
                {
                    var buffer = new byte[8192];
                    long totalBytesRead = 0;
                    int bytesRead;
                    var lastReportTime = DateTime.UtcNow;

                    while ((bytesRead = await contentStream.ReadAsync(
                        buffer, 0, buffer.Length, cancellationToken)) > 0)
                    {
                        await fileStream.WriteAsync(buffer, 0, bytesRead, cancellationToken);
                        totalBytesRead += bytesRead;

                        // Report progress (throttle to avoid excessive updates)
                        if (progress != null && DateTime.UtcNow - lastReportTime > TimeSpan.FromMilliseconds(100))
                        {
                            var progressInfo = new DownloadProgress
                            {
                                BytesReceived = totalBytesRead,
                                TotalBytes = totalBytes ?? 0,
                                PercentComplete = totalBytes.HasValue
                                    ? (double)totalBytesRead / totalBytes.Value * 100
                                    : 0
                            };
                            progress.Report(progressInfo);
                            lastReportTime = DateTime.UtcNow;
                        }
                    }

                    // Final progress report
                    if (progress != null && totalBytes.HasValue)
                    {
                        progress.Report(new DownloadProgress
                        {
                            BytesReceived = totalBytesRead,
                            TotalBytes = totalBytes.Value,
                            PercentComplete = 100
                        });
                    }
                }
            }

            return true;
        }
        catch (HttpRequestException ex)
        {
            Console.WriteLine($"HTTP error: {ex.Message}");
            return false;
        }
        catch (OperationCanceledException)
        {
            Console.WriteLine("Download was cancelled");
            return false;
        }
        catch (IOException ex)
        {
            Console.WriteLine($"I/O error: {ex.Message}");
            return false;
        }
    }
}

Memory-Efficient Streaming for Processing

If you need to process the file content while streaming (e.g., parsing data during download), you can avoid writing to disk:

public async Task ProcessStreamedDataAsync(string url)
{
    using (HttpResponseMessage response = await httpClient.GetAsync(url,
        HttpCompletionOption.ResponseHeadersRead))
    {
        response.EnsureSuccessStatusCode();

        using (Stream contentStream = await response.Content.ReadAsStreamAsync())
        using (StreamReader reader = new StreamReader(contentStream))
        {
            string line;
            while ((line = await reader.ReadLineAsync()) != null)
            {
                // Process each line as it arrives
                ProcessLine(line);
            }
        }
    }
}

private void ProcessLine(string line)
{
    // Your processing logic here
    Console.WriteLine($"Processing: {line}");
}

Best Practices for Streaming Large Files

1. Use HttpCompletionOption.ResponseHeadersRead

Always use HttpCompletionOption.ResponseHeadersRead to start processing as soon as headers arrive:

await httpClient.GetAsync(url, HttpCompletionOption.ResponseHeadersRead);

2. Configure Appropriate Buffer Sizes

The default 8192-byte buffer is suitable for most scenarios, but you can adjust based on your needs:

var buffer = new byte[81920]; // 80 KB for faster networks

3. Implement Timeout Handling

While streaming, use CancellationToken for timeout control:

var cts = new CancellationTokenSource(TimeSpan.FromMinutes(5));
await DownloadFileAsync(url, path, null, cts.Token);

4. Reuse HttpClient Instance

Always reuse the same HttpClient instance across your application to avoid socket exhaustion:

private static readonly HttpClient httpClient = new HttpClient();

5. Handle Partial Downloads

For resumable downloads, check if the server supports range requests and implement retry logic with proper range headers.

Streaming in Web Scraping Scenarios

When downloading large datasets or media files during web scraping operations, streaming becomes crucial. While C# HttpClient is excellent for API-based scraping, you might also need to handle file downloads in Puppeteer when dealing with browser-based downloads.

For comprehensive data extraction from streaming responses, consider using specialized APIs that handle the complexity of parsing large responses efficiently without memory overhead.

Performance Considerations

Memory Usage

Streaming keeps memory usage constant regardless of file size. With a 8KB buffer, memory usage remains under 1MB even when downloading files larger than 10GB.

Speed Optimization

To maximize download speed:

  • Use appropriate buffer sizes (8KB to 80KB)
  • Enable async I/O for file operations
  • Avoid blocking operations in the download loop
  • Consider compression if supported by the server

Throttling Progress Reports

Reporting progress on every chunk can slow down downloads. Throttle updates to 100-200ms intervals:

if (DateTime.UtcNow - lastReportTime > TimeSpan.FromMilliseconds(100))
{
    progress.Report(progressInfo);
    lastReportTime = DateTime.UtcNow;
}

Error Handling and Retry Logic

Implement retry logic for transient failures:

public async Task<bool> DownloadWithRetryAsync(
    string url,
    string outputPath,
    int maxRetries = 3)
{
    for (int attempt = 0; attempt < maxRetries; attempt++)
    {
        try
        {
            await DownloadFileAsync(url, outputPath);
            return true;
        }
        catch (HttpRequestException ex) when (attempt < maxRetries - 1)
        {
            Console.WriteLine($"Attempt {attempt + 1} failed: {ex.Message}");
            await Task.Delay(TimeSpan.FromSeconds(Math.Pow(2, attempt))); // Exponential backoff
        }
    }
    return false;
}

Conclusion

Streaming large files with HttpClient in C# is essential for building robust, memory-efficient applications. By using HttpCompletionOption.ResponseHeadersRead, implementing progress tracking, and supporting cancellation, you can create production-ready download functionality that handles files of any size without overwhelming system resources.

The key principles are:

  • Always stream instead of loading entire files into memory
  • Use async/await throughout your implementation
  • Implement proper error handling and cancellation support
  • Throttle progress updates for optimal performance
  • Reuse HttpClient instances to avoid resource exhaustion

Whether you're building a file download manager, processing large datasets, or implementing web scraping solutions, these streaming techniques will ensure your application remains responsive and efficient even when handling multi-gigabyte files.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon