How do I Stream Large Files with HttpClient (C#)?
When working with large files in C#, streaming is essential to avoid loading entire files into memory, which can cause performance issues or even application crashes. HttpClient provides several methods to stream large files efficiently, allowing you to download files of any size while keeping memory usage minimal.
Why Stream Large Files?
Loading a large file entirely into memory using methods like GetStringAsync()
or GetByteArrayAsync()
can cause:
- High memory consumption - A 1GB file would require at least 1GB of RAM
- OutOfMemoryException errors for very large files
- Poor performance due to waiting for the entire download before processing
- Lack of progress tracking during long downloads
Streaming solves these problems by processing data in small chunks as it arrives.
Basic File Streaming with HttpClient
The most straightforward way to stream a large file is using HttpCompletionOption.ResponseHeadersRead
, which tells HttpClient to return control as soon as headers are received rather than waiting for the entire content.
using System;
using System.IO;
using System.Net.Http;
using System.Threading.Tasks;
public class FileDownloader
{
private static readonly HttpClient httpClient = new HttpClient();
public async Task DownloadFileAsync(string url, string outputPath)
{
// Request with ResponseHeadersRead to start streaming immediately
using (HttpResponseMessage response = await httpClient.GetAsync(url,
HttpCompletionOption.ResponseHeadersRead))
{
response.EnsureSuccessStatusCode();
// Get the content stream
using (Stream contentStream = await response.Content.ReadAsStreamAsync())
using (FileStream fileStream = new FileStream(outputPath,
FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
{
// Copy the content stream to file stream
await contentStream.CopyToAsync(fileStream);
}
}
}
}
This approach downloads the file in chunks without loading it entirely into memory.
Streaming with Progress Tracking
For large downloads, users often need progress feedback. Here's how to implement progress tracking while streaming:
using System;
using System.IO;
using System.Net.Http;
using System.Threading.Tasks;
public class FileDownloaderWithProgress
{
private static readonly HttpClient httpClient = new HttpClient();
public async Task DownloadFileWithProgressAsync(
string url,
string outputPath,
IProgress<DownloadProgress> progress = null)
{
using (HttpResponseMessage response = await httpClient.GetAsync(url,
HttpCompletionOption.ResponseHeadersRead))
{
response.EnsureSuccessStatusCode();
long? totalBytes = response.Content.Headers.ContentLength;
using (Stream contentStream = await response.Content.ReadAsStreamAsync())
using (FileStream fileStream = new FileStream(outputPath,
FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
{
var buffer = new byte[8192];
long totalBytesRead = 0;
int bytesRead;
while ((bytesRead = await contentStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
await fileStream.WriteAsync(buffer, 0, bytesRead);
totalBytesRead += bytesRead;
// Report progress
if (totalBytes.HasValue)
{
var progressInfo = new DownloadProgress
{
BytesReceived = totalBytesRead,
TotalBytes = totalBytes.Value,
PercentComplete = (double)totalBytesRead / totalBytes.Value * 100
};
progress?.Report(progressInfo);
}
}
}
}
}
}
public class DownloadProgress
{
public long BytesReceived { get; set; }
public long TotalBytes { get; set; }
public double PercentComplete { get; set; }
}
Using the Progress Tracker
var downloader = new FileDownloaderWithProgress();
var progressReporter = new Progress<DownloadProgress>(progress =>
{
Console.WriteLine($"Downloaded {progress.BytesReceived:N0} of {progress.TotalBytes:N0} bytes " +
$"({progress.PercentComplete:F2}%)");
});
await downloader.DownloadFileWithProgressAsync(
"https://example.com/large-file.zip",
"output.zip",
progressReporter
);
Streaming with Cancellation Support
For long-running downloads, you should support cancellation to allow users to stop the download:
public async Task DownloadFileWithCancellationAsync(
string url,
string outputPath,
CancellationToken cancellationToken = default)
{
using (HttpResponseMessage response = await httpClient.GetAsync(url,
HttpCompletionOption.ResponseHeadersRead, cancellationToken))
{
response.EnsureSuccessStatusCode();
using (Stream contentStream = await response.Content.ReadAsStreamAsync())
using (FileStream fileStream = new FileStream(outputPath,
FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
{
var buffer = new byte[8192];
int bytesRead;
while ((bytesRead = await contentStream.ReadAsync(
buffer, 0, buffer.Length, cancellationToken)) > 0)
{
await fileStream.WriteAsync(buffer, 0, bytesRead, cancellationToken);
}
}
}
}
Example with Cancellation
var cts = new CancellationTokenSource();
// Cancel after 30 seconds
cts.CancelAfter(TimeSpan.FromSeconds(30));
try
{
await DownloadFileWithCancellationAsync(
"https://example.com/large-file.zip",
"output.zip",
cts.Token
);
}
catch (OperationCanceledException)
{
Console.WriteLine("Download was cancelled");
}
Complete Production-Ready Implementation
Here's a robust, production-ready implementation combining all best practices:
using System;
using System.IO;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
public class RobustFileDownloader
{
private static readonly HttpClient httpClient = new HttpClient
{
Timeout = Timeout.InfiniteTimeSpan // We'll use CancellationToken instead
};
public async Task<bool> DownloadFileAsync(
string url,
string outputPath,
IProgress<DownloadProgress> progress = null,
CancellationToken cancellationToken = default)
{
try
{
// Ensure directory exists
string directory = Path.GetDirectoryName(outputPath);
if (!string.IsNullOrEmpty(directory))
{
Directory.CreateDirectory(directory);
}
using (HttpResponseMessage response = await httpClient.GetAsync(url,
HttpCompletionOption.ResponseHeadersRead, cancellationToken))
{
response.EnsureSuccessStatusCode();
long? totalBytes = response.Content.Headers.ContentLength;
using (Stream contentStream = await response.Content.ReadAsStreamAsync())
using (FileStream fileStream = new FileStream(outputPath,
FileMode.Create, FileAccess.Write, FileShare.None, 8192, true))
{
var buffer = new byte[8192];
long totalBytesRead = 0;
int bytesRead;
var lastReportTime = DateTime.UtcNow;
while ((bytesRead = await contentStream.ReadAsync(
buffer, 0, buffer.Length, cancellationToken)) > 0)
{
await fileStream.WriteAsync(buffer, 0, bytesRead, cancellationToken);
totalBytesRead += bytesRead;
// Report progress (throttle to avoid excessive updates)
if (progress != null && DateTime.UtcNow - lastReportTime > TimeSpan.FromMilliseconds(100))
{
var progressInfo = new DownloadProgress
{
BytesReceived = totalBytesRead,
TotalBytes = totalBytes ?? 0,
PercentComplete = totalBytes.HasValue
? (double)totalBytesRead / totalBytes.Value * 100
: 0
};
progress.Report(progressInfo);
lastReportTime = DateTime.UtcNow;
}
}
// Final progress report
if (progress != null && totalBytes.HasValue)
{
progress.Report(new DownloadProgress
{
BytesReceived = totalBytesRead,
TotalBytes = totalBytes.Value,
PercentComplete = 100
});
}
}
}
return true;
}
catch (HttpRequestException ex)
{
Console.WriteLine($"HTTP error: {ex.Message}");
return false;
}
catch (OperationCanceledException)
{
Console.WriteLine("Download was cancelled");
return false;
}
catch (IOException ex)
{
Console.WriteLine($"I/O error: {ex.Message}");
return false;
}
}
}
Memory-Efficient Streaming for Processing
If you need to process the file content while streaming (e.g., parsing data during download), you can avoid writing to disk:
public async Task ProcessStreamedDataAsync(string url)
{
using (HttpResponseMessage response = await httpClient.GetAsync(url,
HttpCompletionOption.ResponseHeadersRead))
{
response.EnsureSuccessStatusCode();
using (Stream contentStream = await response.Content.ReadAsStreamAsync())
using (StreamReader reader = new StreamReader(contentStream))
{
string line;
while ((line = await reader.ReadLineAsync()) != null)
{
// Process each line as it arrives
ProcessLine(line);
}
}
}
}
private void ProcessLine(string line)
{
// Your processing logic here
Console.WriteLine($"Processing: {line}");
}
Best Practices for Streaming Large Files
1. Use HttpCompletionOption.ResponseHeadersRead
Always use HttpCompletionOption.ResponseHeadersRead
to start processing as soon as headers arrive:
await httpClient.GetAsync(url, HttpCompletionOption.ResponseHeadersRead);
2. Configure Appropriate Buffer Sizes
The default 8192-byte buffer is suitable for most scenarios, but you can adjust based on your needs:
var buffer = new byte[81920]; // 80 KB for faster networks
3. Implement Timeout Handling
While streaming, use CancellationToken
for timeout control:
var cts = new CancellationTokenSource(TimeSpan.FromMinutes(5));
await DownloadFileAsync(url, path, null, cts.Token);
4. Reuse HttpClient Instance
Always reuse the same HttpClient
instance across your application to avoid socket exhaustion:
private static readonly HttpClient httpClient = new HttpClient();
5. Handle Partial Downloads
For resumable downloads, check if the server supports range requests and implement retry logic with proper range headers.
Streaming in Web Scraping Scenarios
When downloading large datasets or media files during web scraping operations, streaming becomes crucial. While C# HttpClient is excellent for API-based scraping, you might also need to handle file downloads in Puppeteer when dealing with browser-based downloads.
For comprehensive data extraction from streaming responses, consider using specialized APIs that handle the complexity of parsing large responses efficiently without memory overhead.
Performance Considerations
Memory Usage
Streaming keeps memory usage constant regardless of file size. With a 8KB buffer, memory usage remains under 1MB even when downloading files larger than 10GB.
Speed Optimization
To maximize download speed:
- Use appropriate buffer sizes (8KB to 80KB)
- Enable async I/O for file operations
- Avoid blocking operations in the download loop
- Consider compression if supported by the server
Throttling Progress Reports
Reporting progress on every chunk can slow down downloads. Throttle updates to 100-200ms intervals:
if (DateTime.UtcNow - lastReportTime > TimeSpan.FromMilliseconds(100))
{
progress.Report(progressInfo);
lastReportTime = DateTime.UtcNow;
}
Error Handling and Retry Logic
Implement retry logic for transient failures:
public async Task<bool> DownloadWithRetryAsync(
string url,
string outputPath,
int maxRetries = 3)
{
for (int attempt = 0; attempt < maxRetries; attempt++)
{
try
{
await DownloadFileAsync(url, outputPath);
return true;
}
catch (HttpRequestException ex) when (attempt < maxRetries - 1)
{
Console.WriteLine($"Attempt {attempt + 1} failed: {ex.Message}");
await Task.Delay(TimeSpan.FromSeconds(Math.Pow(2, attempt))); // Exponential backoff
}
}
return false;
}
Conclusion
Streaming large files with HttpClient in C# is essential for building robust, memory-efficient applications. By using HttpCompletionOption.ResponseHeadersRead
, implementing progress tracking, and supporting cancellation, you can create production-ready download functionality that handles files of any size without overwhelming system resources.
The key principles are:
- Always stream instead of loading entire files into memory
- Use async/await throughout your implementation
- Implement proper error handling and cancellation support
- Throttle progress updates for optimal performance
- Reuse HttpClient instances to avoid resource exhaustion
Whether you're building a file download manager, processing large datasets, or implementing web scraping solutions, these streaming techniques will ensure your application remains responsive and efficient even when handling multi-gigabyte files.