Table of contents

How do I use HttpClient (C#) as a singleton in my application?

Using HttpClient as a singleton is a critical best practice in C# applications to prevent socket exhaustion and improve performance. When you create and dispose of HttpClient instances for each request, you can quickly run out of available sockets, causing SocketException errors and degraded application performance.

Understanding the HttpClient Socket Exhaustion Problem

While HttpClient implements IDisposable, wrapping it in a using statement for each request is actually an anti-pattern. Each HttpClient instance maintains its own connection pool, and disposing of it doesn't immediately release the underlying sockets. These sockets remain in a TIME_WAIT state, which can lead to port exhaustion under high load.

Implementing HttpClient as a Singleton

Basic Singleton Pattern

The simplest approach is to create a static readonly instance:

public class ApiClient
{
    private static readonly HttpClient _httpClient = new HttpClient
    {
        BaseAddress = new Uri("https://api.webscraping.ai"),
        Timeout = TimeSpan.FromSeconds(30)
    };

    static ApiClient()
    {
        _httpClient.DefaultRequestHeaders.Add("Accept", "application/json");
        _httpClient.DefaultRequestHeaders.Add("User-Agent", "MyApp/1.0");
    }

    public async Task<string> GetHtmlAsync(string url)
    {
        var response = await _httpClient.GetAsync($"/html?url={Uri.EscapeDataString(url)}");
        response.EnsureSuccessStatusCode();
        return await response.Content.ReadAsStringAsync();
    }
}

Using Dependency Injection (Recommended)

In modern .NET applications, the preferred approach is using IHttpClientFactory, which manages the lifetime of HttpClient instances and handles DNS changes properly:

// Startup.cs or Program.cs
public void ConfigureServices(IServiceCollection services)
{
    services.AddHttpClient("WebScrapingClient", client =>
    {
        client.BaseAddress = new Uri("https://api.webscraping.ai");
        client.Timeout = TimeSpan.FromSeconds(30);
        client.DefaultRequestHeaders.Add("Accept", "application/json");
    });
}

// Your service class
public class ScrapingService
{
    private readonly IHttpClientFactory _httpClientFactory;

    public ScrapingService(IHttpClientFactory httpClientFactory)
    {
        _httpClientFactory = httpClientFactory;
    }

    public async Task<string> ScrapeWebsiteAsync(string url)
    {
        var client = _httpClientFactory.CreateClient("WebScrapingClient");
        var response = await client.GetAsync($"/html?url={Uri.EscapeDataString(url)}");
        response.EnsureSuccessStatusCode();
        return await response.Content.ReadAsStringAsync();
    }
}

Typed HttpClient Pattern

For even better encapsulation, use typed clients:

// Startup.cs or Program.cs
public void ConfigureServices(IServiceCollection services)
{
    services.AddHttpClient<WebScrapingApiClient>(client =>
    {
        client.BaseAddress = new Uri("https://api.webscraping.ai");
        client.Timeout = TimeSpan.FromSeconds(30);
    });
}

// Typed client
public class WebScrapingApiClient
{
    private readonly HttpClient _httpClient;

    public WebScrapingApiClient(HttpClient httpClient)
    {
        _httpClient = httpClient;
    }

    public async Task<string> GetHtmlAsync(string url, string apiKey)
    {
        var requestUri = $"/html?url={Uri.EscapeDataString(url)}&api_key={apiKey}";
        var response = await _httpClient.GetAsync(requestUri);
        response.EnsureSuccessStatusCode();
        return await response.Content.ReadAsStringAsync();
    }

    public async Task<T> GetStructuredDataAsync<T>(string url, string apiKey) where T : class
    {
        var requestUri = $"/html?url={Uri.EscapeDataString(url)}&api_key={apiKey}";
        var response = await _httpClient.GetAsync(requestUri);
        response.EnsureSuccessStatusCode();

        var json = await response.Content.ReadAsStringAsync();
        return JsonSerializer.Deserialize<T>(json);
    }
}

// Usage in a controller or service
public class DataController : ControllerBase
{
    private readonly WebScrapingApiClient _scrapingClient;

    public DataController(WebScrapingApiClient scrapingClient)
    {
        _scrapingClient = scrapingClient;
    }

    [HttpGet("scrape")]
    public async Task<IActionResult> ScrapeData(string url)
    {
        try
        {
            var html = await _scrapingClient.GetHtmlAsync(url, "your_api_key");
            return Ok(new { html });
        }
        catch (HttpRequestException ex)
        {
            return StatusCode(500, $"Request failed: {ex.Message}");
        }
    }
}

Handling Authentication and Headers

When implementing authentication in your web scraping requests, you can configure headers at the singleton level:

public class AuthenticatedScrapingClient
{
    private static readonly HttpClient _httpClient;

    static AuthenticatedScrapingClient()
    {
        _httpClient = new HttpClient
        {
            BaseAddress = new Uri("https://api.webscraping.ai")
        };

        // Set authentication header
        var apiKey = Configuration.GetApiKey();
        _httpClient.DefaultRequestHeaders.Add("X-API-Key", apiKey);
    }

    public static async Task<string> GetAsync(string endpoint)
    {
        var response = await _httpClient.GetAsync(endpoint);
        response.EnsureSuccessStatusCode();
        return await response.Content.ReadAsStringAsync();
    }
}

Advanced Configuration with Polly for Resilience

Combine IHttpClientFactory with Polly for retry policies and circuit breakers:

public void ConfigureServices(IServiceCollection services)
{
    services.AddHttpClient<WebScrapingApiClient>(client =>
    {
        client.BaseAddress = new Uri("https://api.webscraping.ai");
        client.Timeout = TimeSpan.FromSeconds(30);
    })
    .AddTransientHttpErrorPolicy(policy =>
        policy.WaitAndRetryAsync(3, retryAttempt =>
            TimeSpan.FromSeconds(Math.Pow(2, retryAttempt))))
    .AddTransientHttpErrorPolicy(policy =>
        policy.CircuitBreakerAsync(5, TimeSpan.FromSeconds(30)));
}

Handling DNS Changes

One limitation of singleton HttpClient is that it doesn't respect DNS changes. IHttpClientFactory solves this by recycling handlers:

services.AddHttpClient("WebScrapingClient")
    .SetHandlerLifetime(TimeSpan.FromMinutes(5));

Best Practices for Production

1. Configure Timeouts Appropriately

services.AddHttpClient<WebScrapingApiClient>(client =>
{
    client.Timeout = TimeSpan.FromSeconds(100); // For long-running scraping operations
});

2. Handle Timeout Exceptions

When handling timeouts in your requests, wrap your calls in try-catch blocks:

public async Task<string> GetWithTimeoutHandlingAsync(string url)
{
    try
    {
        var response = await _httpClient.GetAsync(url);
        response.EnsureSuccessStatusCode();
        return await response.Content.ReadAsStringAsync();
    }
    catch (TaskCanceledException ex) when (ex.InnerException is TimeoutException)
    {
        // Handle timeout
        throw new ApplicationException($"Request to {url} timed out", ex);
    }
    catch (HttpRequestException ex)
    {
        // Handle network errors
        throw new ApplicationException($"Network error while requesting {url}", ex);
    }
}

3. Use Connection Limits

Configure ServicePointManager for even better control:

ServicePointManager.DefaultConnectionLimit = 100;
ServicePointManager.MaxServicePointIdleTime = 90000;

Or configure per-handler:

services.AddHttpClient("WebScrapingClient")
    .ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
    {
        PooledConnectionLifetime = TimeSpan.FromMinutes(2),
        MaxConnectionsPerServer = 20
    });

Thread Safety Considerations

HttpClient is thread-safe for its methods, but not for its properties. Configure all properties during initialization:

public class ThreadSafeApiClient
{
    private static readonly HttpClient _httpClient;
    private static readonly SemaphoreSlim _semaphore = new SemaphoreSlim(10, 10);

    static ThreadSafeApiClient()
    {
        _httpClient = new HttpClient
        {
            BaseAddress = new Uri("https://api.webscraping.ai"),
            Timeout = TimeSpan.FromSeconds(30)
        };
    }

    public static async Task<string> GetAsync(string url)
    {
        await _semaphore.WaitAsync();
        try
        {
            var response = await _httpClient.GetAsync(url);
            response.EnsureSuccessStatusCode();
            return await response.Content.ReadAsStringAsync();
        }
        finally
        {
            _semaphore.Release();
        }
    }
}

Common Pitfalls to Avoid

❌ Don't Do This

// Anti-pattern: Creating new instances
public async Task<string> BadExample(string url)
{
    using (var client = new HttpClient())
    {
        return await client.GetStringAsync(url);
    }
}

✅ Do This Instead

// Correct: Reuse singleton instance
private static readonly HttpClient _client = new HttpClient();

public async Task<string> GoodExample(string url)
{
    return await _client.GetStringAsync(url);
}

Testing Considerations

When unit testing code that uses HttpClient, use IHttpClientFactory for easier mocking:

[Fact]
public async Task TestScrapingService()
{
    // Arrange
    var mockFactory = new Mock<IHttpClientFactory>();
    var mockHttpMessageHandler = new Mock<HttpMessageHandler>();

    mockHttpMessageHandler.Protected()
        .Setup<Task<HttpResponseMessage>>(
            "SendAsync",
            ItExpr.IsAny<HttpRequestMessage>(),
            ItExpr.IsAny<CancellationToken>())
        .ReturnsAsync(new HttpResponseMessage
        {
            StatusCode = HttpStatusCode.OK,
            Content = new StringContent("<html><body>Test</body></html>")
        });

    var client = new HttpClient(mockHttpMessageHandler.Object);
    mockFactory.Setup(_ => _.CreateClient(It.IsAny<string>())).Returns(client);

    var service = new ScrapingService(mockFactory.Object);

    // Act
    var result = await service.ScrapeWebsiteAsync("https://example.com");

    // Assert
    Assert.Contains("Test", result);
}

Monitoring Network Requests

When building robust scraping applications, it's important to monitor network requests for debugging and optimization. You can add logging to your HttpClient:

public class LoggingHandler : DelegatingHandler
{
    private readonly ILogger<LoggingHandler> _logger;

    public LoggingHandler(ILogger<LoggingHandler> logger)
    {
        _logger = logger;
    }

    protected override async Task<HttpResponseMessage> SendAsync(
        HttpRequestMessage request,
        CancellationToken cancellationToken)
    {
        _logger.LogInformation($"Request: {request.Method} {request.RequestUri}");

        var response = await base.SendAsync(request, cancellationToken);

        _logger.LogInformation($"Response: {response.StatusCode}");

        return response;
    }
}

// Register the handler
services.AddTransient<LoggingHandler>();
services.AddHttpClient<WebScrapingApiClient>()
    .AddHttpMessageHandler<LoggingHandler>();

Conclusion

Using HttpClient as a singleton is essential for building robust and performant C# applications. While the simple static singleton pattern works, IHttpClientFactory provides a more sophisticated solution that handles DNS changes, supports resilience patterns, and integrates seamlessly with dependency injection.

For production web scraping applications, always use IHttpClientFactory with typed clients, configure appropriate timeouts and retry policies, and monitor your application's socket usage to ensure optimal performance.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon