Table of contents

How do I configure user-agent strings in Puppeteer-Sharp?

Configuring user-agent strings in Puppeteer-Sharp is essential for web scraping and automation tasks where you need to identify your browser session as a specific device, browser, or application. User-agent strings help websites understand what type of client is making requests, and properly configuring them can help avoid detection and ensure compatibility with target websites.

What is a User-Agent String?

A user-agent string is an HTTP header that identifies the client software (browser, web scraper, or application) to the web server. It typically contains information about the browser name, version, operating system, and rendering engine. Websites often use this information to serve appropriate content or implement access controls.

Basic User-Agent Configuration

The most straightforward way to set a user-agent string in Puppeteer-Sharp is using the SetUserAgentAsync() method on a page instance:

using PuppeteerSharp;

// Launch browser and create page
var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
    Headless = true
});

var page = await browser.NewPageAsync();

// Set custom user-agent
await page.SetUserAgentAsync("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36");

// Navigate to target website
await page.GoToAsync("https://example.com");

// Verify the user-agent was set correctly
var userAgent = await page.EvaluateFunctionAsync<string>("() => navigator.userAgent");
Console.WriteLine($"Current User-Agent: {userAgent}");

Setting User-Agent for Multiple Pages

When working with multiple pages, you can set a default user-agent for all new pages created by a browser context:

using PuppeteerSharp;

var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
    Headless = true
});

// Create a browser context with default user-agent
var context = await browser.CreateIncognitoBrowserContextAsync();

// Set user-agent for all pages in this context
var customUserAgent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36";

var page = await context.NewPageAsync();
await page.SetUserAgentAsync(customUserAgent);

// All subsequent pages will need individual user-agent setting
var page2 = await context.NewPageAsync();
await page2.SetUserAgentAsync(customUserAgent);

Popular User-Agent Strings

Here are some commonly used user-agent strings for different scenarios:

// Modern Chrome on Windows
string chromeWindows = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36";

// Chrome on macOS
string chromeMac = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36";

// Firefox on Windows
string firefoxWindows = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0";

// Safari on macOS
string safariMac = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15";

// Mobile Chrome on Android
string mobileChrome = "Mozilla/5.0 (Linux; Android 10; SM-G975F) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36";

// Mobile Safari on iOS
string mobileSafari = "Mozilla/5.0 (iPhone; CPU iPhone OS 17_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Mobile/15E148 Safari/604.1";

Advanced User-Agent Configuration with Random Selection

For web scraping scenarios where you need to rotate user-agents to avoid detection, you can implement a random user-agent selector:

using PuppeteerSharp;
using System;
using System.Collections.Generic;

public class UserAgentManager
{
    private static readonly List<string> UserAgents = new List<string>
    {
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:120.0) Gecko/20100101 Firefox/120.0",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15",
        "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
    };

    private static readonly Random Random = new Random();

    public static string GetRandomUserAgent()
    {
        return UserAgents[Random.Next(UserAgents.Count)];
    }

    public static async Task SetRandomUserAgentAsync(IPage page)
    {
        var randomUserAgent = GetRandomUserAgent();
        await page.SetUserAgentAsync(randomUserAgent);
        Console.WriteLine($"Set User-Agent: {randomUserAgent}");
    }
}

// Usage example
var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true });
var page = await browser.NewPageAsync();

// Set random user-agent
await UserAgentManager.SetRandomUserAgentAsync(page);
await page.GoToAsync("https://example.com");

User-Agent Configuration with Extra Headers

Sometimes you need to set additional headers alongside the user-agent for more realistic browser simulation. This is often used in conjunction with setting custom headers for requests in Puppeteer-Sharp:

using PuppeteerSharp;
using System.Collections.Generic;

var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true });
var page = await browser.NewPageAsync();

// Set user-agent
await page.SetUserAgentAsync("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36");

// Set additional headers for more realistic simulation
await page.SetExtraHttpHeadersAsync(new Dictionary<string, string>
{
    {"Accept-Language", "en-US,en;q=0.9"},
    {"Accept-Encoding", "gzip, deflate, br"},
    {"Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8"},
    {"Sec-Fetch-Site", "none"},
    {"Sec-Fetch-Mode", "navigate"},
    {"Sec-Fetch-User", "?1"},
    {"Sec-Fetch-Dest", "document"}
});

await page.GoToAsync("https://example.com");

Validating User-Agent Configuration

It's important to verify that your user-agent configuration is working correctly. Here's how to check the current user-agent:

var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true });
var page = await browser.NewPageAsync();

// Set user-agent
var targetUserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36";
await page.SetUserAgentAsync(targetUserAgent);

// Navigate to a page
await page.GoToAsync("https://httpbin.org/user-agent");

// Extract and verify user-agent from the response
var userAgentResponse = await page.EvaluateFunctionAsync<string>("() => document.body.textContent");
Console.WriteLine($"Server received User-Agent: {userAgentResponse}");

// Also check via JavaScript navigator object
var navigatorUserAgent = await page.EvaluateFunctionAsync<string>("() => navigator.userAgent");
Console.WriteLine($"Navigator User-Agent: {navigatorUserAgent}");

// Verify they match
Console.WriteLine($"User-Agent set correctly: {navigatorUserAgent.Contains("Chrome/120.0.0.0")}");

Mobile User-Agent Simulation

For mobile web scraping or testing, you'll want to configure user-agents that represent mobile devices. This approach is often combined with updating the viewport size or device scale factor in Puppeteer-Sharp for realistic mobile simulation:

var browser = await Puppeteer.LaunchAsync(new LaunchOptions { Headless = true });
var page = await browser.NewPageAsync();

// Set mobile user-agent
await page.SetUserAgentAsync("Mozilla/5.0 (iPhone; CPU iPhone OS 17_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.1 Mobile/15E148 Safari/604.1");

// Set mobile viewport
await page.SetViewportAsync(new ViewPortOptions
{
    Width = 375,
    Height = 667,
    IsMobile = true,
    HasTouch = true
});

await page.GoToAsync("https://example.com");

Best Practices for User-Agent Configuration

  1. Keep User-Agents Updated: Use recent browser versions in your user-agent strings to avoid detection as outdated clients.

  2. Match User-Agent with Viewport: When simulating mobile devices, ensure your user-agent matches the viewport configuration.

  3. Rotate User-Agents: For large-scale scraping operations, implement user-agent rotation to distribute requests across different apparent client types.

  4. Test Your Configuration: Always verify that your user-agent is being set correctly by checking both the navigator object and server-side detection.

  5. Be Consistent: If you're maintaining sessions across multiple requests, keep the same user-agent throughout the session to maintain consistency.

Troubleshooting Common Issues

User-Agent Not Being Applied

If your user-agent changes aren't taking effect, ensure you're calling SetUserAgentAsync() before navigating to any pages:

// Correct order
var page = await browser.NewPageAsync();
await page.SetUserAgentAsync(customUserAgent); // Set BEFORE navigation
await page.GoToAsync("https://example.com");

// Incorrect order - won't affect the initial page load
var page = await browser.NewPageAsync();
await page.GoToAsync("https://example.com");
await page.SetUserAgentAsync(customUserAgent); // Too late for initial load

Headers Not Matching User-Agent

Some websites perform sophisticated client detection by comparing user-agent strings with other HTTP headers. Ensure consistency between your user-agent and other headers like Accept, Accept-Language, and Sec-* headers.

Managing Sessions with Consistent User-Agents

When working with authentication workflows, maintaining consistent user-agent strings throughout the session is crucial. This approach works well with managing cookies in Puppeteer-Sharp for maintaining login states.

Advanced Configuration Patterns

For enterprise applications, consider creating a configuration class that manages user-agent strings alongside other browser settings:

public class BrowserConfiguration
{
    public string UserAgent { get; set; }
    public Dictionary<string, string> Headers { get; set; }
    public ViewPortOptions Viewport { get; set; }

    public static BrowserConfiguration CreateDesktopChrome()
    {
        return new BrowserConfiguration
        {
            UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
            Headers = new Dictionary<string, string>
            {
                {"Accept-Language", "en-US,en;q=0.9"},
                {"Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"}
            },
            Viewport = new ViewPortOptions { Width = 1920, Height = 1080 }
        };
    }

    public async Task ApplyToBrowserAsync(IPage page)
    {
        await page.SetUserAgentAsync(UserAgent);
        if (Headers?.Count > 0)
            await page.SetExtraHttpHeadersAsync(Headers);
        if (Viewport != null)
            await page.SetViewportAsync(Viewport);
    }
}

User-agent configuration in Puppeteer-Sharp is a powerful tool for web automation and scraping. By properly setting and managing user-agent strings, you can ensure your applications appear as legitimate browser clients to target websites, improving compatibility and reducing the likelihood of being blocked or receiving different content than expected.

Try WebScraping.AI for Your Web Scraping Needs

Looking for a powerful web scraping solution? WebScraping.AI provides an LLM-powered API that combines Chromium JavaScript rendering with rotating proxies for reliable data extraction.

Key Features:

  • AI-powered extraction: Ask questions about web pages or extract structured data fields
  • JavaScript rendering: Full Chromium browser support for dynamic content
  • Rotating proxies: Datacenter and residential proxies from multiple countries
  • Easy integration: Simple REST API with SDKs for Python, Ruby, PHP, and more
  • Reliable & scalable: Built for developers who need consistent results

Getting Started:

Get page content with AI analysis:

curl "https://api.webscraping.ai/ai/question?url=https://example.com&question=What is the main topic?&api_key=YOUR_API_KEY"

Extract structured data:

curl "https://api.webscraping.ai/ai/fields?url=https://example.com&fields[title]=Page title&fields[price]=Product price&api_key=YOUR_API_KEY"

Try in request builder

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon