Is Html Agility Pack compatible with Blazor applications?

Html Agility Pack (HAP) compatibility with Blazor depends on the hosting model you're using. Here's a comprehensive breakdown:

Blazor Server Applications

Yes, Html Agility Pack works perfectly with Blazor Server because the code runs on the server where the full .NET runtime is available.

Installation

First, install the Html Agility Pack NuGet package:

dotnet add package HtmlAgilityPack

Basic Service Implementation

using HtmlAgilityPack;
using System.Net.Http;
using System.Threading.Tasks;

public class HtmlParsingService
{
    private readonly HttpClient _httpClient;

    public HtmlParsingService(HttpClient httpClient)
    {
        _httpClient = httpClient;
    }

    public async Task<HtmlDocument> LoadHtmlFromUrlAsync(string url)
    {
        var response = await _httpClient.GetAsync(url);
        response.EnsureSuccessStatusCode();

        var content = await response.Content.ReadAsStringAsync();

        var htmlDoc = new HtmlDocument();
        htmlDoc.LoadHtml(content);

        return htmlDoc;
    }

    public async Task<List<string>> ExtractLinksAsync(string url)
    {
        var doc = await LoadHtmlFromUrlAsync(url);
        var links = doc.DocumentNode
            .SelectNodes("//a[@href]")
            ?.Select(node => node.GetAttributeValue("href", ""))
            .Where(href => !string.IsNullOrEmpty(href))
            .ToList() ?? new List<string>();

        return links;
    }
}

Using in a Blazor Component

@page "/html-parser"
@inject HtmlParsingService HtmlParser

<h3>HTML Parser</h3>

<input @bind="url" placeholder="Enter URL" />
<button @onclick="ParseHtml">Parse HTML</button>

@if (links.Any())
{
    <ul>
        @foreach (var link in links)
        {
            <li>@link</li>
        }
    </ul>
}

@code {
    private string url = "";
    private List<string> links = new();

    private async Task ParseHtml()
    {
        if (!string.IsNullOrEmpty(url))
        {
            try
            {
                links = await HtmlParser.ExtractLinksAsync(url);
            }
            catch (Exception ex)
            {
                // Handle error
                Console.WriteLine($"Error: {ex.Message}");
            }
        }
    }
}

Service Registration

Register the service in Program.cs:

builder.Services.AddHttpClient<HtmlParsingService>();

Blazor WebAssembly Applications

Html Agility Pack does NOT work directly in Blazor WebAssembly because: - It relies on .NET Framework APIs not available in the browser - Some dependencies are incompatible with the WebAssembly runtime - File system and network operations are restricted in the browser

Workaround: Server-Side API

Create a server-side API to handle HTML parsing:

API Controller

[ApiController]
[Route("api/[controller]")]
public class HtmlParsingController : ControllerBase
{
    private readonly HttpClient _httpClient;

    public HtmlParsingController(HttpClient httpClient)
    {
        _httpClient = httpClient;
    }

    [HttpPost("extract-data")]
    public async Task<ActionResult<HtmlParseResult>> ExtractData([FromBody] ParseRequest request)
    {
        try
        {
            var response = await _httpClient.GetAsync(request.Url);
            response.EnsureSuccessStatusCode();

            var content = await response.Content.ReadAsStringAsync();
            var doc = new HtmlDocument();
            doc.LoadHtml(content);

            var result = new HtmlParseResult
            {
                Title = doc.DocumentNode.SelectSingleNode("//title")?.InnerText,
                Links = doc.DocumentNode
                    .SelectNodes("//a[@href]")
                    ?.Select(n => n.GetAttributeValue("href", ""))
                    .Where(href => !string.IsNullOrEmpty(href))
                    .ToList() ?? new List<string>(),
                Images = doc.DocumentNode
                    .SelectNodes("//img[@src]")
                    ?.Select(n => n.GetAttributeValue("src", ""))
                    .Where(src => !string.IsNullOrEmpty(src))
                    .ToList() ?? new List<string>()
            };

            return Ok(result);
        }
        catch (Exception ex)
        {
            return BadRequest($"Error parsing HTML: {ex.Message}");
        }
    }
}

public class ParseRequest
{
    public string Url { get; set; } = "";
}

public class HtmlParseResult
{
    public string? Title { get; set; }
    public List<string> Links { get; set; } = new();
    public List<string> Images { get; set; } = new();
}

Blazor WASM Client Service

public class HtmlParsingApiService
{
    private readonly HttpClient _httpClient;

    public HtmlParsingApiService(HttpClient httpClient)
    {
        _httpClient = httpClient;
    }

    public async Task<HtmlParseResult?> ParseHtmlAsync(string url)
    {
        try
        {
            var request = new ParseRequest { Url = url };
            var response = await _httpClient.PostAsJsonAsync("api/htmlparsing/extract-data", request);

            if (response.IsSuccessStatusCode)
            {
                return await response.Content.ReadFromJsonAsync<HtmlParseResult>();
            }

            return null;
        }
        catch (Exception ex)
        {
            Console.WriteLine($"API Error: {ex.Message}");
            return null;
        }
    }
}

Using in Blazor WASM Component

@page "/wasm-parser"
@inject HtmlParsingApiService ApiService

<h3>WebAssembly HTML Parser</h3>

<input @bind="url" placeholder="Enter URL" />
<button @onclick="ParseHtml" disabled="@isLoading">
    @if (isLoading) { <span>Loading...</span> } else { <span>Parse HTML</span> }
</button>

@if (result != null)
{
    <div>
        <h4>Title: @result.Title</h4>
        <h5>Links (@result.Links.Count):</h5>
        <ul>
            @foreach (var link in result.Links.Take(10))
            {
                <li>@link</li>
            }
        </ul>
    </div>
}

@code {
    private string url = "";
    private HtmlParseResult? result;
    private bool isLoading = false;

    private async Task ParseHtml()
    {
        if (!string.IsNullOrEmpty(url))
        {
            isLoading = true;
            result = await ApiService.ParseHtmlAsync(url);
            isLoading = false;
        }
    }
}

Summary

| Blazor Model | Html Agility Pack Support | Implementation | |--------------|---------------------------|----------------| | Blazor Server | ✅ Full support | Direct usage in services and components | | Blazor WebAssembly | ❌ Not supported | Requires server-side API with proxy calls |

For Blazor Server applications, you can use Html Agility Pack directly. For Blazor WebAssembly, implement a server-side API that handles HTML parsing and returns the processed data to your client application.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon