How do I select specific HTML elements using Kanna?

Kanna is a Swift library used for parsing HTML and XML. It is commonly used in iOS and macOS development to extract information from web content. Kanna works by using XPath and CSS selectors to navigate and select specific elements within the HTML document.

To use Kanna, you need to import the library into your Swift project. If you haven't already added Kanna to your project, you can add it via CocoaPods, Carthage, or Swift Package Manager.

Here is an example of how to use Kanna to select specific HTML elements:

First, make sure to import Kanna at the top of your Swift file:

import Kanna

Then, you can parse an HTML string and use CSS or XPath selectors to select specific elements:

// Sample HTML string
let htmlString = """
    <html>
        <head>
            <title>My Test Page</title>
        </head>
        <body>
            <h1>Welcome to My Test Page</h1>
            <p class="description">This is a sample paragraph with a <a href="https://example.com">link</a>.</p>
            <ul class="items">
                <li>Item 1</li>
                <li>Item 2</li>
                <li>Item 3</li>
            </ul>
        </body>
    </html>
"""

do {
    // Parse the HTML
    let doc = try HTML(html: htmlString, encoding: .utf8)

    // Select elements using CSS selectors
    if let title = doc.title {
        print("Title: \(title)")
    }

    for link in doc.css("a, link") {
        print("Link: \(link.text ?? ""), Href: \(link["href"] ?? "")")
    }

    // Select elements using XPath
    for li in doc.xpath("//ul[@class='items']/li") {
        print("List item: \(li.text ?? "")")
    }

    // Select a specific element with a class
    if let paragraph = doc.at_css("p.description") {
        print("Paragraph text: \(paragraph.text ?? "")")
    }
} catch {
    print("Error parsing HTML: \(error)")
}

In the example above:

  • We create an HTML object by parsing a string that contains HTML content.
  • We use the doc.title property to access the content of the <title> tag.
  • We use the doc.css method with a CSS selector to find all <a> and <link> tags and print their text and href attributes.
  • We use the doc.xpath method with an XPath expression to select all <li> elements that are children of a <ul> with the class items.
  • We use the doc.at_css method to select the first <p> element with the class description.

Remember to handle parsing errors with a do-catch block since the parser can throw an error if there is an issue with the HTML content or encoding.

Please note that Kanna is specific to Swift and not available for other programming languages like Python or JavaScript. If you are looking for similar libraries in other languages, you can use Beautiful Soup in Python and cheerio in JavaScript (Node.js environment) for HTML parsing and element selection.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon