What is the syntax for using CSS selectors in Kanna?

Kanna is a Swift library used for parsing HTML and XML. It provides a way to select and manipulate HTML elements using CSS selectors, which is similar to the functionality provided by libraries like BeautifulSoup in Python or jQuery in JavaScript.

When using Kanna, you typically start by parsing an HTML string into a HTMLDocument or XMLDocument. After that, you can use the css method to select elements based on their CSS selectors.

Here is the basic syntax for using CSS selectors with Kanna in Swift:

import Kanna

// Assume you have an HTML string called `htmlString`
let htmlString = """
<html>
<head>
    <title>Sample Page</title>
</head>
<body>
    <div class="content">
        <p class="text">Hello World!</p>
        <a href="http://example.com" class="link">Example Link</a>
    </div>
</body>
</html>
"""

do {
    // Parse the HTML string into a document
    let doc = try HTML(html: htmlString, encoding: .utf8)

    // Use CSS selector to select the element(s)
    for link in doc.css("a, p") {
        // Extract the content or attributes of the elements
        print(link.text ?? "")
        if let href = link["href"] {
            print(href)
        }
    }

    // You can also use more specific selectors
    for paragraph in doc.css("div.content p.text") {
        print(paragraph.text ?? "")
    }

} catch {
    print(error)
}

In the above code:

  • We first import the Kanna library.
  • We have an HTML string htmlString that contains the HTML content we want to parse.
  • We parse the HTML content into a HTMLDocument object using try HTML(html:encoding:).
  • We use the css method to select elements. You can pass any valid CSS selector string to this method.
  • The css method returns a collection of elements that match the provided selector.
  • We iterate through the selected elements and extract the content or attributes as needed.

Here are some examples of CSS selectors you might use with Kanna:

  • doc.css("p") - Selects all <p> elements.
  • doc.css(".content") - Selects elements with the class content.
  • doc.css("#footer") - Selects the element with the ID footer.
  • doc.css("div.content p.text") - Selects <p> elements with the class text inside a <div> with the class content.
  • doc.css("a[href]") - Selects all <a> elements that have an href attribute.

Remember that Kanna follows the CSS selector standard, so you can use any selector that you would normally use in CSS or in other libraries like jQuery.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon