Kanna is a Swift library used for parsing HTML and XML. It provides a way to select and manipulate HTML elements using CSS selectors, which is similar to the functionality provided by libraries like BeautifulSoup in Python or jQuery in JavaScript.
When using Kanna, you typically start by parsing an HTML string into a HTMLDocument
or XMLDocument
. After that, you can use the css
method to select elements based on their CSS selectors.
Here is the basic syntax for using CSS selectors with Kanna in Swift:
import Kanna
// Assume you have an HTML string called `htmlString`
let htmlString = """
<html>
<head>
<title>Sample Page</title>
</head>
<body>
<div class="content">
<p class="text">Hello World!</p>
<a href="http://example.com" class="link">Example Link</a>
</div>
</body>
</html>
"""
do {
// Parse the HTML string into a document
let doc = try HTML(html: htmlString, encoding: .utf8)
// Use CSS selector to select the element(s)
for link in doc.css("a, p") {
// Extract the content or attributes of the elements
print(link.text ?? "")
if let href = link["href"] {
print(href)
}
}
// You can also use more specific selectors
for paragraph in doc.css("div.content p.text") {
print(paragraph.text ?? "")
}
} catch {
print(error)
}
In the above code:
- We first import the Kanna library.
- We have an HTML string
htmlString
that contains the HTML content we want to parse. - We parse the HTML content into a
HTMLDocument
object usingtry HTML(html:encoding:)
. - We use the
css
method to select elements. You can pass any valid CSS selector string to this method. - The
css
method returns a collection of elements that match the provided selector. - We iterate through the selected elements and extract the content or attributes as needed.
Here are some examples of CSS selectors you might use with Kanna:
doc.css("p")
- Selects all<p>
elements.doc.css(".content")
- Selects elements with the classcontent
.doc.css("#footer")
- Selects the element with the IDfooter
.doc.css("div.content p.text")
- Selects<p>
elements with the classtext
inside a<div>
with the classcontent
.doc.css("a[href]")
- Selects all<a>
elements that have anhref
attribute.
Remember that Kanna follows the CSS selector standard, so you can use any selector that you would normally use in CSS or in other libraries like jQuery.