How do I filter elements by their text content in SwiftSoup?

To filter elements by their text content in SwiftSoup, you can use the containsOwn(text:) method to find elements that directly contain the specified text, or you can iterate through the elements and manually check their text content. SwiftSoup is a Swift library used for parsing and manipulating HTML and XML documents.

Here is an example of how you might use SwiftSoup to filter elements by their text content:

import SwiftSoup

func filterElementsByTextContent(html: String, searchText: String) throws -> Elements {
    // Parse the HTML content into a Document
    let document = try SwiftSoup.parse(html)

    // Use `containsOwn` to search for elements that contain the text directly
    let elementsContainingText = try document.getElementsContainingOwnText(searchText)

    // If you want to filter elements with exact matches or custom conditions, you might iterate as follows:
    let allElements = try document.getAllElements()
    let filteredElements = Elements()
    for element in allElements.array() {
        let text = try element.text()
        if text.contains(searchText) { // or use `text == searchText` for an exact match
            try filteredElements.add(element)
        }
    }

    return filteredElements
}

// Example usage:
let html = """
<html>
    <body>
        <div>Hello, world!</div>
        <div>Welcome to SwiftSoup</div>
        <p>This is a paragraph.</p>
        <p>Another paragraph with SwiftSoup</p>
    </body>
</html>
"""

do {
    let elementsWithText = try filterElementsByTextContent(html: html, searchText: "SwiftSoup")
    for element in elementsWithText.array() {
        print(try element.outerHtml())
    }
} catch {
    print("An error occurred: \(error)")
}

In this example, the filterElementsByTextContent function takes an HTML string and a search text and returns an Elements collection that contains only the elements with the specified text content. The search is case-sensitive by default. The getElementsContainingOwnText method is used to find elements containing the given text directly within themselves, not within a child element.

If you need more control over the matching criteria, you can iterate over all the elements and check their text content manually. This allows you to perform operations like exact matches, case-insensitive comparisons, or even regular expression matches.

Remember to handle errors properly in your code by using do-catch blocks, as SwiftSoup methods can throw exceptions.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon