SwiftSoup is a pure Swift library that allows for parsing and manipulating HTML, similar to how Jsoup works for Java. While SwiftSoup doesn't have a direct method to download and save images from a webpage, you can use it to extract the image URLs. Then, you can download the images using URLSession or other networking methods available in Swift.
Here's a step-by-step guide on how to use SwiftSoup to download and save images from a webpage:
Step 1: Parse the HTML document
Start by fetching the HTML content of the webpage and parsing it using SwiftSoup.
import SwiftSoup
func fetchHTML(from url: String, completion: @escaping (Result<Document, Error>) -> Void) {
guard let url = URL(string: url) else {
completion(.failure(NSError(domain: "", code: 0, userInfo: [NSLocalizedDescriptionKey: "Invalid URL"])))
return
}
URLSession.shared.dataTask(with: url) { data, response, error in
if let error = error {
completion(.failure(error))
return
}
guard let data = data, let html = String(data: data, encoding: .utf8) else {
completion(.failure(NSError(domain: "", code: 0, userInfo: [NSLocalizedDescriptionKey: "Failed to decode HTML"])))
return
}
do {
let document = try SwiftSoup.parse(html)
completion(.success(document))
} catch {
completion(.failure(error))
}
}.resume()
}
Step 2: Extract image URLs
Once you have the parsed HTML Document
, extract all the image URLs using SwiftSoup.
func extractImageURLs(from document: Document) -> [URL] {
do {
let imageElements = try document.select("img")
let srcs = imageElements.array().compactMap { try? $0.attr("src").trimmingCharacters(in: .whitespacesAndNewlines) }
return srcs.compactMap { URL(string: $0) }
} catch {
print("Error extracting image URLs: \(error)")
return []
}
}
Step 3: Download and save images
With the URLs extracted, you can now download and save the images to the local file system.
func downloadImage(from url: URL, to directory: URL, completion: @escaping (Error?) -> Void) {
URLSession.shared.dataTask(with: url) { data, response, error in
if let error = error {
completion(error)
return
}
guard let data = data, let httpResponse = response as? HTTPURLResponse, httpResponse.statusCode == 200 else {
completion(NSError(domain: "", code: 0, userInfo: [NSLocalizedDescriptionKey: "Invalid response or data"]))
return
}
let fileName = url.lastPathComponent
let fileURL = directory.appendingPathComponent(fileName)
do {
try data.write(to: fileURL)
completion(nil)
} catch {
completion(error)
}
}.resume()
}
// Example usage
let urlString = "http://example.com"
let saveDirectory = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0]
fetchHTML(from: urlString) { result in
switch result {
case .success(let document):
let imageURLs = extractImageURLs(from: document)
imageURLs.forEach { imageURL in
downloadImage(from: imageURL, to: saveDirectory) { error in
if let error = error {
print("Error downloading image: \(error)")
} else {
print("Image downloaded: \(imageURL.lastPathComponent)")
}
}
}
case .failure(let error):
print("Error fetching HTML: \(error)")
}
}
In this Swift code:
- We first define a function
fetchHTML
to fetch the HTML content of a webpage and parse it using SwiftSoup. - Then,
extractImageURLs
takes the parsedDocument
and extracts thesrc
attribute of allimg
tags, returning an array ofURL
objects. - The
downloadImage
function then takes eachURL
, downloads the image data usingURLSession
, and saves it to the specified directory usingFileManager
. - Finally, we use these functions together to download images from a given webpage URL.
Please make sure you have the legal right to scrape the content from the webpage and that you are complying with the website's terms of service or robots.txt file.