GoQuery is a library for Go that emulates jQuery syntax for HTML document traversal and manipulation, making it very convenient to work with HTML content. When working with GoQuery to select elements with specific attributes, you can use the same CSS selector syntax as you would in jQuery.
Here's how you can select elements with specific attributes using GoQuery:
- Import GoQuery:
First, you need to import the GoQuery package after installing it with
go get
.
package main
import (
"github.com/PuerkitoBio/goquery"
"log"
"net/http"
)
- Load the HTML document: You can load an HTML document from a URL, a file, or a string. Here's an example of how to load an HTML document from a URL:
// Request the HTML page.
res, err := http.Get("http://example.com/")
if err != nil {
log.Fatal(err)
}
defer res.Body.Close()
if res.StatusCode != 200 {
log.Fatalf("status code error: %d %s", res.StatusCode, res.Status)
}
// Load the HTML document
doc, err := goquery.NewDocumentFromReader(res.Body)
if err != nil {
log.Fatal(err)
}
- Select elements with specific attributes: You can now use GoQuery's selection methods to find elements with specific attributes. Use the CSS selector syntax for this. Here are some examples:
Select elements by attribute presence:
doc.Find("[data-custom]").Each(func(i int, s *goquery.Selection) { // For each element found, do something // s is the found Selection fmt.Println(s.Text()) })
Select elements by attribute value:
doc.Find("input[type='checkbox']").Each(func(i int, s *goquery.Selection) { // Process each checkbox input fmt.Println(s.Val()) })
Select elements by attribute value contains a specified substring:
doc.Find("a[href*='example']").Each(func(i int, s *goquery.Selection) { // Process each link which href contains "example" href, _ := s.Attr("href") fmt.Println(href) })
Select elements by attribute value starts with a specified string:
doc.Find("img[src^='http']").Each(func(i int, s *goquery.Selection) { // Process each img whose src attribute starts with "http" src, _ := s.Attr("src") fmt.Println(src) })
Select elements by attribute value ends with a specified string:
doc.Find("link[href$='.css']").Each(func(i int, s *goquery.Selection) { // Process each link element whose href attribute ends with ".css" href, _ := s.Attr("href") fmt.Println(href) })
Remember to always check for errors in real-world code and handle them appropriately. The examples above skip error checking for brevity.
By mastering GoQuery's selection methods, you can effectively scrape and manipulate HTML documents according to your requirements.