How do I select elements with specific attributes using GoQuery?

GoQuery is a library for Go that emulates jQuery syntax for HTML document traversal and manipulation, making it very convenient to work with HTML content. When working with GoQuery to select elements with specific attributes, you can use the same CSS selector syntax as you would in jQuery.

Here's how you can select elements with specific attributes using GoQuery:

  1. Import GoQuery: First, you need to import the GoQuery package after installing it with go get.
   package main

   import (
       "github.com/PuerkitoBio/goquery"
       "log"
       "net/http"
   )
  1. Load the HTML document: You can load an HTML document from a URL, a file, or a string. Here's an example of how to load an HTML document from a URL:
   // Request the HTML page.
   res, err := http.Get("http://example.com/")
   if err != nil {
       log.Fatal(err)
   }
   defer res.Body.Close()
   if res.StatusCode != 200 {
       log.Fatalf("status code error: %d %s", res.StatusCode, res.Status)
   }

   // Load the HTML document
   doc, err := goquery.NewDocumentFromReader(res.Body)
   if err != nil {
       log.Fatal(err)
   }
  1. Select elements with specific attributes: You can now use GoQuery's selection methods to find elements with specific attributes. Use the CSS selector syntax for this. Here are some examples:
  • Select elements by attribute presence:

     doc.Find("[data-custom]").Each(func(i int, s *goquery.Selection) {
         // For each element found, do something
         // s is the found Selection
         fmt.Println(s.Text())
     })
    
  • Select elements by attribute value:

     doc.Find("input[type='checkbox']").Each(func(i int, s *goquery.Selection) {
         // Process each checkbox input
         fmt.Println(s.Val())
     })
    
  • Select elements by attribute value contains a specified substring:

     doc.Find("a[href*='example']").Each(func(i int, s *goquery.Selection) {
         // Process each link which href contains "example"
         href, _ := s.Attr("href")
         fmt.Println(href)
     })
    
  • Select elements by attribute value starts with a specified string:

     doc.Find("img[src^='http']").Each(func(i int, s *goquery.Selection) {
         // Process each img whose src attribute starts with "http"
         src, _ := s.Attr("src")
         fmt.Println(src)
     })
    
  • Select elements by attribute value ends with a specified string:

     doc.Find("link[href$='.css']").Each(func(i int, s *goquery.Selection) {
         // Process each link element whose href attribute ends with ".css"
         href, _ := s.Attr("href")
         fmt.Println(href)
     })
    

Remember to always check for errors in real-world code and handle them appropriately. The examples above skip error checking for brevity.

By mastering GoQuery's selection methods, you can effectively scrape and manipulate HTML documents according to your requirements.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon