Nokogiri is a popular Ruby library for parsing HTML and XML. It provides a simple way to navigate and manipulate these types of documents. To use CSS selectors with Nokogiri, you need to use the css
method, which allows you to select elements in a similar way to how you would do it in a web browser with JavaScript or CSS itself.
Here's the basic syntax for using CSS selectors in Nokogiri:
require 'nokogiri'
require 'open-uri'
# Load the HTML document
html = open('http://example.com/')
doc = Nokogiri::HTML(html)
# Select elements using CSS selectors
elements = doc.css('selector')
# Example: Select all paragraph elements
paragraphs = doc.css('p')
# Example: Select elements with class 'example'
class_elements = doc.css('.example')
# Example: Select elements with id 'header'
id_elements = doc.css('#header')
# Example: Select all links within a list item
links_in_list = doc.css('li a')
The css
method can also be called on any Nokogiri element, not just the document, to select descendants of that element.
# Select a specific div by id
specific_div = doc.css('#specific_div')
# Within that div, select elements with the class 'nested'
nested_elements = specific_div.css('.nested')
Nokogiri also allows you to use CSS pseudo-classes, which can be very powerful for selecting elements based on their state or position in the document.
# Select the first paragraph
first_paragraph = doc.css('p:first-of-type')
# Select all even rows in a table
even_rows = doc.css('tr:nth-child(even)')
Remember that CSS selectors in Nokogiri are case-sensitive, so you need to match the case of the element names, classes, and IDs in the HTML document.
Keep in mind that Nokogiri's css
method returns a NodeSet, which is similar to an array and contains all the elements that match the selector. You can iterate over this NodeSet or access individual elements by index.