How can I search for elements with a specific CSS class using Nokogiri?

Nokogiri is a Ruby library for parsing HTML and XML documents. To search for elements with a specific CSS class using Nokogiri, you can use the .css method, which allows you to select elements using CSS selectors.

Here is a step-by-step guide on how to search for elements with a specific CSS class using Nokogiri:

  1. Install Nokogiri: If you haven't already, you need to install the Nokogiri gem. You can do this by running the following command in your terminal:
gem install nokogiri
  1. Require Nokogiri in Your Ruby Script: At the top of your Ruby script, require the Nokogiri library:
require 'nokogiri'
  1. Parse the HTML Document: Use Nokogiri to parse the HTML document from which you want to search elements. This can be done by reading the HTML content from a file, a string, or directly from a web request.
# If you have an HTML file
html_content = File.read('example.html')

# If you have an HTML string
html_content = '<div class="my-class">Content</div>'

# Parse the HTML content
doc = Nokogiri::HTML(html_content)
  1. Search for Elements with a Specific CSS Class: Use the .css method with the appropriate CSS selector to find all elements that have the specific class.
# Search for elements with the class "my-class"
elements = doc.css('.my-class')
  1. Iterate Over and Work with the Elements: Once you have selected the elements, you can iterate over them and work with each element as needed.
# Iterate over each element with the class "my-class"
elements.each do |element|
  puts element.content # Output the content of each element
end

Here's a full example that combines all the steps:

require 'nokogiri'

# Sample HTML content
html_content = <<-HTML
<html>
<head>
  <title>My webpage</title>
</head>
<body>
  <div class="my-class">Content 1</div>
  <div class="my-class">Content 2</div>
  <div class="other-class">Other Content</div>
</body>
</html>
HTML

# Parse the HTML content
doc = Nokogiri::HTML(html_content)

# Search for elements with the class "my-class"
elements = doc.css('.my-class')

# Iterate over each element with the class "my-class"
elements.each do |element|
  puts element.content # Output the content of each element
end

When you run the above Ruby script, it will output the content of each div element with the class my-class:

Content 1
Content 2

Remember that the .css method can take any valid CSS selector, so you can also search for elements by tag name, ID, attribute, or any combination thereof.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon