How do I select sibling or parent elements using Nokogiri?

Selecting Sibling Elements with Nokogiri

Nokogiri is a Ruby library for parsing HTML and XML. To select sibling elements, you can use various methods provided by Nokogiri such as next_sibling, previous_sibling, next_element, previous_element, next, previous, and CSS or XPath selectors.

Here's how to select sibling elements using Nokogiri:

Using CSS Selectors

CSS selectors are a convenient way to navigate through elements in an HTML document.

require 'nokogiri'

# Assuming you have your HTML content in html_content
doc = Nokogiri::HTML(html_content)

# To select the next sibling with a class name
sibling = doc.at_css('.your-class').next_sibling

# If you want to select the next element (ignoring text nodes)
next_element = doc.at_css('.your-class').next_element

# To select the previous sibling with a specific class
previous_sibling = doc.at_css('.your-class').previous_sibling

# To select the previous element (ignoring text nodes)
previous_element = doc.at_css('.your-class').previous_element

Using XPath Selectors

XPath is a language for selecting nodes in XML documents, which can also be used with HTML.

require 'nokogiri'

# Assuming you have your HTML content in html_content
doc = Nokogiri::HTML(html_content)

# To select the next sibling using XPath
sibling = doc.at_xpath('//*[contains(@class, "your-class")]/following-sibling::*[1]')

# To select the previous sibling using XPath
previous_sibling = doc.at_xpath('//*[contains(@class, "your-class")]/preceding-sibling::*[1]')

Selecting Parent Elements with Nokogiri

To select parent elements of a given node, you can use the parent method or the appropriate XPath selectors.

Using Method Chaining

require 'nokogiri'

# Assuming you have your HTML content in html_content
doc = Nokogiri::HTML(html_content)

# To select the parent of an element with a specific class
parent = doc.at_css('.child-class').parent

Using XPath Selectors

require 'nokogiri'

# Assuming you have your HTML content in html_content
doc = Nokogiri::HTML(html_content)

# To select the parent of an element using XPath
parent = doc.at_xpath('//*[contains(@class, "child-class")]/..')

Examples in Context

Let's say you have the following HTML snippet and you want to select the span element that is a sibling of the div with class my-div, and its parent section.

<section>
  <div class="my-div">This is a div.</div>
  <span>This is a span.</span>
</section>

Select the Sibling span

require 'nokogiri'

html_content = <<-HTML
<section>
  <div class="my-div">This is a div.</div>
  <span>This is a span.</span>
</section>
HTML

doc = Nokogiri::HTML(html_content)
span_sibling = doc.at_css('.my-div').next_sibling # This includes text nodes
span_element_sibling = doc.at_css('.my-div').next_element # This is the <span> element

puts span_element_sibling.to_html # => <span>This is a span.</span>

Select the Parent section

require 'nokogiri'

html_content = <<-HTML
<section>
  <div class="my-div">This is a div.</div>
  <span>This is a span.</span>
</section>
HTML

doc = Nokogiri::HTML(html_content)
parent_section = doc.at_css('.my-div').parent

puts parent_section.to_html # => <section>...</section> (with its inner content)

These examples demonstrate how to navigate through an HTML document using Nokogiri to select sibling and parent elements. Remember to adjust the selectors according to the structure of your HTML document.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon