Is it possible to modify HTML or XML content with Nokogiri?

Yes, it is possible to modify HTML or XML content with Nokogiri, which is a Ruby gem that provides an easy way to parse and manipulate these types of documents. Nokogiri allows you to search documents using XPath or CSS selectors and then alter the content, attributes, and structure of the document.

Below is an example of how you can modify HTML content using Nokogiri in Ruby:

require 'nokogiri'

# Sample HTML content
html_content = <<-HTML
<html>
  <head>
    <title>My Sample Page</title>
  </head>
  <body>
    <h1>Welcome to My Sample Page</h1>
    <p id="greeting">Hello, World!</p>
  </body>
</html>
HTML

# Parse the HTML content
doc = Nokogiri::HTML(html_content)

# Modify the title
doc.at_css('title').content = 'My New Title'

# Add a class to the paragraph
doc.at_css('#greeting')['class'] = 'welcome-message'

# Change the text of the paragraph
doc.at_css('#greeting').content = 'Hello, Nokogiri!'

# Add a new element
new_paragraph = Nokogiri::XML::Node.new('p', doc)
new_paragraph.content = 'This is a new paragraph.'
doc.at_css('body') << new_paragraph

# Print the modified HTML
puts doc.to_html

The example above does the following:

  1. Parses the given HTML content into a Nokogiri document.
  2. Modifies the content of the <title> tag.
  3. Adds a new class to the paragraph with the id greeting.
  4. Changes the text of the paragraph with the id greeting.
  5. Creates a new paragraph and appends it to the <body> of the HTML.

After running this code, the output will reflect the changes made to the original HTML:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>My New Title</title>
  </head>
  <body>
    <h1>Welcome to My Sample Page</h1>
    <p id="greeting" class="welcome-message">Hello, Nokogiri!</p>
    <p>This is a new paragraph.</p>
  </body>
</html>

Similarly, if you were dealing with an XML document, you would use Nokogiri::XML instead of Nokogiri::HTML to parse the document, and the modification methods would remain the same.

Keep in mind that Nokogiri is a Ruby-specific library, and there isn't a direct equivalent in JavaScript. However, for similar operations in JavaScript, you can use libraries like cheerio (for server-side manipulation with a jQuery-like syntax) or the built-in DOM API in the browser.

Related Questions

Get Started Now

WebScraping.AI provides rotating proxies, Chromium rendering and built-in HTML parser for web scraping
Icon