Yes, Nokogiri can parse and extract information from RSS or Atom feeds. Nokogiri is a popular Ruby library for parsing HTML and XML, and since RSS and Atom feeds are XML-based formats, Nokogiri is well-suited for this task.
Here's an example of how you could use Nokogiri to parse an RSS feed and extract some basic information from it:
require 'nokogiri'
require 'open-uri'
# URL of the RSS or Atom feed
feed_url = 'http://example.com/feed.xml'
# Open and read the feed
xml_content = open(feed_url).read
# Parse the feed content with Nokogiri
doc = Nokogiri::XML(xml_content)
# Extract information from the feed
doc.xpath('//item').each do |item|
title = item.xpath('title').text
link = item.xpath('link').text
description = item.xpath('description').text
puts "Title: #{title}"
puts "Link: #{link}"
puts "Description: #{description}"
puts "---"
end
In this example, we're using open-uri
to fetch the feed and Nokogiri::XML
to parse it. We then navigate through the XML structure using XPath queries to extract the title
, link
, and description
of each item in the feed.
Please replace 'http://example.com/feed.xml'
with the actual URL of the RSS or Atom feed you want to parse.
Remember to handle exceptions and errors that might occur when fetching or parsing the feed, such as network errors or invalid XML content.
Also, keep in mind that the structure of RSS and Atom feeds can vary slightly, so you may need to adjust the XPath expressions based on the specific format and structure of the feed you're working with.