Web feeds (sometimes known as news feeds, and more commonly as just “feeds”) are special XML files designed to contain multiple items of content (such as news). They’re commonly used by blogs and news sites as a way for users to subscribe to them. A feed reader reads RSS and Atom feeds (the two most popular feed formats) from the sites the user is subscribed to, and whenever a new item appears within a feed, the user is notified by his or her feed client, which monitors the feed regularly. Most feeds allow users to read a synopsis of the item and to click a link to visit the site that has updated.

■ Note Another common use for feeds has been in delivering podcasts, a popular method of distributing audio content online in a radio subscription–type format.

Processing RSS and Atom feeds has become a popular task in languages such as Ruby. As feeds are formatted in a machine-friendly format, they’re easier for programs to process and use than scanning through inconsistent HTML.
FeedTools (http://sporkmonger.com/projects/feedtools/) is a Ruby library for handling RSS and Atom feeds. It’s available as a RubyGem with gem install feedtools. It’s a liberal feed parser, which means it tends to excuse as many faults and formatting problems in the feeds it reads as possible. This makes it an ideal choice for processing feeds,rather than creating your own parser manually with REXML or another XML library.
For the examples in this section you’ll use the RSS feed provided by RubyInside.com, a popular Ruby weblog. Let’s look at how to process a feed rapidly by retrieving it from the Web and printing out various details about it and its constituent items:

require 'rubygems'
require 'feed_tools'
feed = FeedTools::Feed.open('http://www.rubyinside.com/feed/')
puts "This feed's title is #{feed.title}"
puts "This feed's Web site is at #{feed.link}"
feed.items.each do |item|
  puts item.title + "\n---\n" + item.description + "\n\n"
end

Advertisements