XML Tutorials
  Home arrow XML Tutorials arrow Page 2 - RSS 2.0
Codewalker Forums 
  Tutorials  
Database Articles  
Miscellaneous  
Navigation Usability  
PEAR Articles  
Programming Basics  
Server Administration  
XML Tutorials  
  Reviews  
Database Book Reviews  
Linux Book Reviews  
Miscellaneous Reviews  
PHP Book Reviews  
PHP Software Reviews  
Server Admin Reviews  
SQL Tool Reviews  
  Code Gallery  
Content Management Code  
Contest Code  
Counters Code  
Database Code  
Date Time Code  
Discussion Board Code  
Email Code  
File Manipulation Code  
GUI Code  
Link Farm Code  
Miscellaneous Code  
Search Code  
Site Navigation Code  
User Management Code  
Forums Sitemap 
Dedicated Servers  
Download TestComplete 
JMSL Numerical Library 
IBM® developerWorks
Weekly Newsletter 
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
XML TUTORIALS

RSS 2.0
By: O'Reilly Media
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 1
    2008-04-03

    Table of Contents:
  • RSS 2.0
  • The Basic Structure
  • item Elements
  • The Simplest Possible RSS 2.0 Feed

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    RSS 2.0 - The Basic Structure


    (Page 2 of 4 )

    The top level of an RSS 2.0 document is the
    rss version="2.0" element. This is followed by a single channel element. The channel element contains the entire feed contents and all associated metadata.

    Required Channel Subelements

    There are 3 required and 16 optional subelements of channel within RSS 2.0. Here are the required subelements:

    title

    The name of the feed. In most cases, this is the same name as the associated web site or service.

      <title>RSS and Atom</title>

    link

    A URL pointing to the associated resource, usually a web site. The link must be an IANA-registered URI scheme, such as http://, https://, news://, or ftp://, though it isn't necessary for a application developer to support all these by default. The most common by a large margin is http://. For example:

      <link>http://www.benhammersley.com </link>

    description

    Some words to describe your channel.

      <description>This is a nice RSS 2.0 feed of an even nicer weblog</description>

    Although it isn't explicitly stated in the specification, it is highly recommended that you do not put anything other than plain text in the channel/title or
    channel/description elements. There are some existing feeds with HTML within those elements, but these cause a considerable amount of wailing, and at least a small amount of gnashing of teeth. Do not do it. Use plain text only in these elements. The following sidebar, "Including HTML Within title or description," gives a fuller account of this, but in my opinion it's a bad idea.

    Optional Channel Subelements

    There are 16 optional channel subelements of RSS 2.0. Technically speaking, you can leave these out altogether. However, I encourage you to add as many as you can. Much of this stuff is static; the content of the element never changes. Placing it into your RSS template or adding another line to a script is little work for the additional value of your feed's metadata. This is especially true for the first three subelements listed here:


    Including HTML Within title or description

    Since the early days of RSS 0.91, there's been an ongoing debate about whether the item/title or item/description elements may, or should, contain HTML. In my opinion, they should not, for both practical and philosophical reasons. Practically speaking, including HTML markup requires the client software to be able to parse or filter it. While this is fine with many desktop agents, it restricts developers looking for other uses of the data. This brings us to the philosophical aspect. RSS's second use, after providing headlines and content to desktop readers and sites, is to provide indexable metadata. By combining presentation and content (i.e., by including HTML markup within the description element), you could disable this feature.

    However, my opinion lost out on this one. RSS 2.0 now allows for entity-encoded HTML within the item/description tag. It doesn't mention anything, in either direction, regarding item/title, and people are basically making it up as they go along. With that in mind, I still state that item/title at least should be considered plain text.

    If you want to put HTML within the item/description element, you can do it in two ways:

    Entity encoding

    With entity encoding, the angle brackets of HTML tags are converted to their respective HTML entities, &lt; and &gt;. If you need to show angle brackets as literal characters, the ampersand character itself should be encoded as well:

      This is a &lt;em&gt;lovely left angle bracket:&lt;/em&gt; &amp;lt;

    Within a CDATA block

    The alternative is to enclose the HTML within a CDATA block. This removes one level of entity encoding, as in:

      <![CDATA[This is a <em>lovely left angle bracket:</em> &lt;]]>

    Either approach is acceptable according to the specification, and there is no way for a program to tell the difference between the two, or to tell if the description is actually just plain text that resembles encoded HTML. This is a major problem with the RSS 2.0 specification, as you'll see when we talk about parsing feeds. Atom and RSS 1.0 both have their own ways around this issue.


    language

    The language the feed is written in. This allows aggregators to index feeds by language and should contain the standard Internet language codes as per RFC 1766.

      <language>en-US</language>

    copyright

    A copyright notice for the content in the feed:

      <copyright>Copyright 2004 Ben Hammersley</copyright>

    managingEditor

    The email address of the person to contact for editorial enquiries. It should be in the format:
    name @example.com (FirstName LastName).

      <managingEditor>ben@benhammersley.com (Ben Hammersley)</managingEditor>

    webMaster

    The email address of the person responsible for technical issues with the feed:

        
      <webMaster>techsupport@benhammersley
    .com (Geek McNerdy)</webMaster>

    pubDate

    The publication date of the content within the feed. For example, a daily morning newspaper publishes at a certain time early every morning. Technically, any information in the feed should not be displayed until after the publication date, so you can set pubDate to a time in the future and expect that the feed won't be displayed until after that time. Few existing RSS readers take any notice of this element in this way, however. Nevertheless, it should be in the format outlined in RFC 822:

      <pubDate>Sun, 12 Sep 2004 19:00:40 GMT</pubDate>

    lastBuildDate

    The date and time, RFC 822--style, when the feed last changed. Note the difference between this and channel/pubDate. lastBuildDate must be in the past. It is this element that feed applications should take as the "last time updated" value and not channel/pubDate.

      <pubDate>Sun, 12 Sep 2004 19:01:55 GMT</pubDate>

    category

    Identical in syntax to the item/category element you'll see later. This takes one optional attribute, domain. The value of category should be a forward-slash-separated string that identifies a hierarchical location in a taxonomy represented by the domain attribute. Sadly, there is no consensus either within the specification or in the real world as to any standard format for the domain attribute. It would seem most sensible to restrict it to a URL; however, it needn't necessarily be so.

      <category domain="Syndic8">1765</category>

    generator

    This should contain a string indicating which program created the RSS file:

      <generator>Movable Type v3.1b3</generator>

    docs

    A URL that points to an explanation of the standard for future reference. This should point to http://blogs.law.harvard.edu/tech/rss:

       <docs>http://blogs.law.harvard.edu/ tech/rss</docs>

    cloud

    The <cloud/> element enables a rarely used feature known as "Publish and Subscribe," which we shall investigate fully in Chapter 9. It takes no value itself, but it has five mandatory attributes, themselves also explained in Chapter 9: domain, path, port, registerProcedure, and protocol.

      <cloud domain="rpc.sys.com" port="80" path="/RPC2" registerProcedure= "pingMe"
      protocol="soap"/>

    ttl

    ttl, short for Time-to-Live, should contain a number, which is the minimum number of minutes the reader should wait before refreshing the feed from its source. Feed authors should adjust this figure to reflect the time between updates and the number of times they wish their feed to be requested, versus how up to date they need their consumers to be.

      <ttl>60</ttl>

    image

    This describes a feed's accompanying image. It's optional, but many aggregators look prettier if you include one. It has three required and two optional subelements of its own:

    url

    The URL of a GIF, JPG, or PNG image that corresponds to the feed. It is, quite obviously, required.

    title

    A description of the image, normally used within the ALT attribute of HTML's <img> tag. It is required.

    link

    The URL to which the image should be linked. This is usually the same as the channel/link.

    width and height

    The width and height of the icon, in pixels. The icons should be a maximum of 144 pixels wide by 400 pixels high. The emergent standard is 88 pixels wide by 31 pixels high. Both elements are optional.

    <image> <title>RSS2.0 Example</title> <url>http://www.exampleurl.com/example/ images/logo.gif</url> <link>http://www.exampleurl.com/example/ index.html</link>
    <width>88</width> <height>31</height> <description>The Worlds Leading Technical Publisher</description> </image>

    rating

    The PICS rating for the feed; it helps parents and teachers control what children access on the Internet. More information on PICS can be found at http://www. w3.org/PICS/. This labeling scheme is little used at present, but an example of a PICS rating would be:

      <rating>(PICS-1.1 "http://www.gcf.org/v2.5" labels on "1994.11.05T08:15-0500"
      until 1995.12.31T23:59-0000" for http://w3.org/PICS/Overview.html ratings
      (suds 0.5 density 0 color/hue 1))</rating>

    textInput

    An element that lets RSS feeds display a small text box and Submit button, and associates them with a CGI application. Many RSS parsers support this feature, and many sites use it to offer archive searching or email newsletter sign-ups, for example. textInput has four required subelements:

    title

    The label for the Submit button. It can have a maximum of 100 characters.

    description

    Text to explain what the textInput actually does. It can have a maximum of 500 characters.

    name

    The name of the text object that is passed to the CGI script. It can have a maximum of 20 characters.

    link

    The URL of the CGI script.

      <textInput> <title>Search</title> <description>Search the Archives</
      description> <name>query</name> <link>http://www.exampleurl.com/example/
      search.cgi</link> </textInput>

    skipDays and skipHours

    A set of elements that can control when a feed user reads the feed. skipDays can contain up to seven day subelements: Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, or Sunday. skipHours contains up to 24 hour subelements, the numbers 1--24, representing the time in Greenwich Mean Time (GMT). The client should not retrieve the feed during any day or hour listed within these two elements. The elements are ORed not ANDed: in the example here, the application is instructed not to request the feed during 8 p.m. on any day, and never on a Monday:

      <skipDays><day>Monday</day> </skipDays><skipHours><hour>20</hour></skipHours>

    More XML Tutorials Articles
    More By O'Reilly Media


       · This article is an excerpt from the book "Developing Feeds with RSS and Atom,"...
     

    Buy this book now. This article is excerpted from chapter four of the book Developing Feeds with RSS and Atom, written by Ben Hammersley (O'Reilly; ISBN: 0596008813). Check it out today at your favorite bookstore. Buy this book now.

    XML TUTORIALS ARTICLES

    - Creating RSS 2.0 Feeds
    - Using Modules in Your RSS Feed
    - RSS 2.0
    - Querying XML: Use Cases
    - Joins and Query Use with XML
    - Solving Problems by Querying XML
    - Performing Set Operations When Querying XML
    - Querying XML
    - Handling Data for Ajax with JSON
    - Handling XML Data for Ajax
    - XML and JSON for Ajax






    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 4 hosted by Hostway