- From: Danny Ayers <danny.ayers@gmail.com>
- Date: Fri, 3 Feb 2006 11:02:08 +0100
- To: Semantic Web <semantic-web@w3.org>, rss-dev@yahoogroups.com, atom-owl@googlegroups.com
The ugliness of escaped content in XML has been known for a while around RSS. Atom (RFC 4287 [1]) allows you to use full (namespace-qualified) XHTML content. Norm Walsh has declared [2] he's going kill off his RSS feeds leaving only Atom, which doesn't seem an unreasonable course of action under the circumstances. However, unlike RSS 1.0, Atom isn't RDF/XML, so there's a hunk of baby going with this bathwater. As Dan Connolly suggests [3], there should be a way of fixing RSS/RDF content through the use of rdf:parseType="Literal". I believe this is the approach taken with RSS 1.1, but that specification has (so far) failed to get significant adoption, so a tweak of RSS 1.0 would seem preferable. I can't see a perfect solution, but here are some options - One way would be to add this attribute to the content:encoded element (as used in RSS) and keep the content as XHTML. However the very definition of the property is that the markup is escaped, so this seems a little perverse. It may make pragmatic sense : the addition of the attribute is unlikely to cause too many problems with existing aggregators/newsreaders, they're liberal and usually ignore unknown elements/attributes. Another possibility would be to define a new property, something like content:literal, to use in place of (or in addition to) content:encoded, and again use rdf:parseType="Literal". One final approach that comes to mind is to take advantage of the (in-progress) mapping of Atom to RDF/OWL [4]. Unfortunately the content element is one place at which Atom syntax can't trivially be read as RDF/XML, for example this is what it looks like in one of Norm's entries: <content type="xhtml" xml:base="http://norman.walsh.name/2006/02/01/rssrip"> <div xmlns="http://www.w3.org/1999/xhtml"> ... </div> <content> Here's how it looks with the current Atom/OWL mapping, as an entry in a feed (cut down for clarity) : <?xml version="1.0" encoding="utf-8"?> <rdf:RDF xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:awol="http://www.w3.org/2005/10/23/Atom#" > <item rdf:about="http://norman.walsh.name/2006/02/01/docbook50b3"> <title>DocBook V5.0b3</title> <awol:content xml:base="http://norman.walsh.name/2006/02/01/docbook50b3" rdf:parseType="Resource"> <awol:type>xhtml</awol:type> <rdf:value rdf:parseType="Literal"> <div xmlns="http://www.w3.org/1999/xhtml"> <p id="p1">This release includes the changes <a href="http://docbook.org/minutes/2006-01-18.txt" shape="rect">agreed</a> at the 18 Jan 2006 meeting.</p> </div> </rdf:value> </awol:content> </item> </rdf:RDF> I've used the awol prefix to avoid confusion with the Atom namespace proper, which is the same string with the trailing #. (There's an unfinished atom2rdfxml.xsl with other bits and pieces at [5]). Cheers, Danny. [1] http://www.ietf.org/rfc/rfc4287 [2] http://norman.walsh.name/2006/02/01/rssrip [3] http://dig.csail.mit.edu/breadcrumbs/node/78 [4] http://atomowl.org [5] http://pragmatron.org/trac/browser/pragmatron/atom-owl/ -- http://dannyayers.com
Received on Friday, 3 February 2006 10:02:13 UTC