W3C home > Mailing lists > Public > public-html@w3.org > April 2010

Re: looking for the use case for HTML->Atom conversion

From: Tantek Celik <tantek@cs.stanford.edu>
Date: Thu, 15 Apr 2010 18:42:44 +0000
Message-ID: <778843603-1271357179-cardhu_decombobulator_blackberry.rim.net-2108226919-@bda088.bisx.prod.on.blackberry>
To: "Edward O'Connor" <hober0@gmail.com>,"HTML WG" <public-html@w3.org>
Ted wrote:

>The hAtom spec[1] doesn't actually define what to use for <atom:id>.[2]
It *does* define something called the Entry Permalink like so:

Ted,

Thanks for the follow-up and the suggestions for how to improve hAtom to Atom conversion.

Please be sure to add any of the issues you mentioned (if they're not already on the wiki), and certainly add your suggested improvements as possible resolutions to the hatom-issues page you referenced so that they're not lost in email.

Thanks much!

Tantek

------Original Message------
From: Edward O'Connor
Sender: public-html-request@w3.org
To: HTML WG
Subject: Re: looking for the use case for HTML->Atom conversion
Sent: Apr 15, 2010 11:00

Maciej wrote:
> Would using hAtom be a viable option for you, as the second tool apparently
> does already?

hAtom is great--I'm a big fan, and I use it already. In fact, the
widespread use of hAtom in blog templates is one of the sources of
inspiration for <article> & <time pubdate> in the first place. That
said, the converting-hAtom-to-Atom story is actually worse than the
converting-HTML-to-Atom story.

The hAtom spec[1] doesn't actually define what to use for <atom:id>.[2]
It *does* define something called the Entry Permalink like so:

* an Entry Permalink element is identified by rel-bookmark
* an Entry should have an Entry Permalink
* an Entry Permalink element represents the concept of an Atom link in
  an entry
* if the Entry Permalink is missing, use the URI of the page; if the
  Entry has an "id" attribute, add that as a fragment to the page URI to
  distinguish individual entries

So the Entry Permalink is the equvalent of <atom:link>, not <atom:id>.
Also, note the use of RFC2119 SHOULD and the fallback, for every entry,
to the document URL.

The non-normative hAtom parsing document[3] says to use the Entry
Permalink for <atom:id> as well as for an <atom:link>, and this is
almost what hAtom2Atom implements. If you ran a (valid hAtom) page with
several entries, all of which fail to provide a permalink (and lack
id="") through hAtom2Atom, the resultant <atom:entry>s wouldn't all have
the page's URL for their <atom:id>s, as hAtom specifies. That would be
bad enough, but hAtom2Atom doesn't implement the fallback to the
document URL--it generates empty <atom:id/>s instead, and so produces
invalid Atom. Here's a test case:

http://edward.oconnor.cx/tests/html5/ISSUE-86/hAtom-no-id.html

Run through hAtom2Atom:

http://lukearno.com/projects/hatom2atom/?url=http://edward.oconnor.cx/tests/html5/ISSUE-86/hAtom-no-id.html&ctype=application/atom%2Bxml&tidy=yes

So converting hAtom to Atom with hAtom2Atom suffers from worse Atom
conformance issues than the HTML5 spec's HTML to Atom algorithm. Empty
<atom:id/>s are worse than unstable <atom:id>s in my book. Software that
implemented the Entry Permalink fallback correctly would suffer from a
worse <atom:id> story too, because in the above scenario all of the
distinct <atom:entry>s in the feed would share the same <atom:id>.


Ted

1. http://microformats.org/wiki/hatom
2. There are several open hAtom issues related to feed and entry IDs:
   http://microformats.org/wiki/hatom-issues#Entry_id_.28atom:id.29
   http://microformats.org/wiki/hatom-issues#Feed_id_.28atom:id.29
   http://microformats.org/wiki/hatom-issues#Relationship_of_rel-bookmark_to_url.2Buid
   http://microformats.org/wiki/hatom-issues#add_url_property_to_hentry
3. http://microformats.org/wiki/hatom-parsing
Received on Thursday, 15 April 2010 18:46:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:17:07 GMT