W3C home > Mailing lists > Public > public-html@w3.org > April 2010

Re: looking for the use case for HTML->Atom conversion

From: Edward O'Connor <hober0@gmail.com>
Date: Thu, 15 Apr 2010 11:00:23 -0700
Message-ID: <s2l3b31caf91004151100w838090bbg331832812ce72ea2@mail.gmail.com>
To: HTML WG <public-html@w3.org>
Maciej wrote:
> Would using hAtom be a viable option for you, as the second tool apparently
> does already?

hAtom is great--I'm a big fan, and I use it already. In fact, the
widespread use of hAtom in blog templates is one of the sources of
inspiration for <article> & <time pubdate> in the first place. That
said, the converting-hAtom-to-Atom story is actually worse than the
converting-HTML-to-Atom story.

The hAtom spec[1] doesn't actually define what to use for <atom:id>.[2]
It *does* define something called the Entry Permalink like so:

* an Entry Permalink element is identified by rel-bookmark
* an Entry should have an Entry Permalink
* an Entry Permalink element represents the concept of an Atom link in
  an entry
* if the Entry Permalink is missing, use the URI of the page; if the
  Entry has an "id" attribute, add that as a fragment to the page URI to
  distinguish individual entries

So the Entry Permalink is the equvalent of <atom:link>, not <atom:id>.
Also, note the use of RFC2119 SHOULD and the fallback, for every entry,
to the document URL.

The non-normative hAtom parsing document[3] says to use the Entry
Permalink for <atom:id> as well as for an <atom:link>, and this is
almost what hAtom2Atom implements. If you ran a (valid hAtom) page with
several entries, all of which fail to provide a permalink (and lack
id="") through hAtom2Atom, the resultant <atom:entry>s wouldn't all have
the page's URL for their <atom:id>s, as hAtom specifies. That would be
bad enough, but hAtom2Atom doesn't implement the fallback to the
document URL--it generates empty <atom:id/>s instead, and so produces
invalid Atom. Here's a test case:


Run through hAtom2Atom:


So converting hAtom to Atom with hAtom2Atom suffers from worse Atom
conformance issues than the HTML5 spec's HTML to Atom algorithm. Empty
<atom:id/>s are worse than unstable <atom:id>s in my book. Software that
implemented the Entry Permalink fallback correctly would suffer from a
worse <atom:id> story too, because in the above scenario all of the
distinct <atom:entry>s in the feed would share the same <atom:id>.


1. http://microformats.org/wiki/hatom
2. There are several open hAtom issues related to feed and entry IDs:
3. http://microformats.org/wiki/hatom-parsing
Received on Thursday, 15 April 2010 18:01:16 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:16:01 UTC