Re: Producing Atom from Sam Ruby on 2009-10-05 (public-html@w3.org from October 2009)

From: Sam Ruby <rubys@intertwingly.net>
Date: Sun, 04 Oct 2009 20:22:50 -0400
To: Maciej Stachowiak <mjs@apple.com>
CC: Ian Hickson <ian@hixie.ch>, public-html@w3.org
Message-ID: <4AC93C5A.1040607@intertwingly.net>
Maciej Stachowiak wrote:
> 
> On Oct 4, 2009, at 4:11 PM, Sam Ruby wrote:
> 
>> Ian Hickson wrote:
>>> On Sun, 4 Oct 2009, Sam Ruby wrote:
>>>>>> Furthermore, why couldn't additional specifications define 
>>>>>> additional information to be placed into the feed?
>>>>> They could, but that wouldn't change whether HTML5's algorithm 
>>>>> alone produced conforming Atom.
>>>> What does an HTML5 to Atom algorithm need to be in the HTML5 spec?
>>>>
>>>> If it were in a separate spec, it could normatively reference the 
>>>> vCard algorithm.
>>> The point presumably is not just a political "HTML shouldn't 
>>> reference vCard"; my understanding is that there are technical 
>>> reasons for which we would want the basic HTML algorithms to not 
>>> reference specific microdata vocabularies. Otherwise, there wouldn't 
>>> be any problem with just having the algorithm in HTML5 and 
>>> referencing he vocabulary, presumably. Putting the algorithm in a 
>>> separate spec presumably wouldn't change the technical requirments here.
>>> The HTML-to-Atom conversion IMHO should be in HTML5 for the same 
>>> reason the HTML-to-RDF and HTML-to-JSON conversions are in HTML5: 
>>> they are a key part of the language and address specific use cases 
>>> that have been brought forward. Being able to interpret HTML as a 
>>> feed is a fundamental aspect of the language, it's not an arbitrary 
>>> secondary feature. (For example, it is the reason behind the 
>>> existence of <time pubdate> and its requirements.)
>>
>> The original statement was "the HTML-to-Atom conversion algorithm can 
>> no longer output valid Atom".  Stated that way, it is a bug.
>>
>> "whether HTML5's algorithm alone produced conforming Atom." is a 
>> related, but separate issue.
>>
>> As currently documented, "a user agent must run the following 
>> algorithm to extract an Atom feed".  Documenting the minimum 
>> information that must be included in an Atom feed is goodness.  But 
>> specifying it as "THE" algorithm without any provisions for other 
>> information to be included is a problem.  You said "information added 
>> by particular vendors".  That's not what I intended to discuss.  I 
>> described another possibility: other relevant standards.  When you 
>> said that "It's also not really relevant", it became clear that you 
>> are on a different point that I am.
>>
>> Just so it is clear: if HTML provides a conversion algorithm to Atom, 
>> and there is no possibility for that conversion algorithm to produce 
>> valid Atom, then that's the bug I want to discuss.
> 
> Here's the algorithm: <http://dev.w3.org/html5/spec/Overview.html#atom>. 
> It looks to me like it will never add an <atom:author> element, so its 
> output is always invalid Atom.

Section 2.2.2 may provide an out "extension specification can be written 
that overrides the requirements in this specification".

>> If it is possible for the conversion algorithm to produce 
>> non-conforming Atom, then I believe that an informative statement to 
>> that effect is in order, and ideally in that informative statement 
>> some guidance should be provided.
> 
> As written, it's not only possible but necessary.

If you draw that conclusion, and the intent was that implementations may 
augment this in any way, then this should be clarified.  Something as 
simple as an informative statement would address the issue.

>> I have no problem with that statement discouraging vendor-specific 
>> proprietary extensions, and encouraging vendor neutral extensions to 
>> this specification where appropriate.
> 
> Here's some possible options:
> 
> 1) Leave the HTML-to-Atom algorithm in HTML5, generating Atom that is 
> always nonconforming.
> 2) Remove the HTML-to-Atom algorithm from the HTML5 spec (perhaps it can 
> be in a separate specification).
> 3) Define the conversion algorithm in HTML5, but have it require the 
> inclusion of <atom:author> in some way that HTML5 itself does not 
> specify. Other specifications may fill in the gap, but HTML5 won't 
> reference them.
> 4) Leave the HTML-to-Atom algorithm in HTML5, generating Atom that is 
> always nonconforming, but allowing arbitrary additional information to 
> be added to the Atom output as part of the conversion. Other 
> specifications may fill in the gap, but HTML5 won't reference them.
> 5) HTML5 references the vCard vocabulary and specifies how to include 
> <atom:author> info, thus rendering its default Atom output conforming.
> 6) Option 3, but HTML5 does reference the spec that describes how to 
> include author info.
> 7) Option 4, but HTML5 does reference the spec that describes how to 
> include author info.
> 
> I think #1 is unacceptable, because I believe generating noncomforming 
> Atom does not satisfy the use cases for HTML-to-Atom conversion. I I 
> will leave it to others to determine which of the other options, if any, 
> might be acceptable. In my opinion, all of the other options are 
> effectively equivalent to either #2 or #5.
> 
>> Would it be helpful if I opened a bug in bugzilla?
> 
> It would be useful to be clear about what the bug is, and possible ways 
> to resolve it.

My point was that #1 is unacceptable.

Other options include moving the Atom production or even all of 
Microdata out of the HTML5 spec.  (Perhaps those specs could reference 
Vcard, perhaps not; but either way, those are separate questions).

As to which of the remaining options are preferable, I have a preference 
for options that leave open the possibility that the production of Atom 
feeds could include information found in the page annotated by either 
the hAtom microformat or RDFa.

- Sam Ruby

> Regards,
> Maciej
> 
> 
>
Received on Monday, 5 October 2009 00:23:29 UTC