Re: Producing Atom

On Oct 4, 2009, at 4:11 PM, Sam Ruby wrote:

> Ian Hickson wrote:
>> On Sun, 4 Oct 2009, Sam Ruby wrote:
>>>>> Furthermore, why couldn't additional specifications define  
>>>>> additional information to be placed into the feed?
>>>> They could, but that wouldn't change whether HTML5's algorithm  
>>>> alone produced conforming Atom.
>>> What does an HTML5 to Atom algorithm need to be in the HTML5 spec?
>>>
>>> If it were in a separate spec, it could normatively reference the  
>>> vCard algorithm.
>> The point presumably is not just a political "HTML shouldn't  
>> reference vCard"; my understanding is that there are technical  
>> reasons for which we would want the basic HTML algorithms to not  
>> reference specific microdata vocabularies. Otherwise, there  
>> wouldn't be any problem with just having the algorithm in HTML5 and  
>> referencing he vocabulary, presumably. Putting the algorithm in a  
>> separate spec presumably wouldn't change the technical requirments  
>> here.
>> The HTML-to-Atom conversion IMHO should be in HTML5 for the same  
>> reason the HTML-to-RDF and HTML-to-JSON conversions are in HTML5:  
>> they are a key part of the language and address specific use cases  
>> that have been brought forward. Being able to interpret HTML as a  
>> feed is a fundamental aspect of the language, it's not an arbitrary  
>> secondary feature. (For example, it is the reason behind the  
>> existence of <time pubdate> and its requirements.)
>
> The original statement was "the HTML-to-Atom conversion algorithm  
> can no longer output valid Atom".  Stated that way, it is a bug.
>
> "whether HTML5's algorithm alone produced conforming Atom." is a  
> related, but separate issue.
>
> As currently documented, "a user agent must run the following  
> algorithm to extract an Atom feed".  Documenting the minimum  
> information that must be included in an Atom feed is goodness.  But  
> specifying it as "THE" algorithm without any provisions for other  
> information to be included is a problem.  You said "information  
> added by particular vendors".  That's not what I intended to  
> discuss.  I described another possibility: other relevant  
> standards.  When you said that "It's also not really relevant", it  
> became clear that you are on a different point that I am.
>
> Just so it is clear: if HTML provides a conversion algorithm to  
> Atom, and there is no possibility for that conversion algorithm to  
> produce valid Atom, then that's the bug I want to discuss.

Here's the algorithm: <http://dev.w3.org/html5/spec/ 
Overview.html#atom>. It looks to me like it will never add an  
<atom:author> element, so its output is always invalid Atom.

> If it is possible for the conversion algorithm to produce non- 
> conforming Atom, then I believe that an informative statement to  
> that effect is in order, and ideally in that informative statement  
> some guidance should be provided.

As written, it's not only possible but necessary.

> I have no problem with that statement discouraging vendor-specific  
> proprietary extensions, and encouraging vendor neutral extensions to  
> this specification where appropriate.

Here's some possible options:

1) Leave the HTML-to-Atom algorithm in HTML5, generating Atom that is  
always nonconforming.
2) Remove the HTML-to-Atom algorithm from the HTML5 spec (perhaps it  
can be in a separate specification).
3) Define the conversion algorithm in HTML5, but have it require the  
inclusion of <atom:author> in some way that HTML5 itself does not  
specify. Other specifications may fill in the gap, but HTML5 won't  
reference them.
4) Leave the HTML-to-Atom algorithm in HTML5, generating Atom that is  
always nonconforming, but allowing arbitrary additional information to  
be added to the Atom output as part of the conversion. Other  
specifications may fill in the gap, but HTML5 won't reference them.
5) HTML5 references the vCard vocabulary and specifies how to include  
<atom:author> info, thus rendering its default Atom output conforming.
6) Option 3, but HTML5 does reference the spec that describes how to  
include author info.
7) Option 4, but HTML5 does reference the spec that describes how to  
include author info.

I think #1 is unacceptable, because I believe generating noncomforming  
Atom does not satisfy the use cases for HTML-to-Atom conversion. I I  
will leave it to others to determine which of the other options, if  
any, might be acceptable. In my opinion, all of the other options are  
effectively equivalent to either #2 or #5.

> Would it be helpful if I opened a bug in bugzilla?

It would be useful to be clear about what the bug is, and possible  
ways to resolve it.

Regards,
Maciej

Received on Sunday, 4 October 2009 23:53:55 UTC