- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Thu, 08 Apr 2010 14:09:01 +0200
- To: Maciej Stachowiak <mjs@apple.com>
- CC: "public-html@w3.org WG" <public-html@w3.org>
On 07.04.2010 02:46, Maciej Stachowiak wrote: > > Thank you for your submission! > > Recorded here: http://dev.w3.org/html5/status/issue-status.html#ISSUE-086 > > Regards, > Maciej > ... Below is a slightly updated version, based on feedback from the atom-syntax mailing list. -- snip -- Revision 2; taking feedback from the Atom-Syntax mailing list into account. SUMMARY The HTML5 spec contains an algorithm for producing an Atom (RFC4287) feed document from an HTML page. The definition both relaxes a MUST-level requirement from RFC4287, but also adds a needless restriction. Also, it's not clear *at all* whether this is a feature that people really want, and if they do, whether it needs to be part of HTML5. Given the fact that it's non-trivial to generate a valid Atom feed from HTML, but the reverse *is* trivial, we should also consider removing this feature altogether (I'd be happy to write a 2nd change proposal if people want to see that as well). (See [2]) RATIONALE Instructions to derive a secondary format from HTML documents shouldn't be misleading, and also should make clear which conditions need to be met to produce valid documents. DETAILS There are two problems, both with the following step (4.15.1, step 15.9 as of April 6): "Otherwise Let id be a user-agent-defined undereferenceable yet globally unique valid absolute URL. The same absolute URL should be generated for each run of this algorithm when given the same input. Let has-alternate be false." Problem #1: RFC 4287 does not require the ID to be undereferenceable. This was a conscious decision of the IETF WG. There's absolutely no point in adding this requirement, except for the spec author's distaste for URIs that are both dereferenceable *and* act as a globally unique and stable identifier. Furthermore, there's no way to ensure that a URL is "undereferenceable", or remains so in the future. As soon as a dereferencing service has been written, it's not "undereferenceable" anymore. (See [1]). Note from <http://greenbytes.de/tech/webdav/rfc4287.html#rfc.section.4.2.6.p.2>: "...Though the IRI might use a dereferencable scheme, Atom Processors MUST NOT assume it can be dereferenced." Problem #2: RFC 4287 makes it a MUST-level requirement to generate the same ID every time the feed is regenerated: From <http://greenbytes.de/tech/webdav/rfc4287.html#rfc.section.4.2.6.p.3>: "When an Atom Document is relocated, migrated, syndicated, republished, exported, or imported, the content of its atom:id element MUST NOT change. Put another way, an atom:id element pertains to all instantiations of a particular Atom entry or feed; revisions retain the same content in their atom:id elements. It is suggested that the atom:id element be stored along with the associated resource." HTML5 relaxes this to a should-level requirement. I do agree that generating valid Atom feeds from HTML *is* hard, but violating a MUST-level requirement from the Atom spec is not acceptable. Proposed changes: For issue #1: Leave out "undereferencable", changing the sentence to: "Let id be a user-agent-defined yet globally unique valid absolute URL." For issue #2: Change "The same absolute URL should be generated for each run of this algorithm when given the same input." to "The same absolute URL must be generated for each run of this algorithm when given the same input. If this requirement can not be fulfilled, then generating a valid Atom feed is not possible and this algorithm should be aborted." IMPACT 1. Positive Effects Consistency between the applicable specs. Also, authors are correctly informed about what it takes to generate proper Atom feeds. 2. Negative Effects None. 3. Conformance Classes Changes Atom feed generators are actually required to generate valid Atom documents (with respect to atom:id). 4. Risks None. REFERENCES [1] <http://www.imc.org/atom-syntax/mail-archive/msg21400.html> [2] <http://www.imc.org/atom-syntax/mail-archive/msg21396.html>
Received on Thursday, 8 April 2010 12:09:39 UTC