Re: Feedback on "Offline Web Applications" (Editor's Draft 17 November 2007)

Ian Hickson wrote:
> ...
>> 3. Offline Application Caching APIs -- seems the spec defines a new text 
>> format for defining the application caching. [...] Not sure why this 
>> isn't simply an XML format; instead of defining yet another special text 
>> format with (IMHO) quite obscure parsing rules
> 
> The main reason not to use XML is that defining the error handling for how 
> to process XML is orders of magnitude more complicated than desired here, 
> and, more importantly, it is frequently the case that UAs get it wrong. 
> For example, many UAs of XML-based vocabularies check the namespace of the 
> root element and then ignore the namesapce of other nodes, so that things 
> like:
> 
>    <manifest xmlns="http://www.w3.org/ns/manifest">
>      <file xmlns="http://bogus.example.com/" href="..."/>
>    </manifest>
> 
> ...are treated the same as:
> 
>    <manifest xmlns="http://www.w3.org/ns/manifest">
>      <file href="..."/>
>    </manifest>
> 
> ...and it is hard work to get UAs to get this right.

These UAs are broken and need to be fixed. That they exist is IMHO not 
sufficient reason not to use XML.

> Also, XML is really overkill for this. After all, we only want a list of 
> URLs and URL pairs, having a syntax that allows arbitrary nesting, 
> arbitrary name/value pairs, namespaces, PIs, multiple ways to escape 
> characters, multiple encodings, etc, is unnecessary.

Famous last words :-). I'll just link to: 
<http://norman.walsh.name/2008/05/13/thetax>.

> Finally, there is the draconian error handling problem. We don't want to 
> require that UAs parse the whole manifest before starting to process the 
> manifest, and having UAs fail half-way when they hit a well-formedness 
> error seems suboptimal.

Not convinced (both that doing so would be a problem, and that requiring 
to load the full manifest first would be a problem).

>> However, *what* is defined over there ("Note: This is a willful double 
>> violation of RFC2046.") makes me nervous.
> 
> As I understand it, RFC2046 requires us not to support LF only, which is 
> incompatible with typical workflows on the Web, and requires us to not use 
> UTF-8 as the default, which is somewhat silly in this day and age.

That may be correct, but just saying "we are aware that we are breaking 
a standard", and then not offering any insight into what's going on 
seems to me to be the wrong thing for a spec.

> On Sun, 18 Nov 2007, Julian Reschke wrote:
>> Henri Sivonen wrote:
>>> RFC 2046 was created with email legacy considerations in mind. The 
>>> encoding rules there are not only unhelpful but downright harmful in 
>>> the contemporary HTTP context with UTF-8 decoding readily available.
>>>
>>> The Web needs a text/5 spec.
>> That may be true, but then take that to the relevant standards body, 
>> instead of simply violating a spec on purpose. This seems to follow a 
>> pattern of "we ignore what the specs do, we can do better" with which I 
>> Strongly disagree.
> 
> Is there any chance I can ask you to help us here? You're probably in a 
> better position to take it to the relevant standards body than I am. Any 
> help you could provide here would be great.

I can only recommend to raise the issue where it belongs, probably on 
the ietf-types mailing list.

>> If you don't like the defaults for a text/* format, use application/*.
> 
> It would be sad to deprecate the text/* type just because of an outdated 
> spec.

There are people who think it's not outdated.

 > ...

BR, Julian

Received on Wednesday, 14 May 2008 13:25:13 UTC