- From: Rob Lanphier <robla@real.com>
- Date: Tue, 28 May 2002 18:14:37 -0700 (Pacific Daylight Time)
- To: Tim Bray <tbray@textuality.com>
- cc: "www-tag@w3.org" <www-tag@w3.org>
Hi Tim, Thanks for your response. I agree that the issue is still pretty fuzzy, and so here's my attempt to scare the issue out from the underbrush. More inline On Tue, 28 May 2002, Tim Bray wrote: > Rob Lanphier wrote: > > Summary: this is a request that the TAG issue a finding regarding > > appropriate error resilience/recovery/"second guessing" in web software. > > The TAG spent some time on this on May 27, and while there's an issue > lurking out there, it needs a bit more cooking before we're ready to > take it up officially. > > > * Should future XML-based language specifications from W3C extend > > traditional XML strictness into attribute values and other areas left > > undefined by XML? > > The answer seems to be "it depends". I'm having trouble imagining what > kind of thing we could say that would cover the general case. Is there > a general case here? Maybe not, and maybe that's what needs to be said (say explicitly "it depends" somewhere, rather than have people assume one thing or another). The IETF has a well-known motto "Be liberal in what you accept, and conservative in what you send". It's documented in more detail here: http://www.ietf.org/rfc/rfc2360.txt (section 2.9) The strictness embodied in XML departs from that principle (though not as far from the detailed explanation in RFC 2360 as one might think). I think it would be very helpful for the TAG to somehow adapt section 2.9 of RFC 2360 to the Web. > > > * Should specifications be clear on what is safe to ignore? (I would > > hope so....not always the case, so perhaps this should be written > down) > > * When is it safe to specify that unknown issues can be ignored > > ("ignorability"), and when must specification writers not allow > > ignorability? > > Same comment, really. I'm having trouble seeing the general case or > imagining what a TAG finding could say. I guess I'd like to call attention to what the TAG has already said: "An example of incorrect and dangerous behavior is a user-agent that reads some part of the body of a response and decides to treat it as HTML based on its containing a <!DOCTYPE declaration or <title> tag, when it was served as text/plain or some other non-HTML type." There's clearly a bigger architectural principle driving that statement than solely the second-guessing of a media type based on content. I'm having trouble teasing that out myself. Here's an attempt at a pithy phrase: "Do what I say, not what I 'mean'". I.e. if I say that something is text/plain, then IT'S TEXT/PLAIN ALREADY!!! I DIDN'T "MEAN" TEXT/HTML!!!!! :) The document should be treated as text/plain. In general, HTML documents marked as "text/plain" are totally valid "text/plain" documents. It's not as though there's an error, per se. Therefore, legitimate responses from webservers should not be treated as "errors" even if there's a 99% probability that what is seen is a configuration problem. One other issue to give guidance on is "how should vendor-specific extensions be allowed". Some W3C specifications, like SMIL 2.0 Language, are very strict in allowing only elements and attributes that are explicited scoped or qualified to a namespace URI are allowed. This has the effect that *all* proprietary extensions to SMIL 2.0 must be traceable to a URI. See the following section of the SMIL document for the nitty-gritty of this: http://www.w3.org/TR/smil20/smil20-profile.html#SMILProfileNS-LanguageConformance In doing this, the SMIL working group thought we were going in the direction that the consortium as a whole was tending toward. If we misread the situation, or if circumstances have changed, then it would be good to know what the current read on matters is. Clearly, current document formats are what they are, but it would be good to provide guidance to new working groups as to how strict new document formats should be. > I'm trying not to diss the issue that Rob's raising here. Clearly the > decision as to how-liberal-to-be-in-what-you-accept is architectural in > scope. On the other hand, the W3C specifies languages designed for > authorship by nontechnical humans, protocols for significant e-commerce > payloads, and pretty well everything in between, so is there an > architectural principle that cuts across the spectrum? For example, I > (perhaps in a minority) am OK with HTML processors being very liberal in > what they accept; it helps let everyone publish to the web. I also > believe that XML's draconian error-handling was the right design decision. Should the W3C ever publish a new recommendation that is as liberal as HTML, or is that seen as a legacy issue? If XML's draconian error-handling was the right decision, is that the right decision for more than XML? This is the type of question it would be nice to answer. Perhaps I'm bundling up many issues in a single request, but I'm not sure I know how to break this up into bite-sized chunks. If there are suggestions, I'd like to hear them. Thanks, Rob
Received on Tuesday, 28 May 2002 21:13:33 UTC