- From: Michael(tm) Smith <mike@w3.org>
- Date: Wed, 17 Jun 2009 20:03:01 +0900
- To: Shane McCarron <shane@aptest.com>
- Cc: public-html@w3.org
[cc trimmed] Shane McCarron <shane@aptest.com>, 2009-06-17 00:01 -0500: > Most modern standards and "recommendations" define what happens > when presented with conforming input, and implicitly or explicitly state > that non-conforming input results unspecified or undefined behavior > (see *all* the IEEE POSIX standards, for example). The unique scale and scope of the problems we need to address in developing rules for optimal-for-the-needs-of-actual-end-users processing of resources on the public Web may be necessarily very different than those of most other standards. So I think it is maybe to be expected that the standards we develop in this group might need to be somewhat different from most other standards. But that said, in my previous experience with development of e-mail software and services for mobile operators and mobile-device makers and ISPs, I can think of a number of cases where we had to add ad-hoc error-handling behavior to our applications, downstream, to deal with processing of cases of non-conforming input being generated by third-party upstream systems over which we had no control -- cases for which whatever relevant standard left the behavior unspecified and undefined. And I'm certain that other vendors who had downstream apps that needed to process that same broken input also had to guess out and add their own ad-hoc error-handling behavior, very likely in a way that was not fully interoperable with what we ended up deploying. So I guess I'm not convinced that other standards for handling Internet content shouldn't also be specced out with a degree of rigor and thoroughness in defining (and iteratively updating during further development and testing) rules for error-handling similar to what we have ended up with in the current HTML5 draft (and other drafts we have now that are related or spun-off from it). > It is an unparalleled act of hubris that this work attempts not > only to define good behavior, but also to define *positive* > behavior in the face of bad data. That's just insane. If it > is critical to define the behavior of implementations when > handed invalid input, then say that the implementation is > required to tell the user "the input is broken"! In many cases, doing that would seem to amount to punishing end users for mistakes made by authors. Or maybe the producer of the input which has the errors may be a broken upstream implementation (or content provider or author) over which the users of the downstream consumer apps receiving the data have zero control. For that case, having the client/consumer application simply tell the end user "the input is broken" and refuse to process it does not seem like right solution to the problem. Deploying to any significant number of users a client that behaved that way for that case would seem to incur costs to respond to customer-support inquiries from those end users -- and maybe a great way to hand competing vendors an opportunity to take business away from you (by providing clients that are capable of processing the input in spite of the errors). On that note, there's what seems to me a relevant bit in this essay from back 2004: http://dbaron.org/www/df-frag ...While solution by the market may not sound inherently bad, it is worth remembering that the rules for error-handling in traditional HTML were solved by the market, and the end result was bad for competition and bad for small devices... In practice, it seems like for a lot of cases we are left with the same kind of choice: When there is any market value for a client that can correct a particular type of error in some class of input, we can all collectively either: A. Follow (repeat) the strategy of waiting and watching as the whims of the market (which in some cases largely just amounts to whatever vendor has a majority market share) determines for us what the error-handling rules are going to end up being. B. Work together to consider the possible error cases up front and try to get agreement on what the error-handling behavior for those cases should be, and spec it out. And to iterate through that process as we new kinds of error cases develop (or are discovered). > Defining that an implementation will somehow cleverly deal with > every broken piece of input is not just silly, it's impossible. In practice, it seems like it's often not a situation of trying to anticipate and define how to deal with every possible piece of broken input, but instead a situation of defining how to deal with actual cases that we're already aware of -- whether it's because they're discovered during testing or are found in instances of existing content developed by actual users or early adopters, or whatever -- and iterating back through that process after further testing (or even deployment) expose more such cases. --Mike -- Michael(tm) Smith http://people.w3.org/mike/
Received on Wednesday, 17 June 2009 12:46:38 UTC