Re: XHTML Ruby specification error handling rules

On Tue, 5 Jul 2005, Martin Duerst wrote:
> >
> >   http://www.w3.org/TR/ruby/
> >
> >...with the intent of writing test cases and adding Ruby support to the 
> >HTML5 specification [1],
> 
> <rant>
> I'm not really sure it's appropriate to use a name such as HTML5
> for a project that's essentially trying to legitimize broken
> behavior so that some browser manufacturers can try claim that
> they are not just following the broken behavior used by the
> market leader, just at a moment when that market leader finally
> shows some signs of moving ahead, hopefully in the right direction.
> </rant>

The goal of the HTML5 project is to create the next version of HTML. It 
certainly isn't "trying to legitimize broken behavior", and I see no 
reason why browser vendors would want to "claim that they are not just 
following the broken behavior used by the market leader" -- we have been 
claiming this all along, and raising it as the main reason it is 
absolutely critical for specs to include error handling rules.


> >However, I have run into a problem. In the conformance section, it 
> >states that an interpreter must reject non-conformant Ruby markup, but 
> >Web browsers are effectively unable to do this, primarily because Web 
> >authors have been conditioned to expect browsers to handle errors,
> 
> Well, yes, for old crap. There is no such need for new stuff, I hope.

I don't see why. Authors are not going to suddenly become perfect. Just 
try to find instances of valid XHTML on the Web.


> >but also because conformance checking is an expensive operation, and 
> >rendering is a performance-sensitive operation.
> 
> I understand that rendering is expensive. I'm not exactly sure why 
> conformance checking should be that expensive. Could you explain?

Verifying a content model is expensive (that is, it takes more than a 
couple of milliseconds per page). I am not familiar with the details so 
could not give you detailed reasons for why this is the case, but Web 
browser implementors have told me in no uncertain terms that implementing 
validation is not an option.


> >To be able to put Ruby in HTML5, therefore, I need a Ruby processing 
> >model that is well-defined even in the face of bogus markup,
> 
> What do you mean by bogus markup? No end quotes on attributes?
> Missing end tags? Interleaving start and end tags? Or what else?

None of the above. The processing of the above are all already pretty well 
defined. I meant things like:

 * <ruby> <em>...</em> <rt>...</rt> </ruby>
 * <ruby> <rb>...</rb> <rp>...</rp> <ruby> ... </ruby> </ruby>
 * <ruby> ... </ruby>
 * <ruby> ... <rt>...</rt> <rb>...</rb> </ruby>

...etc.

Ideally this would be defined in terms of a processing model, not in terms 
of error handling. For example, instead of saying "If there is more than 
one RT element, then ignore all but the first", the processing model would 
say something like "the UA must take the first RT element and do X, then 
must take the first RB element and do Y, then...".


> Correct markup is well defined. Bogus markup is a very wide field. This 
> is the main reason why trying to do something like "well-defined even in 
> the face of bogus markup" is an extremely hard task.

It need not be hard if the definition assumes bad markup is common (as it 
is) and works from there. By defining the operations in terms of 
operations on the DOM, for instance, instead of in terms of the correct 
content model, error handling becomes implicit and well-defined.


Incidentally, is there a document somewhere where I can read the CR 
implementation report for Ruby? I would be interested in examining the 
test suite that was used to verify interoperability, as well as testing 
the implementations that were found to be interoperable to see how they 
handle various error conditions.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 12 July 2005 22:16:48 UTC