W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > November 2010

[Bug 11363] ins/del inside ruby

From: <bugzilla@jessica.w3.org>
Date: Sun, 21 Nov 2010 08:53:40 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1PK5g4-0000bC-4O@jessica.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11363

--- Comment #1 from Michael(tm) Smith <mike@w3.org> 2010-11-21 08:53:38 UTC ---
(In reply to comment #0)
> It wasn't entirely clear to me from reading the spec whether the following was
> supposed to be valid:
> 
> <!DOCTYPE html>
> <title>Ruby vs ins/del</title>
> <p><ruby>X<rt>x</rt>Y<ins><rt>y</rt></ins></ruby></p>
> 
> I think so, given that it's semantically meaningful, <ins> is supposed to be
> transparent, and <ins> is occurring in a context where phrasing content is (one
> of the things) expected. On the other hand, this is going to be tough to
> validate and validator.nu says it's invalid.

The messages that validator.nu emits for this markup instance are actually
messages generated by the parser, prior to any validation taking place. It's
indicating that there was a parse error -- the reason being that the HTML
parsing algorithm in the spec defines that instance as such. See the case for
'A start tag whose tag name is one of: "rp", "rt"' in section 8.2.5.10 'The "in
body" insertion mode' -

http://dev.w3.org/html5/spec/tokenization.html#parsing-main-inbody
"If the stack of open elements has a ruby element in scope, then generate
implied end tags. If the current node is not then a ruby element, this is a
parse error; pop all the nodes from the current node up to the node immediately
before the bottommost ruby element on the stack of open elements."

So when the parser hits the rt start tag after the ins start tag within the
ruby element, it generates an ins end tag and emits an error message, and the
ins and rt elements end up as siblings in the DOM.

You can check this by looking at the DOM view for this markup in Live DOM
Viewer running in either Firefox 4 or a recent Chrome version or WebKit nightly
(which all now have parsers that conform to the parsing algorithm in the spec).

http://software.hixie.ch/utilities/js/live-dom-viewer/

I'm not sure why the spec requires this behavior to begin with. I'm also not
sure why it requires this behavior only when the rt start tag comes after a
ruby element; so if you omit the ruby element, as in the following (which is
invalid), the parsing behavior is different, and the rt element does end up as
a child of the ins element in the DOM (instead of as sibling).

> Either way, I think it's worth a note/example in the spec.

It is documented in the section that specifies the parsing algorithm, but it
really would be good to have it also noted at point of use in the
authoring-conformance section that defines the content models for ruby and rt
and rp element. There are other cases like this one -- where you can only
understand the behavior if you also read the parsing algorithm. But since most
authors are not going to read the parsing algorithm, it would be good to have
all of these cases also noted in the corresponding authoring-conformance parts
of the per-element sections of the spec.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Sunday, 21 November 2010 08:53:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 21 November 2010 08:53:48 GMT