[Bug 14709] lang tag validation is insufficiently specified

http://www.w3.org/Bugs/Public/show_bug.cgi?id=14709

--- Comment #22 from Glenn Adams <glenn@skynav.com> 2011-11-08 15:38:07 UTC ---
(In reply to comment #21)
> To me, it seems like a bad idea to help legacy language tags proliferate. I
> think document conformance should require strict RFC 4646 validity and,
> furthermore, OpenType values shouldn't leak to HTML. That is, I think we should
> require lang=my in HTML and leave it to OpenType implementations to map my to
> BRM. This way, the burden of dealing with legacy would be contained to
> implementations that deal with OpenType instead of burdening all kinds of
> implementations.

I agree, except that it should be RFC 5646, which obsoletes 4646. The question
remains of how to treat invalid values. Should the simply be ignored (as if not
specified at all)? Should the be treated as specifying the empty string? Should
an invalid value be visible via the lang IDL attribute?

My suggestion would be that they (i.e., non well formed or otherwise
non-compliant language values) be ignored internally (in the UA) for the
purpose of further processing. However, for the lang IDL attribute, I would
suggest they be retained, even if non-well formed or otherwise non-conformant.
In other words, the following from 2.1.3 applies:

"When it is stated that some element or attribute is ignored, or treated as
some other value, or handled as if it was something else, this refers only to
the processing of the node after it is in the DOM. A user agent must not mutate
the DOM in such situations."

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Tuesday, 8 November 2011 15:38:11 UTC