- From: <bugzilla@jessica.w3.org>
- Date: Sun, 06 Nov 2011 19:52:38 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=14709 Summary: lang tag validation is insufficiently specified Product: HTML WG Version: unspecified Platform: PC OS/Version: All Status: NEW Severity: normal Priority: P2 Component: HTML5 spec (editor: Ian Hickson) AssignedTo: ian@hixie.ch ReportedBy: jdaggett@mozilla.com QAContact: public-html-bugzilla@w3.org CC: mike@w3.org, public-html-wg-issue-tracking@w3.org, public-html@w3.org In section "The lang and xml:lang attributes" describing the behavior of language tags in HTML elements, there's wording that makes it difficult to determine exactly if/when some form of language tag validation should occur. The spec currently contains this wording: If the resulting value is not a recognized language tag, then it must be treated as an unknown language having the given language tag, distinct from all other languages. For the purposes of round-tripping or communicating with other services that expect language tags, user agents should pass unknown language tags through unmodified. Thus, for instance, an element with lang="xyzzy" would be matched by the selector :lang(xyzzy) (e.g. in CSS), but it would not be matched by :lang(abcde), even though both are equally invalid. Similarly, if a Web browser and screen reader working in unison communicated about the language of the element, the browser would tell the screen reader that the language was "xyzzy", even if it knew it was invalid, just in case the screen reader actually supported a language with that tag after all. To give a concrete example of where this leads to fuzzy interpretation in implementations, consider the language tag 'mya', the ISO 639-3 language code for Burmese. There's a two-letter language tag from ISO 639-1 'my', so the valid BCP47 language tag is 'my'. So what's the exact behavior for user agents that use API's that make use of language tag information, for example OpenType API's that have use OpenType language tags. Should the language tag be validated and a default used if none exists? Or should 'mya' be passed through to these API's just in case it might be a supported OpenType tag? The spec can be read either way, especially given the example of a screen reader which "actually supported a language with that tag after all". I think the wording needs to be stronger than this, I think the spec specifically needs to say that when the language is used, if it doesn't match a BCP47 language tag (such as 'mya'), then the only interpretation is that it's the equivalent of an unknown language when passed along to an API. As is, the spec merely defines the *expectation* that the language code is a BCP47 code but allows for an entirely different language tag format to be used in it's place. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Sunday, 6 November 2011 19:52:42 UTC