- From: <bugzilla@wiggum.w3.org>
- Date: Mon, 30 Mar 2009 00:57:06 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=6746 --- Comment #4 from Nick Levinson <Nick_Levinson@yahoo.com> 2009-03-30 00:57:06 --- Interesting. I don't know the Turkish situation. Maybe someone else can explore it and any similar situations around the world. We don't need to add a security hole; perhaps there's a solution that meets both sets of needs. On whether ASCII is all that is of interest, the standard, <http://www.w3.org/html/wg/markup-spec/>, as accessed today (29th), defines _case-insensitivity_ separately from _ASCII case-insensitivity_. The term "case-insensitive" fits within the term "ASCII case-insensitive", so defining both as separate semantic entities only makes sense in a concise document if meanings are at least subtly different. Both offer essentially the same definitions as to the 26 letters. No other character within 7-bit ASCII, to my knowledge, is subject to case differentiation. So case-insensitivity that is not ASCII case-insensitivity must encompass, either now or in the future, non-ASCII case-insensitivity. Non-ASCII case-insensitivity, if not to be redundant, must encompass letters other than the 26. I assume that includes not only diacritically-marked letters (we treat all of them for computer purposes as not of the 26) but also some like the yogh, the thorn, and the edh, which have case (I don't know if they come with diacriticals). Attribute names may consist of almost any Unicode character (per id., section 5.6), thus of letters not of the 26. If no attribute is now spelled with a letter not of the 26, section 5.6 anticipates such attributes being added later. Attribute values may be spelled with almost any Unicode character (per id., sections 5.6 (value) and 5.7 (text)), thus of letters not of the 26, and that's now, not just in the future. Scripts may use almost any Unicode character (per id., sections 5.5 and 5.7), thus again letters not of the 26. Does this mean the Turkish issue is already an issue in HTML5? I don't know enough to answer that. Should HTML5 and compliant user agents and tools treat a letter not of the 26 case-insensitively or -sensitively when found in a attribute value or name? I would favor insensitivity for those contexts, for the sake of consistency and meeting authors' expectations. I would extend case-insensitivity within a context from ASCII to non-ASCII, although not from contexts where any insensitivity is now required by HTML to contexts where it is not, such as phrasing content or what normally appears visibly to a user in a browser window. On the other hand, I would favor case-sensitivity within scripts, albeit not for attribute names and values for the script element, because script content is often not HTML and thus must follow the requirements that apply to a script language such as JavaScript, which HTML should not constrain any more than it may have to. Thank you for helping me bring the argument more tightly within HTML5. -- Nick -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Monday, 30 March 2009 00:57:14 UTC