i18n Polyglot Markup/attr values (6th issue)

Richard Ishida, Tue, 13 Jul 2010 20:40:24 +0100:

> FWIW, the i18n group keeps track of comments on your doc at 
> http://www.w3.org/International/reviews/1007-polyglot/

	6th issue:
		]]
			6.2.3 Attribute values	Case requirements	
            " however, case requirements do not apply to non-ASCII
              letters such as Greek, Cyrillic, or non-ASCII Latin 
letters. "
 
		We are confused by this text. Scripts such as Greek, Cyrillic, and 
Armenian do have case distinctions, and those distinctions are 
significant in XML if you have attribute names or values in those 
scripts. But we are not clear when any characters from those scripts or 
non-ASCII Latin letters are used for attribute names or values in HTML.

Please clarify for us what the intent is.

(There is similar text in 6.2.2)
		[[

	Comment: I think I may have had a word in what the spec says here. The 
purpose is to express that while ASCII letters are generally treated 
case-insensitively in HTML (in contrast to XHTML), the same is not the 
case for non-ASCII letters. Thus XHTML and HTML agree that non-ASCII 
letters are treated case _sensitively_. Whereas they disagree about 
ASCII letters - XHTML treats them case sensitively, whereas HTML treats 
them as insensitively. For programmers, it is perhaps obvious that 
there is a difference between the ASCII case sensitivity of the 
non-ASCII case sensitivity. But for more ordinary people, it is not 
logical that some letters are treated case sensitively, while others 
are not. It is also generally common to say about XML that it is case 
sensitive, in contrast to HTML. But fact is, that HTML and XML only 
differ with regard to case sensitivity when it comes to ASCII.

For the record, HTML5, when it talks about the data-* attributes, says 
the same thing: data-ASCII="" is treated case insensitively. Whereas 
data-ÆØÅ="" is not treated case insensitively.
-- 
leif halvard silli

Received on Thursday, 15 July 2010 21:00:38 UTC