W3C home > Mailing lists > Public > www-validator@w3.org > November 2008

Re: [VE][122] Smart quotes not allowed in the value of attribute "id"

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Fri, 21 Nov 2008 20:23:34 +0200
Message-ID: <AA3F8C8AD0154DDE8960F19DFFE36652@JukanPC>
To: "Dan Dascalescu" <ddascalescu@gmail.com>, <www-validator@w3.org>

Dan Dascalescu wrote:

> I was validating the following XHTML 1.0 Strict fragment:
>
> <h1 id="&#8220; &#8221;">header text</h1>
>
> and got the error that the "smart" quotes were not allowed.

That is correct, though the error message is differently formulated.

> I'm
> curious as to why that is, since smart quotes are high-ASCII Unicode
> characters like any other,

There is no such thing as high-ASCII, unless you mean octet 7F (hexadecimal) 
and below, and "smart" quotes aren't there.

> and hex-encoded Chinese characters (e.g.
> &#x884C;) are perfectly allowed in 'id' attribute values.

The character U+884C, whether as such or as a character reference, is 
classified as a letter, for the purposes of XML syntax at least. Here's an 
excerpt from XML spec:
Ideographic    ::=    [#x4E00-#x9FA5] | #x3007 | [#x3021-#x3029]

The "smart" quotes aren't letters under any definition, and they aren't 
allowed in identifiers in XML by any other rule either. Why _would_ you use 
quotation marks in an identifier?

Generally, it is surely safest to stick to good old ASCII letters (and maybe 
a few other ASCII characters as allowed by old HTML specs) in identifiers. 
After all, identifiers are supposed to be machine-processable mainly, not 
something that end users see (though they may accidentally see them e.g. 
when an identifier appears in a link in a fragment identifier).

-- 
Yucca, http://www.cs.tut.fi/~jkorpela/ 
Received on Friday, 21 November 2008 18:24:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:33 GMT