- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Tue, 18 Dec 2007 17:36:18 +0200
- To: "Magni Hovgaard" <m_hovgaard@hotmail.com>, <www-validator@w3.org>
Magni Hovgaard wrote: > http://www.clarecoco.ie/services/gaeilge/gaeilge.html > > The online says the page is valid, while my local installation > returns these errors. - - > Line 87, column 18: character "�" is not allowed in the value of > attribute "name" <h2><a name="Réamhrá" id="Réamhrá"></a> Looks like an encoding problem, as you suspect. The offending character seems to be U+FFFD, REPLACEMENT CHARACTER, which is an indicator for character data error. Oddly enough, the markup quoted seems to contain it properly, as "á". > Encoding: > iso-8859-1 Even more puzzling. As a workaround, you could represent "á" as "é" as elsewhere on the page. There is no reason why this entity reference could not be used in an attribute value, too. But of course you _should_ be able to write it as such as well. On the other hand, non-ASCII characters are risky in ID values. It would be safer to omit the diacritic, i.e. use just "Reamhra", since this is mostly just an internal code rather than something visible to users. It becomes visible as part of URL, if someone uses it in a fragment identifier in a link, but this in turn implies problems, since not all browsers can handle non-ASCII characters in URLs properly. (Actually I was somewhat astonished at noticing that XHTML indeed allows e.g. "á" in an identifier. I should have remembered that - my book on Unicode has a longish discussion of the identifier concept in XML - but the concept is fairly confusing and complex and rarely applied in web authoring. People just tend to stick to ASCII letters there.) Jukka K. Korpela ("Yucca") http://www.cs.tut.fi/~jkorpela/
Received on Tuesday, 18 December 2007 15:36:04 UTC