- From: Martin Duerst <duerst@w3.org>
- Date: Tue, 20 Aug 2002 11:34:46 +0900
- To: maherb@brimworks.com, www-validator@w3.org
At 12:29 02/08/19 -0400, maherb@brimworks.com wrote: >In the HTML 4.01 specification on this page, > >http://www.w3.org/TR/html4/charset.html#h-5.3.1 > >describes numeric character references which are perfectly legal, >however when validating with such numeric character reference, I >recieve an error: > > * Line 106, column 5: > > —foo bar<br> > ^ The validator is correct. Please have a look at http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html This says: CHARSET BASESET "ISO Registration Number 177//CHARSET ISO/IEC 10646-1:1993 UCS-4 with implementation level 3//ESC 2/5 2/15 4/6" DESCSET 0 9 UNUSED 9 2 9 11 2 UNUSED 13 1 13 14 18 UNUSED 32 95 32 127 1 UNUSED 128 32 UNUSED 160 55136 160 55296 2048 UNUSED -- SURROGATES -- 57344 1056768 57344 Please note the line 128 32 UNUSED This says that 32 characters, starting from character number 128, are unused. The next usable character is 160. This is because the numbers in numeric character refences are taken from Unicode, and in Unicode, the characters from 128 to 159 are are control characters, which don't belong into an HTML document. What you probably wanted was the character EM DASH, represented with byte 0x97 in windows-1252. For this, please use the hexadecimal NCR —, or its decimal equivalent. Regards, Martin. >Error: reference to non-SGML character > >Thanks, >-Brian > > >-- > Brian Maher CS Major WWU > BrimWorks.com >> Glory to God <<
Received on Monday, 19 August 2002 23:30:42 UTC