- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Mon, 16 Feb 2015 21:19:23 +0200
- To: Jonathan Grant <jgrantwork@gmail.com>, www-validator@w3.org
2015-02-09, 14:11, Jonathan Grant wrote: > I followed this example: > > http://www.w3.org/International/questions/qa-validator-charset-check.en > > but it didn't catch the corrupt characters in the following page, any ideas? > > > http://man7.org/linux/man-pages/man1/hostname.1.html There are no corrupt characters there, as far as I can see. But some characters there can be problematic in terms of font support; that’s a completely different problem. The page is declared as UTF-8 encoded, both in a <meta> tag and in an HTTP header. And it appears to be actually UTF-8 encoded. > See text below with ?? > > > Information about the project can be found at > ??http://net-tools.sourceforge.net/??. If you have a bug report for > this manual page, see ??http://net-tools.sourceforge.net/??. > > > The bytes seem to be some multi byte E2 9F A8 The character before the URL is “⟨” U+27E8 MATHEMATICAL LEFT ANGLE BRACKET, which is E2 9F A in UTF-8 encoding; see http://www.fileformat.info/info/unicode/char/27e8/index.htm And the character after the URL is “⟩” U+27E9 MATHEMATICAL RIGHT ANGLE BRACKET. At the level of character representation and use of characters in (X)HTML, everything is correct; there is no error to report. But font support is limited; the page http://www.fileformat.info/info/unicode/char/27e8/fontsupport.htm lists most of the fonts containing these characters (though it may lack some very new or specialized fonts). Browsers generally indicate lack of font support by displaying a small rectangle instead. Moreover, it is questionable whether these characters, designated as mathematical, should be used as URL delimiters. It is much safer, and much more common, to use the Ascii characters “<” and “>” as delimiters. In (X)HTML, you just need to remember to write the former as < due to (X)HTML syntax rules. > I'm not a member on this list, so please keep my email in replies. OK. Yucca
Received on Monday, 16 February 2015 19:19:55 UTC