- From: Jonathan Grant <jgrantwork@gmail.com>
- Date: Tue, 17 Feb 2015 16:42:30 +0000
- To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>
- Cc: www-validator@w3.org
- Message-ID: <CAAVmPNdWZ6voHTK8b2xYkw8Y7F1eetoLrEX00_O16X5GchPuSw@mail.gmail.com>
Many thanks for your reply Regards, Jon On 16 February 2015 at 19:19, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote: > 2015-02-09, 14:11, Jonathan Grant wrote: > >> I followed this example: >> >> http://www.w3.org/International/questions/qa-validator-charset-check.en >> >> but it didn't catch the corrupt characters in the following page, any >> ideas? >> >> >> http://man7.org/linux/man-pages/man1/hostname.1.html > > > There are no corrupt characters there, as far as I can see. But some > characters there can be problematic in terms of font support; that’s a > completely different problem. > > The page is declared as UTF-8 encoded, both in a <meta> tag and in an HTTP > header. And it appears to be actually UTF-8 encoded. > >> See text below with ?? >> >> >> Information about the project can be found at >> ??http://net-tools.sourceforge.net/??. If you have a bug report >> for >> this manual page, see ??http://net-tools.sourceforge.net/??. >> >> >> The bytes seem to be some multi byte E2 9F A8 > > > The character before the URL is “⟨” U+27E8 MATHEMATICAL LEFT ANGLE BRACKET, > which is E2 9F A in UTF-8 encoding; see > http://www.fileformat.info/info/unicode/char/27e8/index.htm > And the character after the URL is “⟩” U+27E9 MATHEMATICAL RIGHT ANGLE > BRACKET. > > At the level of character representation and use of characters in (X)HTML, > everything is correct; there is no error to report. > > But font support is limited; the page > http://www.fileformat.info/info/unicode/char/27e8/fontsupport.htm > lists most of the fonts containing these characters (though it may lack some > very new or specialized fonts). Browsers generally indicate lack of font > support by displaying a small rectangle instead. > > Moreover, it is questionable whether these characters, designated as > mathematical, should be used as URL delimiters. > > It is much safer, and much more common, to use the Ascii characters “<” and > “>” as delimiters. In (X)HTML, you just need to remember to write the former > as < due to (X)HTML syntax rules. > >> I'm not a member on this list, so please keep my email in replies. > > > OK. > > Yucca > >
Received on Tuesday, 17 February 2015 16:43:02 UTC