W3C home > Mailing lists > Public > site-comments@w3.org > October 2009

Re: apparent unicode error

From: Drew Perttula <drewp@bigasterisk.com>
Date: Sat, 17 Oct 2009 23:29:15 -0700
Message-ID: <4ADAB5BB.8050906@bigasterisk.com>
To: Ian Jacobs <ij@w3.org>
CC: site-comments@w3.org
Ian Jacobs wrote:
> 
> On 13 Oct 2009, at 10:46 PM, Drew Perttula wrote:
> 
>> Ian Jacobs wrote:
>>> On 13 Oct 2009, at 9:55 PM, Drew Perttula wrote:
>>>> http://bigasterisk.com/post/w3page.png is what 
>>>> http://www.w3.org/standards/semanticweb/ looks like on Chromium on 


> I checked the file source and I'm using &#8212;  which is Unicode mdash:
>  http://www.fileformat.info/info/unicode/char/2014/index.htm
> 
> I checked the generated output using this checker:
>  http://rishida.net/tools/conversion/
> 
> And when I paste in the character, it tells me it's mdash (&#x2014; in 
> hex).
> 
> Not sure what to do at this point...

I'm pretty sure we're talking about two different chars on the page.
% curl http://www.w3.org/standards/semanticweb/ | od -ah | less

There's this one:

0016420   k  sp   o   n   e  sp   f   i   n   d   s  sp   i   n   f   e
         206b 6e6f 2065 6966 646e 2073 6e69 6566
0016440   r   e   n   c   e  nl  sp  sp  sp  sp  sp  sp  sp  sp   b soh
         6572 636e 0a65 2020 2020 2020 2020 81e2
0016460  em  sp   r   e   a   s   o   n   i   n   g  sp   o   v   e   r
         2099 6572 7361 6e6f 6e69 2067 766f 7265

which is definitely the five-dot symbol, 0xe2 0x81 0x99.
http://www.fileformat.info/info/unicode/char/2059/index.htm

And there's this one:

0017540  sp   w   i   t   h  sp   d   i   f   f   e   r   e   n   t  sp
         7720 7469 2068 6964 6666 7265 6e65 2074
0017560   i   n   d   u   s   t   r   i   e   s  sp   b nul dc4  sp   f
         6e69 7564 7473 6972 7365 e220 9480 6620
0017600   o   r  sp   e   x   a   m   p   l   e  sp   i   n  nl  sp  sp
         726f 6520 6178 706d 656c 6920 0a6e 2020

which is a proper mdash like you describe, 0xe2 0x80 0x94.
Received on Sunday, 18 October 2009 06:29:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 24 October 2012 16:21:32 GMT