The confusion between U+2014 and U+2015.

On 9/28/2011 11:22 AM, Koji Ishii wrote:
>   I don't know how it was determined but EM DASH in Japanese and Korean legacy encodings are mapped to U+2015, not to U+2014.


Here is my theory:

The character 1-1-29 is listed on page 69 of JIS 0213:2000 as 
corresponding to “U+2015 EM DASH”. It turns out that U+2015 is actually 
HORIZONTAL BAR and that EM DASH is actually U+2014. This is clearly a 
typo and should have been “U+2014 EM DASH”, as the correct code point is 
used on page 458, and the correspondence with the name EM DASH is also 
visible on page 321. The connection between 1-1-29 and the name EM DASH 
is also visible in JIS X 0208:1997, page 45. Finally, 0213:2004 gives 
the proper correction (page 21).

I suspect that somebody built a mapping by reading JIS 0213:2000, just 
looking at the code points.

The practical bottom line, in any case, is that we need to treat 2014 
and 2015 as equivalent in Japanese contexts.

Eric.

Received on Wednesday, 28 September 2011 23:06:44 UTC