[csswg-drafts] Backslash & Yen sign behavior (#6848)

litherum has just created a new issue for https://github.com/w3c/csswg-drafts:

== Backslash & Yen sign behavior ==
In WebKit we have a bunch of code to replace the U+005C REVERSE SOLIDUS (commonly knows as "backslash") character with U+00A5 YEN SIGN, using the same mechanism as `text-transform` uses. We do this in certain conditions:

1. The font name is one of:
    - MS PGothic
    - MS PMincho
    - MS Gothic
    - MS Mincho
    - Meiryo
2. OR the encoding is one of:
    - x-mac-japanese
    - ISO-2022-JP
    - EUC-JP
    - Shift_JIS
    - Shift_JIS_X0213-2000

We appear to be the only browser on Mac that does this. On Windows, browsers don't appear to do this, because those fonts have glyphs for the U+005C character that look like the yen sign. It appears to be implemented in the fonts themselves, so the browsers don't appear to do anything on Windows.

Some background reading:
- http://archives.miloush.net/michkap/archive/2005/09/17/469941.html
- https://en.wikipedia.org/wiki/Backslash#Confusion_with_¥_and_other_characters
- http://archives.miloush.net/michkap/archive/2005/10/12/479561.html
- https://en.wikipedia.org/wiki/Yen_and_yuan_sign#Code_points
- https://twitter.com/UINT_MIN/status/1458309391711551489

There appear to be two similar yet distinct problems:
1. Shift JIS says 0x005C is how the yen sign is represented, but for some reason text decoders don't turn that into U+00A5 in-memory. Fixing this might be tricky, because the backslash has semantic meaning within Javascript sources.
2. Some fonts on Windows seem to think that U+005C visually looks like a yen sign

These two problems sort of cancel each other out - if a text decoder turns the yen sign into the wrong in-memory character, but then a font is used which renders that character like a yen sign, the user gets what they expect. (Copy/paste doesn't work, a11y doesn't work, and find-in-page (probably) doesn't work, though.)

The reason WebKit has this special handling is because those fonts listed above don't exist on the Mac. If an author is writing their page on Windows, and they are typing their source code and want to type the yen sign, they might type the backslash character, and page would appear to work on Windows. Then, when someone else visits the page on a Mac, font fallback occurs, we use a different font to render the content, and their character shows up as a backslash, which isn't what they wanted. So, we "fixed this" by just magically turning the backslash into a yen sign in-memory, because we're trying to be helpful.

We (the web platform) should determine what to do here. WebKit appears to be the only browser which tries to be helpful like this. We added this handling before 2007, way before the Blink fork, and Blink no longer seems to have this behavior, so presumably they intentionally deleted it.

Should other browsers try to be helpful like WebKit? Should WebKit try to stop being helpful? Should we ask Microsoft to change these glyphs in their fonts? Maybe the problem has mostly alleviated itself in the at-least 14 years since it was investigated last, and WebKit can just delete its special handling?

(This issue might belong better in different standards groups, but I don't know which ones, so I'm starting it here and I can migrate it as necessary.)

Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/6848 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Wednesday, 1 December 2021 01:27:17 UTC