RE: [Bug 27851] Add MS932 as a label of Shift_JIS

After reviewing our behavior, I don't like this change.  The new code wouldn't be recognized in Windows, certainly not in existing versions, so it seems unlikely to have much immediate impact in a large number of cases.

The better fix would be to correctly tag the data as shift_jis.

To get into this state, the developer may as well have used "MSCP932".  So would we add MSCP932 if someone starts misusing that?

Adding MS932 implies that it is valid for encoding.  The person that's currently using it probably doesn't see any errors for whatever reason, but it won't work on millions of machines all over the world.  Right now their page is "broken" and they can fix it by choosing a better label.  If it were added, then they wouldn't be "broken", but it still wouldn't work, but then they'd have difficulty troubleshooting it.

We've been trying to encourage use of Unicode and are very reluctant to add new aliases.

-Shawn 

-----Original Message-----
From: Shawn Steele [mailto:Shawn.Steele@microsoft.com] 
Sent: Monday, January 19, 2015 1:21 PM
To: bugzilla@jessica.w3.org; www-international@w3.org
Subject: RE: [Bug 27851] Add MS932 as a label of Shift_JIS

I'd like to check to see if we'd recognize it.  I'm concerned that if we add aliases that we don’t recognize is then that if it leaks to some other context it doesn't work.

-Shawn

-----Original Message-----
From: bugzilla@jessica.w3.org [mailto:bugzilla@jessica.w3.org] 
Sent: Monday, January 19, 2015 8:15 AM
To: www-international@w3.org
Subject: [Bug 27851] Add MS932 as a label of Shift_JIS

https://www.w3.org/Bugs/Public/show_bug.cgi?id=27851


Anne <annevk@annevk.nl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jsbell@google.com,
                   |                            |jshin@chromium.org

--- Comment #4 from Anne <annevk@annevk.nl> --- (I misremembered the issue in comment 1 as shift-jis is clearly a known label.
It was euc_jp getting recognized as euc-jp I think.)

Those pages that have Content-Type: text/html;charset=MS932 would actually be slightly better of as we would know the encoding for certain and would no longer have to scan for it in the HTML.

Thanks, I guess we should add it. Anyone see any good reason not to do it?

--
You are receiving this mail because:
You are on the CC list for the bug.

Received on Monday, 19 January 2015 22:09:25 UTC