RE: [Bug 27851] Add MS932 as a label of Shift_JIS from Shawn Steele on 2015-01-20 (www-international@w3.org from January to March 2015)

From: Shawn Steele <Shawn.Steele@microsoft.com>
Date: Tue, 20 Jan 2015 20:07:33 +0000
To: Anne van Kesteren <annevk@annevk.nl>
CC: "bugzilla@jessica.w3.org" <bugzilla@jessica.w3.org>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <CY1PR0301MB07318BDC4EF1743A607C84F9824B0@CY1PR0301MB0731.namprd03.prod.outlook.>

I'm pretty torn by the MS932 thing.  To me it feels like encouraging bad behavior to make the bad behavior "standard".  It seems more likely that these pages would be better served by fixing the pages.  (Yea, I know that's hard).  Otherwise if it's made legal, they'll see that it works (for them), and is now "standard", so they have no incentive to fix it, so it'll remain broken on a huge long tail of legacy un-updated machines.  Best case would be to have their tooling fix it for them, and/or notify them that it was wrong so they can fix the rest of their content.

On the unrelated Japanese question, I'm not sure, and I'm pretty sure it isn't well documented :)  IE uses our (deprecated) MLang code page detection stuff: http://msdn.microsoft.com/en-us/library/ie/aa740986(v=vs.85).aspx - however I think they're moving away from that and moving toward "just" using what the page says it is.  (Because it can't guess perfectly).  I'm not directly involved though, so my information may be out of date.

To further randomize, what are other folks doing about Zawgyi encoded Burmese text?  It's mangled Unicode and typically incorrectly declared as Unicode.  So it only works with the fonts that expect the non-Unicode encoding.

-Shawn

-----Original Message-----
From: Anne van Kesteren [mailto:annevk@annevk.nl] 
Sent: Martes, 20 Qulla puquy, 2015 02:08 a.m.
To: Shawn Steele
Cc: bugzilla@jessica.w3.org; www-international@w3.org
Subject: Re: [Bug 27851] Add MS932 as a label of Shift_JIS

On Mon, Jan 19, 2015 at 11:08 PM, Shawn Steele <Shawn.Steele@microsoft.com> wrote:
> So would we add MSCP932 if someone starts misusing that?

Depends on how widespread the usage is and I would expect new pages to use utf-8. Any usage of "mscp932" should have turned up by now as it would be a competitive disadvantage for other browsers not to support that in case of widespread usage.

Somewhat related question while we're here, how much sniffing does Internet Explorer for Japanese? And are the fine details documented someplace? (More general sniffing information would also be welcome.
It seems we can probably get away with most heuristics, but there's still a few that browsers employ that seem hard to kill.)


--
https://annevankesteren.nl/

Received on Tuesday, 20 January 2015 20:08:14 UTC