[Bug 11540] The willful violation clause is most unwise. A standard should not violate another standard for any reason. This wouls lead to 2 things : 1) Content correctly encoded content would never be displayed correctly. 2) All future standards would need to includ

http://www.w3.org/Bugs/Public/show_bug.cgi?id=11540

--- Comment #10 from Shelley Powers <shelleyp@burningbird.net> 2010-12-14 03:55:14 UTC ---
(In reply to comment #9)
> (In reply to comment #8)
> > I'm not sure it is appropriate for any of us to tell each other we're off-topic
> > or not.
> 
> "Only one issue—please use separate bugs for separate issues."
> 
> http://dev.w3.org/html5/decision-policy/decision-policy.html
> 
> > If Laura is concerned about the phrase "willful violation", then
> > hearing more details about what drives the use of this phrase in this
> > bug could then lead her to decide against posting another bug, or to
> > post a bug that is more likely to generate a useful response.
> 
> Optimising some theoretical other bug is a poor rationale for swamping
> discussion in _this_ bug.
> 
> > The original bug is fairly generic. The example seems to be more of a
> > an example of one specific mention of the "willful violation".
> 
> The rationale might be potentially applicable to other willful
> violations, but the report applied it to /a/ clause (singular), not
> multiple clauses. It doesn't say anything about it being a mere example.

Again, we can't tell for sure since the beginning of the bug implies
generality, while the rest of the bug provides a specific example. This could
be as much a problem due to using the type of bug reporting--which is based on
reporting a problem in a specific point in the document. 


> 
> > So, erring on the side of question, the bug could be broken into two parts:
> > 
> > Is the use of willful violation justified? 
> 
> The bug report posits /a priori/ that willful violations can never be
> justified. I think that's an indefensible position since, while it's
> reasonable to expect groups working on different standards to try to
> work together:
>

Confused. You rejected the idea that this bug is generalized around willful
violations, but then proceed to defend a generalized willful violations as a
design and decision paradigm. 


>     1. It's ultimately unrealistic to expect a group in charge of
> formulating Standard X to be able to force a group in charge of
> formulating Standard Y to reformulate Standard Y as required for the
> target audience of Standard X.
>

But it is realistic to expect the group in charge of formulating standard X to
cooperate to every extent possible in ensuring there are no incompatibilities
between it and Standard Y, because said incompatibilities will eventually, most
likely, cause problems. 

>     2. It's inhuman to expect the group in charge of formulating
> Standard X to sacrifice the human needs of its target audience (e.g.
> access to access information and services over the world wide web,
> protection of their privacy and security) on the altar of technical
> consistency with Standard Y.
>

Inhuman? Odd phrase. Puppy mills are inhuman. 

Technology sometimes requires us to compromise: sometimes you have to adapt in
the short term, to benefit in the long run. 

I've found over the years that inconsistencies generate more security problems,
and, overall, generate more of every other kind of problem. I don't easily
disregard inconsistencies. 


> To put it another way: free agents are free agents. :)
> 
> Do you have any arguments or information to add on this?
> 

Free agents are free agents? No, nothing to add to this.

> > Is this specific use of willful justification justified?
> 
> This is always a good question to ask. :)
>

Well, it should have read "willful violation justified"...

> I claim no expertise in the subject of character encodings, so take the
> following with a pinch of salt.
>

I'll take only a fraction of your pinch of salt. I also know very little on the
topic.

> HTML5 character mappings need to enable access to the deployed web
> corpus interoperably with major user agents.
> 
> Not least of the advantages of standardizing such mappings is to help
> protect users from security problems like:
> 
> http://shiflett.org/blog/2005/dec/google-xss-example
> 
> http://code.google.com/p/chromium/issues/detail?id=15701
> 
> For general background see:
> 
> Web encodings page on the WHATWG wiki:
> 
> http://wiki.whatwg.org/wiki/Web_Encodings
> 
> "Internal character encoding declaration" thread at WHATWG
> 
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-March/006000.html
> 
> "Superset encodings" thread at WHATWG
> 
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
> 
> "charset name matching rules" thread at W3C:
> 
> http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
> 
> Some test cases:
> 
> http://hsivonen.iki.fi/test/wa10/encoding-detection/
> 
> http://www.hixie.ch/tests/adhoc/html/parsing/encoding/all.html
> 
> http://coq.no/character-tables/en
> 
> I've taken the trouble to search the archives for rationales specific
> to each violation. I make no guarantee that this information is complete
> or accurate; read the links and make up your own minds.
> 
> "Popular browsers" here is shorthand for the big four engines (Trident,
> Gecko, WebKit, Presto).
> 
>    * Popular browsers and Google Web Search map EUC-KR to Windows-949.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>      http://code.google.com/p/chromium/issues/detail?id=15701
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
> 
>    * Popular browsers map EUC-JP to CP51932.
>      http://www.w3.org/Bugs/Public/show_bug.cgi?id=7444
>     
> http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-September/023208.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
> 
>    * Popular browsers (but not Google Web Search) map GB2312 and GB_2312-80 to
>      the superset GBK.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-March/014219.html
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/020846.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html 
> 
>    * Popular browsers and Google Web Search map ISO-8859-1 to the superset
>      windows-1252.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-March/006000.html
>     
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-November/007737.html
>     
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2006-November/007882.html
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011650.html
>      http://wiki.whatwg.org/wiki/Web_Encodings
>      http://mail.apps.ietf.org/ietf/charsets/msg01835.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
> 
>    * WebKit and Google Web Search map ISO-8859-9 to the superset
>      windows-1254. Adopting this behavior has support from an Opera rep.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011648.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Aug/0047.html
>      http://wiki.whatwg.org/wiki/Web_Encodings
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Aug/0041.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
> 
>    * Popular browsers and Google Web Search map ISO-8859-11 to the
>      superset windows-874.
>      http://lists.w3.org/Archives/Public/public-html/2008Mar/0183.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
> 
>    * Popular browsers and Google Web Search map TIS-620 to the
>      superset windows-874.
>      http://lists.w3.org/Archives/Public/public-html/2008Mar/0183.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://wiki.whatwg.org/wiki/Web_Encodings
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
> 
>    * Popular browsers and Google Web Search map KS_C_5601-1987 to windows-949.
>      http://lists.w3.org/Archives/Public/ietf-charsets/2001AprJun/0030.html
>      http://lists.w3.org/Archives/Public/www-archive/2008Jun/0155.html
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-July/021207.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
> 
>    * Popular browsers and Google Web Search map Shift_JIS to its superset
>      Windows-31J.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-April/019322.html
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      Note ongoing discussion at the IETF:
>      http://mail.apps.ietf.org/ietf/charsets/msg01942.html
> 
>    * Popular browsers and Google Web Search map TIS-620 to its superset
>      windows-874. WebKit S60 made this change back in 2006 because of a
>      bug report.
>      http://trac.webkit.org/changeset/15974
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2007-June/011651.html
>      http://lists.w3.org/Archives/Public/public-html-comments/2009Sep/0050.html
>      http://wiki.whatwg.org/wiki/Web_Encodings
>      http://mail.apps.ietf.org/ietf/charsets/msg01834.html
>     
> http://trac.webkit.org/browser/trunk/WebCore/platform/text/TextCodecICU.cpp
>      http://www.opera.com/docs/specs/presto27/encodings/
> 
>    * Opera, Firefox, Safari, and Google Web Search map US-ASCII to its
>      superset windows-1252, while IE7 drops the high bit. Ian judged the
>      later behavior to be a security risk.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-July/015455.html
>     
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-September/016170.html
> 
>    * Popular browsers map UTF-16 without BOM to LE. Content found in the wild
>      depends on this behavior.
>      http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-June/020552.html
> 
> Fixing these willful violations by pushing them upstream into the IANA
> registry is non-trivial.
> 
> Consider the problem of globally mapping Shift_JIS to windows-31J at the
> registry level, as expressed by a Microsoft rep:
> 
> "Problem is that there are 4+ implementations of shift_jis in 'common'
> use, and none of them are likely to change, since it'd break their
> customers. :(
> 
> "So I don't see a perfect solution here.  HTML5 is fairly clear about
> browser behavior, but in other environments, I think the best we can do
> is point to the variants and allow the clients to decide which version
> they'd like to use."
> 
> http://mail.apps.ietf.org/ietf/charsets/msg01966.html
> 
> Once the principle of munging encodings is accepted, there's clearly
> room for updating the details based on new data. Do you have any
> new data to add?
> 
> Can you persuade major user agent vendors to commit to a different
> implementation strategy than the one described in the spec?

There's a difference between accepting a technical decision because it is the
best, and accepting one because some vendors hold us hostage. 

See, now this is why further discussion is good, as you've provided a great
deal of information; much more so than the original quick rejection of the bug. 

I have started to access some of the material, but stopped when the Hixie test
case link killed my browser. If I have time to go through the information and
feel I have anything further to add on this specific instance of willful
violation, I will do so.

Whether in general "willful violation" is a good principle on which to build a
sound standard, though, is still an issue.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Tuesday, 14 December 2010 03:55:17 UTC