- From: <bugzilla@jessica.w3.org>
- Date: Tue, 01 Jul 2014 10:51:48 +0000
- To: www-international@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=23646 --- Comment #40 from Henri Sivonen <hsivonen@hsivonen.fi> --- (In reply to Jirka Kosek from comment #39) > But are there any pages using us-ascii encoding in a wild? It would be extremely surprising if there weren't. > If no, then there > is no problem with having different aliases for decoding/encoding. As noted in my previous comment, when e.g. submitting a form, browser use the encoding of the submitting document. The document stores the identity of the encoding. It doesn't store the original label, so you don't have a chance to re-resolve the label according to a different mapping. As for the TextEncoding API, it doesn't support non-UTF-* encodings anyway, so the issue of "us-ascii" is moot. > > > In ideal world yes, but when you have other constraints and you know that > > > receiver can handle us-ascii then why it should be broken? > > > > What "other constraints"? > > For example 15 years old POS terminal with no UTF-8 support. Without UTF-8 support, they can't have conforming XML support. It's not the Encoding Standard's problem to accommodate XML interchange with fundamentally XML-non-conforming legacy systems. > > If you know what the receiver can handle, you don't need specs to bless your > > bilateral arrangement. > > If I'm asking encoder to produce us-ascii output I'm not expecting getting > bytes with value larger then 127 in my output. The point where things go wrong is asking an encoder to produce something other than UTF-8. :-) > > > Please note that the > > > Encoding Standard changes how us-ascii encoding behaved in the past, so this > > > change must be justified and well reasoned. > > > > Citation-needed for the Encoding Standard describing a change compared to > > pre-Encoding Standard browser behavior. > > I think that definition of US-ASCII is pretty clear, it's 7-bit encoding. I said "browser behavior"--not (de jure) "definition". > I'm talking about us-ascii in general not only in browsers because the > Encoding Standard seems to apply to everything, not only to browsers. If the > scope is narrowed to browsers only, then do as you wish. But it would be > silly to have two different definitions of us-ascii -- one for browsers and > second for other environments. I think we should focus the spec on the Web Platform--i.e. browsers. As other systems find the need to consume Web content, they'll eventually grow Encoding Standard-compliant encoding subsystems. It's clear that there exist encoding libraries whose label handling is IANA-oriented. Those will probably stick around for a long time for compatibility with their old selves. It's unfortunate that the Web behavior and e.g. the IANA-oriented JDK behavior differ, but we should just admit the existence of two different legacies and not try to mix e.g. the JDK legacy into Web specs. -- You are receiving this mail because: You are on the CC list for the bug.
Received on Tuesday, 1 July 2014 10:51:50 UTC