- From: Andrew Cunningham <andrewc@vicnet.net.au>
- Date: Mon, 12 Oct 2009 12:59:23 +1100
- To: Larry Masinter <masinter@adobe.com>
- CC: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, Ian Hickson <ian@hixie.ch>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "Phillips, Addison" <addison@amazon.com>, Richard Ishida <ishida@w3.org>, "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
- Message-ID: <4AD28D7B.7060602@vicnet.net.au>
*shrugs* as far as i can tell its something that shouldn't be defined by the developers, but rather defined by the localisation teams who choose a suitable default encoding for the particular UI locale they are developing. Larry Masinter wrote: > Can someone please explain, again, why the discussion of default > configurations of a particular category of user agent in various > regions belongs in the definition of the HyperText Markup Language? > > What benefit can any author of a web page derive, please, from > knowing what the default settings of various browsers in products > sold into various language environments? > > What benefits to the Internet, the Web, to anyone else, is there > in specifying what the default configuration should be for various > "demographics", independent of the actual user's language and > preference? Does it help a Kenyan who brings a laptop for use > by his Egyptian wife living in Finland? > > What is going on here? > > Thanks, > > Larry > -- > http://larry.masinter.net > > > -----Original Message----- > From: public-html-request@w3.org [mailto:public-html-request@w3.org] On Behalf Of Leif Halvard Silli > Sent: Sunday, October 11, 2009 4:57 PM > To: Ian Hickson > Cc: "Martin J. Dürst"; Phillips, Addison; Andrew Cunningham; Richard Ishida; public-html@w3.org; public-i18n-core@w3.org > Subject: Re: HTML5 Issue 11 (encoding detection): I18N WG response... > > Ian Hickson On 09-10-11 21.23: > > >> On Sun, 11 Oct 2009, Leif Halvard Silli wrote (reordered): >> >>> The choice of character set - alphabet - for instance, has always been a >>> political matter, and still is. >>> >> Ok, then it seems sensible to use a political way of speaking to refer to >> the choice of alphabet. >> > > > We do not choose alphabet every day. Day to day, the right to use > the alphabet that your language requires is what matters. And > ditto language is required to express that. > > >>> "Western this-and-that" is predominantly a political way of speaking. >>> >> Good, then it is appropriate terminology. >> > > > Appropriate for what? Diplomatic language is political and > accurate, yet tries to avoid contested political phrasings. > > "Western European Language [environments]" as Addison suggested is > a reasonable neutral term, btw, despite use of "Western". It also > gives the reader much more hints about what the politics involved ... > > Western demographics, OTOH ... You mentioned Africa: Egypt was a > colony once. So was Kenya. Why does Kenya have an Western > demographic, but Egypt not? > > >>> Therefore is wrong to use a wording that causes readers to think in >>> political terms. >>> >> But you agree that it _is_ a political matter. >> > > > Which "it" are you referring to now? > > >>> It is wrong to nourish the thought that if some population changes to >>> use an alphabet which is covered by Win1252, that they then will start >>> to belong to the "Western demographics". >>> >> It doesn't matter if a population _changes_ to use an alphabet which is >> covered by 1252, because that will only affect future pages, not legacy >> pages, and it is only legacy pages we are concerned about. >> > > I see the logic, but I wonder how you can any outcome for granted. > I don't know what is default in Azerbaijan today ... > > >> What phrase best approximates the areas of the world where _today_ UAs are >> shipping with a 1252 default encoding? >> > > > "Western demographics" is a term that leaves the job of finding > out which those areas are to the reader, anyhow. > > If you want to give better hints, then you could speak about "the > British commonwealth, predominantly English, French, Spanish and > Portuguese speaking demographics, demographics that was > alphabetized as Western colonies earlier colonies of France, > Belgium, England, Spain, Portugal" - etc. You should of course add > that "the list is not exhaustive". > > You could also say "demographics using the Latin alphabet covered > by ASCII plus the letters ŠŒŽÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÚÛÜÝÞß". You > may say that this is circular. But at least it can help > implementors find the answer. > > You could also list the names of the different Latin alphabets > that are considered covered by Win1252: the ASCII alphabet, German > alphabet(s), Scandinavian, etc. See Wikipedia: > > http://en.wikipedia.org/wiki/Latin-derived_alphabet > http://en.wikipedia.org/wiki/Basic_modern_Latin_alphabet > > You could also say "demographics covered by the Latin alphabet, > except the following and other countries, which uses letters that > are not covered by Win1251: Turkey, Croatia, Azerbaijan etc etc" > > >>> Does Croatia belong to "Western demographics, for instance? Why? And why >>> not? The Croatian alphabet is not covered by Win1252. What about Serbia? >>> Serbia uses both Cyrillic and Latin side by side. >>> >> What default encodings to browsers use in those areas? >> > > > I don't know. I just know that Win1252 doesn't cover the Croatian > alphabet. And I have also gotten the impression that it is a > problem that - if using one's own alphabet is seen as the normal > thing - software may not default to a charset using the local > alphabet. > > >>> As you can see, "Western demographics" is a wording that - depending on >>> how you define "Western" -covers both narrower and wider than e.g. >>> "writing systems covered by Win1252". >>> >> Is there a better term that would more accurately refer to the areas of >> the world where a UA needs to ship with a Win1252 default encoding? >> > > > Se above. And below. > > >>> For example you could say "For demographics that are covered by what in >>> user agents and e-mail applications are typically known as "Western" or >>> "West European" encodings, then Win1252 is the best default". >>> >> That's circular logic ("Use Win1252 as a default for demographics where >> Win1252 is the default"). >> > > > To say that "Win1252" is the default for those areas which are > covered by what is referred to as "Western encodings", is not a > circular argument. > > But your focus appears to be *areas*. And from that point of view > I can see why you think it is circular. > > But I thought that it was more relevant for implementors to know > that Win1252 is considered the default for wherever "Western > Encodings" are useful, than it is for them to know that there > apparently exists a secret Union of Window 1252 Countries ... > > However, I just now looked in Firefox to see what it meant by > Western, and found, under "West European", both Greek and > "Western" encodings ... > > I suppose that Win1252 isn't the default encoding in Greece? > > Proves that "Western" is a very imprecise term. > > >> The point is to be able to give implementation >> advice that is useful independent of the implementor performing any >> reverse engineering, studying of other user agents, etc. >> > > It doesn't require "reverse engineering" to find out the language > of a population, does it? What's really needed, if you want to do > a good job, is to visit that country and observe and judge. > > The issue of reverse engineering is, however, connected to what I > said above above about "Win1252" being the default for areas > covered by "Western encodings". > -- Andrew Cunningham Senior Manager, Research and Development Vicnet State Library of Victoria 328 Swanston Street Melbourne VIC 3000 Ph: +61-3-8664-7430 Fax: +61-3-9639-2175 Email: andrewc@vicnet.net.au Alt email: lang.support@gmail.com http://home.vicnet.net.au/~andrewc/ http://www.openroad.net.au http://www.vicnet.net.au http://www.slv.vic.gov.au
Received on Monday, 12 October 2009 02:00:12 UTC