- From: Andrew Cunningham <andrewc@vicnet.net.au>
- Date: Mon, 12 Oct 2009 12:59:23 +1100
- To: Larry Masinter <masinter@adobe.com>
- CC: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>, Ian Hickson <ian@hixie.ch>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, "Phillips, Addison" <addison@amazon.com>, Richard Ishida <ishida@w3.org>, "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
- Message-ID: <4AD28D7B.7060602@vicnet.net.au>
*shrugs*
as far as i can tell its something that shouldn't be defined by the
developers, but rather defined by the localisation teams who choose a
suitable default encoding for the particular UI locale they are developing.
Larry Masinter wrote:
> Can someone please explain, again, why the discussion of default
> configurations of a particular category of user agent in various
> regions belongs in the definition of the HyperText Markup Language?
>
> What benefit can any author of a web page derive, please, from
> knowing what the default settings of various browsers in products
> sold into various language environments?
>
> What benefits to the Internet, the Web, to anyone else, is there
> in specifying what the default configuration should be for various
> "demographics", independent of the actual user's language and
> preference? Does it help a Kenyan who brings a laptop for use
> by his Egyptian wife living in Finland?
>
> What is going on here?
>
> Thanks,
>
> Larry
> --
> http://larry.masinter.net
>
>
> -----Original Message-----
> From: public-html-request@w3.org [mailto:public-html-request@w3.org] On Behalf Of Leif Halvard Silli
> Sent: Sunday, October 11, 2009 4:57 PM
> To: Ian Hickson
> Cc: "Martin J. Dürst"; Phillips, Addison; Andrew Cunningham; Richard Ishida; public-html@w3.org; public-i18n-core@w3.org
> Subject: Re: HTML5 Issue 11 (encoding detection): I18N WG response...
>
> Ian Hickson On 09-10-11 21.23:
>
>
>> On Sun, 11 Oct 2009, Leif Halvard Silli wrote (reordered):
>>
>>> The choice of character set - alphabet - for instance, has always been a
>>> political matter, and still is.
>>>
>> Ok, then it seems sensible to use a political way of speaking to refer to
>> the choice of alphabet.
>>
>
>
> We do not choose alphabet every day. Day to day, the right to use
> the alphabet that your language requires is what matters. And
> ditto language is required to express that.
>
>
>>> "Western this-and-that" is predominantly a political way of speaking.
>>>
>> Good, then it is appropriate terminology.
>>
>
>
> Appropriate for what? Diplomatic language is political and
> accurate, yet tries to avoid contested political phrasings.
>
> "Western European Language [environments]" as Addison suggested is
> a reasonable neutral term, btw, despite use of "Western". It also
> gives the reader much more hints about what the politics involved ...
>
> Western demographics, OTOH ... You mentioned Africa: Egypt was a
> colony once. So was Kenya. Why does Kenya have an Western
> demographic, but Egypt not?
>
>
>>> Therefore is wrong to use a wording that causes readers to think in
>>> political terms.
>>>
>> But you agree that it _is_ a political matter.
>>
>
>
> Which "it" are you referring to now?
>
>
>>> It is wrong to nourish the thought that if some population changes to
>>> use an alphabet which is covered by Win1252, that they then will start
>>> to belong to the "Western demographics".
>>>
>> It doesn't matter if a population _changes_ to use an alphabet which is
>> covered by 1252, because that will only affect future pages, not legacy
>> pages, and it is only legacy pages we are concerned about.
>>
>
> I see the logic, but I wonder how you can any outcome for granted.
> I don't know what is default in Azerbaijan today ...
>
>
>> What phrase best approximates the areas of the world where _today_ UAs are
>> shipping with a 1252 default encoding?
>>
>
>
> "Western demographics" is a term that leaves the job of finding
> out which those areas are to the reader, anyhow.
>
> If you want to give better hints, then you could speak about "the
> British commonwealth, predominantly English, French, Spanish and
> Portuguese speaking demographics, demographics that was
> alphabetized as Western colonies earlier colonies of France,
> Belgium, England, Spain, Portugal" - etc. You should of course add
> that "the list is not exhaustive".
>
> You could also say "demographics using the Latin alphabet covered
> by ASCII plus the letters ŠŒŽÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÚÛÜÝÞß". You
> may say that this is circular. But at least it can help
> implementors find the answer.
>
> You could also list the names of the different Latin alphabets
> that are considered covered by Win1252: the ASCII alphabet, German
> alphabet(s), Scandinavian, etc. See Wikipedia:
>
> http://en.wikipedia.org/wiki/Latin-derived_alphabet
> http://en.wikipedia.org/wiki/Basic_modern_Latin_alphabet
>
> You could also say "demographics covered by the Latin alphabet,
> except the following and other countries, which uses letters that
> are not covered by Win1251: Turkey, Croatia, Azerbaijan etc etc"
>
>
>>> Does Croatia belong to "Western demographics, for instance? Why? And why
>>> not? The Croatian alphabet is not covered by Win1252. What about Serbia?
>>> Serbia uses both Cyrillic and Latin side by side.
>>>
>> What default encodings to browsers use in those areas?
>>
>
>
> I don't know. I just know that Win1252 doesn't cover the Croatian
> alphabet. And I have also gotten the impression that it is a
> problem that - if using one's own alphabet is seen as the normal
> thing - software may not default to a charset using the local
> alphabet.
>
>
>>> As you can see, "Western demographics" is a wording that - depending on
>>> how you define "Western" -covers both narrower and wider than e.g.
>>> "writing systems covered by Win1252".
>>>
>> Is there a better term that would more accurately refer to the areas of
>> the world where a UA needs to ship with a Win1252 default encoding?
>>
>
>
> Se above. And below.
>
>
>>> For example you could say "For demographics that are covered by what in
>>> user agents and e-mail applications are typically known as "Western" or
>>> "West European" encodings, then Win1252 is the best default".
>>>
>> That's circular logic ("Use Win1252 as a default for demographics where
>> Win1252 is the default").
>>
>
>
> To say that "Win1252" is the default for those areas which are
> covered by what is referred to as "Western encodings", is not a
> circular argument.
>
> But your focus appears to be *areas*. And from that point of view
> I can see why you think it is circular.
>
> But I thought that it was more relevant for implementors to know
> that Win1252 is considered the default for wherever "Western
> Encodings" are useful, than it is for them to know that there
> apparently exists a secret Union of Window 1252 Countries ...
>
> However, I just now looked in Firefox to see what it meant by
> Western, and found, under "West European", both Greek and
> "Western" encodings ...
>
> I suppose that Win1252 isn't the default encoding in Greece?
>
> Proves that "Western" is a very imprecise term.
>
>
>> The point is to be able to give implementation
>> advice that is useful independent of the implementor performing any
>> reverse engineering, studying of other user agents, etc.
>>
>
> It doesn't require "reverse engineering" to find out the language
> of a population, does it? What's really needed, if you want to do
> a good job, is to visit that country and observe and judge.
>
> The issue of reverse engineering is, however, connected to what I
> said above above about "Win1252" being the default for areas
> covered by "Western encodings".
>
--
Andrew Cunningham
Senior Manager, Research and Development
Vicnet
State Library of Victoria
328 Swanston Street
Melbourne VIC 3000
Ph: +61-3-8664-7430
Fax: +61-3-9639-2175
Email: andrewc@vicnet.net.au
Alt email: lang.support@gmail.com
http://home.vicnet.net.au/~andrewc/
http://www.openroad.net.au
http://www.vicnet.net.au
http://www.slv.vic.gov.au
Received on Monday, 12 October 2009 02:00:07 UTC