On Mon, 12 Oct 2009, Leif Halvard Silli wrote: > Ian Hickson On 09-10-11 21.23: > > On Sun, 11 Oct 2009, Leif Halvard Silli wrote (reordered): > > > > > > The choice of character set - alphabet - for instance, has always > > > been a political matter, and still is. > > > > Ok, then it seems sensible to use a political way of speaking to refer > > to the choice of alphabet. > > > > > "Western this-and-that" is predominantly a political way of > > > speaking. > > > > Good, then it is appropriate terminology. > > Appropriate for what? For the spec. Using political ways of speaking to talk about political matters. > "Western European Language [environments]" as Addison suggested is a > reasonable neutral term, btw, despite use of "Western". It also gives > the reader much more hints about what the politics involved ... "European" has no place in this term, as far as I can tell. > > > Therefore is wrong to use a wording that causes readers to think in > > > political terms. > > > > But you agree that it _is_ a political matter. > > Which "it" are you referring to now? The choice of character set - alphabet. > "Western demographics" is a term that leaves the job of finding out > which those areas are to the reader, anyhow. If we can have instead a table of languages to default encodings, I would much rather have that. Is the data for such a table available? On Mon, 12 Oct 2009, Henri Sivonen wrote: > > It probably wouldn't make sense to build an exhaustive lists of locales > where browsers default to Windows-1252, but wouldn't it be feasible to > build an exhaustive list of the locales where browsers *don't* default > to Windows-1252 (e.g. by grepping Firefox localization files)? If such data is available, I'd be happy to include it instead of the current text. On Sun, 11 Oct 2009, Mark Davis â~X~U wrote: > > But focusing on advice to developers, I'd suggest replacing 6 and 7 in > http://dev.w3.org/html5/spec/Overview.html#determining-the-character-encoding, > by the following 3 numbered items. > > - Test if the bytes are valid UTF-8. If they are, return return that > encoding, with the > confidence<http://dev.w3.org/html5/spec/Overview.html#concept-encoding-confidence> > *tentative*, and abort these steps. > - *[include note about UTF-8 patterns, maybe reworded a bit.]* > - The user agent may attempt to autodetect the character encoding *[include > rest of #5]* > - Otherwise, return an implementation-defined or user-specified default > character encoding, with the > confidence<http://dev.w3.org/html5/spec/Overview.html#concept-encoding-confidence> > *tentative*. Due to its widespread use as a default in legacy content, > windows-1252 is recommended as a default in the absences of other > information. On Mon, 12 Oct 2009, Henri Sivonen wrote: > > So you are suggesting making UTF-8 autodetect mandatory while leaving > the rest of chardet optional? Does any one of the 5 top browsers do > that? Mark, could you elaborate on your reasoning for this proposal and on the intent of browser vendors to follow those requirements? On Mon, 12 Oct 2009, Maciej Stachowiak wrote: > On Oct 11, 2009, at 12:23 PM, Ian Hickson wrote: > > > > What phrase best approximates the areas of the world where _today_ UAs > > are shipping with a 1252 default encoding? > > "locales that predominantly use the Latin script" Given that 1252 is the Latin script, and seem circular. > Or you could say: > > "locales that predominantly use the Latin script, and whose primary > languages are completely or almost completely covered by Windows-1252." I'd rather just have an explicit table, if we can. > Note: in the browsers that vary this, it is always determined by > "locale", not "demographic" (which is not a computing concept). I don't > think using the term "demographic" makes sense in this context. Fair enough. Changed to "locale". -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'Received on Monday, 12 October 2009 11:34:54 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 12 October 2009 11:34:55 GMT