[Bug 17859] Mechanism to enable localisation of form controls and other locale-specific data

https://www.w3.org/Bugs/Public/show_bug.cgi?id=17859

--- Comment #31 from Cameron Jones <cmhjones@gmail.com> ---
(In reply to Ian 'Hixie' Hickson from comment #30)
> (In reply to Cameron Jones from comment #29)
> > I think it might be better to start with some appraisal of whether
> > auto-localization is a feature or a bug.
> 
> Not sure exactly what you mean by "auto-localisation", but from context it
> looks like you mean the feature that exists today that causes form controls
> to render according to the user's local platform conventions rather than
> having the same UI for everyone.

Almost. That should be:

"...the feature that exists today that causes form controls
to render according to the user's local platform conventions rather than
the language negotiation of HTTP and the HTML language resolution algorithm"

> 
> If that is what you mean, then it clearly seems like a feature. The
> alternative would be for type=date to show a Chinese calendar (since that's
> the most-widely used calendar in terms of users), and I, for one, have no
> idea how to read Chinese.
> 

No, it wouldn't. Nothing would show the Chinese calendar unless it had been set
somewhere in the chain of locale settings, by default:

element -> parent -> html -> meta -> content-language -> user locale

The base-line default always falls back to the "user's local platform
conventions". 

So, unless you had installed a Chinese operating system or otherwise set your
environment to Chinese you would never see a Chinese calendar, unless you
visited a Chinese web site.

> 
> > What benefit is auto-localization providing today such that it warrants the
> > necessity for an escape hatch?
> 
> The benefits are not what warrant an "escape hatch". This bug is just a
> feature request from authors to be able to control the localisation more
> specifically.
> 

No. The bug is that it is impossible to show anything *other* than the "user's
local platform conventions".

> 
> > We must consider the eventuality that if an escape hatch is provided, will
> > it be used by default? Does this not render the default behavior a bug?
> 
> I don't understand what you mean. We can't change the defaults. Maintaining
> backwards-compatibility is paramount.

There is no backwards compatibility to support.

Since any data-point which would require localization is new to HTML5 and the
implementations of such features is patchy at best there is no precedence for
existing uses to preserve.

That browsers have implemented some patchy prototypes which don't take into
account locale resolution, it has to be questioned if they have considered it
at all. Ergo a bug.

For a tangential discussion, what scope is there within a "living standard" for
the appropriate review and refinement of new features? If the first draft is
baked in stone with the first implementation, this doesn't provide the
necessary environment for global standards development impacting disparate
users. 

> 
> 
> > In lieu of some specific syntax to consider, i think this could be
> > problematic as the locale will be defined through the same place as it is
> > needed to be used.
> 
> I don't understand what you mean.

I thought that you were implying that the locale could be set through CSS
properties, in which case there is scope for infinite loops when you consider
the additional requirement of needing to style based on locale selectors:

:locale("fr") {
    locale: "en";    
}

:locale("en") {
    locale: "fr";    
}

> 
> 
> > As a declarative language, HTML by definition is a description of *what* the
> > document means.
> 
> Yes.
> 
> > There are no useless or unimportant definitions.
> 
> That's clearly false. There's lots of ways of including useless or
> unimportant HTML markup. For example:
> 
>    <span class=""></span>
> 
> ...is semantically moot.

Nope. Still means something. That it has no content is moot.

I can still style it. I can still JS some content into it. Removing it from the
page might break it in unknown or unexpected ways. We cannot make that
judgment.

> 
> > Therefore, supporting a model of copy/paste which is not
> > simply a manifestation of referential transparency would violate the
> > essential nature of a declarative language.
> 
> I'm not sure what you mean by "supporting". The simple fact of the matter is
> that significant volumes of Web content are generated by authors who don't
> understand the nuances of HTML yet, and they get their documents working by
> copying and pasting something that works nearly as they want, and then
> mutating it until it works well enough for them to deploy. I don't pass a
> value judgment on this matter, it's just how it is.
> 

Within the specific context of <html lang="">, I think that is a throw back to
the bygone days of (X)HTML 4(.01) (Strict|Transitional|Frameset) when it was
too confusing and difficult to remember what anyone should put at the top of
their shiny new HTML page. That technological requirement forced people to
lookup and use the closest DOCTYPE to hand, with little consideration for what
else they were copying (lang, xml:lang, dir, xmlns).

There is no problem with copy/paste, the problem is attempting to negate the
semantics of the document because people have used it incorrectly. 

What is the problem with users getting "their documents working by copying and
pasting something that works nearly as they want" and then mutating the <html
lang=""> until it works well enough for them?

> 
> > You can not say something has meaning, and then ignore it at will (or when
> > *some* people use it incorrectly/without consideration for its effects). To
> > do so would be to render valid uses invalid and "break things across the
> > web"(tm).
> 
> I'm not sure to what you are referring here.
> 
> 
> > If people have copy/pasted that their page is within the Inuit locale then
> > lo(!) forever more it shall be.
> 
> It's not the semantic meaning we have to preserve, it's the user-visible end
> result.

But we have no standard user-visible behavior at present. All we have are
partial implementations of a draft specification.

Looking again at Addison's test page in the various browsers and i'm struggling
to find any localization happening at all any more. It appears that browsers
have backed out of this functionality.

I think they are looking for some clear direction over the currently
non-normative and ambiguous advice in section 4.10.5.2:

http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#input-impl-notes

One way to simplify the problem space is to avoid baking in auto-localization
and stick with the HTML language resolution algorithm as the sole means to
derive a locale. This will avoid any knock-on affects to CSS and at least allow
the new data points to be implemented consistently.

The notion of a browser implementing a default localization policy across all
pages is something to consider, but the question is how important - or useful -
is this when considered as a distinct and separate function from translation.

If an english monoglot visits a french website, what use is there in
'localizing' the page data for them? Either they understand the content or they
do not.

The real scope for auto-localization is when viewed between 'en-US' and 'en-GB'
or any other variation on the 'en' base. In this case a website could produce a
generic english document (with the exclusion of spelling differences) and yet
support through default client-side localization the translation of data points
within the local cultural conventions.

The exclusion of spelling differences highlights that this is essentially a
futile exercise as it will be impossible to provide a completely generic base
document which maintains integrity across all cultural variations.

So, i conclude that the notion of client-side automated semi-translation of
localizable data points is a bogus concept. 

I think instead we should just be looking at non-html derived localization
being the same as translation and being catered for in the same manor.

As such, the translate="" attribute can be used to semantically denote
intrinsically localized data points.

The main downside to all of this is 'en-US' users visiting 'en-GB' web sites
needing to cope with the strange month/day variations in date widgets. This is
already defacto standard for proprietary localized form controls stuffed into
type="text".

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Received on Wednesday, 30 July 2014 15:23:44 UTC