RE: ISSUE-88 / Re: what's the language of a document ?

Ian responded:

> It was not an oversight.

The Internationalization working group maintains that, for compatibility with existing documents, authoring practice, and non-browser tools and user-agents, the existing syntax of HTML <meta> Content-Language really MUST be preserved. We do thank you for the other changes, but herewith request that the remainder of our Change Proposal be accepted by the HTML WG.

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization WG

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: www-international-request@w3.org [mailto:www-international-
> request@w3.org] On Behalf Of Ian Hickson
> Sent: Thursday, April 01, 2010 2:20 PM
> To: Maciej Stachowiak
> Cc: Richard Ishida; www-international@w3.org; public-html@w3.org
> Subject: Re: ISSUE-88 / Re: what's the language of a document ?
> 
> On Thu, 1 Apr 2010, Maciej Stachowiak wrote:
> >
> > Ian, comments on the two points below would be appreciated.
> 
> My position hasn't changed since this was last proposed:
> 
>    http://lists.w3.org/Archives/Public/public-

> html/2010Feb/0729.html
>    http://lists.w3.org/Archives/Public/public-

> html/2010Feb/0734.html
> 
> 
> > > [[
> > > [3] Change:
> > > "For meta elements with an http-equiv attribute in the Content
> Language
> > > state, the content attribute must have a value consisting of a
> valid BCP 47
> > > language code. [BCP47]"
> > > to
> > > "For meta elements with an http-equiv attribute in the Content
> Language
> > > state, the content attribute must have a value consisting of
> one or more
> > > valid BCP 47 language codes, separated by commas. [BCP47]"
> > > ]]
> > >
> > > Since the algorithm just above this text now allows for
> treatment of a
> > > comma-separated list of values in determining the pragma-set
> default
> > > language, we suspect that it might be an oversight that this
> text wasn't
> > > changed.
> 
> It was not an oversight.
> 
> I do not think allowing multiple values is a good idea, because it
> doesn't
> match reality. User agents do not pay any attention to values after
> the
> first. The right way to mark that a document _uses_ multiple
> languages is
> to use the lang="" attribute in the document. There is no reason to
> have a
> standard way to say who the target audience of the document is,
> since in
> practice few people use that information on the Web. Even if there
> was
> such a need, this feature would be a bad way to provide that
> information,
> since it is used in an incompatible way by user agents (the first
> language, and only the first language, is used to determine
> processing
> behaviour). For controlled environments, there are a multitude of
> options
> available to authors, such as <meta name> with custom names,
> microdata,
> RDFa, out-of-band data, <script> blocks, etc. We don't need to use
> this
> mechanism for that purpose. Doing so would just confuse authors
> further.
> 
> 
> > > [[
> > > [2] Add an additional note just before the numbered list in the
> section
> > > about Content language state, with the following text:
> > >
> > > "Note: Declarations in the HTTP header and the Content Language
> pragma are
> > > metadata, referring to the document as a whole and expressing
> the expected
> > > language or languages of the audience of the document. On the
> other hand, a
> > > language attribute on an element describes the actual language
> used in the
> > > range of content bounded by that element (and so values are
> limited to a
> > > single language at a time)."
> > >
> > > Rationale: To clarify why the HTTP and pragma declarations are
> different
> > > when it comes to values, and how they should be used. This is a
> constant
> > > source of confusion.
> > > ]]
> > >
> > > On balance, we would still prefer to see a note of this kind in
> the spec, if
> > > the editor agrees.
> 
> The above note is wrong in practice. The pragma doesn't give
> metadata
> abotu the document. The original intent of the <meta http-equiv>
> feature
> was to provide a way for _servers_ to include data in their HTTP
> headers
> on a per-file basis. This isn't document-wide metadata for user
> agents,
> it's for servers. This original intent doesn't match reality;
> reality is
> that this pragma sets the default language for lang="". That also
> isn't
> document-wide metadata for user agents.
> 
> If there is a "constant source of confusion", then what we need is
> pointers to this confusion, so that text intended specifically to
> address
> that confusion is included in the spec. I do not believe the text
> above
> would reduce confusion; I believe it would cause it.
> 
> (Note that the proposed note above doesn't actually even match the
> stated
> rationale, as far as I can tell.)
> 
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.
> fL
> http://ln.hixie.ch/       U+263A                /,   _.. \
> _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--
> (,_..'`-.;.'

Received on Friday, 2 April 2010 07:47:15 UTC