W3C home > Mailing lists > Public > www-international@w3.org > January to March 2007

Re: Comment on working draft "Specifying Language in XHTML and HTML Content"

From: Mark Davis <mark.davis@icu-project.org>
Date: Tue, 13 Mar 2007 13:26:41 -0700
Message-ID: <30b660a20703131326j32d0b783r43c2d5836423fba@mail.gmail.com>
To: "Richard Ishida" <ishida@w3.org>
Cc: "Jonathan Rosenne" <rosennej@qsm.co.il>, www-international@w3.org
Some comments:

Since the meta element puts few limits on what you can say, it would also be
possible, though not very common, to express language information using
Dublin Core notation.
Example 3:

<meta name="dc.language" content="en" />
This is extremely obscure. If you really need very obscure stuff, it should
be in an appendix, and clarified ("Dublin Core notation"?)

In the meantime, we recommend that you use HTTP headers and meta elements to
provide document metadata about the language of the intended audience(s),
and language attributes on the html tag to indicate the default
text-processing language. Furthermore, we recommend that you always declare
the default text-processing language.

Getting people to do one thing -- correctly -- is hard enough. Asking them
to distinguish between the language of the document and that of the intended
audience is way too far. Best practices should be aimed at getting the 99%
case right, then point to a different document for the edge cases that won't
matter anyway because nobody will do them correctly so they can't be relied
on.

In general, for each of the items you list, unless you can show a substantial
functional difference, a way in which users of the page will see a
substantial positive or negative impact, you should consider withdrawing the
advice.

Best Practice 13: Using Hans and Hant
codes<http://www.w3.org/International/geo/html-tech/tech-lang.html#ri20040429.113217290>

This also goes for some other cases like uz_Arab. You might list the common
cases where a language uses multiple scripts.

Best Practice 15: Using hreflang with
CSS<http://www.w3.org/International/geo/html-tech/tech-lang.html#ri20030112.224458239><http://www.w3.org/International/geo/html-tech/tech-lang.html#ri20050128.152033553>

I'm really leery about this one. It is extremely fragile. If you really
wanted to mark the language of an HTML document pointed to, this is
something that the browser would have a much better job of doing, since it
could fetch the start of the page (working in the background) and pick up
the actual language used on the page. So if you are evangelizing anyone, I'd
think it'd be browser vendors.

Saw a couple of spelling errors also, so you might spell-check.


On 3/13/07, Richard Ishida <ishida@w3.org> wrote:
>
>
> Hello Jony,
>
> > -----Original Message-----
> > From: www-international-request@w3.org
> > [mailto:www-international-request@w3.org] On Behalf Of
> > Jonathan Rosenne
> > Sent: 09 March 2007 20:46
> > To: www-international@w3.org
> > Subject: Comment on working draft "Specifying Language in
> > XHTML and HTML Content"
> >
> >
> > Clause 3.3 Relationships between language, character encoding
> > and directionality
> >
> > The 4th paragraph is misleading. One might get the impression
> > that bidi tags are required for numbers. I suggest that the
> > second sentence be deleted.
>
> See another email for a response about this.
>
>
> >
> > "Similarly" in the 5th paragraph is not very clear. Similar
>
> Note that the latest edit version no longer contains that word. See
>
> http://www.w3.org/International/geo/html-tech/tech-lang.html#ri20050208.0936
> 46470
>
>
> > to what? And shouldn't it be "Azeri"?
>
> No, I believe not. I looked into this some time ago. See for example
> wikipedia articles on Azerbaijan, that refer to the Azerbaijani language
> but
> the Azeri people. Also, the language subtag registry entry is
> Type: language
> Subtag: az
> Description: Azerbaijani
> Added: 2005-10-16
>
>
> >
> > Clause 4.2 Attributes or metadata?
> >
> > I would like to add that often the author is not able to
> > control the metadata. It is handled by the server, and in any
> > large organization the bureaucratic obstacles make it too
> > difficult for most authors to manage, even if they are aware
> > of it, which they may not be.
>
> I modified bullet 4 to say:
> It is important to always declare the default text-processsing language
> for
> the document, but if the document is not read from a server, or the author
> is unable to apply the necessary server settings, the HTTP content header
> will not be available.
>
> There is also mention of this issue two paras down, and in relevant best
> practise sections.
>
> Cheers,
> RI
>
>
> >
> > Jony
>
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
>
> http://www.w3.org/People/Ishida/
> http://www.w3.org/International/
> http://people.w3.org/rishida/blog/
> http://www.flickr.com/photos/ishida/
>
>
>
> --
> No virus found in this outgoing message.
> Checked by AVG Free Edition.
> Version: 7.5.446 / Virus Database: 268.18.10/720 - Release Date:
> 12/03/2007
> 19:19
>
>
>
>


-- 
Mark
Received on Tuesday, 13 March 2007 20:26:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:09 GMT