Re: Language label from M.T. Carrasco Benitez on 1997-02-26 (www-international@w3.org from January to March 1997)

From: M.T. Carrasco Benitez <carrasco@innet.lu>
Date: Wed, 26 Feb 1997 22:14:27 +0100 (MET)
To: Misha Wolf <misha.wolf@reuters.com>
cc: www-international <www-international@w3.org>, Unicode <unicode@unicode.org>
Message-ID: <Pine.LNX.3.95.970226220153.11072J-100000@localhost>

> There are a number of quite separate requirements for language 
> identification, such as:
> 
> 1.  Language tagging of portions of text, so that:
>     -  search engines can carry out appropriate stemming etc,
>     -  browsers can:
>        -  select the most appropriate fonts,
>        -  do the most appropriate hyphenation,
>        -  use the most appropriate voice synthesis,
>        -  ...

The proposal is about language label for monolingual HTML docs.  Small
portions of texts in another language in a monolingual doc should be
indentify using the LANG attribute within the pertinent tag.  Once the
language is know, it can be used for many functions, in particular the one
mentioned above.

> 2.  Language tagging of a Web object (eg an image or an HTML file), so that 
>     HTTP can be used to negotiate between client and server, in order to 
>     obtain the objects which best suit the user.

This could be also needed, but the proposal is for HTML docs.

> It is not obvious that the information for item 2 above should come from 
> inside the object.  Indeed, in the case of an image, you might have serious 
> difficulties if you tried to insert "<HTML LANG=xx>" or "<META HTTP-EQUIV 
> ...>" :-).  The best solution is, probably, to place this information 
> outside the object, eg in the object's name within a filestore.

The problem should also be discussed for other objects.

But:
 There should be *one* language label for monolingual HTML docs *inside* 
 the doc.

Regards
Tomas

Received on Wednesday, 26 February 1997 16:13:10 UTC