Re: Text that's not in any language

> Maybe LANG should be extended to cover
> 
>   - computer languages (Pascal, C, HTML, CSS,...)
>   - proper names (language "none"?)
>   - "unknown" and "any" languages
> 
> The last two would be useful, resp., for a text that is in some
> language, but the author doesn't know which

The _author_ doesn't know which? I know some of these people .. :-)

>, and for a text that is the
> same in every language. An example would be the SI units mm, s, etc.
>
 
Tagging a unit as "Any" would imply that the unit was the same whatever the context. This might well be the case, but in an ideal world it should be possible to have or not have units translated into the cultural environment of the reader. In an electronic commerce context, for example, European shoe sizes might not mean a lot to a potential British buyer. In a scientific text, on the other hand, this would be neither necessary nor desirable. Units would need to be tagged as either invariant or culturally-dependent. (Does unit usage map directly onto language usage anyway?) 

Proper names might also need something richer than "none". London the city is the same thing as Londres, but London the surname isn't. This could be relevant in multilingual retrieval systems - a French speaker looking for information on "Londres", who might be willing to accept output in English, doesn't want to know about Jack. Iain

Received on Wednesday, 8 January 1997 14:26:07 UTC