- From: Albert Lunde <albert-lunde@nwu.edu>
- Date: Wed, 21 Oct 1998 14:17:08 CDT
- To: www-international@w3.org
> >I still think that these efforts to describe with "language", what > seems to > >me, to be a case of a general transformation of various things > including > >script and language, are a bad idea. > There are documents that are transliteration of language for another > language. Tagging a Greek doc as Greek when it has been > transliterated into the Latin alphabet for Frech speaker would > be wrong. Neither can it be marked as French. Hence there is > a real need for this type of tagging. I'd describe that as Greek transliterated into a roman (or latin) script. One could most likely be more precise. But I don't think there's big differences in the way Greek (or Japanese or Korean or Hebrew) are romanized for French-speaking or English-speaking readers. Romanization is _not_ a change of language; it's use of a different script. There's clearly more than one way to romanize many languages. (And you can write English in other scripts, i.e. Japanese Katakana (though it may get a bit distorted)) If you romanize greek, it's still greek to me ;) I _agree_ that there's a use for tagging this general sort of transformation. I _disagree_ that it should be done by over-loading the existing language headers/language-tags as used in HTTP and HTML. If it needs to be marked up on portions of a document, then I'd say we should propose a new attribute for a HTML DTD. The arguments for having a standard markup are pretty much the same as LANG: it helps spell-checkers, hyphentation software, and so forth to do the right thing. Right now, I'm expect multi-lingual HTML processing software is doing huristics to associate script with language, but making this explict could be a good thing. I'd love to be able to spell-check romanized Japanese when I write it (which illustrates that these issues appear in original as well as transformed texts). If we could live with indicating it for the whole document, then use a META tag. But it sounds like that would be too weak for the general multi-lingual case. I'm not an expert in these matters, so I'm not sure I've got all the right terminology either, I just see pitfalls in over-loading "language" in HTTP/HTML. -- Albert Lunde Albert-Lunde@nwu.edu
Received on Wednesday, 21 October 1998 15:17:10 UTC