RE: Transliteration

> >I still think that these efforts to describe with "language", what
> seems to
> >me, to be a case of a general transformation of various things
> including
> >script and language, are a bad idea.
> There are documents that are transliteration of language for another
> language. Tagging a Greek doc as Greek when it has been
> transliterated into the Latin alphabet for Frech speaker would
> be wrong.  Neither can it be marked as French.  Hence there is
> a real need for this type of tagging.

I'd describe that as Greek transliterated into a roman (or latin)
script. One could most likely be more precise.

But I don't think there's big differences in the way Greek
(or Japanese or Korean or Hebrew) are romanized for French-speaking
or English-speaking readers.

Romanization is _not_ a change of language; it's use of
a different script.  There's clearly more than one way
to romanize many languages. (And you can write English
in other scripts, i.e. Japanese Katakana (though
it may get a bit distorted))

If you romanize greek, it's still greek to me ;)

I _agree_ that there's a use for tagging this general sort
of transformation.

I _disagree_ that it should be done by over-loading the existing
language headers/language-tags as used in HTTP and HTML.

If it needs to be marked up on portions of a document,
then I'd say we should propose a new attribute for a
HTML DTD. The arguments for having a standard markup
are pretty much the same as LANG: it helps spell-checkers,
hyphentation software, and so forth to do the right thing.

Right now, I'm expect multi-lingual HTML processing
software is doing huristics to associate script with
language, but making this explict could be a good thing.

I'd love to be able to spell-check romanized Japanese
when I write it (which illustrates that these issues
appear in original as well as transformed texts).

If we could live with indicating it for the whole document,
then use a META tag. But it sounds like that would be too weak
for the general multi-lingual case.

I'm not an expert in these matters, so I'm not sure I've
got all the right terminology either, I just see pitfalls
in over-loading "language" in HTTP/HTML.

--
    Albert Lunde                      Albert-Lunde@nwu.edu

Received on Wednesday, 21 October 1998 15:17:10 UTC