Re: xml:lang question, markup for things like 'kursee', 'arigato'?

Misha Wolf scripsit:

> In my case, I was trying to decide what xml:lang values to use for
> brief Turkish phrases which have been degraded to the Latin alphabet
> as used for English.  Both the Turkish writing system and the English
> writing system use the Latin script.  It would surely not be helpful
> to mark both the original phrase and the degraded version as "tr-Latn"?

In the RFC 3066 bis regime (which we are not yet in) you could
use "tr-x-misspelled".  Today, I think "tr" is the best you can do.
There is plenty of English text properly tagged but orthographically

John Cowan
We want more school houses and less jails; more books and less arsenals;
more learning and less vice; more constant work and less crime; more
leisure and less greed; more justice and less revenge; in fact, more of
the opportunities to cultivate our better natures.  --Samuel Gompers

Received on Wednesday, 16 June 2004 11:14:34 UTC