W3C home > Mailing lists > Public > www-international@w3.org > October to December 2006

Re: What to do with Gaulish ?

From: John Cowan <cowan@ccil.org>
Date: Mon, 13 Nov 2006 18:29:37 -0500
To: CE Whitehead <cewcathar@hotmail.com>
Cc: www-international@w3.org
Message-ID: <20061113232937.GC3781@ccil.org>

CE Whitehead scripsit:

> Hi, I am troubled by tags like frc, fro, and frm because I am wondering
> what happens when a person using a search engine asks for pages
> in French?  Will the frc, fro, frm pages turn up too?

In practice, search engines tend to ignore language tagging in favor
of statistical analysis of text, since language tags are so often
missing or incorrect.

> It's quite possible that a person interested in French will be
> interested in moyen Francais/Middle French (frc) and in Old French
> (fro) if the search is for someone studying French.

There are various contexts where multiple language tags can be specified,
indicating that you will accept content in any of these, usually with a
priority order.

> The trouble with you all is you assume that people are just searching
> for pages in their first language and that they have only one real
> primary language they can accept pages in; clearly this cannot be the
> case for fro (Old  French) and frm (Moyen Francais).

Not at all.

> It's also conceivable that a person might want documents that are
> written in either a Creole of French and Standard French.

The same feature (the language-tag search list) can be used in that case
as well.

> One could of course list all of these in the meta content tags; for
> example for my "Moyen francais" document I could list: lang=en, fr, frm


> but some applications used to put up pages at some web hosts embed
> one's document into the body of a page they create; that's the case
> with teacher web (http://teacherweb.com), as I pointed out once before.

That's known to be a problem, yes.

> Also, as I noted, some of the 17th Century new world documents were
> in Middle French although you all have set the dates as 1400-1600
> (those dates can vary a bit; you'd be surprised also at the amount of
> variation you can get in any given language at any given time before
> literacy was so widespread)

We didn't pick the dates, the ISO 639-2 Registration Authority
(the Library of Congress) did.  You can go to
and request a change.

> I note that for Arabic (which has as far as I know and I am no expert)
> the following main subdivisions in its dialects, [...]  you just have
> to use the country codes--at least this is all I saw?

As of ISO 639-2 and RFC 4646, yes.  In 639-3 and 4646bis, about
30 different Arabic language tags will be available:

aao  	Algerian Saharan Arabic
abh 	Tajiki Arabic
abv 	Baharna Arabic
acm 	Mesopotamian Arabic
acq 	Ta'izzi-Adeni Arabic
acw 	Hijazi Arabic
acx 	Omani Arabic
acy 	Cypriot Arabic
adf 	Dhofari Arabic
aeb 	Tunisian Arabic
aec 	Saidi Arabic
afb 	Gulf Arabic
ajp 	South Levantine Arabic
apc 	North Levantine Arabic
apd 	Sudanese Arabic
arb 	Standard Arabic
arq 	Algerian Arabic
ars 	Najdi Arabic
ary 	Moroccan Arabic
arz 	Egyptian Arabic
auz 	Uzbeki Arabic
avl 	Eastern Egyptian Bedawi Arabic
ayh 	Hadrami Arabic
ayl 	Libyan Arabic
ayn 	Sanaani Arabic
ayp 	North Mesopotamian Arabic
bbz 	Babalia Creole Arabic
pga 	Sudanese Creole Arabic
shu 	Chadian Arabic
ssh 	Shihhi Arabic

> Why not also have variants for dates, such as two digits plus the letter
> c, with the two digits indicating the century (01-20; I assume that the
> century would be redundant for the 21rst century variant of a language)?

Those forms are too short, and there are problems with generic tags, as
centuries are not the appropriate units for many languages.

Is a chair finely made tragic or comic? Is the          John Cowan
portrait of Mona Lisa good if I desire to see           cowan@ccil.org
it? Is the bust of Sir Philip Crampton lyrical,         http://ccil.org/~cowan
epical or dramatic?  If a man hacking in fury
at a block of wood make there an image of a cow,
is that image a work of art? If not, why not?               --Stephen Dedalus
Received on Monday, 13 November 2006 23:29:55 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:27 UTC