W3C home > Mailing lists > Public > www-international@w3.org > October to December 2006

Re: What to do with Gaulish ?

From: CE Whitehead <cewcathar@hotmail.com>
Date: Mon, 13 Nov 2006 19:47:21 -0500
Message-ID: <BAY114-F230B46FDF7BC34E5C4ED2EB3EB0@phx.gbl>
To: cowan@ccil.org
Cc: www-international@w3.org



Hi, thanks!  O.k. thanks for your clarifications, they are quite helpful!

I did not entirely think search engines used the tags much; I get the 
language I want from search engines by entering a text string in it in the 
search engine.
But, suppose the search engines did decide to use the tags, or some of them 
did, then would a person who requested French (fr) not be able to access 
documents in Middle French (frm)?
Also I'm curious, I expected that the search engines would use the tags 
some, or it would scarcely be worth my trouble to include too much more than 
the text processing tags.  What besides browsers uses these tags at the 
moment?
Finally, can we request to register at least a variant subtag rather than a 
separate language subtag, just one, say 17c, so we can then have the tag,
fr-US-17c ?
I'd love to have that as 17th century French still has a few peculiarities.  
But it is generally more modern.
I see no reason to change the dates you have, 1200-1400 = Old French; 
1400-1600 = Middle French; that should be o.k.; it is just that in some 
instances the dates vary, depending on the author, etc.
Thanks!

--C. E. Whitehead
cewcathar@hotmail.com


>From: John Cowan <cowan@ccil.org>
>To: CE Whitehead <cewcathar@hotmail.com>
>CC: www-international@w3.org
>Subject: Re: What to do with Gaulish ?
>Date: Mon, 13 Nov 2006 18:29:37 -0500
>MIME-Version: 1.0
>Received: from frink.w3.org ([128.30.52.16]) by 
>bay0-mc3-f4.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.2444); Mon, 13 
>Nov 2006 15:32:31 -0800
>Received: from lists by frink.w3.org with local (Exim 4.50)id 
>1GjlFW-00032S-0yfor www-international-dist@listhub.w3.org; Mon, 13 Nov 2006 
>23:29:58 +0000
>Received: from lisa.w3.org ([128.30.52.41])by frink.w3.org with esmtp (Exim 
>4.50)id 1GjlFT-00031B-SZfor www-international@listhub.w3.org; Mon, 13 Nov 
>2006 23:29:55 +0000
>Received: from mercury.ccil.org ([192.190.237.100])by lisa.w3.org with 
>esmtp (Exim 4.50)id 1GjlFE-0001Do-Pzfor www-international@w3.org; Mon, 13 
>Nov 2006 23:29:55 +0000
>Received: from cowan by mercury.ccil.org with local (Exim 4.34)id 
>1GjlFB-000504-6A; Mon, 13 Nov 2006 18:29:37 -0500
>Received: none (lisa.w3.org: domain of cowan@ccil.org does not designate 
>permitted sender hosts)
>X-Message-Info: txF49lGdW42snom58xYKtTL/mcQdVSkgLth5X2jtx8o=
>References: <20061113062042.GA2527@ccil.org> 
><BAY114-F219AFF379E4079744A4E2BB3F40@phx.gbl>
>User-Agent: Mutt/1.3.28i
>X-W3C-Hub-Spam-Status: No, score=-2.6
>X-W3C-Scan-Sig: lisa.w3.org 1GjlFE-0001Do-Pz 
>ae2a39d447006cf49a971380cc990cb0
>X-Original-To: www-international@w3.org
>X-Archived-At: http://www.w3.org/mid/20061113232937.GC3781@ccil.org
>Resent-From: www-international@w3.org
>X-Mailing-List: <www-international@w3.org> archive/latest/4892
>X-Loop: www-international@w3.org
>Resent-Sender: www-international-request@w3.org
>Precedence: list
>List-Id: <www-international.w3.org>
>List-Help: <http://www.w3.org/Mail/>
>List-Unsubscribe: 
><mailto:www-international-request@w3.org?subject=unsubscribe>
>Resent-Message-Id: <E1GjlFW-00032S-0y@frink.w3.org>
>Resent-Date: Mon, 13 Nov 2006 23:29:58 +0000
>Return-Path: www-international-request@listhub.w3.org
>X-OriginalArrivalTime: 13 Nov 2006 23:32:31.0588 (UTC) 
>FILETIME=[FD544A40:01C7077B]
>
>
>CE Whitehead scripsit:
>
> > Hi, I am troubled by tags like frc, fro, and frm because I am wondering
> > what happens when a person using a search engine asks for pages
> > in French?  Will the frc, fro, frm pages turn up too?
>
>In practice, search engines tend to ignore language tagging in favor
>of statistical analysis of text, since language tags are so often
>missing or incorrect.
>
> > It's quite possible that a person interested in French will be
> > interested in moyen Francais/Middle French (frc) and in Old French
> > (fro) if the search is for someone studying French.
>
>There are various contexts where multiple language tags can be specified,
>indicating that you will accept content in any of these, usually with a
>priority order.
>
> > The trouble with you all is you assume that people are just searching
> > for pages in their first language and that they have only one real
> > primary language they can accept pages in; clearly this cannot be the
> > case for fro (Old  French) and frm (Moyen Francais).
>
>Not at all.
>
> > It's also conceivable that a person might want documents that are
> > written in either a Creole of French and Standard French.
>
>The same feature (the language-tag search list) can be used in that case
>as well.
>
> > One could of course list all of these in the meta content tags; for
> > example for my "Moyen francais" document I could list: lang=en, fr, frm
>
>Indeed.
>
> > but some applications used to put up pages at some web hosts embed
> > one's document into the body of a page they create; that's the case
> > with teacher web (http://teacherweb.com), as I pointed out once before.
>
>That's known to be a problem, yes.
>
> > Also, as I noted, some of the 17th Century new world documents were
> > in Middle French although you all have set the dates as 1400-1600
> > (those dates can vary a bit; you'd be surprised also at the amount of
> > variation you can get in any given language at any given time before
> > literacy was so widespread)
>
>We didn't pick the dates, the ISO 639-2 Registration Authority
>(the Library of Congress) did.  You can go to
>http://www.loc.gov/standards/iso639-2/php/iso639-2chform.php
>and request a change.
>
> > I note that for Arabic (which has as far as I know and I am no expert)
> > the following main subdivisions in its dialects, [...]  you just have
> > to use the country codes--at least this is all I saw?
>
>As of ISO 639-2 and RFC 4646, yes.  In 639-3 and 4646bis, about
>30 different Arabic language tags will be available:
>
>aao  	Algerian Saharan Arabic
>abh 	Tajiki Arabic
>abv 	Baharna Arabic
>acm 	Mesopotamian Arabic
>acq 	Ta'izzi-Adeni Arabic
>acw 	Hijazi Arabic
>acx 	Omani Arabic
>acy 	Cypriot Arabic
>adf 	Dhofari Arabic
>aeb 	Tunisian Arabic
>aec 	Saidi Arabic
>afb 	Gulf Arabic
>ajp 	South Levantine Arabic
>apc 	North Levantine Arabic
>apd 	Sudanese Arabic
>arb 	Standard Arabic
>arq 	Algerian Arabic
>ars 	Najdi Arabic
>ary 	Moroccan Arabic
>arz 	Egyptian Arabic
>auz 	Uzbeki Arabic
>avl 	Eastern Egyptian Bedawi Arabic
>ayh 	Hadrami Arabic
>ayl 	Libyan Arabic
>ayn 	Sanaani Arabic
>ayp 	North Mesopotamian Arabic
>bbz 	Babalia Creole Arabic
>pga 	Sudanese Creole Arabic
>shu 	Chadian Arabic
>ssh 	Shihhi Arabic
>
> > Why not also have variants for dates, such as two digits plus the letter
> > c, with the two digits indicating the century (01-20; I assume that the
> > century would be redundant for the 21rst century variant of a language)?
>
>Those forms are too short, and there are problems with generic tags, as
>centuries are not the appropriate units for many languages.
>
>--
>Is a chair finely made tragic or comic? Is the          John Cowan
>portrait of Mona Lisa good if I desire to see           cowan@ccil.org
>it? Is the bust of Sir Philip Crampton lyrical,         
>http://ccil.org/~cowan
>epical or dramatic?  If a man hacking in fury
>at a block of wood make there an image of a cow,
>is that image a work of art? If not, why not?               --Stephen 
>Dedalus
>

_________________________________________________________________
Stay in touch with old friends and meet new ones with Windows Live Spaces 
http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
Received on Tuesday, 14 November 2006 00:48:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:09 GMT