- From: CE Whitehead <cewcathar@hotmail.com>
- Date: Mon, 13 Nov 2006 19:47:21 -0500
- To: cowan@ccil.org
- Cc: www-international@w3.org
Hi, thanks! O.k. thanks for your clarifications, they are quite helpful! I did not entirely think search engines used the tags much; I get the language I want from search engines by entering a text string in it in the search engine. But, suppose the search engines did decide to use the tags, or some of them did, then would a person who requested French (fr) not be able to access documents in Middle French (frm)? Also I'm curious, I expected that the search engines would use the tags some, or it would scarcely be worth my trouble to include too much more than the text processing tags. What besides browsers uses these tags at the moment? Finally, can we request to register at least a variant subtag rather than a separate language subtag, just one, say 17c, so we can then have the tag, fr-US-17c ? I'd love to have that as 17th century French still has a few peculiarities. But it is generally more modern. I see no reason to change the dates you have, 1200-1400 = Old French; 1400-1600 = Middle French; that should be o.k.; it is just that in some instances the dates vary, depending on the author, etc. Thanks! --C. E. Whitehead cewcathar@hotmail.com >From: John Cowan <cowan@ccil.org> >To: CE Whitehead <cewcathar@hotmail.com> >CC: www-international@w3.org >Subject: Re: What to do with Gaulish ? >Date: Mon, 13 Nov 2006 18:29:37 -0500 >MIME-Version: 1.0 >Received: from frink.w3.org ([128.30.52.16]) by >bay0-mc3-f4.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.2444); Mon, 13 >Nov 2006 15:32:31 -0800 >Received: from lists by frink.w3.org with local (Exim 4.50)id >1GjlFW-00032S-0yfor www-international-dist@listhub.w3.org; Mon, 13 Nov 2006 >23:29:58 +0000 >Received: from lisa.w3.org ([128.30.52.41])by frink.w3.org with esmtp (Exim >4.50)id 1GjlFT-00031B-SZfor www-international@listhub.w3.org; Mon, 13 Nov >2006 23:29:55 +0000 >Received: from mercury.ccil.org ([192.190.237.100])by lisa.w3.org with >esmtp (Exim 4.50)id 1GjlFE-0001Do-Pzfor www-international@w3.org; Mon, 13 >Nov 2006 23:29:55 +0000 >Received: from cowan by mercury.ccil.org with local (Exim 4.34)id >1GjlFB-000504-6A; Mon, 13 Nov 2006 18:29:37 -0500 >Received: none (lisa.w3.org: domain of cowan@ccil.org does not designate >permitted sender hosts) >X-Message-Info: txF49lGdW42snom58xYKtTL/mcQdVSkgLth5X2jtx8o= >References: <20061113062042.GA2527@ccil.org> ><BAY114-F219AFF379E4079744A4E2BB3F40@phx.gbl> >User-Agent: Mutt/1.3.28i >X-W3C-Hub-Spam-Status: No, score=-2.6 >X-W3C-Scan-Sig: lisa.w3.org 1GjlFE-0001Do-Pz >ae2a39d447006cf49a971380cc990cb0 >X-Original-To: www-international@w3.org >X-Archived-At: http://www.w3.org/mid/20061113232937.GC3781@ccil.org >Resent-From: www-international@w3.org >X-Mailing-List: <www-international@w3.org> archive/latest/4892 >X-Loop: www-international@w3.org >Resent-Sender: www-international-request@w3.org >Precedence: list >List-Id: <www-international.w3.org> >List-Help: <http://www.w3.org/Mail/> >List-Unsubscribe: ><mailto:www-international-request@w3.org?subject=unsubscribe> >Resent-Message-Id: <E1GjlFW-00032S-0y@frink.w3.org> >Resent-Date: Mon, 13 Nov 2006 23:29:58 +0000 >Return-Path: www-international-request@listhub.w3.org >X-OriginalArrivalTime: 13 Nov 2006 23:32:31.0588 (UTC) >FILETIME=[FD544A40:01C7077B] > > >CE Whitehead scripsit: > > > Hi, I am troubled by tags like frc, fro, and frm because I am wondering > > what happens when a person using a search engine asks for pages > > in French? Will the frc, fro, frm pages turn up too? > >In practice, search engines tend to ignore language tagging in favor >of statistical analysis of text, since language tags are so often >missing or incorrect. > > > It's quite possible that a person interested in French will be > > interested in moyen Francais/Middle French (frc) and in Old French > > (fro) if the search is for someone studying French. > >There are various contexts where multiple language tags can be specified, >indicating that you will accept content in any of these, usually with a >priority order. > > > The trouble with you all is you assume that people are just searching > > for pages in their first language and that they have only one real > > primary language they can accept pages in; clearly this cannot be the > > case for fro (Old French) and frm (Moyen Francais). > >Not at all. > > > It's also conceivable that a person might want documents that are > > written in either a Creole of French and Standard French. > >The same feature (the language-tag search list) can be used in that case >as well. > > > One could of course list all of these in the meta content tags; for > > example for my "Moyen francais" document I could list: lang=en, fr, frm > >Indeed. > > > but some applications used to put up pages at some web hosts embed > > one's document into the body of a page they create; that's the case > > with teacher web (http://teacherweb.com), as I pointed out once before. > >That's known to be a problem, yes. > > > Also, as I noted, some of the 17th Century new world documents were > > in Middle French although you all have set the dates as 1400-1600 > > (those dates can vary a bit; you'd be surprised also at the amount of > > variation you can get in any given language at any given time before > > literacy was so widespread) > >We didn't pick the dates, the ISO 639-2 Registration Authority >(the Library of Congress) did. You can go to >http://www.loc.gov/standards/iso639-2/php/iso639-2chform.php >and request a change. > > > I note that for Arabic (which has as far as I know and I am no expert) > > the following main subdivisions in its dialects, [...] you just have > > to use the country codes--at least this is all I saw? > >As of ISO 639-2 and RFC 4646, yes. In 639-3 and 4646bis, about >30 different Arabic language tags will be available: > >aao Algerian Saharan Arabic >abh Tajiki Arabic >abv Baharna Arabic >acm Mesopotamian Arabic >acq Ta'izzi-Adeni Arabic >acw Hijazi Arabic >acx Omani Arabic >acy Cypriot Arabic >adf Dhofari Arabic >aeb Tunisian Arabic >aec Saidi Arabic >afb Gulf Arabic >ajp South Levantine Arabic >apc North Levantine Arabic >apd Sudanese Arabic >arb Standard Arabic >arq Algerian Arabic >ars Najdi Arabic >ary Moroccan Arabic >arz Egyptian Arabic >auz Uzbeki Arabic >avl Eastern Egyptian Bedawi Arabic >ayh Hadrami Arabic >ayl Libyan Arabic >ayn Sanaani Arabic >ayp North Mesopotamian Arabic >bbz Babalia Creole Arabic >pga Sudanese Creole Arabic >shu Chadian Arabic >ssh Shihhi Arabic > > > Why not also have variants for dates, such as two digits plus the letter > > c, with the two digits indicating the century (01-20; I assume that the > > century would be redundant for the 21rst century variant of a language)? > >Those forms are too short, and there are problems with generic tags, as >centuries are not the appropriate units for many languages. > >-- >Is a chair finely made tragic or comic? Is the John Cowan >portrait of Mona Lisa good if I desire to see cowan@ccil.org >it? Is the bust of Sir Philip Crampton lyrical, >http://ccil.org/~cowan >epical or dramatic? If a man hacking in fury >at a block of wood make there an image of a cow, >is that image a work of art? If not, why not? --Stephen >Dedalus > _________________________________________________________________ Stay in touch with old friends and meet new ones with Windows Live Spaces http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
Received on Tuesday, 14 November 2006 00:48:23 UTC