W3C home > Mailing lists > Public > www-international@w3.org > October to December 2006

Re: What to do with Gaulish ?

From: Elizabeth J. Pyatt <ejp10@psu.edu>
Date: Fri, 10 Nov 2006 11:18:38 -0500
Message-Id: <p06230909c17a3f96bf9a@[]>
To: www-international@w3.org

First, let me put on the flame-retardant suit...here goes!

The language tag system works well for 
well-established modern languages, but  I don't 
think the current process of language tag 
registration is at all well designed for 
colloquial forms and obscure ancient languages.

My main objection is that I am not seeing a 
systematic process where the appropriate 
linguistic community is ever consulted. There 
probably does need to be a "fr-cajun" tag 
(because you might have to use the "roa" Generic 
Romance tag otherwise), but in the current 
scenario, the following will likely happen:

* Innocent researcher will submit the "fr-cajun" tag
* List may discuss whether it should be "fr-us", 
"fr-us-LA", "fr-us-Cajun", "fr-caj" or "fr-cajun".
      However, I will not see any messages from any linguist identifying
      any dialect classifications French dialect 
researchers use (unless I missed that step....)

Basically tags will be created haphazardly, and I 
suspect duplications will occur (e.g. fr-caj vs 
fr-LA).  There is also no mechanism in place to 
ensure that all French dialects (or Langue D'öil 
languages) get consistent tags.

Without a consistent set of tags, any research 
for comparing closely related French forms is 
pretty hopeless. I need a single set of tags if I 
want to compare French forms (e.g. Cajun vs. 
Walloon vs. Jerrais (Jersey)  vs.  17th cent US 

Even worse, the French linguistic community may 
ignore these ad hoc tags unless they were in the 
original consultation. One project may use the ad 
hoc tag they registered, but not all dialect 
projects will (or they'll be using an alternate 
system developed by the dialectologists together).

Without the systematicity...what's the goal of 
registering these tags other than as a 
"feel-good" measure?

You could argue implementation standards, but do 
I seriously expect there will ever be a Gaulish 
Google? Will support for Gaulish in a search 
engine actually exist? Will a Gaulish speech 
sysnthesizer/text parser ever be built? Will a 
Gaulish collation sort for SQL be developed? And 
I think I will be waiting a while for the Gaulish 
grammar checker from Microsoft.

Note that these tools do exist for the MODERN 
Celtic languages because there are more texts and 
live speakers to worry about. The modern Celtic 
languages also have reasonably accurate tags.

What are the actual consequences if I create my 
own tag for a very obscure form (maybe tell the 
other linguists) but not register it for world 
wide use?

You can't say "Other researchers will miss your 
materials" because support for these ad hoc tags 
will probably not be systematically implemented 
by the developers anyway.

What will actually happen is what happens 
now...the language name is just entered into a 
keyword field and that's how researchers find 
each other.

My thoughts as a lurking linguist.


P.S. If you need a Cajun tag, I would  recommend 
checking  if there are pre-exsting electronic 
archives of Cajun documents, then ask find out 
how THEY tagged it.

You may find the identification is actually in a 
keyword field in the metadata. I'm pretty sure 
that's where the "Gaulish" metadata is.

Elizabeth J. Pyatt, Ph.D.
Instructional Designer
Education Technology Services, TLT/ITS
Penn State University
ejp10@psu.edu, (814) 865-0805 or (814) 865-2030 (Main Office)

210 Rider Building II
227 W. Beaver Avenue
State College, PA   16801-4819
Received on Monday, 13 November 2006 01:39:30 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:27 UTC