W3C home > Mailing lists > Public > www-international@w3.org > October to December 2006

Re: What to do with Gaulish ?

From: CE Whitehead <cewcathar@hotmail.com>
Date: Tue, 14 Nov 2006 17:53:00 -0500
Message-ID: <BAY114-F24DA524E01E2115CB48FD1B3EB0@phx.gbl>
To: ietf-languages@iana.org, iso639@dkuug.dk
Cc: www-international@w3.org

Hi, I am troubled by tags like frc, fro, and frm
because I am wondering what happens when a person using a search engine asks 
for pages in French?  Will the frc, fro, frm pages turn up too?
It's quite possible that a person interested in French will be interested in 
moyen Francais/Middle French (frc) and in Old French (fro) if the search is 
for someone studying French.

Also, as I noted, some of the 17th Century new world documents were in 
Middle French although you all have set the dates as 1400-1600 (those dates 
can vary a bit; you'd be surprised also at the amount of variation you can 
get in any given language at any given time before literacy was so 
widespread).

It's also conceivable that a person might want documents that are written in 
either a Creole of French and Standard French.

One could of course list all of the languages related to a particular page 
using the meta content tags; for example for my "Moyen francais" document I 
could list:
lang=en, fr, frm

but some applications used to put up pages at some web hosts embed one's 
document into the body of a page they create; that's the case with teacher 
web (http://teacherweb.com), as I pointed out to the internationalization 
mailing list; they concur that this is a problem.

John Cowan, of that list, notes, however, the following,

"In practice, search engines tend to ignore language tagging in favor
of statistical analysis of text, since language tags are so often
missing or incorrect.

"There are various contexts where multiple language tags can be specified,
indicating that you will accept content in any of these, usually with a
priority order."

Nevertheless, I am still certain that we are using the tags for something 
besides the browser and assume that search engines will ultimately 
incorporate these tags if internationalization truely proceeds as 
recommended in the W3C document, "Internationalization Best Practices: 
Specifying Language in XHTML & HTML Content," 
http://www.w3.org/TR/i18n-html-tech-lang/!

I note that it is an option to use the country codes, and that the user 
choses when these are relavant!

Why not also have optional variant tags indicating the century in which a 
dialect/language was used, for example

12c (12th century, 1100-1199 A.D.)
13c (13th century, 120001299 A.D.)
14c
15c
16c
17c

and so forth.

These become quite relevant for 17th century European languages which are 
'modern' sort of but sometimes vary quite a bit from the modern version of 
the language (I found this to be the case when dealing with 17th century 
French in a report coming from the U.S.; some of the features I noted in the 
1683 report were reminiscent of Old French, many of Middle French, spellings 
were sometimes irregular and phonetic; it might be understandable to a 
speaker of Modern French but so might 16th century French which does get the 
Middle French tag; elsewhere, in some texts from France, 17th century French 
appears more like the modern variety; likewise Shakespeare's 16th-17th 
century English is modern, in fact, as I understand things, his use of 
English based on Scots dialect made Modern English what it is; but it does 
vary a bit from English used today).

The tags can help encode modern slang too.

Like the country codes, they would sometimes be redundant; then it would be 
recommended to drop them, to specify the language as succinctly as possible.

The main problem I see in adding these now is that a variety of forms for 
say Middle French and Old French become possible

fro-13c  (Or fro-14c)
or
fro
or
fr-13c (or fr-14c)

Obviously one would have to specify whether it was preferred to use, for Old 
French say, fro [Old French] with or without the century tag-- over fr 
[French] with the century tag (the former is probably the best option for 
Old French as it really is a bit different than Modern French and it is 
studied as a separate language, though generally under the auspices of a 
French Department; but I am not sure about Middle French; to me, both ways 
of writing it would be fine:  frm with or without a century tag, or fr with 
a century tag; I do not know if having two ways to write the tag for Middle 
French would cause problems, but having only frm does not work either as 
speakers of French can generally read Middle French without many problems, 
but may not know what variety to ask for, so may ask just for French!).

On the same issue, what is going to happen with Arabic, when you get the new 
subtags, will people still be able to use ar with a country code to indicate 
the language?  Or is the new subtag to be the only option?  I am not sure 
which should be the case myself as the dialects are quite different, though 
most written Arabic is not spoken but standard so these new codes should 
probably only apply for spoken materials or phonetic transcriptions for the 
most part.

Of course, having a century tag would not solve everything for languages 
that vary over time:

For example,
in English prior to Shakespeare, there were many competing dialects; thus
we have very the modern sounding English of the 12th-13th-century Scots 
dialect without the Chaucerian -en
endings on verb plurals and noun plurals of "Sir Patrick Spence"
("The King sits in Dunferling toune
drinking the blood-reid wine,
'O whar can ee get good sailor to sail this ship o' mine?'
Up and spek an eldern knight, set at the king's recht knee,
'Sir Patrick Spence is the best sailor,
that sails upon the sea,'"
and so on; my apologies for my spelling of this; I put it down from auditory 
memory in a hurry.)
As I hear it, the dialect in "Sir Patrick" is actually almost identical to 
modern Scottish English dialect (English as spoken by the Scots) but I am 
not an expert.

Then we get the English of Chaucer, with a French (the English and French 
were fighting back and forth over their borders) and Germanic lilt,
which is not used today (it has, as you all say, been obsoleted):
("and smalle foules maken melodien
that clepen all the night with open-eyen,
so priketh him nature in her couragen,
then loongan folk to goon on pilgramagen;"
again my apologies for the spellings; I stuck these in quickly).

But the addition of variants indicating the century could clarify many 
things, and simply, as I noted, not be used when redundant.


On this note, I'd like to know how to apply for a variant (not a language) 
subtag, 17c, if I may do so.
Hope I may.
Thanks.

Sincerely,

C. E. Whitehead
cewcathar@hotmail.com



P.S. I've included below a description of the French in the 17th century 
U.S. document (for which the frm-US tag is of course clear enough to 
indicate the date if that is all one wants to do since there was as far as 
we know no 16th century U.S. French) for those who may not be convinced that 
it is not in Modern French.

Thanks!

* * * NOTE--below are the peculiarities of the French document I am dealing 
with * * *

Grammar Changes

Singular nouns in the nominative may end in "s" as may their adjectives ( in 
Old French, the nominative endings for the plural and singular were the 
reverse of today's endings; it is thus the oblique endings for the plural 
and singular from Old French that today's endings, with -s for the plural, 
are based on):

un/uns? isles
    (Fr. Moderne: un île)
semblables
    (Fr. Moderne: semblable)

Spelling/Misspelling
trouver
    'to find,' might be spelled trouve, 'found' or trouvez, 'you find' 
(trouver and trouve  [with an accent aigu on the e] and trouvez are all 
pronounced identically of course, so the spelling variation seems to be 
grounded in phonetics)

Spelling Changes
As in Middle French,
ai becomes, sometimes oi; ait becomes sometimes oist; êt (and also et and 
ét) becomes sometimes est; ot becomes sometimes ost; îl becomes sometimes 
isl; ui becomes, sometimes uy; and oin becomes oing. Occasionally, v may be 
realized as b, while both s and c may be realized as sc as in "scavoir" (for 
"savoir') and "escrasent" (for 'écrasent'); also dipthongs with i may be 
spelled with y as in "celuy" (for 'celui').
Additionally, ocasionally archaic nominative forms ending in "s" (from Old 
French) might be used!

alesne
    (Fr. Moderne "alène," 'awl;' see 
http://portail.atilf.fr/cgi-bin/getobject_?p.0:45./var/artfla/dicos/ACAD_1694/IMAGE/ 
[in Le dictionnaire de l'académe françoise, 1694; this reference was 
supplied by Gardefeu at http://www.wordreference.com])
allast
    (Fr. Moderne "allât," 'go,' imparfait du subjonctif/imperfect of the 
subjunctive.)
avoit, alternately aboit
    (Fr. Moderne "avait," 'he, she, it had')
avoient
    (Fr. Moderne avaient, 'they had')
cassetestes
    (Fr. Moderne "casse-têtes" 'war clubs,' perhaps 'tomahawks')
celuy
    (Fr. Moderne "celui" 'that one,' 'which one')
charioit
    (Fr. Moderne "chariait"?)
connoistre
    (Fr. Moderne "connaitre," 'to be acquainted with')
costé
    (Fr. Moderne "côté'," 'coast,' 'side')
disoit
    (Fr. Moderne "disait," 'he, she, it said,' 'he, she, it was saying')
escrasent
    (Fr. Moderne "écrasent," 'they crush' or 'mash')
escrit
    (Fr. Moderne "écrit," past participle of "écrire," 'write')
esté
    (Fr. Moderne "été," past participle of "être," 'been')
estoit, étoit
    (Fr. Moderne "était," 'he, she, it was')
estoient, étoient
    (Fr. Moderne "étaient," 'they were')
fasoit
    (Fr. Moderne "faisait," 'he, she, it was doing')
fenestres
    (Fr. Moderne "fenêtres," 'windows')
feste
    (Fr. Moderne "fête," 'feast,' 'celebration')
francois
    (Fr. Moderne "Français")
froterisont
    (probably Fr. Moderne "fraternisèrent," the simple past tense of 
"fraterniser," to 'fraternize;' in addition to subsituting an 'o' for the 
'a' in "fraterniser," de la Salle le jeune seems to have invented some of 
the word's spelling.)
iroit
    (Fr. Moderne "irait," 'would go' [conditional of "aller," 'go')
isles
    (Fr. Moderne "île," 'island;' the -s ending on "isle" is from the Old 
French nominative form)
loing
    (Fr. Moderne "loin," 'far')
luy
    (Fr. Moderne "lui", 'him,' 'it')
nommoient
    (Fr. Moderne "nommaient," 'they were named')
paroist
    (Fr. Moderne "parait," imperfect of "paraitre," 'it seemed')
pluye
    (Fr. Moderne "pluie," 'rain')
peschoient
    (Fr. Moderne "peschaient," 'they fished,' 'they were fishing')
pourroit
    (Fr. Moderne "pourrait," 'he, she, it could') [I misspelled!]
scavoir
    (Fr. Moderne "savoir," 'to know')
sçavoit
    (Fr. Moderne "savait," 'he, she, it knew,''he, she, it could tell')
sise
    (Archaic French [feminine? not in this case] form of Fr. Moderne "six," 
'six')
soi
    (Fr. Moderne "soi," 'self;' or "soi-même," 'oneself')
tirois or tiroit
    (Fr. Moderne "tirait," 'drew' as in drew a bow--to shoot an arrow)

There are also particular words such as canot (English 'canoe') that come 
from the Americas.

_________________________________________________________________
Get today's hot entertainment gossip  
http://movies.msn.com/movies/hotgossip?icid=T002MSN03A07001
Received on Tuesday, 14 November 2006 22:53:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:09 GMT