W3C home > Mailing lists > Public > www-international@w3.org > January to March 2007

(wrong string) ‘this is not in any language’ in XHTML/HTML

From: CE Whitehead <cewcathar@hotmail.com>
Date: Sun, 18 Mar 2007 15:09:32 -0400
Message-ID: <BAY114-F63F499584F311BB82DDF4B3770@phx.gbl>
To: asmusf@ix.netcom.com
Cc: www-international@w3.org

>But perhaps such a systematic list already exists elsewhere.

I'm not going to address this; it's out of my knowledge.

My other comments are below (thanks for explaining the rationale of this 
message to me, I guess it makes sense)

>>>1) you have content that has not been classified
>>>2) you have content for which classification has failed
>>>3) you have content that is known to not fit  any of the classifications  
>>>(would not this be 2?)
>>>4) you have content to which the classification cannot apply
>>>5) you have content that fits multiple classifications

>>>in the case of tagging natural language content, the label "zxx" is 
>>>clearly the correct one for case 4. When there is no linguistic content, 
>>>the classification cannot apply.
>>>"und" seems  a fine label when you want to convey that tagging has not 
>>>happened (case 1 or 2 - the distinction between these is not necessarily 
>>>of sufficient interest to carry it forward). But so would the empty tag 
>>>if it had been allowed.
>>und would be o.k. if there were some language but it has not been 
>>determined which or maybe even how many  (I think Addison's comment, that 
>>und was not recommended by the rfc when there was no real language, was 
>>helpful for und)
>I read this as implying that we are in agreement.
I guess so.

>>>Case 3 could be handled with any form or label that says "no tag assigned 
>>>yet", but failing that, if available, a private tag might be useful.
>>>A single string like "OK" is an example that could fit category 5.

I would code "o.k." en normally but as any other language if the term had 
really seeped into that language, I might also encode as maybe slang, or 
maybe mul
???  I kind of think this will end up being ad-hoc

If you share my maybe off-base theory that o.k. comes from Langue d'oc  I 
guess you could encode that it is from that language too ; though o.k.'s by 
now pretty English.

if you have another theory then you would not encode "o.k." as Langue d'oc;

>>John Cowan did address the off-topic remarks on the word I'd chosen as an 
>>exmaple sufficiently,

Here are some links on this word; there seems to be an accent grave over the 
o in oc; otherwise it might get pronounced something like an open u  is that


or u


lenga d'c


 - "se prononce comme le o franais de "pomme", "dort", etc. Jamais ferm 
comme dans le franais "mtro", "boulot", "dodo"."

Is that  the backwards C or whatever
or an open o??

(here's the IPA vowel chart
http://en.wikipedia.org/wiki/International_Phonetic_Alphabet#Vowels )

(I do not speak this language; I read it in its medieval form; when I was in 
France I maybe saw a few modern words/heard a few in Oc, but really used 
French there and heard a dialect of French I think;
my original source of the /ok/ pronunciation is a course in Old French where 
we studied the trobadors some and then a lecture, then this was an interest 
sort of a lot once)

--C. E. Whitehead

Watch free concerts with Pink, Rod Stewart, Oasis and more. Visit MSN 
Presents today. 
Received on Sunday, 18 March 2007 19:09:40 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:27 UTC