- From: Mark Davis <mark.davis@icu-project.org>
- Date: Thu, 12 Apr 2007 09:13:09 -0700
- To: "Richard Ishida" <ishida@w3.org>
- Cc: www-international@w3.org, "LTRU Working Group" <ltru@ietf.org>
- Message-ID: <30b660a20704120913g2924f63dr2ca0d13f7b27a2ed@mail.gmail.com>
The summary looks good. This discussion raises 2 items for the LTRU group. Q1. What tag should be used where it is definitely a language, but there is no code available yet? (This is an area where ISO 15924 is ahead of ISO 639 (and 3166), since it has Zzzz: Code for uncoded script.) Q2. Clarify the wording around "und" vs "". Mark On 4/12/07, Richard Ishida <ishida@w3.org> wrote: > > Ok, I just found some additional emails between John and Mark that had > been > scooped into my LTRU folder by my mail client because [LTRU] was prepended > to the subject. They shift the baseline for discussion and agreement. > > What I now see as the summary of where we are is: > > [a] Determined, and a language (for which a subtag exists): <subtag(s)> > > [b] Determined, and not a language: zxx > > [c] Determined, but not a language for which a subtag exists: ??? > > [d] Undetermined, and not sure whether it is a language or not: > xml:lang="" > if available, otherwise und > > I have revised the article at [1] to make it clearer that whether text is > linguistic or not is unimportant wrt use of '' and und. > > > I'm still troubled however by the passage in RFC 4646, however, so I'll > repeat those comments here and copy the LTRU folks: > > I'm also a little worried about the wording in section 4.1 of RFC 4646[2] > about und, which quite clearly says that you shouldn't use und unless the > *protocol* demands it, or sometimes when matching tags. This doesn't make > any distinction between specifying the language of a resource and turning > off language declarations for a range of embedded text. It seems that > this > suggests a way in which xml:lang='' and xml:lang="und" are not equivalent, > since there are no such restrictions on xml:lang="". In my opinion, the > text > of RFC 4646 needs some work, both to relax the use of und in scenarios > where > 'undefined text' occurs in a context with a defined language, and to > clarify > the relationship of und to xml:lang=''. > > RI > > [1] http://www.w3.org/International/questions/qa-no-language#answer > > [2] http://www.rfc-editor.org/rfc/rfc4646.txt > > ============ > Richard Ishida > Internationalization Lead > W3C (World Wide Web Consortium) > > http://www.w3.org/People/Ishida/ > http://www.w3.org/International/ > http://people.w3.org/rishida/blog/ > http://www.flickr.com/photos/ishida/ > > > > > -----Original Message----- > > From: www-international-request@w3.org > > [mailto:www-international-request@w3.org] On Behalf Of Richard Ishida > > Sent: 12 April 2007 14:56 > > To: www-international@w3.org > > Subject: RE: For review: Tagging text with no language > > > > > > So it seems the alternatives that John is suggesting are: > > > > Determined, and a language (for which a subtag exists): <subtag(s)> > > Determined, and not a language: zxx > > Determined, but not a language for which a subtag exists: ??? > > > > Undetermined, and not sure whether it is a language or not: > > xml:lang="" (if > > available) > > Undetermined, but sure that it's a language: und > > > > The implications of this for X/HTML are that there is no way > > to say that > > text is undetermined if you are not sure whether it's a > > language or not. > > > > This is very different from Jon Hanna's proposal at > > http://lists.w3.org/Archives/Public/www-international/2007JanM > ar/0178.html > > > > Can we please discuss this. I'm particularly hoping for > > contributions from > > John, Jon, Mark, Martin and Addison (though he's on vacation > > at the moment). > > > > For my part, having experienced, even when trying to write > > this email, how > > difficult it is to succinctly talk about the difference > > between something > > that is unidentified and may or may not be a language, I'm a > > little leery > > about accepting the evidence in the mail below, John. Can we > > be sure that > > the people who drafted that text were conciously making the > > distinction you > > mention rather than just being a little imprecise in wording? > > > > I'm also a little worried about the wording in section 4.1 of > > RFC 4646[1] > > about und, which quite clearly says that you shouldn't use > > und unless the > > *protocol* demands it, or sometimes when matching tags. This > > doesn't make > > any distinction between specifying the language of a resource > > and turning > > off language declarations for a range of embedded text. It > > seems that this > > suggests another way in which xml:lang='' and xml:lang="und" are not > > equivalent. In my opinion, either the text of RFC 4646 needs > > some work, > > either to relax the use of und in scenarios where undefined > > text occurs in a > > context that is defined, or to clarify the relationship of und to > > xml:lang=''. > > > > RI > > > > > > [1] http://www.rfc-editor.org/rfc/rfc4646.txt > > > > ============ > > Richard Ishida > > Internationalization Lead > > W3C (World Wide Web Consortium) > > > > http://www.w3.org/People/Ishida/ > > http://www.w3.org/International/ > > http://people.w3.org/rishida/blog/ > > http://www.flickr.com/photos/ishida/ > > > > > > > > > -----Original Message----- > > > From: www-international-request@w3.org > > > [mailto:www-international-request@w3.org] On Behalf Of John Cowan > > > Sent: 11 April 2007 21:24 > > > To: Mark Davis > > > Cc: John Cowan; CE Whitehead; www-international@w3.org > > > Subject: Re: For review: Tagging text with no language > > > > > > > > > Mark Davis scripsit: > > > > > > > I believe that that is adding an interpretation to "und" > > > which is not > > > > borne out by either the source standards, nor in common usage. > > > > > > ISO 639-2 says merely "Undetermined", but this is placed in a > > > column labeled "English name of language", so I think it's > > > fair to read it as "Undetermined language". But ISO 639-3 > > > is, I think, definitive. > > > http://www.sil.org/iso639-3/scope.asp#S says (in part): > > > > > > The identifier [und] (undetermined) is provided for those > > > situations in which a language or languages must be indicated > > > but the *language* cannot be identified [emphasis added]. > > > > > > By contrast, "zxx" is explained in the next sentence thus: > > > > > > The identifier [zxx] (no linguistic content) may be applied in a > > > situation in which a language identifier is required by system > > > definition, but the item being described does not actually > > > contain linguistic content. > > > > > > In any case, the document I'm commenting on says that "zxx" > > > is non-linguistic content, and that "und" and "" are > > > synonymous and represent linguistic content. Whatever "und" > > > may or may not mean, I think there's no doubt that "" can be > > > applied to both linguistic and non-linguistic content. > > > > > > -- > > > You escaped them by the will-death John Cowan > > > and the Way of the Black Wheel. cowan@ccil.org > > > I could not. --Great-Souled Sam > > > http://www.ccil.org/~cowan > > > > > > > > > > _______________________________________________ > Ltru mailing list > Ltru@ietf.org > https://www1.ietf.org/mailman/listinfo/ltru > -- Mark
Received on Thursday, 12 April 2007 16:13:16 UTC