W3C home > Mailing lists > Public > www-rdf-interest@w3.org > January 2005

RE: Language X within scope of language Y

From: Jon Hanna <jon@hackcraft.net>
Date: Wed, 19 Jan 2005 18:41:39 -0000
To: "'Misha Wolf'" <Misha.Wolf@reuters.com>, <www-rdf-interest@w3.org>, <www-international@w3.org>
Cc: <ietf-languages@iana.org>
Message-Id: <20050119184140.22AC66EDA177@postie2.hosting365.ie>

> I agree that "en-IT" expresses "English as written/spoken in Italy", 
> but that wasn't, I think, the problem that Reto was writing about in:
> http://lists.w3.org/Archives/Public/www-rdf-interest/2005Jan/0125.html

I don't think so either, it's more a case of "English as used in an
otherwise Italian context" which is not what en-IT means (or rather it was a
case of Latin in an otherwise German context in the original example).

I think the meaning of <p xml:lang="en">Aw, well, <span xml:lang="fr">c'est
la vie</span>, I suppose</p> is clear.
I think the meaning of <p xml:lang="en">Aw, well, <span title="that's
life"><span xml:lang="fr">c'est la vie</span></span>, I suppose</p> is
clearer still.

The French here isn't fr-GB or any thing like that. Nor do I think this not
being fr-GB is controversal, so perhaps we no longer need to include
ietf-languages in the cross-posting list.

The original example is a special case of this were the only text remaining
is that which is "foreign" to the general context. Book or document titles
are a common case of this (_Das Kapital_, _Tao Te Ching_, _Humanita Vita_)
and we can expect it to be a reasonably common case as regards to metadata.

There are two issues here. One is linguistic and almost philosopical, viz.
is "Tao Te Ching" when used in English, English or Chinese?

The other regards the lack of a mechanism to indicate a base xml:lang
attribute for XMLLiterals in RDF/XML. When blocking the inheritance of
xml:lang in the case of typed literals was first being debated it was
suggested that this would make sense from an XML perspective by considering
the element containing the literal value to have an implied xml:lang
attribute with a null value.

By this logic it would seem reasonable that one could do:

<dc:title rdf:parseType="Literal" xml:lang="de"><html:span
xml:lang="la">Carpe diem</html:span></dc:title> to override that implied
xml:lang="" and indicate a Latin title used in an otherwise German context.
I think this would require an amendment to both RDF and RDF/XML (and hence
will probably never happen unless I find that the above is already allowed
in RDF at least), but I think it would be justified - well-formed XML
fragments, such as are allowed as values for XMLLiterals in RDF, can have a
natural language, especially if they are extracted from documents which
applied that by having an xml:lang attribute at a higher level (this is not
a million miles away from certain issues with XML C18N) and in this way are
not comparable with other typed literals such as xsd:integer.

Regards,
Jon Hanna
Work: <http://www.selkieweb.com/>
Play: <http://www.hackcraft.net/>
Chat: <irc://irc.freenode.net/selkie> 
Received on Wednesday, 19 January 2005 18:41:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:52:12 GMT