- From: <Patrick.Stickler@nokia.com>
- Date: Tue, 3 Sep 2002 09:02:53 +0300
- To: <jjc@hpl.hp.com>, <w3c-rdfcore-wg@w3.org>
> -----Original Message----- > From: ext Jeremy Carroll [mailto:jjc@hpl.hp.com] > Sent: 02 September, 2002 21:15 > To: w3c-rdfcore-wg@w3.org > Subject: Re: Datatyping: moving away from "literal as 3-part thing" to > "literal as dt+opaque bit" > > > > >[Patrick said, at the telecon, "xml:lang infects everything" as an > >example of this view] > > >There should be no "infection" > >of new types by stuff like language properties, > > The unicode string in an XML document which gives the lexical > form of a > datatype literal may well be in scope of an xml:lang declaration. > > But the current proposals expect the parser to know whether > it is parsing an > old-style literal (in which case xml:lang is significant) or > a new style > literal, in which case it is not. Hmmmm, interesting. I've yet to see any proposal that spells this out, that xml:lang information is discarded entirely. Now, I agree that xml:lang does not affect the L2V mapping in any way, but if specified, it still must be presumed to be information about the literal that is relevant to applications and must not be discarded (even though it is known that xml:lang can overgenerate such information). > Thus > > <a:prop xml:lang="en" rdf:ltype="&xsd;string">banana</a:prop> > > would deliver the value <xsd:string>"banana" and the language > declaration has > no effect. (If you want an xsd:string, you don't get a langstring. > > Jeremy Unfortunately, this precludes being able to use xml:lang with explicitly typed xsd:string values, which I consider unacceptable. Consider the following use case: <rdf:Description rdf:about="#TheEnglishLanguage"> <rdfs:label xml:lang="en" rdfd:type="&xsd;string">English</rdfs:label> <rdfs:label xml:lang="fi" rdfd:type="&xsd;string">Englanti</rdfs:label> <rdfs:label xml:lang="sp" rdfd:type="&xsd;string">Ingles</rdfs:label> </rdf:Description> which I would expect to produce <#TheEnglishLanguage> rdfs:label xsd:string"English"-en . <#TheEnglishLanguage> rdfs:label xsd:string"Englanti"-fi . <#TheEnglishLanguage> rdfs:label xsd:string"Ingles"-sp . so that my RDF application can choose which label is most appropriate, per the intentionally specified language. If all I get is <#TheEnglishLanguage> rdfs:label xsd:string"English" . <#TheEnglishLanguage> rdfs:label xsd:string"Englanti" . <#TheEnglishLanguage> rdfs:label xsd:string"Ingles" . then I have lost some crucial information needed for e.g. autogeneration of GUIs, etc. And if literals cannot be subjects, nor are tidy, then how can I assert the language of the particular string values otherwise? To be quite honest, I find hiding the xml:lang information in the structure of the literal, rather than generating triples, to be highly distasteful, but that's what we've got at the moment, so ... Ideally, we'd just let literals be subjects and have a trivially simple solution to all this mess, for both datatyping and xml:lang attribution: <rdf:Description rdf:about="#TheEnglishLanguage"> <rdfs:label xml:lang="en" rdf:type="&xsd;string">English</rdfs:label> </rdf:Description> would give us <#TheEnglishLanguage> rdfs:label _:a"English" . _:a"English" rdf:type xsd:string . _:a"English" xml:lang _:b"en" . etc... even if at this time we don't extend RDF/XML yet to make explicit statements about literal subjects but just leave such statements as output from the parser... but I guess we shouldn't go there anymore... maybe in RDF 2.x ... maybe ... -- At the very least, for now, xml:lang codes must be part of the typed literal node structure just as they are for the untyped literal node structure, and just as it is specified in the restructured document. Regards, Patrick
Received on Tuesday, 3 September 2002 02:02:56 UTC