- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Wed, 11 Sep 2002 10:04:16 +0300
- To: "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>, "Graham Klyne" <GK@NineByNine.org>
- Cc: "RDF core WG" <w3c-rdfcore-wg@w3.org>
[Patrick Stickler, Nokia/Finland, (+358 50) 483 9453, patrick.stickler@nokia.com] ----- Original Message ----- From: "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com> To: "Graham Klyne" <GK@NineByNine.org>; "Patrick Stickler" <patrick.stickler@nokia.com> Cc: "RDF core WG" <w3c-rdfcore-wg@w3.org> Sent: 10 September, 2002 16:35 Subject: RE: Literals: language and xml (was: Comments on new datatyping document, part 1) > > > (agreeing with Patrick I think) Mostly, I think, but a few questions/comments below to test that ;-) > My view is that the abstract syntax will say something like: > > A Literal Node is labelled with one of: > (a) - A datatype value It cannot be labeled by a datatype value. It can only be labeled with a URIref denoting the datatype and a lexical form -- which together denote a datatype value. URIref nodes are not labeled with the resources they denote, neither are typed literal nodes. There are no native datatype values in the RDF graph, only labeled nodes which denote datatype values. Perhaps we are in agreement on this, and it's just a matter of getting the wording right (though I think you are suggesting something different). > (b) - An rdf string literal It may be useful to say "a non-explicitly typed string literal". > (c) - An rdf xml literal I would rephrase the above list as (a) an explicitly typed string literal (<xsd:string>, "xyz") (b) a non-explicitly typed string literal (_:a, "xyz") (c) an XML literal (xml"xyz") and if XML literals can be typed (and I don't see why they couldn't): (a) an explicitly typed string literal (<xsd:string>, "xyz") (b) a non-explicitly typed string literal (_:a, "xyz") (c) an explicitly typed XML literal (<xhtml:h1>, xml"<h1>Foo</h1>") (d) a non-explicitly typed XML literal (_:b, xml"<blarg>belch</blarg>") > Typical RDF/XML giving rise to (a) is: > > <rdf:Description> > <eg:prop rdf:datatype="&xsd;string">val<eg:prop> > </rdf:Description> > > (Label is <xsd:string>"val") OK. > (b) > > <rdf:Description> > <eg:prop>val<eg:prop> > </rdf:Description> > > (Label is "val") I thought it should be _:x"val" Isn't that what you meant by syntactically untidy? > (c) > <rdf:Description> > <eg:prop rdf:parseType="Literal">val<eg:prop> > </rdf:Description> > > (Label is xml"val") Well, perhaps it should be _:yxml"val" or such, of course, we have the problem with maintaining an explicit partition between _:y and xml, as I've pointed out before. And then also (d) <rdf:Description> <eg:prop rdf:datatype="&ex;someComplexType" rdf:parseType="Literal">val</eg:prop> </rdf:Description> (Label is <ex:someComplexType>xml"val") > > Adding an xml:lang we get: > (a) > <rdf:Description xml:lang="en"> > <eg:prop rdf:datatype="&xsd;string">val<eg:prop> > </rdf:Description> > > (Label is "val" > It has to be an xsd:string, and so the language tag must be lost) No. If the primary mechanism for specifying language for literal content is xml:lang, then that information must not be lost from the literal node. The label here should be <xsd:string>"val"-en We *have* to have a mechanism for attributing language qualification to literals. Since literals can't be subjects, I see no other mechanism than to attach it to the literal node label itself, as was decided at the Bristol f2f. Here, just because there is a datatype specified, does not mean the language is not considered valid. I may wish to say *both* that the property value is a string, *and* that the string contains e.g. Finnish content. No, the semantics of xsd:string does not care about the language qualification and the xml:lang value does not affect the L2V mapping, but applications will likely want to have that information. > (b) > > <rdf:Description xml:lang="en"> > <eg:prop>val<eg:prop> > </rdf:Description> > > Label is "val"-en Or rather _:x"val"-en > (c) > <rdf:Description xml:lang="en"> > <eg:prop rdf:parseType="Literal">val<eg:prop> > </rdf:Description> > > Label is xml"val"-en OK. > The only choice is whether we allow: > > <rdf:Description xml:lang="en"> > <eg:prop rdf:parseType="Literal" rdf:datatype="&xsd;string>val<eg:prop> > </rdf:Description> In which case, we'd have <xsd:string>xml"val"-en Fine. > > If we did then the following would be problematic > > <rdf:Description xml:lang="en"> > <eg:prop rdf:parseType="Literal" > rdf:datatype="&xsd;string><b>val</b><eg:prop> > </rdf:Description> > > My take is that it a syntax error. I would say it correlates to <xsd:string>xml"<b>val</b>"-en Where's the problem? At the RDF level, there is no distinction between simple and complex datatype, as is made by XML Schema. For RDF, a datatype simply has a lexical space, a value space, and a mapping from the former to the latter, but does not care nor can say what resides in the value space. If the datatype in question is a complex datatype, and the lexical space contains XML serializations (fragments), there is no problem with applying the RDF datatyping mechanisms to associate any datatype (complex or otherwise) with an XML literal. It is up to the definition of the datatype itself whether or not the XML literal is an acceptable representation of some member of its value space (whatever that is -- and RDF doesn't have to say). So whether or not either <xsd:string>xml"<b>val</b>"-en or <xsd:string>xml"val"-en validly denote datatype values of xsd:string is not RDF's concern. That's up to the definition of xsd:string. Taking a similar example to the above, but with a known complex type, again, it is not RDF's concern whether either of <xhtml:h1>xml"<h1>val</h1>"-en or <xhtml:h1>xml"val"-en validly denote members of the value space of xhtml:h1 (and we can presume that the latter example does not). What matters here is what is being *asserted*. And the mechanisms for making assertions about the datatype by which the above literals are to be interpreted are all being used correctly, even if some of the assertions turn out to be false. And most importantly, the presence of the XML bit/flag and the xml:lang value are totally *irrelevant* to the datatyping semantics -- but nevertheless are necessary for RDF applications in many contexts where those typed literals are to be used. In the case of the xml:lang value, that is not relevant to the L2V mapping, but is relevant to applications in how the value might be selected or displayed -- i.e. the xml:lang value is a "hidden statement" about the value denoted *by* the literal, not about the literal itself, so it's no surprise that it does not affect the interpretation of the literal. In the case of the XML bit/flag, it is saying something about the legacy RDF/XML representation of the lexical form -- and one can very well express complex typed values without it: <rdf:Description> <ex:prop rdf:datatype="&xhtml;h1"><h1>val</></ex:prop> </rdf:Description> providing <xhtml:h1>"<h1>val</h1>" which is every much as valid a typed literal representation as <xhtml:h1>xml"<h1>val</h1>" the only difference being in the latter case that one specified instead a parseType=Literal for convenience sake, but that's all just syntactic sugar, no? And it's no surprise that variation in the RDF/XML representation of the literal does not affect the interpretation of that literal. Patrick
Received on Wednesday, 11 September 2002 03:04:43 UTC