- From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
- Date: Mon, 01 Oct 2001 11:18:51 +0100
- To: RDFCore WG <w3c-rdfcore-wg@w3.org>
With reference to... [1] Sergey's message: http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0444.html [2] Some concerns expressed about DLs and literals-as-resources: http://lists.w3.org/Archives/Public/www-rdf-logic/2001Sep/0077.html Specifically: [[[ Peter F. Patel-Schneider: > >DAML+OIL depends somewhat on the separation between resources and > >literals. Some Description Logics may break severely if their separation > >between abstract (resources) and concrete (literals) domains is breached. > > Right, that is what worries me. I recall this being a sticking point > in the DAML discussions for some people, so I presume it is fairly > critical there also, no? Right now, it is probably the case that the theory of XML Schema datatypes is weak enough and the constructs that use them in DAML+OIL are also weak enough that no undecidabilities would arise if literals were also resources. (Implementation headaches do arise, however!) If you want to have a stronger theory for datatypes or more DAML+OIL constructs that use them, you can easily introduce undecidabilites. Combining two formalisms requires great care! ]]] [3] DanC's thoughts on literal values: http://www.w3.org/2001/01/ct24 [4] A comment by Peter Patel-Schneider about literals: http://lists.w3.org/Archives/Public/www-rdf-interest/2001Sep/0135.html [5] My exchange with Brian about literals and strings: http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0445.html and http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Oct/0001.html [6] The currently-published model theory: http://www.w3.org/TR/2001/WD-rdf-mt-20010925/ I am concerned that Sergey's approach may be introducing more problems than it solves... I'm having a hard time getting my head around the implications, so, instead, I'm going to stand back and try another tack, taking a somewhat different view than Sergey. 1. Inspired by [5], distinguish between "strings" and "literals": - a string is a sequence of UCS/Unicode codepoints. - a literal is, informally, that kind of RDF object value whose is specified by a string and possibly some additional information (such as a language tag). I think that a "literal" in this sense exists only in the context of some concrete syntax, and its nature is somewhat dependent on that syntax. 2. The model theory [6] presumes: XL : literals -> LV -- (fixed mapping for literals to literal values in the domain of interpretation) IS : V -> IR -- (mapping for vocabulary of URIs used to to resources in the domain of interpretation) but does not make presumptions about the nature of LV, or whether there is any overlap between LV and IR. Exhibit [2] suggests that there might be problems if LV and IR are not disjoint, but that such problems don't arise if the data structuring primitives are weak enough and/or constructs that use them are weak enough. I'm not sufficient logician to know what might constitute "weak enough", but I have an intuition that one source of problems might be if the same structure that is expressed within the data type of a literal can also be expressed using "simpler" literal values related by RDF properties. That would, I think, require the subsumption computation to examine the internal structure of literals. It seems to me, then, that the structure of literals should be, in some sense, atomic or opaque, and composite structures should be expressed using RDF relations. Any value (in the domain of interpretation) that can be expressed in terms of relationships between other values should not be admissible as a literal value. This rules out having an LV which is a composition of a string and a language tag. [[[Trouble is, it also seems to rule out anything but individual characters, as a string of length >1 can be expressed as a concatenation of other strings. I think this is a purely lexical/syntactic issue, but I'm on shaky ground here.]]] 3. Inspired partly by [3], I suggest that literal attributes (xml:lang, maybe others in future) are handled by some kind of syntactic transformation when constructing the RDF graph, rather than being represented somehow within graph literal nodes. Thus, within the RDF graph syntax, "literals" are simply "strings". Example: <Subject> <property xml:lang="us-EN">Property string</property> </Subject> might yield a graph like this: [Subject] --property--> [ ] --xml:lang--> "us-EN" [ ] --property--> "Property string" or, following DanC's lead [3], figure 1: [Subject] --???--> [ ] --xml:lang---> "us-EN" [ ] [ ] --rdf:value--> "Property string" [ ] [ ] --property--> "Property string" The details of the transformation aren't fixed; the key idea is the transformation to graph form reduces all literals to "string" form. 4. Wrapping up The upshot of this is that a literal value (in LV) is always a string without additional adornment. For RDF graph syntax, the LX mapping can be a unity mapping. Any deeper interpretation of a literal (a string in a given language, a number, etc) is in the interpretation of some resource for which that literal is an rdf:value. Then: - Do LV and IR overlap? It seems to me unclear how one would exclude a mapping in IS from some URI to a Unicode string in LV; e.g. <data:,text/plain;charset=utf-8,Property string>. I think this could be resolved either way. If disjointness of IR and LV is required, then the above example might map to something like: [ ] --rdf:value----------> "Property string" [ ] --meta:content-type--> [Content-type:text/plain] - Does overlapping resources with the very simple domain of Unicode strings for literals cause problems for description logics? I don't know. - Does it make sense for literals to have properties; e.g. "Property string" --length--> "15" I think any such properties would be trivial, in the sense that they always can be determined by examination of the literal itself. So, if prohibited, no expressive power is lost. #g
Received on Monday, 1 October 2001 06:40:46 UTC