- From: Sergey Melnik <melnik@db.stanford.edu>
- Date: Mon, 01 Oct 2001 15:48:39 -0700
- To: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
- CC: RDFCore WG <w3c-rdfcore-wg@w3.org>
Graham,
thanks a lot for the pointers! There may be a possibility that making
literals composite might impact the MT in some non-trivial ways.
Compared to an approach like the one in 3) below, datatypes expressed
using composite literals simplify a developer's life a lot. So maybe
some of the problems that you anticipate aren't too crucial given the
benefit? ('d need to consult Pat on this, I guess).
To reiterate, in [1] I did not mean to say anything about the
relationship between IR and LV, but just about the pool of constants.
Sergey
Graham Klyne wrote:
>
> With reference to...
>
> [1] Sergey's message:
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0444.html
>
> [2] Some concerns expressed about DLs and literals-as-resources:
> http://lists.w3.org/Archives/Public/www-rdf-logic/2001Sep/0077.html
> Specifically:
> [[[
> Peter F. Patel-Schneider:
> > >DAML+OIL depends somewhat on the separation between resources and
> > >literals. Some Description Logics may break severely if their separation
> > >between abstract (resources) and concrete (literals) domains is breached.
> >
> > Right, that is what worries me. I recall this being a sticking point
> > in the DAML discussions for some people, so I presume it is fairly
> > critical there also, no?
>
> Right now, it is probably the case that the theory of XML Schema datatypes
> is weak enough and the constructs that use them in DAML+OIL are also weak
> enough that no undecidabilities would arise if literals were also
> resources. (Implementation headaches do arise, however!) If you want to
> have a stronger theory for datatypes or more DAML+OIL constructs that use
> them, you can easily introduce undecidabilites. Combining two formalisms
> requires great care!
> ]]]
>
> [3] DanC's thoughts on literal values:
> http://www.w3.org/2001/01/ct24
>
> [4] A comment by Peter Patel-Schneider about literals:
> http://lists.w3.org/Archives/Public/www-rdf-interest/2001Sep/0135.html
>
> [5] My exchange with Brian about literals and strings:
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0445.html
> and
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Oct/0001.html
>
> [6] The currently-published model theory:
> http://www.w3.org/TR/2001/WD-rdf-mt-20010925/
>
> I am concerned that Sergey's approach may be introducing more problems than
> it solves... I'm having a hard time getting my head around the
> implications, so, instead, I'm going to stand back and try another tack,
> taking a somewhat different view than Sergey.
>
> 1. Inspired by [5], distinguish between "strings" and "literals":
>
> - a string is a sequence of UCS/Unicode codepoints.
>
> - a literal is, informally, that kind of RDF object value whose is
> specified by a string and possibly some additional information (such as a
> language tag).
>
> I think that a "literal" in this sense exists only in the context of some
> concrete syntax, and its nature is somewhat dependent on that syntax.
>
> 2. The model theory [6] presumes:
> XL : literals -> LV
> -- (fixed mapping for literals to literal values in the domain of
> interpretation)
> IS : V -> IR
> -- (mapping for vocabulary of URIs used to to resources in the
> domain of interpretation)
> but does not make presumptions about the nature of LV, or whether there is
> any overlap between LV and IR. Exhibit [2] suggests that there might be
> problems if LV and IR are not disjoint, but that such problems don't arise
> if the data structuring primitives are weak enough and/or constructs that
> use them are weak enough.
>
> I'm not sufficient logician to know what might constitute "weak enough",
> but I have an intuition that one source of problems might be if the same
> structure that is expressed within the data type of a literal can also be
> expressed using "simpler" literal values related by RDF properties. That
> would, I think, require the subsumption computation to examine the internal
> structure of literals.
>
> It seems to me, then, that the structure of literals should be, in some
> sense, atomic or opaque, and composite structures should be expressed using
> RDF relations. Any value (in the domain of interpretation) that can be
> expressed in terms of relationships between other values should not be
> admissible as a literal value.
>
> This rules out having an LV which is a composition of a string and a
> language tag.
>
> [[[Trouble is, it also seems to rule out anything but individual
> characters, as a string of length >1 can be expressed as a concatenation of
> other strings. I think this is a purely lexical/syntactic issue, but I'm
> on shaky ground here.]]]
>
> 3. Inspired partly by [3], I suggest that literal attributes (xml:lang,
> maybe others in future) are handled by some kind of syntactic
> transformation when constructing the RDF graph, rather than being
> represented somehow within graph literal nodes. Thus, within the RDF graph
> syntax, "literals" are simply "strings".
>
> Example:
>
> <Subject>
> <property xml:lang="us-EN">Property string</property>
> </Subject>
>
> might yield a graph like this:
>
> [Subject] --property--> [ ] --xml:lang--> "us-EN"
> [ ] --property--> "Property string"
>
> or, following DanC's lead [3], figure 1:
>
> [Subject] --???--> [ ] --xml:lang---> "us-EN"
> [ ] [ ] --rdf:value--> "Property string"
> [ ]
> [ ] --property--> "Property string"
>
> The details of the transformation aren't fixed; the key idea is the
> transformation to graph form reduces all literals to "string" form.
>
> 4. Wrapping up
>
> The upshot of this is that a literal value (in LV) is always a string
> without additional adornment. For RDF graph syntax, the LX mapping can be
> a unity mapping. Any deeper interpretation of a literal (a string in a
> given language, a number, etc) is in the interpretation of some resource
> for which that literal is an rdf:value.
>
> Then:
>
> - Do LV and IR overlap? It seems to me unclear how one would exclude a
> mapping in IS from some URI to a Unicode string in LV; e.g.
> <data:,text/plain;charset=utf-8,Property string>. I think this could be
> resolved either way. If disjointness of IR and LV is required, then the
> above example might map to something like:
>
> [ ] --rdf:value----------> "Property string"
> [ ] --meta:content-type--> [Content-type:text/plain]
>
> - Does overlapping resources with the very simple domain of Unicode strings
> for literals cause problems for description logics? I don't know.
>
> - Does it make sense for literals to have properties; e.g.
> "Property string" --length--> "15"
> I think any such properties would be trivial, in the sense that they always
> can be determined by examination of the literal itself. So, if prohibited,
> no expressive power is lost.
>
> #g
Received on Monday, 1 October 2001 18:23:18 UTC