Re: big issue (2001-09-28#13) from Sergey Melnik on 2001-10-01 (w3c-rdfcore-wg@w3.org from October 2001)

From: Sergey Melnik <melnik@db.stanford.edu>
Date: Mon, 01 Oct 2001 15:48:39 -0700
To: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
CC: RDFCore WG <w3c-rdfcore-wg@w3.org>
Message-ID: <3BB8F2C7.64D6822E@db.stanford.edu>
Graham,

thanks a lot for the pointers! There may be a possibility that making
literals composite might impact the MT in some non-trivial ways.
Compared to an approach like the one in 3) below, datatypes expressed
using composite literals simplify a developer's life a lot. So maybe
some of the problems that you anticipate aren't too crucial given the
benefit? ('d need to consult Pat on this, I guess).

To reiterate, in [1] I did not mean to say anything about the
relationship between IR and LV, but just about the pool of constants.

Sergey


Graham Klyne wrote:
> 
> With reference to...
> 
> [1] Sergey's message:
>    http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0444.html
> 
> [2] Some concerns expressed about DLs and literals-as-resources:
>    http://lists.w3.org/Archives/Public/www-rdf-logic/2001Sep/0077.html
> Specifically:
> [[[
> Peter F. Patel-Schneider:
>  > >DAML+OIL depends somewhat on the separation between resources and
>  > >literals.  Some Description Logics may break severely if their separation
>  > >between abstract (resources) and concrete (literals) domains is breached.
>  >
>  > Right, that is what worries me. I recall this being a sticking point
>  > in the DAML discussions for some people, so I presume it is fairly
>  > critical there also, no?
> 
> Right now, it is probably the case that the theory of XML Schema datatypes
> is weak enough and the constructs that use them in DAML+OIL are also weak
> enough that no undecidabilities would arise if literals were also
> resources.  (Implementation headaches do arise, however!)  If you want to
> have a stronger theory for datatypes or more DAML+OIL constructs that use
> them, you can easily introduce undecidabilites.  Combining two formalisms
> requires great care!
> ]]]
> 
> [3] DanC's thoughts on literal values:
>    http://www.w3.org/2001/01/ct24
> 
> [4] A comment by Peter Patel-Schneider about literals:
>    http://lists.w3.org/Archives/Public/www-rdf-interest/2001Sep/0135.html
> 
> [5] My exchange with Brian about literals and strings:
>    http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0445.html
> and
>    http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Oct/0001.html
> 
> [6] The currently-published model theory:
>    http://www.w3.org/TR/2001/WD-rdf-mt-20010925/
> 
> I am concerned that Sergey's approach may be introducing more problems than
> it solves... I'm having a hard time getting my head around the
> implications, so, instead, I'm going to stand back and try another tack,
> taking a somewhat different view than Sergey.
> 
> 1.  Inspired by [5], distinguish between "strings" and "literals":
> 
> - a string is a sequence of UCS/Unicode codepoints.
> 
> - a literal is, informally, that kind of RDF object value whose is
> specified by a string and possibly some additional information (such as a
> language tag).
> 
> I think that a "literal" in this sense exists only in the context of some
> concrete syntax, and its nature is somewhat dependent on that syntax.
> 
> 2.  The model theory [6] presumes:
>      XL : literals -> LV
>           -- (fixed mapping for literals to literal values in the domain of
> interpretation)
>      IS : V -> IR
>           -- (mapping for vocabulary of URIs  used to to resources in the
> domain of interpretation)
> but does not make presumptions about the nature of LV, or whether there is
> any overlap between LV and IR.  Exhibit [2] suggests that there might be
> problems if LV and IR are not disjoint, but that such problems don't arise
> if the data structuring primitives are weak enough and/or constructs that
> use them are weak enough.
> 
> I'm not sufficient logician to know what might constitute "weak enough",
> but I have an intuition that one source of problems might be if the same
> structure that is expressed within the data type of a literal can also be
> expressed using "simpler" literal values related by RDF properties.  That
> would, I think, require the subsumption computation to examine the internal
> structure of literals.
> 
> It seems to me, then, that the structure of literals should be, in some
> sense, atomic or opaque, and composite structures should be expressed using
> RDF relations.  Any value (in the domain of interpretation) that can be
> expressed in terms of relationships between other values should not be
> admissible as a literal value.
> 
> This rules out having an LV which is a composition of a string and a
> language tag.
> 
> [[[Trouble is, it also seems to rule out anything but individual
> characters, as a string of length >1 can be expressed as a concatenation of
> other strings.  I think this is a purely lexical/syntactic issue, but I'm
> on shaky ground here.]]]
> 
> 3. Inspired partly by [3], I suggest that literal attributes (xml:lang,
> maybe others in future) are handled by some kind of syntactic
> transformation when constructing the RDF graph, rather than being
> represented somehow within graph literal nodes.  Thus, within the RDF graph
> syntax, "literals" are simply "strings".
> 
> Example:
> 
>      <Subject>
>         <property xml:lang="us-EN">Property string</property>
>      </Subject>
> 
> might yield a graph like this:
> 
>      [Subject] --property--> [  ] --xml:lang--> "us-EN"
>                              [  ] --property--> "Property string"
> 
> or, following DanC's lead [3], figure 1:
> 
>      [Subject] --???--> [  ] --xml:lang---> "us-EN"
>      [       ]          [  ] --rdf:value--> "Property string"
>      [       ]
>      [       ] --property--> "Property string"
> 
> The details of the transformation aren't fixed;  the key idea is the
> transformation to graph form reduces all literals to "string" form.
> 
> 4. Wrapping up
> 
> The upshot of this is that a literal value (in LV) is always a string
> without additional adornment.  For RDF graph syntax, the LX mapping can be
> a unity mapping.  Any deeper interpretation of a literal (a string in a
> given language, a number, etc) is in the interpretation of some resource
> for which that literal is an rdf:value.
> 
> Then:
> 
> - Do LV and IR overlap?  It seems to me unclear how one would exclude a
> mapping in IS from some URI to a Unicode string in LV;  e.g.
> <data:,text/plain;charset=utf-8,Property string>.  I think this could be
> resolved either way.  If disjointness of IR and LV is required, then the
> above example might map to something like:
> 
>     [ ] --rdf:value----------> "Property string"
>     [ ] --meta:content-type--> [Content-type:text/plain]
> 
> - Does overlapping resources with the very simple domain of Unicode strings
> for literals cause problems for description logics?  I don't know.
> 
> - Does it make sense for literals to have properties; e.g.
>    "Property string" --length--> "15"
> I think any such properties would be trivial, in the sense that they always
> can be determined by examination of the literal itself.  So, if prohibited,
> no expressive power is lost.
> 
> #g
Received on Monday, 1 October 2001 18:23:18 UTC