- From: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
- Date: Mon, 01 Oct 2001 11:18:51 +0100
- To: RDFCore WG <w3c-rdfcore-wg@w3.org>
With reference to...
[1] Sergey's message:
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0444.html
[2] Some concerns expressed about DLs and literals-as-resources:
http://lists.w3.org/Archives/Public/www-rdf-logic/2001Sep/0077.html
Specifically:
[[[
Peter F. Patel-Schneider:
> >DAML+OIL depends somewhat on the separation between resources and
> >literals. Some Description Logics may break severely if their separation
> >between abstract (resources) and concrete (literals) domains is breached.
>
> Right, that is what worries me. I recall this being a sticking point
> in the DAML discussions for some people, so I presume it is fairly
> critical there also, no?
Right now, it is probably the case that the theory of XML Schema datatypes
is weak enough and the constructs that use them in DAML+OIL are also weak
enough that no undecidabilities would arise if literals were also
resources. (Implementation headaches do arise, however!) If you want to
have a stronger theory for datatypes or more DAML+OIL constructs that use
them, you can easily introduce undecidabilites. Combining two formalisms
requires great care!
]]]
[3] DanC's thoughts on literal values:
http://www.w3.org/2001/01/ct24
[4] A comment by Peter Patel-Schneider about literals:
http://lists.w3.org/Archives/Public/www-rdf-interest/2001Sep/0135.html
[5] My exchange with Brian about literals and strings:
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0445.html
and
http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Oct/0001.html
[6] The currently-published model theory:
http://www.w3.org/TR/2001/WD-rdf-mt-20010925/
I am concerned that Sergey's approach may be introducing more problems than
it solves... I'm having a hard time getting my head around the
implications, so, instead, I'm going to stand back and try another tack,
taking a somewhat different view than Sergey.
1. Inspired by [5], distinguish between "strings" and "literals":
- a string is a sequence of UCS/Unicode codepoints.
- a literal is, informally, that kind of RDF object value whose is
specified by a string and possibly some additional information (such as a
language tag).
I think that a "literal" in this sense exists only in the context of some
concrete syntax, and its nature is somewhat dependent on that syntax.
2. The model theory [6] presumes:
XL : literals -> LV
-- (fixed mapping for literals to literal values in the domain of
interpretation)
IS : V -> IR
-- (mapping for vocabulary of URIs used to to resources in the
domain of interpretation)
but does not make presumptions about the nature of LV, or whether there is
any overlap between LV and IR. Exhibit [2] suggests that there might be
problems if LV and IR are not disjoint, but that such problems don't arise
if the data structuring primitives are weak enough and/or constructs that
use them are weak enough.
I'm not sufficient logician to know what might constitute "weak enough",
but I have an intuition that one source of problems might be if the same
structure that is expressed within the data type of a literal can also be
expressed using "simpler" literal values related by RDF properties. That
would, I think, require the subsumption computation to examine the internal
structure of literals.
It seems to me, then, that the structure of literals should be, in some
sense, atomic or opaque, and composite structures should be expressed using
RDF relations. Any value (in the domain of interpretation) that can be
expressed in terms of relationships between other values should not be
admissible as a literal value.
This rules out having an LV which is a composition of a string and a
language tag.
[[[Trouble is, it also seems to rule out anything but individual
characters, as a string of length >1 can be expressed as a concatenation of
other strings. I think this is a purely lexical/syntactic issue, but I'm
on shaky ground here.]]]
3. Inspired partly by [3], I suggest that literal attributes (xml:lang,
maybe others in future) are handled by some kind of syntactic
transformation when constructing the RDF graph, rather than being
represented somehow within graph literal nodes. Thus, within the RDF graph
syntax, "literals" are simply "strings".
Example:
<Subject>
<property xml:lang="us-EN">Property string</property>
</Subject>
might yield a graph like this:
[Subject] --property--> [ ] --xml:lang--> "us-EN"
[ ] --property--> "Property string"
or, following DanC's lead [3], figure 1:
[Subject] --???--> [ ] --xml:lang---> "us-EN"
[ ] [ ] --rdf:value--> "Property string"
[ ]
[ ] --property--> "Property string"
The details of the transformation aren't fixed; the key idea is the
transformation to graph form reduces all literals to "string" form.
4. Wrapping up
The upshot of this is that a literal value (in LV) is always a string
without additional adornment. For RDF graph syntax, the LX mapping can be
a unity mapping. Any deeper interpretation of a literal (a string in a
given language, a number, etc) is in the interpretation of some resource
for which that literal is an rdf:value.
Then:
- Do LV and IR overlap? It seems to me unclear how one would exclude a
mapping in IS from some URI to a Unicode string in LV; e.g.
<data:,text/plain;charset=utf-8,Property string>. I think this could be
resolved either way. If disjointness of IR and LV is required, then the
above example might map to something like:
[ ] --rdf:value----------> "Property string"
[ ] --meta:content-type--> [Content-type:text/plain]
- Does overlapping resources with the very simple domain of Unicode strings
for literals cause problems for description logics? I don't know.
- Does it make sense for literals to have properties; e.g.
"Property string" --length--> "15"
I think any such properties would be trivial, in the sense that they always
can be determined by examination of the literal itself. So, if prohibited,
no expressive power is lost.
#g
Received on Monday, 1 October 2001 06:40:46 UTC