- From: Sergey Melnik <melnik@db.stanford.edu>
- Date: Mon, 01 Oct 2001 15:48:39 -0700
- To: Graham Klyne <Graham.Klyne@MIMEsweeper.com>
- CC: RDFCore WG <w3c-rdfcore-wg@w3.org>
Graham, thanks a lot for the pointers! There may be a possibility that making literals composite might impact the MT in some non-trivial ways. Compared to an approach like the one in 3) below, datatypes expressed using composite literals simplify a developer's life a lot. So maybe some of the problems that you anticipate aren't too crucial given the benefit? ('d need to consult Pat on this, I guess). To reiterate, in [1] I did not mean to say anything about the relationship between IR and LV, but just about the pool of constants. Sergey Graham Klyne wrote: > > With reference to... > > [1] Sergey's message: > http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0444.html > > [2] Some concerns expressed about DLs and literals-as-resources: > http://lists.w3.org/Archives/Public/www-rdf-logic/2001Sep/0077.html > Specifically: > [[[ > Peter F. Patel-Schneider: > > >DAML+OIL depends somewhat on the separation between resources and > > >literals. Some Description Logics may break severely if their separation > > >between abstract (resources) and concrete (literals) domains is breached. > > > > Right, that is what worries me. I recall this being a sticking point > > in the DAML discussions for some people, so I presume it is fairly > > critical there also, no? > > Right now, it is probably the case that the theory of XML Schema datatypes > is weak enough and the constructs that use them in DAML+OIL are also weak > enough that no undecidabilities would arise if literals were also > resources. (Implementation headaches do arise, however!) If you want to > have a stronger theory for datatypes or more DAML+OIL constructs that use > them, you can easily introduce undecidabilites. Combining two formalisms > requires great care! > ]]] > > [3] DanC's thoughts on literal values: > http://www.w3.org/2001/01/ct24 > > [4] A comment by Peter Patel-Schneider about literals: > http://lists.w3.org/Archives/Public/www-rdf-interest/2001Sep/0135.html > > [5] My exchange with Brian about literals and strings: > http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0445.html > and > http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Oct/0001.html > > [6] The currently-published model theory: > http://www.w3.org/TR/2001/WD-rdf-mt-20010925/ > > I am concerned that Sergey's approach may be introducing more problems than > it solves... I'm having a hard time getting my head around the > implications, so, instead, I'm going to stand back and try another tack, > taking a somewhat different view than Sergey. > > 1. Inspired by [5], distinguish between "strings" and "literals": > > - a string is a sequence of UCS/Unicode codepoints. > > - a literal is, informally, that kind of RDF object value whose is > specified by a string and possibly some additional information (such as a > language tag). > > I think that a "literal" in this sense exists only in the context of some > concrete syntax, and its nature is somewhat dependent on that syntax. > > 2. The model theory [6] presumes: > XL : literals -> LV > -- (fixed mapping for literals to literal values in the domain of > interpretation) > IS : V -> IR > -- (mapping for vocabulary of URIs used to to resources in the > domain of interpretation) > but does not make presumptions about the nature of LV, or whether there is > any overlap between LV and IR. Exhibit [2] suggests that there might be > problems if LV and IR are not disjoint, but that such problems don't arise > if the data structuring primitives are weak enough and/or constructs that > use them are weak enough. > > I'm not sufficient logician to know what might constitute "weak enough", > but I have an intuition that one source of problems might be if the same > structure that is expressed within the data type of a literal can also be > expressed using "simpler" literal values related by RDF properties. That > would, I think, require the subsumption computation to examine the internal > structure of literals. > > It seems to me, then, that the structure of literals should be, in some > sense, atomic or opaque, and composite structures should be expressed using > RDF relations. Any value (in the domain of interpretation) that can be > expressed in terms of relationships between other values should not be > admissible as a literal value. > > This rules out having an LV which is a composition of a string and a > language tag. > > [[[Trouble is, it also seems to rule out anything but individual > characters, as a string of length >1 can be expressed as a concatenation of > other strings. I think this is a purely lexical/syntactic issue, but I'm > on shaky ground here.]]] > > 3. Inspired partly by [3], I suggest that literal attributes (xml:lang, > maybe others in future) are handled by some kind of syntactic > transformation when constructing the RDF graph, rather than being > represented somehow within graph literal nodes. Thus, within the RDF graph > syntax, "literals" are simply "strings". > > Example: > > <Subject> > <property xml:lang="us-EN">Property string</property> > </Subject> > > might yield a graph like this: > > [Subject] --property--> [ ] --xml:lang--> "us-EN" > [ ] --property--> "Property string" > > or, following DanC's lead [3], figure 1: > > [Subject] --???--> [ ] --xml:lang---> "us-EN" > [ ] [ ] --rdf:value--> "Property string" > [ ] > [ ] --property--> "Property string" > > The details of the transformation aren't fixed; the key idea is the > transformation to graph form reduces all literals to "string" form. > > 4. Wrapping up > > The upshot of this is that a literal value (in LV) is always a string > without additional adornment. For RDF graph syntax, the LX mapping can be > a unity mapping. Any deeper interpretation of a literal (a string in a > given language, a number, etc) is in the interpretation of some resource > for which that literal is an rdf:value. > > Then: > > - Do LV and IR overlap? It seems to me unclear how one would exclude a > mapping in IS from some URI to a Unicode string in LV; e.g. > <data:,text/plain;charset=utf-8,Property string>. I think this could be > resolved either way. If disjointness of IR and LV is required, then the > above example might map to something like: > > [ ] --rdf:value----------> "Property string" > [ ] --meta:content-type--> [Content-type:text/plain] > > - Does overlapping resources with the very simple domain of Unicode strings > for literals cause problems for description logics? I don't know. > > - Does it make sense for literals to have properties; e.g. > "Property string" --length--> "15" > I think any such properties would be trivial, in the sense that they always > can be determined by examination of the literal itself. So, if prohibited, > no expressive power is lost. > > #g
Received on Monday, 1 October 2001 18:23:18 UTC