- From: Dan Connolly <connolly@w3.org>
- Date: 28 Jun 2002 01:53:12 -0500
- To: pat hayes <phayes@ai.uwf.edu>
- Cc: w3c-rdfcore-wg@w3.org
In sum: I don't see any new information that motivates re-opening the issue, though I do agree with pretty much all of this. I do have a suggestion for how to handle some specifics of literals in the model theory. With apologies for piping up in this thread but not in # Bad job on literals? Sergey Melnik (Tue, May 21 2002) http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002May/thread.html#71 on to the details: On Thu, 2002-06-27 at 13:23, pat hayes wrote: > Apologies to all for reopening this business, but Ive been descending > into a black hole on this issue since the F2F. I really, really do > not like the situation we seem to be in here regarding literals, for > a whole lot of reasons. This message tries to enumerate them and to > suggest a possible solution. I know this has been done before and > apologize in advance for missing the obvious objections that have > probably already been raised. > > Summary: literals ought to be strings, not 3-tuples. To achieve this, > literals should be allowed in subject position. If we do this, a > number of issues are clarified and the overall design of the language > is rationalized and simplified, at almost no cost. Also, > datastructuring suddenly gets easier. That's my preference as well... I elaborated in: parseType="Literal" as syntactic sugar for infoset description (#rdfms-literal-is-xml-structure) From: Dan Connolly (connolly@w3.org) Date: Wed, Oct 10 2001 http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Oct/0153.html ) But I have come to grudgingly accept the status quo; let me explain how I look at it; maybe you'll find it acceptable too, then... > -------- > > First let me summarize my understanding of the current state of play. > > 1. Literals are not strings. You keep saying that, but what if you say: Literals are, sometimes, strings. Sometimes they're not. Think of Literal as a union of String, String X Lang, XML-thing, XML-thing X Lang. (XML-things happen to have string serializations) Or think of String as a subsort of Literal, if sorted logics is an appealing analogy... > 2. Literals are 3-tuples, consisting of a bit and two strings, the > second of which is an XML lang tag. > > There are some constraints on these three parts of a literal. > > 3. If the bit is set (Im not sure if that is a one or a zero) then > the second string is required to be well-formed XML. > 4. If the bit is set, then the lang tag is understood to be the lang > tag of the second string. > 5. If the bit is not set, then the lang tag is understood to indicate > that the second string is an expression of that language. The lang tag works the same way whether the bit is set or not: it's just an optional language tag. > This has some curious consequences. > > First, (3) has the consequence that any RDF engine must include an > XML parser, even if it is not expected to parse RDF/XML. Yes, this is very unaesthetic, to me. > The graph > syntax has all of XML syntax incorporated into it. This seems to me > to be the most serious problem. Under these circumstances it hardly > seems worth having the graph syntax: it would be better to just > identify RDF with RDF/XML and stop pretending. No, for the purposes that the graph serves, the fact that literals can be XML-things is sorta like the fact that strings are over the unicode alphabet; it's pretty opaque; you don't need to consider the model-theoretic impact of each part of the XML grammar any more than you need to consider each unicode character separately. > Second, (4) and (5) interact in odd ways. Notice that the meaning of > the lang tag changes according to the bit setting. Suppose that the > second string is, in fact, well-formed XML and the lang tag is "FR". > If the bit is set, then tout est bien; but if the bit is not set, > then this combination is a disaster, since that well-formed XML is a > mere string, its resemblance to XML a mere accident, and then the > lang tag is claiming that it is a French text, which of course (being > well-formed XML) it is not, in fact. I am not sure what an RDF engine > is supposed to do at this point: would it be expected to reject this > as an incoherent literal? No. Don't think of it as "french text"; think of it as "unicode text with a french-language label hanging off the side". > Third, since we have decided that literals denote themselves, this > construction means that these ad-hoc 3-tuples must be in the semantic > domain of any RDF interpretation, and hence of any interpretation of > any language which extends RDF. Ah; good point; I thought the difference between 3-tuples and the union notion above was stylistic, so I didn't make much noise about it previously, but here it has teeth. I suggest we say that IR includes String, String X Lang, XML-thing, and XML-thing X Lang. (and yes, I find it barely acceptable to include String X Lang and XML-thing X Lang. Ugh! we can use RDF properties for all sorts of things, except language tags. Go figure.) > This is at best an unattractive > consequence, and it might be disastrous if other languages expect to > handle literals differently (as I am sure they will). It seems > particularly weird to incorporate XML *syntax* into the semantic > domain of all web ontology languages. Hmm... no more than including lists in all KIF domains, to me. (well... not *much* more; lists are so much more obviously minimal...) As I said in my Oct 10 message, I'd prefer to build XML-things out of RDF properties, but I can live with just assuming they're in the domain. > Notice that we do not have the > option of making these 3-tuples 'optional' once they are in the > semantic domain. > > Further to the previous point, datatyping simply does not work. All > the datatyping proposals so far considered - ALL of them - have been > predicated on the assumption that literals are members of the lexical > space of XML datatypes. But 3-tuples consisting of a bit and two > strings are not in the lexical space of any XML datatype. So we > simply cannot do datatyping in RDF at present, seems to me, with > literals the way they are. Seems easy enough to gloss over, by looking at the Literal part of the domain as including Strings. > Finally, there is a purely aesthetic reason which one might reject, > but it is worth mentioning: there is an obvious analogy between the > lang tag and a datatype, and it would be nice if the overall RDF > scheme of things preserved this analogy. Quite. (i.e. no, this is not new information.) > For all these reasons, I would suggest that the decision to make > literals (in the graph syntax) into things other than strings was a > very bad one, and should be reconsidered. (I would have said this a > long time ago if I had realized that this was in fact the decision > that had been taken, and Im sorry that I missed that in the weeks > following the Cannes meeting.) > > ------ > > I gather that the motivation for this decision was twofold: the bit > is there to record the parsetype being XML, to allow accurate XML > round-tripping; and the lang tag is needed by some RDF users. (I find > myself sceptical that the bit is actually needed: would it really be > a disaster if a string which just happened to be legal XML, but > hadn't been parsed from XML input, was accidentally misreported as > being true XML? yes. Absolutely. > This seems to me like worrying about the possibility > that monkeys might accidentally type out some Milton sonnets. But > never mind.) > > Now, it seems to me that both of these are examples of information > *about* a literal string which needs to be recorded in an unambiguous > form. Since RDF is itself a language for asserting things about > things, the obvious way to record information in an RDF graph is to > use triples to make such assertions; but the obvious problem in doing > that is that the literal in question would naturally be the subject > of the triple, and literals cannot be subjects. Damn. Again, I agree; but I don't see any new information that should cause folks to change their minds. [...] -- Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Friday, 28 June 2002 02:52:41 UTC