- From: Geoff Chappell <geoff@sover.net>
- Date: Thu, 1 Aug 2002 23:40:04 -0400
- To: "Sampo Syreeni" <decoy@iki.fi>
- Cc: <www-rdf-logic@w3.org>
----- Original Message ----- From: "Sampo Syreeni" <decoy@iki.fi> To: "Geoff Chappell" <geoff@sover.net> Cc: <www-rdf-logic@w3.org> Sent: Thursday, August 01, 2002 7:51 PM Subject: Re: Dataypes, literals, syntax > > On 2002-08-02, Geoff Chappell uttered to Sampo Syreeni: > > >It strikes me that it is legitimate to pack langid into literals because > >the langid is really a statement about the string/label and not the > >thing that it denotes. > > Huh? But that's *exactly* what it is. The literal string is by no means an > unambiguous label for a given literal, but precisely an extra attribute > which is necessary in order to both disambiguate which literal we are > talking about If we're talking about (as opposed to with) literals, I guess we're taking the position that literals are tidy? i.e. that they denote themselves? Sure, in that case an associated langid is making a statement about the thing the literal denotes - since the thing it denotes is itself. (or did I misunderstand you?) >*and* to interpret the string value coherently. Consider: > > <s,p,o1> > <s,p,o2> > > where > > o1==("aho","fi",false) > o2==("aho","ja",false) . As names, these two things (rdf literals) are clearly different. But _taking the untidy position_, absent other information, how do I know that these two names don't refer to the same object? Or how about the case: o1==("aho","fi",false) o2==("aho","fi",false) . Can I assume that the two names do refer to the same object given that many (most?) words have multiple senses. I guess we'd have to assume that if langids were put on equal footing with datatypes since datatypes are assumed to functionally bind a lexical representation to a value. > > You have two strings which are precisely equivalent in the literal sense, > but which clearly mean two entirely different things in the languages > denoted. (Assume away the trouble with hiragana vs. romaji for Japanese, > for the sake of an example.) I would contend such a difference constitutes > what is properly called a semantic distinction. The situation wouldn't > really be different if we substituted identical languages and parse types > "xsd:decimal" and "xsd:string". > > AFAICT, the part having to do with subtyping relations within XSD is well > beyond basic RDF, just as rdfs:subPropertyOf isn't supposed to be > understood by RDF-only parsers. I would tend to think that two lexically > equal literal strings should be treated as RDF-inequal if they had > separate language and/or separate parse type I'd agree if tidy literals are the rule, disagree otherwise (assuming that RDF-inequal is a measure of the inequality of the things the literals denote, not the literals themselves). >(even given that parse types > include all XSD data types), and only be treated as equal at the higher > level handled by XSD aware API's. >After all, that's what's being done to > anonymous nodes with daml:UniqueProperty's and the like, now, or with > identical string values with different parse types and/or languages. > > >By the same token, it seems to make some sense to pack a datatype into a > >literal as long as it is only saying something about the string (i.e. > >"10" is in the lexical space of xsd:integer) but seems odd for that > >packed statement to be saying anything about the value denoted by that > >string > > On the contrary. "aho" is both in the lexical space of (romanized) > Japanese and Finnish, yet the difference needs to be made in order to be > express both values for a single property on a given subject. There is a > clear difference, both in the semantic and RDF-equality terms, here, as > there would be if we were talking about xsd:integer"1001" and > xsd:string"1001". Kind of a special case, I grant that, but it's elegance > I'm after. > > >(assuming of course that literals can denote things other than > >themselves). > > They can, of course. Otherwise textual encodings of anything other than > literal strings would be meaningless. I guess I'm a bit confused whether you're arguing for or against tidy literals. Most places you seem to take a tidy stance, but here it sounds otherwise. Is it fair to say that you want a literal to be able to be an unambiguous referrer by definition (by always affixing a datatype/context)? if so, why not just use a uri scheme? > > >Otherwise what's the distinction between statements packed inside > >literals, and statements represented in the graph? > > A derivative of the one that is currently being made between resources and > literals, of course. Literals are an artifact of us wanting to represent > attributes separately from relations. They call for extra data, like > language and parse type, which aren't present in the case of normal > resources because *every* distinquishing feature of a resource can be > assumed to be represented by its name. The same doesn't hold for literals > which may very well represent anything at all. That's why we get language > and parse type, but also quite a number of extra features we might want to > talk about. > > >I guess if rdf evolves some sort of quoting mechanism, we wouldn't need > >to pack things within literals at all (at least not as a way of making > >statements about the string). > > The trouble is, language and parse type are part of the identity of a > string. ("aho","fi",0)!=("aho","ja",0), so you cannot represent "aho" in > the graph and just talk about it separately from its other attributes. > IOW, you cannot name the Finnish "aho" separately from the Japanese one > without referring to the language. That is also a distinction which arises > solely out of the semantic difference between the two strings, much like > the difference between xsd:integer"1001" and xsd:string"1001". > > If there were no literals, we could always assume that any difference in > identity would be encapsulated by the name of the object (that's pretty > much the definition of a "name", after all), That seems to me a better definition of uriref under rdf than of name (i.e. urirefs are assumed to be unambiguous though not necessarily unique names). >but when we refer to objects > by their content (like we do with literals), any distinctive attribute > whatsoever will have to be represented. Granting an open type mechanism is > one way to accomplish precisely that. (If you want a distinction, you make > it by allocating a new type.) Without it, there's Inelegance and Badness. > (I.e. a literal might very well share all the currently defined > attributes, but might *still* be different because of an characteristic > not defined. Currently one example of such a characteristic is the fact > that one literal might be an xsd:integer and another an xsd:string.) > -- > Sampo Syreeni, aka decoy - mailto:decoy@iki.fi, tel:+358-50-5756111 > student/math+cs/helsinki university, http://www.iki.fi/~decoy/front > openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 --geoff
Received on Thursday, 1 August 2002 23:10:37 UTC