Re: A basis for convergence and closure? from Sergey Melnik on 2002-02-07 (w3c-rdfcore-wg@w3.org from February 2002)

From: Sergey Melnik <melnik@db.stanford.edu>
Date: Wed, 06 Feb 2002 16:30:20 -0800
To: Pat Hayes <phayes@ai.uwf.edu>
CC: Patrick Stickler <patrick.stickler@nokia.com>, w3c-rdfcore-wg@w3.org, "McBride, Brian" <bwm@hplb.hpl.hp.com>
Message-ID: <3C61CA9C.D1B0F6DA@db.stanford.edu>
Pat Hayes wrote:
> 
> >
> >
> >It seems that, from all the past couple of weeks discussions,
> >there are the following characteristics on folks wish lists
> >(this is a partial recap of some of the desiderada):
> >
> >1. A working MT (duh ;-)
> >2. Tidy literals
> >3. A global/implicit idiom
> >4. A local/explicit idiom
> >5. Same vocabulary valid for both local and global idioms
> >6. Free combination of local and global idioms without conversion
> >7. The ability to conduct queries by value
> >8. The ability to conduct queries by literal
> >9. Datatype URIs denote the entire datatype, as defined by
> >    the datatype "owner", not only one of its components

Pat, I think your summary is sharp enough so that I would suggest to use
it as a starting point for a convergence write-up. We could produce it
in a quick and streamlined fashion by simply referring to the relevant
sections of the TDL, S, and  Pat's documents where appropriate, without
replicating existing content. I think the main focus should be on
capturing two things: a list of items we managed to agree on, and a list
of options where we still have to make a final choice. What do you think
guys?

> I think I can summarize a way to have all the above with a minimal
> imposition of particular idioms. Consider the following story, which
> I think is very similar to  Patrick's, but expressed slightly
> differently. This is a summary of the 'simple' version of the 'Oh my
> GOD' proposal. To hell with the subtle version.
> 
> I. Literal nodes are tidy, and literals denote themselves (ie we can
> treat literals as syntactic labels).

I think (I) is for the "agreed" list.
 
> II. rdf:value means that its subject can be represented textually by
> its object (somehow), so
> _:s rdf:value "10" .
> means that '10' is one possible way to 'write' whatever it is that _:s denotes.

(II) is stated flexibly enough so I'd also put it on the "agreed" list.
 
> Call that a 'value triple'. It is the basic (weakest) way of linking
> a value to a literal's lexical form.
> 
> III. There are two basic ways to say more about the relationship
> between the subject of a value triple and the lexical form of the
> literal.
> 
> IV.  One is to use a 'tighter' property, ie a subPropertyOf
> rdf:value. So for example (Graham's F case) one could assert
> ex:ISO8601 rdf:subPropertyOf rdf:value .
> and then
> _:s ex:ISO8601 "10" .
> would say that the subject node denotes something that could be
> written as '10' *using the conventions associated with that URI*.
> 
> This allows for 'multiple' such assertions, where the meaning would
> be that the literal/value pair had to conform to all the named
> constraints, ie a conjunctive reading, as usual. It also allows for
> alternative lexical forms for the same value, as in
> _:s xsd:realnumber "10.3" .
> _:s ex:germannumber "10,3" .

(IV) would be on the "options" list, right?
 
> V. The other is to assert a special datatyping triple along with the
> value triple, using rdf:dtype (or rdfd:type, whatever), giving the
> doublet case:
> 
> _:s rdf:value "10" .
> _:s rdf:dtype xsd:integer .
> 
> This has exactly the same meaning as
> _:s xsd:integer "10" .
> when xsd:integer is known to be a datatype, and the two forms can be
> used interchangeably or together.
> 
> rdf:dtype is always a subPropertyOf rdf:type.

(V) is another "option"
 
> VI. Each of these two cases IV and V requires a special semantic
> condition, and those conditions refer to externally defined datatype
> mappings. In order for an engine to make use of these mappings,  the
> relevant uris must be declared to be datatypes  by an assertion like
> xsd:integer rdf:type rdf:Datatype .
> or
> xsd:integer rdfs:subPropertyOf rdf:value .
> 
> It is acceptable for the graph to entail these assertions, but I
> suggest that we recommend that they be made explicit.

(VI) is also an option, because I believe that there are further
solutions besides (IV) and (V), see below.
 
> VII. Any graph that entails one of the forms above has the same
> meaning, as far as datatyping is concerned.  There are several ways
> to take advantage of this.
>
> One is to make <dtype> a subPropertyOf <type>, which makes them
> equivalent, so that the datatyping effect of the doublet works for
> all type assertions applied to a datatype. That is the dangerous way
> to do it, since it will break if there are two datatypes with
> overlapping value spaces but incompatible lexical spaces. (The
> symptom of 'breaking' will arise when the datatyping machinery is
> invoked; exactly what happens in the bad case is not fully
> determined. One possibility would be that the datatyping constraint
> machinery will barf. Another is that it will work, but produce
> inconsistent or misleading answers.) Nevertheless, this is a user
> option that may be useful in situations where it is known that no
> such datatype clashes will arise, or to handle legacy RDF code
> written using rdf:type. We note that it is always safe to do this in
> the untyped case, ie where no datatypes have been declared.
> 
> Another way is to add a further constraint on rdf:dtype, which allows
> particular kinds of assertion to entail rdf:dtype triples. To do this
> with full generality would require the ability to write RDF rules
> (implications), but the only case that seems to be of widespread
> interest can be captured by one simple 'ranging' rule that could be
> incorporated into the general semantic conditions, viz:
> 
> ddd rdf:type rdf:Datatype .
> aaa rdf:range ddd .
> bbb aaa ccc .
> --->
> ccc rdfd:type ddd .

Alternatively, why not just pretend there is a 'rule'

  xxx rdf:type ddd .
  xxx rdf:value vvv .
  --->
  xxx ddd vvv .

? (or additionally with   ddd rdf:type rdf:Datatype)
 
As a further alternative to using rules, I could imagine that we
consistently use pairs for typed values throughout the idioms. Thus, in

  _:1 prop _:2
  _:2 rdf:type xsd:int
  _:2 rdf:value "10"

_:2 would denote a pair, just as in TDL and S-P, but then in

  _:1 prop _:2
  _:2 xsd:int "10"

_:2 could also denote the same pair (instead of element of a value space
as it is defined now in S-A).

The above works just fine without introducing rdf:dtype. Of course, this
suggestion would make datatypes tightly coupled with their lexical
representations (so Patrick would/should like it ;) My guess is that the
question of semantic adequacy raised by Pat might not be very critical;
if the current use assumes that typed values go together with their
lexical representations (XSD also suggests that), then we could just
call such pairs "typed values" and that's it.

To summarize, I'm suggesting additional two "options". First, a
different kind of a rule-based approach suggested by Pat, second, a
consistent pair-oriented view throughout idioms.

Sergey

-- 
E-Mail:      melnik@db.stanford.edu (Sergey Melnik)
WWW:         http://www-db.stanford.edu/~melnik
Tel:         OFFICE: 1-650-725-4312 (USA)
Address:     Room 438, Gates, Stanford University, CA 94305, USA


> With this rule, we can infer the doublet case from an assertion that
> the range of a property is a datatype.  This is 'safe', even though
> it refers to rdf:range, because the rdfs closure rules do not allow
> us to infer that a subclass of a range is a range; so even if the
> ranges overlap, this rule will not produce any unwanted datatyping
> clashes.
> --
> 
> Thats the full story. We don't need to say this idiom is OK and that
> idiom isn't; we can just let people use rdf:value and rdf:dtype
> pretty freely, and things will work out as they ought to, once you
> get used to what they are supposed to mean. rdf:value means 'this
> *can* be written out like this:<literal>', and rdfd:type means 'this
> *must* be written out using the conventions found here:<datatype
> URI>' So rdf:dtype is a constraint on the ways of writing a value,
> which is why it has this odd semantic constraint with rdf:value,
> right?
> 
> Oh, I forgot; it also follows that (Dan C., are you reading this?)
> 
> VIII. The in-line use of literals, as in
> <mary> <age> "10" .
> has a fixed meaning which is absolutely unchangeable, which is that
> Mary's age is the *actual literal*, ie the character string '10'.  So
> if you want to write things like that and have them mean Mary is ten,
> then either <age> has to have an odd extension, or else, tough. This
> is where Sergey's idea about XML styles might be a point worth
> making, however, to keep out the noisy townfolk; and as Patrick says,
> you can always interpret <age> to *mean*
> (lambda (x y)
>     x aged _:s
>     _:s rdf:value y  )
> and then put range constraints on <aged>.
> 
> Pat
> 
> --
> ---------------------------------------------------------------------
> IHMC                                    (850)434 8903   home
> 40 South Alcaniz St.                    (850)202 4416   office
> Pensacola,  FL 32501                    (850)202 4440   fax
> phayes@ai.uwf.edu
> http://www.coginst.uwf.edu/~phayes
Received on Wednesday, 6 February 2002 19:00:12 UTC