- From: Pat Hayes <phayes@ai.uwf.edu>
- Date: Tue, 23 Oct 2001 23:11:19 -0500
- To: Brian McBride <bwm@hplb.hpl.hp.com>
- Cc: w3c-rdfcore-wg@w3.org
>Lets imagine we have a simple example where we have resources of >type Foobar that have a unique property size which is an integer. Help already. Do you mean that Foobar is a datatype? Which datatypes are you going to use to refer to integers? > >The use case is to: > > o merge two graphs using different lexical representations of an int That doesn't make sense to me. A datatyping specifies the value for a lexical representation. Do you mean that there are two different datatypes with the same value space, or one datatype which maps two different lexical forms (eg 0023 and 23 ) into the same value? > > >Approach 1: concrete types are represented by literals which are a >pair consisting of a type and a lexical representation, the result >of a dumb merge is: > >_:foobar rdf:type util:Foobar . >_:foobar util:size xsd:int-"10" . >_:foobar util:size xsd:int-"010" . > >Unless the software is smart enough to understand that "05" and "5" >denote the same integer, this is pretty unsatisfactory, since there >really is only one size property. > >Conclusion: require a canonical representation of integers, or sw >has to understnad how to process the the xsd:int type. ??? Surely the point of using xsd: is that it refers one to a datatyping spec and *that spec* establishes that 10 and 010 have the same value. If RDF has to do this itself, what's the point of using XSD? > >Approach 2: the DAML+OIL approach - a dumb merge results in: > >_:foobar rdf:type util:Foobar . >_:foobar rdf:size _:size1 . >_:size1 rdf:type xsd:int . >_:size1 rdf:value "10" . >_:size2 rdf:type xsd:int . >_:size2 rdf:value "010" . > >Smart sw can recognise that size1 and size2 are really the same >thing. I'm worried though, that that doing that will call for a >compare against all anon resources in the graph. If I'm adding >_:size2 to the graph, would I be able to restrict the sw attention >to checking only whether it was equal to size1. Yes, it could; it >need only consider arcs with the same blunt end and property. > >Conclusion: Same as 1; to do the merge properly require either a >canonical representation of ints, or the software has to understand >ints their lexical representation. Takes more triples this way. >Assert without proof, that this is harder to implement than approach >1, since it involves multiple triples. > >Approach 3: DanC's approach > >_:foobar rdf:type util:Foobar . >_:foobar util:size "10" . >_:foobar util:intSize _:size1 . >_:size1 rdf:type xsd:int . >_:size1 rdf:value "10" . >_:foobar util.intSize _:size2 . >_:size2 rdf:type xsd:int . >_:size2 rdf:value "010" . > >Conclusion: same as approach 2. > >Approach 4: Pat's approach (?) > >_:foobar rdf:type util:Foobar . >_:foobar util:size size1:"10" . >_:size1:"10" rdf:type xsd:int . >_:foobar util:size size2:"010" . >_:size2:"010" rdf:type xsd:int . > >Software that's aware of how to process lexical representations of >int's reduces this to three triples. Less triple bloat than above. >Debatably extends M&S's model, but compatible with with current RDF >in that types added as an extension. >Worry about DAML+OIL requirement that concrete types and resources >are disjoint. Specifically, rdf:type in above is illegal in >DAML+OIL; no property can have a both a resource and a concrete type >in its domain. They would use[rdfs:range xsd:integer] to convey the required typing information. We could do that also, by the way. The point is that however the information is conveyed, the literals get interpreted according to the datatype conventions that are required by the rdf:type info, whether that is expressed explicitly (why not?) or implicitly (eg via rdfs:range). > >I guess what's botherimg me here is RDF's ability to separate out >individual statements and if the original set was true, they should >be true independently. So > >_:foobar util:size size1:"10" . > >Is this really true? is it the same as: > >_:foobar util:size "10" . Yes. > >The value of the size should be an integer, shouldn't it. It is. (?? I'm not following your worry here.) > > >There may be other approaches I should cover; sorry I'm running out >of time and patience(as probably are you). > > > o query a graph where the graph and the query contain different >lexical representations of an int. > >It seems to me that its likely that implementations will want to >store a canonical representation of an int, and will convert the >query to that canonical representation. So this isn't a problem. >What bothers me though, is, if the the graph implementation reads in >a representation of an int as say from "010" but changes it >intenally to some canonical representation which loses the fact it >was originally represented as "010", have we lost anything? Not if we know that it was in a datatype which maps 010 and 10 to the same value; and if we didn't know that, then we wouldn't be justified in making the inference in the first place. > Is there any distinction amongst the different approaches. > >So if I've got: > >_:foobar rdf:type util:Foobar . >_:foobar rdf:size _:size1 . >_:size1 rdf:type xsd:int . >_:size1 rdf:value "10" . >_:size2 rdf:type xsd:int . >_:size2 rdf:value "010" . > > > o query a graph for all Foobar's whose size is less than 12. > >In a large scale database, it would be crazy, from an implementation >point of view, not to store the underlying data in a way that >represented the ordering of integers so that the query can be done >efficiently. This implies using an internal canonical >representation. > >I'm left with the feeling that representing a concrete type as a >pair will be easier to implement. It probably would be, but it would ruffle a lot of XML feathers. I think we can allow it if people want to use it, and also allow more flexible but expensive schemes if people want to use those. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola, FL 32501 (850)202 4440 fax phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Wednesday, 24 October 2001 00:11:25 UTC