Merging: Resource vs. Thing [was: Re: n3/n-triples syntax question]

[...]
> In RDF, suppose you could know that two nodes in two
> graphs represented one and the same resource.  Would it
> not be allowable, even desirable, to merge them?  If not,
> why not?

This is actually a rather intreguing question. If you decided that you
*wanted* to merge the nodes, then you could of course do so, but you
would have to come up with a strategy for getting rid of the URI-views
that you do not want. For example, say you had:-

   <#p> = <#q> .

and then references to p and q throughout your store. Which URI-ref
would you use? Some processors should have enough power to let you
filter out the ones that you don't want, perhaps based upon string
length, and then alphabetical order. Indeed, there is a filter for CWM
that can do just that [1].

Processors such as CWM take an identifier to be the "building block"
that gets shunted around, and I think that that is the best behaviour,
since from a parser's point of view, it can never tell if two
URI-views denote the same resource. Indeed, this view of things may
actually help us come to some conclusions as to what URIs with fragids
(I'll call them fragURIs) identify.

Let's create two classes: one that is the class of all things that can
be identified with URIs (x:Resource), and one that is the class of
things that can be identified with fragURIs (x:Thing). We should be
able to conclude that:-

   x:Resource rdfs:subClassOf x:Thing .
   x:Thing rdfs:subClassOf x:Resource .

i.e., they are equivalent in set membership. That does not necessarily
mean that the two classes are equivalent, just that any member of one
class is a member of the other (and the converse). Also, both of these
classes will be sub classes of rdfs:Resource.

   rdfs:Resource is rdfs:subClassOf of x:Thing, x:Resource .

This view of things is not all that necessary, unless you have some
built-in predicates such as log:racine. log:racine is a built-in to
CWM which means "the root form of this fragURI", viz.:-

   log:racine rdfs:domain x:Thing .
   log:racine rdfs:range x:Resource .

And so x:Thing and x:Resource correspond with the Thing and Resource
classes in thing.py [2] respectively. Here's a bit of N3 that can let
you distinguish:-

   { ?x log:uri [ string:contains "#" ] }
      log:implies { ?x a x:Thing } .
   { ?x log:uri [ string:doesNotContain "#" ] }
      log:implies { ?x a x:Resource } .

Which should work, although when you start applying rules [3], you'll
get things like:-

   <http://example.org/#blargh> a x:Thing, x:Resource .
   <http://example.org/blargh> a x:Resource, x:Thing .

Which is of course true, and exposes the fact that the finite amount
of data that a processor knows is significant. Does anyone else have
some thoughts on this?

Cheers,

[1] http://www.w3.org/2000/10/swap/test/forgetDups.n3
[2] http://www.w3.org/2000/10/swap/thing.py
[3] http://infomesh.net/2001/05/rdflint/rules.n3

--
Kindest Regards,
Sean B. Palmer
@prefix : <http://webns.net/roughterms/> .
:Sean :hasHomepage <http://purl.org/net/sbp/> .

Received on Monday, 3 December 2001 08:59:57 UTC