Anonymous resource names -versus- variables from Dan Brickley on 2000-05-03 (www-rdf-interest@w3.org from May 2000)

From: Dan Brickley <danbri@w3.org>
Date: Wed, 3 May 2000 13:16:36 -0400 (EDT)
To: www-rdf-interest@w3.org
Message-ID: <Pine.LNX.4.20.0005031216570.30944-100000@tux.w3.org>
Jan W writes...
> 	http://www.swi.psy.uva.nl/projects/SWI-Prolog/packages/sgml/online.html
> 
> P.s.s.	What about a standard or at least reserved naming schema for
>         anonymous resources?  I'm now using <Type>__<N>, with is nice
> 	for debugging. We also see genid<N>, but none of this is
> 	reserved!

I've been thinking about this a fair bit lately. Contrary to (my reading
of) Sergey's perspective on this, I don't believe we should generate URIs
for anonymously mentioned resources since they may well already have 'proper'
URIs that we're unaware of. So instead of assigning a URI, i believe we
need some way of tracking the fact that we don't really know the name of
the resource.

This can slip easily into the URN/URL/URI rathole of discussing "what's in
a name", what it is to name something, what it is for a resource to
"have" a URI name, whether the resource/URI relationship is 1:1, 1:many   
etc etc. While it would be good to make some progress on
those larger issues, I suspect there are options we can explore right now
that don't depend on resolving these. 

Specifically, I'm thinking of something like an implementors convention to
use a certain subset of URIs when naming resources whose "proper" URI we
don't know. This could be done in a number of ways: my preference is for a
URI scheme such as "var:" whose IDs were either huge random numbers or
generated from context. I call the scheme "var:" because these feel more
like variables than identifiers. Which is both interesting and/or a bit of
a hack, depending on your perspective. 


So I guess I should unpack this claim a bit, ie. that anonymous nodes in
RDF and variables (for RDF query, inference etc) are two sides of the same
coin...

If I write (assume some namespace declarations with the RDF namespace
imported as 'web:', and that top level is a typedNode)...

<Wordnet:Photo>
 <image:depicts>
	<Wordnet:Person>
	<abc:personalMbox web:resource="mailto:danbri@w3.org"/>
	</Wordnet:Person>
 </image:depicts>
 <ecom:buyOnline web:resource="http://example.com/shoppingcart/checkout?items=photo323423"/>
 <dc:description>...etc</dc:description>
</Wordnet:Photo>

In this serialized RDF, I'm mentioning some resource of rdf:type
Wordnet:Photo, I'm claiming that
it stands in an image:depicts relationship to another resource of df:type
Wordnet:Person. And that this same person has an abc:personalMailbox of
some web-identified mailbox. Then (to give some substantive content to
this vacuous description) we can supply more data, eg. where the photo can
be bought online, dublin core descriptions etc. In that syntax, the XML
syntax serves to do the "hooking together", so no explicit IDs get used
for the different resources. However when an RDF parser outputs the
extracted RDF data, some other convention is needed to hook together the
graph. Where we have public URIs for the resources, they do fine; where we
don't, we have cludges. The current cludge in many RDF parsers seems to be
to use a made up URI based on the source of the data with "genid_n" tacked on
the end.




A partial graph representation of this might be sketched as:

[anon1]--rdf:type-->[WordNet:Photo] 
                     \-image:depicts->[anon2]
                                      \--abc:mbox-->[mailto:danbri@w3.org]
                                      \--rdf:type-->[WordNet:Person]


Jan's concern, echoing previous discussion of "generated IDs" for
anonymous resources, is that the output of an RDF parser shouldn't lose
track of the fact that we invented identifiers for these nodes.

My strawman suggestion is (a) that we adopt a convention of using URIs of
the form var:2342534647647476456456 for such situations, and (b) that such
identifiers are serving as named placeholders much like variables, and
that this analogy might be worth exploring further.

In pseudo-prolog, and using short variable names X, Y in place of long
var: identifiers, the analogy should be clear.

	type(X, wordnetImage)
	depicts(X,Y)
	type(Y,wordnetPerson)
	abcMbox(Y,mailto:danbri@w3.org)


If this analogy persuades anyone, a more ambitious step would be to
consider this URI/variable scheme as a component of an RDF query
protocol, ie. using the same data structure in a different context. While
a parser will output something which basically says "there is some
resource X that is a photo and depicts some resource Y which...", a query
system might consume analagous descriptions and return matches against an
RDF database.

Dan

--
mailto:danbri@w3.org
Received on Wednesday, 3 May 2000 13:16:43 UTC