RE: anonymous nodes

Just for the record, I'm not on any crusade to get rid of
anonymous nodes (at the moment at least ;-)

If folks find them useful, fine. *BUT* I am very much
interested in seeing that the core, fundamental, basic 
representation of knowledge (such as resource identity) is
*not* defined in terms of anonymous nodes. Specifically, I
am opposed to the idea of using complex graph structures to
represent resources based on their serialized QName representation
in lieu of globally valid URIs (as opposed to system internal
identifiers of "anonymous" nodes).

(cf. the separate posting 'QNames are not URIs')

> -----Original Message-----
> From: ext Seth Russell [mailto:seth@robustai.net]
> Sent: 14 August, 2001 16:48
> To: Stickler Patrick (NRC/Tampere); sean@mysterylights.com;
> scranefield@infoscience.otago.ac.nz; www-rdf-interest@w3.org;
> www-rdf-logic@w3.org
> Subject: anonymous nodes
> 
> 
> From: <Patrick.Stickler@nokia.com>
> 
> > Well, I'm probably going to get grilled for this comment, 
> but personally
> > I don't like anonymous nodes. After all, just what *is* an anonymous
> > node. Every application that I've seen that uses them has 
> had to give
> > them some form of identity, and yet that identity is system 
> dependent.
> 
> Yep, any implemented system that has nodes must have some way 
> to identify
> them internally (whether they are termed anonymous or not).  
> Point is that
> the anonymous ones just can't be accessed by name from 
> outside that system.
> The only way to address an anonymous node from outside the implemented
> system is by criteria ... for example I'm talking about the 
> nodes in your
> system for which [foo bloop; bar goop] and of those nodes I 
> wish to say [gar
> poop] ... oh ... and I will be expecting that you have only 
> one such node.

Forgive me if I'm being a bit naiive here, but isn't there a
difference between triples, representing knowledge, and queries
representing patterns or templates needing to be satisfied
against a given knowledge base?

While I certainly subscribe to the need to be able to define
axioms based on queries, and that such queries may not care what
the actual identity of the resources are, operating only on 
the basis of their properties -- eventually, if either the 
explicit or inferred knowledge is to be useful outside the
scope of a given system (such as to e.g. some other agent
or a human needing information) then those resources will
ultimately have to have some recognizable and globally
unique identity, no?

How does it help if, after your above axiom is applied, I
do a query such as 'get me all resources X where [X gar poop]
and I get back X = [] or X = G28998_2898193282.28281. Not
particularly informative, eh?

Again, it's not that one can't encode knowledge using 
anonymous nodes (e.g. per the traditional examples such as
[somebook author [X firstname John; X lastname Doe]] etc.) but
is that *really* necessary and useful when SW agents need
an identity for an author -- and eventually, we're going
to need an identity for that author to say anything about
them -- and in all such examples I've seen I would expect
that (a) an identity already exists for them that is suitable
to use as the object of the first triple, and (b) I would
argue that such knowledge is ambigous, as it could in fact
be construed as a query which is satisfied by *all* authors
who's firstnames are John and lastnames are Doe! 

Again, I assert that anonymous nodes were primarily a side effect of
syntactic conveniences based on the syntactic serialization of
RDF statements, either to avoid having to explicitly name an
object resource and then define additional separate statements
about that resource or to facilitate collection structures
which produce graph structures which prevent efficient
syndication of knowledge from multiple, individually serialized,
sources.

> > IMO, anonymous nodes were a hack to allow collection structures as
> Objects,
> 
> Well maybe that is what some people use them for.   The 
> primary reason,
> imho, is that we got so many nodes we can't name them all ....  tis
> impossible .... twill always be impossible.   But that 
> doesn't mean that we
> can't start talking about things which are impossible to name 
> or for which
> (in a distributed system) it is infeasible to name.   So next 
> time you go to
> the beech select the first grain of sand that is touched by 
> your left pinkie
> and send it to me ...
> 
> .... along with a dollar bill of course.

Fair enough. As I said above, there is a need to be able to define
patterns or templates of facts in order to infer new knowledge. Duh.
Like that's the whole point of the SW ;-)  But I don't buy the
argument that nearly all explicit statements about resources would
be able to specify a known, globally unique URI for those resources
if they had to. 

Unfortunately, folks will take the path of least resistance, and if
they see some serialization model that requires fewer keystrokes and
doesn't require them to look up what the URI of that person is, but
they think they can just toss their names there, will result in a SW
that will be *extremely* cumbersome to process because of the gross
proliferation of implicit knowledge.

It is true that we will never escape implicit knowledge or the need
to make statements about things where we do not know their precise
identity, but those should be the exceptions to the rule, and the
general machinery of RDF should discourage (or at least not encourage)
the defintion of inexplicit knowledge.

Regards,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Software Technology Laboratory        Fax:    +358 7180 35409
Nokia Research Center                 Video:  +358 3 356 0209 / 4227
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
 

Received on Wednesday, 15 August 2001 06:24:08 UTC