- From: David Allsopp <dallsopp@signal.dera.gov.uk>
- Date: Thu, 16 Aug 2001 11:24:14 +0100
- To: Patrick.Stickler@nokia.com
- CC: www-rdf-interest@w3.org
Patrick.Stickler@nokia.com wrote: > > > > rather than a single opaque URI identifier. > > > > But this is just querying - you have to do that anyway to > > find out what > > the "opaque URI" actually is. > > Why would you need to find out what a URI "is". Do you > mean dereferencing it? Surely dereferencing of URIs is not > required for any kind of RDF based inferencing. Ok, I'm confused as to what you meant - you appeared to be saying that it was difficult to refer to the anonymous resource because you would have to use its 'surrounding' related nodes to identify it. My point was that even if the node does have an opaque URL, if the data are at a remote site or agent you have no idea what that URL string is, and have to form a query in order to find out. This query would of course use the related nodes, so the situation is no different. If on the other hand you have accessed and parsed that RDF locally, you will have generated a local ID for that resource and can refer to it using that ID. > Even if some application may wish to dereference a URI for > some purpose, that URI is not a "URI" per se to RDF, it is > simply an opaque universal identifier, no? Yes; I wasn't suggesting dereferencing. > > John --hasFather--> [] --age--> 84 > > > > John --hasFather--> [] --age--> 84 > > > > compared with > > > > John --hasFather--> randomgenid0123456789 --age--> 84 > > > > John --hasFather--> randomgenid9876543210 --age--> 84 > > > > where [] represents an anonymous node. > > > > The point is that we don't know the name of John's father, so > > assigning > > him a random name makes our life harder, not easier, since everybody > > necessarily assigns him a _different_ random name. > > But this is exactly my point. There is no such thing as an anonymous > node! It always gets a randomly generated system identifier! So what? In principle, the system can keep track of which nodes are in fact anonymous and distinguish them from the others. If the RDF is then re-serialized, the anonymity of the nodes should then be preserved, so no system identifiers are exported, and the recipient understands that these are anonymous nodes. I just don't see the problem. Let's say that I implement a system where the anonymous resource is NOT given a system name in the form of a URI string, but is only stored as a distinct object in memory. Would that be any different? Or we could randomly change the name of all anonymous nodes every second; it wouldn't make any difference. [Aside: perhaps this is rather like the Robinson Crusoe story, where he meets a foreigner on his island; not knowing his name, and having met him on a Friday, he calls him "Man Friday". The man presumably has a real name, but we don't know it - we have to call him _something_, but we acknowledge it isn't really his name.] > So if I get the same statement twice (e.g. it happens to be defined > redundantly in two disparate sources) then a given system will > assign *different* system identities to each anonymous node > for each essentially equivalent statement. Not necessarily - we have the option of keeping track of which resources are anonymous and handling them specially if we want. > Would it not be far better to have a "variable" for an anonymous > node which is based on the fusion of the subject and predicate > identities. Thus rather than the current practice where > > John --hasFather--> [] --age--> 84 > John --hasFather--> [] --age--> 84 > > results in > > [John, hasFather, gen123] > [gen123, age, 84] > [John, hasFather, gen456] > [gen456, age, 84] This is what tends to happen, but we can in principle detect the anonymous nodes and do more intelligent merging. > which is *not* what was intended; we instead could get > > [John, hasFather, rdf:anonymous:(John)(hasFather)] > [rdf:anonymous:(John)(hasFather), age, 84] > > with neither redundancy nor irreconcilable equivalence, and > where the implicit but regular (not system dependent) identity of > an anonymous node is defined in terms of a special RDF specific > URI scheme and sub-type for anonymous nodes. I have no objection to explicit identification of anonymous nodes, but I don't think your suggested scheme solves the problem yet (nice idea though...): John --hasFather-- | [] --age--> 84 | Jim --hasBrother-- What's the URI of the anonymous node here? If I add more triples pointing to it, then what? [Actually there may be a wider problem here as I don't think that graph can be serialized in XML RDF with an anonymous node 8-) So some explicit identification scheme may be needed...] [neat encoding of statements] > Thus, the issue is not really so much about anonymous nodes but > that they are in fact *not* anonymous within a given system, being > given unique and disjunct identities -- nor are they really anonymous > in the conceptual graph, as they represent a single actual resource > having an implicit identity based on their context within a statement > (which all nodes have, even if given an explicit URI identity). They are anonymous in the syntax, and have a temporary name in implementations (although one could probably come up with an implementation where they were treated specially and so only really had a memory address or something). Does something have to have a name in order to be distinct? I don't see that it does - as we said before, it can be identified by its surroundings. Generating a name such as "thingNextToFoo" is just a convenience for this identification. I do belive that 'anonymous' nodes are different to others in that the name is _only_ a convenience, and could be changed at random without affecting anything (in principle - provided the change is distributed appropriately!). I guess the difference is that the name can be removed from any given graph WITHOUT LOSS OF INFORMATION, (only loss of convenience). Removing any other name changes the graph, by removing information. > > I don't see how removing anonymous nodes assists here - the data can > > always be structured in different ways, and you have to know that in > > advance, or perform cleverness to deduce the structure. > > In this particular case, which is essentially talking about removing > collections as distinct structures within the graph, it greatly simplifies > processing, since the set of values for a given query will be a flat/shallow > list of URIs, not a possible list of mixed URIs and anonymous nodes. OK. Regards, David. -- /d{def}def/u{dup}d[0 -185 u 0 300 u]concat/q 5e-3 d/m{mul}d/z{A u m B u m}d/r{rlineto}d/X -2 q 1{d/Y -2 q 2{d/A 0 d/B 0 d 64 -1 1{/f exch d/B A/A z sub X add d B 2 m m Y add d z add 4 gt{exit}if/f 64 d}for f 64 div setgray X Y moveto 0 q neg u 0 0 q u 0 r r r r fill/Y}for/X}for showpage
Received on Thursday, 16 August 2001 06:24:26 UTC