- From: <Patrick.Stickler@nokia.com>
- Date: Thu, 16 Aug 2001 12:24:10 +0300
- To: dallsopp@signal.dera.gov.uk, www-rdf-logic@w3.org
- Cc: www-rdf-logic@w3.org, www-rdf-interest@w3.org
> > rather than a single opaque URI identifier. > > But this is just querying - you have to do that anyway to > find out what > the "opaque URI" actually is. Why would you need to find out what a URI "is". Do you mean dereferencing it? Surely dereferencing of URIs is not required for any kind of RDF based inferencing. Even if some application may wish to dereference a URI for some purpose, that URI is not a "URI" per se to RDF, it is simply an opaque universal identifier, no? > John --hasFather--> [] --age--> 84 > > John --hasFather--> [] --age--> 84 > > compared with > > John --hasFather--> randomgenid0123456789 --age--> 84 > > John --hasFather--> randomgenid9876543210 --age--> 84 > > where [] represents an anonymous node. > > The point is that we don't know the name of John's father, so > assigning > him a random name makes our life harder, not easier, since everybody > necessarily assigns him a _different_ random name. But this is exactly my point. There is no such thing as an anonymous node! It always gets a randomly generated system identifier! So if I get the same statement twice (e.g. it happens to be defined redundantly in two disparate sources) then a given system will assign *different* system identities to each anonymous node for each essentially equivalent statement. Would it not be far better to have a "variable" for an anonymous node which is based on the fusion of the subject and predicate identities. Thus rather than the current practice where John --hasFather--> [] --age--> 84 John --hasFather--> [] --age--> 84 results in [John, hasFather, gen123] [gen123, age, 84] [John, hasFather, gen456] [gen456, age, 84] which is *not* what was intended; we instead could get [John, hasFather, rdf:anonymous:(John)(hasFather)] [rdf:anonymous:(John)(hasFather), age, 84] with neither redundancy nor irreconcilable equivalence, and where the implicit but regular (not system dependent) identity of an anonymous node is defined in terms of a special RDF specific URI scheme and sub-type for anonymous nodes. The very same approach provides for system-independent and portable reification of statements based on the statements themselves, without the need to assert those statements in a given knowledge base unless an application specifically chooses to do so. E.g. rdf:statement:(subject)(predicate)(object) <rdf:Description about="http://some.org.com/some/url/path/personnel_data.html"> <foo:asserts resource="rdf:statement:(John)(age)(32)" /> </rdf:Description> Thus, the issue is not really so much about anonymous nodes but that they are in fact *not* anonymous within a given system, being given unique and disjunct identities -- nor are they really anonymous in the conceptual graph, as they represent a single actual resource having an implicit identity based on their context within a statement (which all nodes have, even if given an explicit URI identity). Interestingly, the same RDF specific URI scheme approach could be used for the QName to URI mapping problem, with rdf:qname:(namespace)(name) But these are just ideas... (and I'm not sure I fully like them myself ;-) > > Another is not knowing whether I will get back from a > > query an anonymous node constituting the root of a collection, > > containing resource nodes (or other collections) rather than > > an actual resource node -- or possibly getting a set of results > > having both resource nodes *and* collection root nodes -- because > > in one case in the *serialization* the values of a property were > > defined as a bag in the "same" statement and in another case > > each was defined as a separate statement! Yuck! > > I don't see how removing anonymous nodes assists here - the data can > always be structured in different ways, and you have to know that in > advance, or perform cleverness to deduce the structure. In this particular case, which is essentially talking about removing collections as distinct structures within the graph, it greatly simplifies processing, since the set of values for a given query will be a flat/shallow list of URIs, not a possible list of mixed URIs and anonymous nodes. Cheers, Patrick -- Patrick Stickler Phone: +358 3 356 0209 Senior Research Scientist Mobile: +358 50 483 9453 Software Technology Laboratory Fax: +358 7180 35409 Nokia Research Center Video: +358 3 356 0209 / 4227 Visiokatu 1, 33720 Tampere, Finland Email: patrick.stickler@nokia.com
Received on Thursday, 16 August 2001 05:24:18 UTC