- From: Sergey Melnik <melnik@DB.Stanford.EDU>
- Date: Mon, 06 Dec 1999 14:31:39 -0800
- To: Gabe Beged-Dov <begeddov@jfinity.com>
- CC: RDF Interest Group <www-rdf-interest@w3.org>
Gabe Beged-Dov wrote: > > In general, it looks really great. The use of a signature as the URI is > very powerful :-! I have several short questions/comments about this. In fact, the original motivation behind signatures was that one needed a way to generate interoperable hashCode()'s that are independant of the implementation. Content-based URIs for (reified) triples have already been discussed on the list. I refined the algorithm so that subj --pred--> obj subj --pred--> "obj" and all possible permutations like pred --subj--> obj etc. yield different digests, i.e. different URIs. > The first question is related to the interaction between the model being > "closed" and the generation of the signature. Should the concept of the > model being open vs. closed be part of the API? Could you reformulate this question? If this is what you are asking about: the model URI is recomputed whenever triples are added or removed from the model. ...Uups, just found a bug, this is currently not being done in the code... > The second comment is related to noname resources. The sample > implementation uses the incrementing genid which is dependant on the XML > serialization. I was wondering if there couldn't be a step at model > signature generation time that also generated the signatures for the > nonames. The idea would be that once the model is stable you can > generate a signature for the noname using something like the set of > triples for which the noname is the subject using the same algorithm as > for models (kind of like mini-models [forgive the Austin Powers > reference, I saw a scary mini-me doll shopping last night and it stuck > in my mind]:) That's another tough issue, you are absolutely right. First, on the model level there no "proper" noname resources, since every resource must have a URI. org.w3c.rdf.util.RDFUtil has a static method noname() that generates a cryptographically strong unique identifier for a noname resource. On the syntax level, the problem becomes how to make sure that *every* compliant RDF parser generates the same URI for a given noname resource. Your idea to somehow bind noname URI generation to the content is very tempting, great idea! The same algorithm as for models would not work, though, since the noname URI would be recursively dependant on the "mini-model" URI. In the current RDF syntax, a noname resource can be at most once an object of a statement and can have a bunch of properties. This information fully determines the "context" of a noname resource. Thus, the URI for <rdf:Description> <fn>John</fn> <ln>Smith</ln> </rdf:Description> could be computed using the data --fn--> "John" --ln--> "Smith" There there at least three further issues to consider: (1) duplicates: if I repeat the same RDF/XML content like the description above, I'd prefer to fuse both of them. Can that be a problem from a semantic point of view? (2) recursion: in case of nested RDF descriptions, we have to postpone triple generation until the descendand nodes are fully processed. Furthermore, if we use the fact that the noname resource is an object of someone else's statement, mutual dependency becomes ugly again. (3) changes: the generated URIs will only make sense, if the "context" of the URI remains intact. As soon as I add another property to the resource, or modify property's value, the URI breaks. On the other hand, we can move the descriptions around in the document. I think this content-based approach works better than an XPointer-based one. Noname URI generation is a syntax related issue. However, it will arise no matter what kind of XML-based syntax we take. So what do you think about (1)-(3)? Sergey
Received on Monday, 6 December 1999 17:26:20 UTC