- From: Graham Klyne <GK@Dial.pipex.com>
- Date: Tue, 02 May 2000 10:57:12 +0100
- To: Sergey Melnik <melnik@db.stanford.edu>
- Cc: www-rdf-interest@w3.org, CC/PP WG list <w3c-ccpp-wg@w3.org>
Sergey, Thank you again for your comments. I think I now understand what you are suggesting. At this stage, I think the discussion has raised two distinct issues, which I'd like to take separately: (1) what is actually protected (model or syntax)? (2) how are assurances to be represented? (1) what is actually protected (model or syntax)? ------------------------------------------------- At 12:37 PM 4/30/00 -0700, Sergey Melnik wrote: >I agree that signing RDF statements/models is a promising way of >building the Web of Trust. However, I'm skeptical of using serialization >syntaxes for that. As I pointed out earlier, I believe we have to sign >*content* rather than one of its syntactic representations. Although we agree on some points, I think we have very different views of this issue. (It probably comes down to what we mean by "content".) In my view, a signature can be applied _only_ to some syntactic representation of an RDF statement. A signature is, by its very nature, calculated over some specific sequence of bits. These bits _represent_ the content, but are not the content itself. The content is an abstraction that is independent of any particular representation or sequence of bits. (An English banknote carries the statement "I promise to pay the bearer on demand...", with a rendering of the Bank of England's chief cashier's signature: the paper is not the money, merely a promise to pay the money. It is not the money or the promise itself that is signed, but a written representation of the promise. In practice, people may treat these scraps of paper as if they were actual money (and a similar evolution could happen with RDF), but I think it's important to maintain at some level the distinction. For example, the Bank of England's promise remains good long after the form of note itself has gone out of circulation and is hence no longer regarded as "money".) [[If the promise were in a different language, and similarly signed by BofE's cashier, it would still be a promise to pay the same money. I'm reminded of TimBL's comments about "interpretation properties".]] I see the process you describe as effectively defining a "canonical" representation of an RDF model that is used as a basis for calculating digests. As such, the process is fine, but I'm not yet convinced that it is really needed. I think that the assurance (like the BofE's cashier's promise) is something that can exist separately from any representation. The representation is merely a way to prove to someone else that the assurance was indeed given. So, when you say: >Imagine you got the following three statements (over an insecure link): >[...] You talk about receiving an RDF model as if it can be transferred independently of any serialization syntax. But I say that in order to communicate the model it must be serialized. And it is the serialized form (rather than the abstract model) that is subject to tampering and other corruptions that a signature protects against. Once we are dealing with an abstract model, an assurance either exists or does not exist. (2) how are assurances to be represented? ----------------------------------------- At first pass, I rather like your suggestion: >The recipient gets: > >T --rdf:type--> SignedStatements >T --principal--> Alice >T --algorithm--> RSA >T --statement--> <hash1> >... >T --statement--> <hashN> ><statement 1> >... ><statement N> In which I take the hash to be a "stand in" for the reified statement. This seems to have a strong link to the underlying theory of reification, while also maintaining a reasonably intuitive representation for the assurance. I note that the fundamental difference from the RDFM&S approach to such statements is that it adopts an "active voice" approach of the form "Alice assures <statements>", rather than a "passive voice" approach of the form "<statements> are assured by Alice". That is, the RDF graph arcs are from assurance to statement, rather than statement to assurance, hence provides for a useful grouping of statements covered by the assurance. This in turn requires that the reification of the statements be identifiable, hence your introduction of the hashes. It occurs to me that if properties in an RDF graph could be tagged with unique identifiers then the uncertainties of hashing and birthday paradox effects could be avoided. (I can conceive the possibility of a semantic web containing sufficiently large numbers of of RDF statements that the hash may not offer sufficient assurance of uniqueness. Appendix A of <draft-ietf-conneg-feature-hash-04.txt> summarizes the results of an analysis of this issue for 128-bit MF5 hashes.) While I like this approach, I think implications of the differences from RDF M&S should be carefully considered. #g -- ------------ Graham Klyne (GK@ACM.ORG)
Received on Tuesday, 2 May 2000 07:38:49 UTC