- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Fri, 12 Mar 2004 14:43:29 +0000
- To: Patrick Stickler <patrick.stickler@nokia.com>
- Cc: ext Pat Hayes <phayes@ihmc.us>, www-archive@w3.org, chris@bizer.de
I am having a lot of difficulty in understanding the problems that Pat and Patrick see with bootstrapping, and needing to distinguish asserting (by the publisher) from affirming (by anyone). I don't see why bootstrapping stands outside the normal MT. Here is my understanding of a SemWeb agent A007 A007 has access to a set of named graphs. (e.g. from a SemWeb Crawler where each name is a URL and each graph is given by the RDF/XML at that URL, or from some Trix files or whatever). A007 has access to a set of named graphs. e.g. using some of Patrick's example, or Chris's example, ... > :X > { > :X trix:assertedBy ex:Bob . > :X trix:signature "..." . -> verifiable signature > for :X+ex:Bob > } > > :Y > { > :X trix:assertedBy ex:Jane . > :Y trix:asserted "true"^^xsd:boolean . -> authoritative assertion > of :Y > :Y trix:assertedBy ex:Jane . > :Y trix:signature "..." . -> verifiable signature > for :Y+ex:Jane > } > > :Z > { > :X trix:asserted "false"^^xsd:boolean . -> third-party > non-assertion of :X > :Z trix:asserted "true"^^xsd:boolean . -> authoritative assertion > of :Z > :Z trix:assertedBy ex:Bill . -> authority for :Z > :Z trix:signature "..." . -> verifiable signature > for :Z+ex:Bill > } > etc. I would ideally want the trix:asserted predicate to be say trix:affirmed, and have an agent URL as its object. We also want to have signatures in there, which as patrick points out are functions of agents+graphs We also may want to have verification chains of agent identities (including their public keys, as already available as part of Public Key infrastructure) We may also want to have some verifiable relationships between URLs and agents, indicating ability to publish. To some extent this information is already public - registries of who owns which domain name linked with public key registries. To some extent this is simply more meta-information. In keeping with Chris's view, just how much of this A007 chooses to use is A007's business, and not fundamental. So Assume A007 has a policy of trust anything for which he can identify a party to sue. (I think this is implicit in Pat's view of publication). The algorithm used by A007 may work like this: 1) Non deterministic choose a named graph g from the input 2) Hypothesise g provisional adding it to A007's knowledge base KB 3) If g trix:affirmedBy UUU (is a consequence of KB) where UUU is an identifiable party, and all the signatures are good (a signature by UUU affirming g, and a trusted chain of signatures, from some root body such as verisign or microsoft, affirming the public key and identity of UUU), then the hypotesis is good and we confirm g in the knowledge base. Otherwise fail, and go back to 1, for a different choice of g. 4) If knowledge base is contradictory then someone is lying and A007 engages lawyers, otherwise repeat from 1, to add more graphs to the knowledge base 5) Terminate when no more graphs can be added. (Sorry algorithm is somewhat unpolished) Note - the only way that graphs have any meaning is in step 3, which uses RDF MT. Also note - actually having signed graphs in there is not going to happen by accident. A more conservative A008 can require at step 3 that the agent UUU has publication rights over the URL naming g. (And that these publication rights are known, signed and verified). The whole things gets off the ground by the usual public key trick of having some well-known facts, like the public key of verisign. I also believe that, in practice, most SemWeb applications can be less paranoid, and could use a policy of say, believe anything your friends say, where your friends are as defined in your own local foaf file. Also the dig sig stuff is only relevant if handling financially relevant material; and even then not very - just knowing that the URL were appropriate is typically enough (e.g. I jsut spent £100 at www.ryanair.com, with only DNS to convince me that I as really dealing with the same people who have previously carried me on an aeroplane - possible target of fraud, in practice fraudsters find easier or bigger targets) In this framework asserting some RDF is simply about how much trouble you are prepared to go to in order to convince your reader. If the RDF is not intended as commerically relevant, the answer is probably very little, so just adding a single triple saying you affirm it, is enough for a minimalist trusting algorithm. If the RDF is adevertising a webservice with Ts&Cs then we wheel out the PKI machinery, and make sure that a paranoid customer can check everything. So every act of publishing is an assertion, but we can make more forceful assertions or less forceful ones depending on how we do it. I guess it is useful to have some way of explicitly marking a graph as false (seems a bt strong) or unaffirmed by the author (my preference). Notice that the algorithm above works fine to permit third party affirmation even with author denial, and the third party is liable not the author. Jeremy
Received on Friday, 12 March 2004 09:44:29 UTC