- From: Phil Dawes <pdawes@users.sourceforge.net>
- Date: Mon, 9 Aug 2004 15:43:39 +0000
- To: Leo Sauermann <leo@gnowsis.com>
- Cc: Phil Dawes <pdawes@users.sourceforge.net>, www-rdf-interest@w3.org
Hi Leo, Leo Sauermann writes: > > I had a discussion with Joe Geldart about exactly this thing on friday. > > I think smushing is a good idea to integrate data-stores and based on > IFPs, it should be a good technique. > But I would base it on OWL rules that say something like > "owl:SameIndividualAs" when IFP's are the same. > Do you mean leave the triples as-is, but add SIA statements to store? > sure, the OWL approach would be harder to program as you would have to > do the evaluation during read access of the database and not write > access. Smushing on write is easier. > Do you have an algorithm for doing an SIA-inclusive query? I'd be eager to see if it would meet my performance requirements. Note that I'm merely exploring the :uname option - I'm not emotionally attached to it in any way. > about properties and classes: > you cannot dump all URIs, thats not wise. you will lose properties and > classes. It is impossible to differentiate between identification URIs > and vocabulary uris. The plan is not to loose the URIs. Just make them an IFP via the :uname property. > so I would suggest not to trash the uris but to add new data to existing > URIs, when you have a match by IFP (although that may also have easter > eggs, the most elegant way would be the OWL way) Problem with this approach is that you either nominate a 'primary uri', or duplicate all the statements. I am exploring the indirection between resource and URI that allows you to have 1 resource with multiple URIs. Representing the URIs as literals to the :uname property neatly side-steps the triple-bloat problem. > also, I am in the "religious group" of URI resoluters, thereby i like to > parse urls to check what server they are hosted on and to query more > information about the resource by contacting the server (f.e. over > URIQA). From this view, I do not like the identification of resources by > IFP. True, and I agree that QAable URIs seem the most adequate solution to this problem. Unfortunately (from this standpoint) there is already a large number of resources that don't have URIs, and this number is likely to grow massively I think. I need a way to work with this data. Why grow massively? Because in a decentralized world it's easier to reference resources using IFPs rather than agreeing on URIs. E.g. I know a person that I call 'fred' and who has an email address fred@example.com. Unfortunately that person doesn't have a foaf file, or a URI to identify himself. I can add to my data: .. <foaf:knows foaf:nick="Fred" foaf:mbox="fred@example.com"/> .. If fred (or someone else) ever publishes a foaf file, my data will automatically link with his via IFP. This disconnection simply isn't possible using URIs - either fred has come up with a URI when I need it, and then subsequently use it in his foaf file, or I make up a URI, and he has to know about it and use it later when he authors his foaf file. > I like to write URLs on things and give them to people, so that they can > get curious, and get more information by entering the url somewhere. > You still can with the :uname approach - the URI is not lost, merely relegated to an IFP. Cheers, Phil > > Es begab sich aber zu der Zeit 06.08.2004 14:01, da Phil Dawes schrieb: > > >Hi RDF Interest, > > > >Have been thinking a lot recently about techniques for smushing IFP > >based data (inc. foaf, doap etc..), and Sandro's uname paper[1] got me > >thinking about the optimisation possibilites of an extra layer of > >indirection. > > > >I'm putting together a prototype store (derived from the design of > >Steve Harris' excellent 3store) that doesnt expose URIs to the user > >except through explicit queries containing (?foo :uname ?uri). This > >disconnection between resource and URI offers some reasonably compact > >strategies for IFP and owl:sameAs smushing (since the resource -> URI > >can be 1:N without duplicating triples). It also provides optimisation > >possibilities for cases where the client isn't interested in URIs, but > >just the structure of the data and its literals. (I've found this to > >be the case when obtaining RDF infromation for e.g. displaying in a > >UI). > > > >Instead of BNodes, I'm using generated internal ids with a limited > >lifespan (remain constant between smushing passes). This is mainly > >because you can't use bnodes for properties, but also because it > >enables a client to efficiently submit multiple queries using the > >short-lived internal URIs for speed. > > > >The downside to this approach is that I can't think of a way to > >efficiently undo a smushing pass, which you'd want to do if e.g. you > >unasserted an IFP. > > > >Has anybody else explored these possibilities? Anything I ought to > >consider? > > > >Cheers, > > > >Phil > > > >[1] http://www.w3.org/2001/12/uname/ > > > > > > > > > >
Received on Monday, 9 August 2004 19:15:49 UTC