- From: Henry Story <henry.story@bblfish.net>
- Date: Mon, 13 Mar 2006 12:29:34 +0100
- To: atom-owl@googlegroups.com, Semantic Web <semantic-web@w3.org>
Here is a little puzzle regarding atom that it would be interest to have some feedback from the larger Semantic Web community. We are wondering if there are best practices guidelines for updating semantic web data found on the web. We have an ontology for the Atom (rfc4287) spec called AtomOwl [1], that would allow us to GRDDL atom documents into graphs. This thread started off with the question as to whether one should map <entry> <title>Atom-Powered Robots Run Amok</title> <link href="http://example.org/2003/12/13/entry"/> <id>tag:example.com,2003/blog/entry1</id> <updated>2003-12-13T18:30:02Z</updated> <summary>Some text.</summary> </entry> to <> a :Entry; :title [ :value "Atom-Powered Robots Run Amok"; :type "text/plain" ]; iana:alternate <http://example.org/blog/entry.html>; :id <tag:example.com,2003/blog/entry1>; :updated "2003-12-13T18:30:02Z"^^xsd:dateTime; :summary [ :value "some text"; :type "text/plain" ] . or to [] a :Entry; :title [ :value "Atom-Powered Robots Run Amok"; :type "text/plain" ]; iana:alternate <http://example.org/blog/entry.html>; :id <tag:example.com,2003/blog/entry1>; :updated "2003-12-13T18:30:02Z"^^xsd:dateTime; :summary [ :value "some text"; :type "text/plain" ] . On 12 Mar 2006, at 21:03, Reto Bachmann-Gmür wrote: > Aren't id and updated together a cifp, so that in your examples we are > unambiguously talking about the same resource whether it is named > or not? Yes. Though David Powell had some good arguments against using the CIFP @prefix cifp: <http://eulersharp.sourceforge.net/2004/04test/ rogier#>. [] cifp:productProperty ( :updated :id ); a owl:InverseFunctionalProperty . in some earlier mails to the atom-owl list (misleadingly) entitled by me "Feed or Document" [2]. But perhaps the following reasoning can help resolve that issue... > I do however agree in the fundamental question whether atom-owl should > be good to describe thing over time or just at a specific moment in > time. If the second design goal is chosen then aggregators may rely on > some more generic graph versioning systems, of which - as you > mention - > a possible implementation would be quad-based. Clearly AtomOwl has to be able to describe entries (as identified by their id) evolving over time, since there can be more than one entry with the same id in a feed. This is a great feature of Atom as it does allow the description of the history of certain types of resources over time. But in the discussion with David Powell it came up that people may want to update an entry without modifiying the time stamp. So perhaps the publisher will decide that <entry> <title>Atom-Powered Robots Run Amok</title> <link href="http://example.org/2003/12/13/entry"/> <id>tag:example.com,2003/blog/entry1</id> <updated>2003-12-13T18:30:02Z</updated> <summary>Some text.</summary> </entry> published at <http://example.org/coll/> using HTTP POST as specified by the Atom Publishing Protocol [3] and resulting in an entry being placed at <http://example.org/coll/entry1.atom> needs a change that he considers insignificant. So he will PUT the following xml at that location: <entry> <title>Atom-Powered Robots Run Amok in France</title> <link href="http://example.org/2003/12/13/entry"/> <id>tag:example.com,2003/blog/entry1</id> <updated>2003-12-13T18:30:02Z</updated> <summary>Some text.</summary> </entry> Let us assume that this is acceptable behavior. After that PUT operation, the feed representing the collection will be updated too. There will of course only be one entry with the 2003-12-13T18:30:02Z time stamp as required by the spec. This entry will have the new title "Atom-Powered Robots Run Amok in France". A Atom-OWL based GRDDL tool that would refetch the entry1.atom document would create a new set of triples. And if we were to just add these to our triple store (which the [] notation is more favorable to) we would end up with 2 anonymous entries in our triple store with the same time stamp. With the CIFP rule we would end up with a contradiction. So we could of course as suggested by David Powell add an extra "fetched-at" relation on each blank node entry (and remove the CIFP rule) and then base our idea of the actual state of the feed on that relation. But from what I understand of the way Tim Berners Lee is thinking about the SemWeb the correct thing to do might in fact be to remove the triples generated by the initial GET from your working graph (you can always relegate it to a archive graph of course), and replace them with the new triples. So you would replace graph G1 <http://example.org/coll/entry1.atom> a :Entry; :title [ :value "Atom-Powered Robots Run Amok"; :type "text/plain" ]; iana:alternate <http://example.org/blog/entry.html>; :id <tag:example.com,2003/blog/entry1>; :updated "2003-12-13T18:30:02Z"^^xsd:dateTime; :summary [ :value "some text"; :type "text/plain" ] . with graph G2 <http://example.org/coll/entry1.atom> a :Entry; :title [ :value "Atom-Powered Robots Run Amok in France"; :type "text/plain" ]; iana:alternate <http://example.org/2003/12/13/entry.html>; :id <tag:example.com,2003/blog/entry1>; :updated "2003-12-13T18:30:02Z"^^xsd:dateTime; :summary [ :value "some text"; :type "text/plain" ] . One should of course write ontologies that are monotonic but we have also have to allow people to fix errors they make when publishing statements to the Semantic Web, and a PUT overwriting a document does just that. So it makes sense. Now this leaves us with a problem of asynchronous graph updates. A client may for example update the graph at http://example.org/coll/ entry1.atom resulting in graph G2 without having yet had time to update the feed at http://example.org/coll/ which (after GRDDL transform) contains graph G3 [] a :Entry; :title [ :value "Atom-Powered Robots Run Amok"; :type "text/plain" ]; iana:alternate <http://example.org/2003/12/13/ atom03.html>; :id <urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a>; :updated "2003-12-13T18:30:02Z"^^xsd:dateTime; :summary [ :value "some text"; :type "text/plain" ] . which is compatible with G1 but not G2 (given our CIFP). So what should such an aggregator do? - a client using the APP protocol would presumably know that it will need to update the feed too, and so it could refetch that and replace its graph. That's feasible. - an aggregator that was not involved in the process, and so did not know about the PUT operation that had just happened, could notice the contradiction and try to resolve it by refetching the feed, noticing that the version it had was older than the entry. Anyone have experience in dealing with updates across rdf documents on the web? And how to deal with contradictions? Henry Story > reto > > Henry Story wrote: >> I have been reading the "Reaching out onto the Web" document at >> <http://www.w3.org/2000/10/swap/doc/Reach> a little and am trying to >> see how keeping this in mind would affect the ontology. [snip] [1] http://bblfish.net/work/atom-owl/2005-10-23/ [2] http://groups.google.com/group/atom-owl/browse_frm/thread/ 357e36c4ee9cd31b [3] http://bitworking.org/projects/atom/draft-ietf-atompub- protocol-08.html
Received on Monday, 13 March 2006 11:29:52 UTC