Re: [] or <> as root ? from Henry Story on 2006-03-15 (semantic-web@w3.org from March 2006)

From: Henry Story <henry.story@bblfish.net>
Date: Wed, 15 Mar 2006 12:28:49 +0100
To: atom-owl@googlegroups.com
Cc: Semantic Web <semantic-web@w3.org>, Hawke Sandro <sandro@w3.org>
Message-Id: <167C0D75-1F3C-4F84-946D-245129D10951@bblfish.net>
After a good conversation on #swig with Sandro Hawke [1] I have come  
to the conclusion that using an anonymous node [] is better. It makes  
it easier to merge graphs, and I have a feeling that should be the  
primary principle when creating ontologies: making graph merges easy.

The CIFP I mention below is a problem when people update an entry  
without changing the updated time stamp, and exists because the atom  
working group did not want (or could not) be clear about what good  
identity criteria are. By doing graph updates as explained below we  
should be able to deal with that problem. We could also drop the  
CIFP. That should in part be decided by seeing what advantages it  
provides...

Tricky question though...

Henry

[1] http://chatlogs.planetrdf.com/swig/2006-03-15.html#T09-38-04

On 13 Mar 2006, at 12:29, Henry Story wrote:
> Here is a little puzzle regarding atom that it would be interest to
> have some feedback from the larger Semantic Web community. We are
> wondering if there are best practices guidelines for updating
> semantic web data found on the web. We have an ontology for the Atom
> (rfc4287) spec called AtomOwl [1], that would allow us to GRDDL atom
> documents into graphs.
>
> This thread started off with the question as to whether one should map
>
>   <entry>
>         <title>Atom-Powered Robots Run Amok</title>
>         <link href="http://example.org/2003/12/13/entry"/>
>         <id>tag:example.com,2003/blog/entry1</id>
>         <updated>2003-12-13T18:30:02Z</updated>
>         <summary>Some text.</summary>
> </entry>
>
> to
>
>            <> a :Entry;
>               :title [ :value "Atom-Powered Robots Run Amok";
>                        :type "text/plain" ];
>               iana:alternate <http://example.org/blog/entry.html>;
>               :id <tag:example.com,2003/blog/entry1>;
>               :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
>               :summary [  :value "some text";
>                           :type "text/plain" ] .
>
> or to
>
>            [] a :Entry;
>               :title [ :value "Atom-Powered Robots Run Amok";
>                        :type "text/plain" ];
>               iana:alternate <http://example.org/blog/entry.html>;
>               :id <tag:example.com,2003/blog/entry1>;
>               :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
>               :summary [  :value "some text";
>                           :type "text/plain" ] .
>
>
>
> On 12 Mar 2006, at 21:03, Reto Bachmann-Gmür wrote:
>> Aren't id and updated together a cifp, so that in your examples we  
>> are
>> unambiguously talking about the same resource whether it is named
>> or not?
>
> Yes. Though David Powell had some good arguments against using the  
> CIFP
>
>    @prefix cifp: <http://eulersharp.sourceforge.net/2004/04test/
> rogier#>.
>    [] cifp:productProperty ( :updated :id );
>           a owl:InverseFunctionalProperty .
>
> in some earlier mails to the atom-owl list (misleadingly) entitled by
> me "Feed or Document" [2]. But perhaps the following reasoning can
> help resolve that issue...
>
>> I do however agree in the fundamental question whether atom-owl  
>> should
>> be good to describe thing over time or just at a specific moment in
>> time. If the second design goal is chosen then aggregators may  
>> rely on
>> some more generic graph versioning systems, of which - as you
>> mention -
>> a possible implementation would be quad-based.
>
> Clearly AtomOwl has to be able to describe entries (as identified by
> their id) evolving over time, since there can be more than one entry
> with the same id in a feed. This is a great feature of Atom as it
> does allow the description of the history of certain types of
> resources over time. But in the discussion with David Powell it came
> up that people may want to update an entry without modifiying the
> time stamp. So perhaps the publisher will decide that
>
>   <entry>
>         <title>Atom-Powered Robots Run Amok</title>
>         <link href="http://example.org/2003/12/13/entry"/>
>         <id>tag:example.com,2003/blog/entry1</id>
>         <updated>2003-12-13T18:30:02Z</updated>
>         <summary>Some text.</summary>
> </entry>
>
> published at <http://example.org/coll/> using HTTP POST as specified
> by the Atom Publishing Protocol [3] and resulting in an entry being
> placed at
> <http://example.org/coll/entry1.atom> needs a change that he
> considers insignificant. So he will PUT the following xml at that
> location:
>
> <entry>
>         <title>Atom-Powered Robots Run Amok in France</title>
>         <link href="http://example.org/2003/12/13/entry"/>
>         <id>tag:example.com,2003/blog/entry1</id>
>         <updated>2003-12-13T18:30:02Z</updated>
>         <summary>Some text.</summary>
> </entry>
>
> Let us assume that this is acceptable behavior.
>
> After that PUT operation, the feed representing the collection will
> be updated too. There will of course only be one entry with the
> 2003-12-13T18:30:02Z time stamp as required by the spec. This entry
> will have the new title "Atom-Powered Robots Run Amok in France".
>
> A Atom-OWL based GRDDL tool that would refetch the entry1.atom
> document would create a new set of triples. And if we were to just
> add these to our triple store (which the [] notation is more
> favorable to) we would end up with 2 anonymous entries in our triple
> store with the same time stamp. With the CIFP rule we would end up
> with a contradiction. So we could of course as suggested by David
> Powell add an extra "fetched-at" relation on each blank node entry
> (and remove the CIFP rule) and then base our idea of the actual state
> of the feed on that relation.
>
> But from what I understand of the way Tim Berners Lee is thinking
> about the SemWeb the correct thing to do might in fact be to remove
> the triples generated by the initial GET from your working graph (you
> can always relegate it to a archive graph of course), and replace
> them with the new triples.
>
> So you would replace graph G1
>
> <http://example.org/coll/entry1.atom> a :Entry;
>               :title [ :value "Atom-Powered Robots Run Amok";
>                        :type "text/plain" ];
>               iana:alternate <http://example.org/blog/entry.html>;
>               :id <tag:example.com,2003/blog/entry1>;
>               :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
>               :summary [  :value "some text";
>                           :type "text/plain" ] .
>
> with graph G2
>
> <http://example.org/coll/entry1.atom> a :Entry;
>               :title [ :value "Atom-Powered Robots Run Amok in  
> France";
>                        :type "text/plain" ];
>               iana:alternate <http://example.org/2003/12/13/ 
> entry.html>;
>               :id <tag:example.com,2003/blog/entry1>;
>               :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
>               :summary [  :value "some text";
>                           :type "text/plain" ] .
>
> One should of course write ontologies that are monotonic but we have
> also have to allow people to fix errors they make when publishing
> statements to the Semantic Web, and a PUT overwriting a document does
> just that. So it makes sense.
>
> Now this leaves us with a problem of asynchronous graph updates. A
> client may for example update the graph at http://example.org/coll/
> entry1.atom resulting in graph G2  without having yet had time to
> update the feed at http://example.org/coll/ which (after GRDDL
> transform) contains graph G3
>
>            [] a :Entry;
>               :title [ :value "Atom-Powered Robots Run Amok";
>                        :type "text/plain" ];
>               iana:alternate <http://example.org/2003/12/13/
> atom03.html>;
>               :id <urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a>;
>               :updated "2003-12-13T18:30:02Z"^^xsd:dateTime;
>               :summary [  :value "some text";
>                           :type "text/plain" ] .
>
>
> which is compatible with G1 but not G2 (given our CIFP). So what
> should such an aggregator do?
>    - a client using the APP protocol would presumably know that it
> will need to update the feed too, and so it could refetch that and
> replace its graph. That's feasible.
>    - an aggregator that was not involved in the process, and so did
> not know about the PUT operation that had just happened, could notice
> the contradiction and try to resolve it by refetching the feed,
> noticing that the version it had was older than the entry.
>
>
> 	Anyone have experience in dealing with updates across rdf documents
> on the web? And how to deal with contradictions?
>
>     Henry Story
>
>
>
>> reto
>>
>> Henry Story wrote:
>>> I have been reading the "Reaching out onto the Web" document at
>>> <http://www.w3.org/2000/10/swap/doc/Reach> a little and am trying to
>>> see how keeping this in mind would affect the ontology.
> [snip]
>
>
> [1] http://bblfish.net/work/atom-owl/2005-10-23/
> [2] http://groups.google.com/group/atom-owl/browse_frm/thread/
> 357e36c4ee9cd31b
> [3] http://bitworking.org/projects/atom/draft-ietf-atompub-
> protocol-08.html
> --~--~---------~--~----~------------~-------~--~----~
> You received this message because you are subscribed to the Google  
> Groups "atom-owl" group.
> To post to this group, send email to atom-owl@googlegroups.com
> To unsubscribe from this group, send email to atom-owl- 
> unsubscribe@googlegroups.com
> For more options, visit this group at http://groups.google.com/ 
> group/atom-owl
> -~----------~----~----~----~------~----~------~--~---
Received on Wednesday, 15 March 2006 11:29:08 UTC