Re: Atom Triples Internet Draft from Story Henry on 2008-07-02 (semantic-web@w3.org from July 2008)

From: Story Henry <henry.story@bblfish.net>
Date: Wed, 2 Jul 2008 16:47:43 +0200
To: Beckett Dave <dave@dajobe.org>, "semantic-web@w3.org Web" <semantic-web@w3.org>
Cc: atom-owl@googlegroups.com, Mark Nottingham <mnot@pobox.com>, Atom-Protocol Protocol <atom-protocol@imc.org>
Message-Id: <100C0D02-F23C-41C4-B118-0A36F3EE139D@bblfish.net>
Thanks Dave for this proposal to link atom and rdf.

A few remarks:

1. one can already embed rdf in atom
------------------------------------

Just as a matter of interest for those who do not know the atom  
format, one can already embed rdf in atom quite simply.

<entry>
       <title>syndeocms Project</title>
       <link href="http://doapspace.org/doap/sf/syndeocms"/>
       <id>http://doapspace.org/doap/sf/syndeocms</id>
       <updated>2007-12-13T18:30:02Z</updated>
       <summary>Some text..</summary>
       <content type="text/rdf+n3">
            @prefix doap: &lt;http://usefulinc.com/ns/doap#&gt; .
            &lt;http://projects.com/1&gt; a doap:Project;
                                          doap:name "Project 1" .
       </content>
</entry>

It is interesting to think about what the Atom Triples proposal adds  
to this. More below.

2. Mappings from Atom to RDF already exist
------------------------------------------

As another point of reference one has to point out that links between  
rdf and atom are being worked on.
Projects such at the atom-owl group have started looking at designing  
an ontology and a mapping for this.
	http://groups.google.com/group/atom-owl
The latest spec is available here:
         http://bblfish.net/work/atom-owl/2006-06-06/AtomOwl.html
with XSLT and XQuery transforms. David Powell has also worked on what  
turns out to be a nearly isomorphic ontology atom-rdf

Clearly that atom in the content has to be interpreted as a literal,  
otherwise a feed with a number of entries saying contradictory things  
could produce on GRDDL extraction a nonsensical graph. Ie, the content  
is close to an N3 graph. We could nearly translate the example above  
into the following N3

@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#>

[] a :Entry
    :title "syndeocms Project";
    :alternate <http://doapspace.org/doap/sf/syndeocms>;
    :id "http://doapspace.org/doap/sf/syndeocms"^^xsd:anyURI;
    :updated "2007-12-13T18:30:02Z"^^xsd:dateTime;
    :summary "Some text";
    :content {
        @prefix doap: <http://usefulinc.com/ns/doap#> .

        <http://projects.com/1> a doap:Project;
                                doap:name "Project 1" .
    }

There is no general rule that one has to merge any two graphs one  
finds on the internet. How, and when to merge two graphs is a matter  
of trust and choice of lifting rules. RDF semantics does state rules  
about merging two graphs one believes to be true.

The way atom is used though it is quite possible that two entries in  
the same feed with the same id have contradictory content, the second  
entry being an update of the first entry. Perhaps a mistake was made  
on first publication. It follows therefore that the content above need  
not be merged with the surrounding context, or with other content of  
different entries in the same feed, let alone other feeds. As a result  
it is true, it would not be the place to add new metadata about the  
feed or entry objects themselves.

This points to the need of something like what is being proposed by  
the AtomTriples specification.


3. embedding rdf in atom
------------------------

It is clearly stated in the introduction that the aim of this format  
is to embed
rdf in atom

[[
    This specification describes AtomTriples, a set of Atom [RFC4287]
    extension elements for embedding RDF
    [W3C.WD-rdf-syntax-grammar-20031010] statements in Atom documents
    (both element and feed), as well as declaring how they can be  
derived
    from existing content.
]]

The at:md element does in fact allow the embedding of any rdf in the  
atom feed or entry elements. The following statement is very odd though:

[[
    Likewise, the mechanics of combining metadata from multiple  
instances
    of the same entry, or from multiple feed documents, is out of the
    scope of this specification.
]]

This cannot be right. You cannot use rdf in a format and yet say  
nothing about how to use that RDF. RDF semantics makes it very clear  
how to merge relations from different graphs. If you are embedding RDF  
in atom, there has to be a way to make sense of what that embedding  
means, or else how can one know that it is rdf that you have embedded  
in atom, and not something completely different that just happens to  
look very much like a well known serialisation of rdf?

That is, one has to have a story of what merged graphs looks like if  
one believes all the information to be correct. Someone who publishes  
a feed clearly makes a statement about the content of the feed, and  
the statement should be taken on face value to say something true.

If one really does not want such a merge, then would the right place  
to put this metadata not be the <content> element , where clearly we  
have a literal, and there is no obligation to merge the log:semantics  
of literals ?

Or are you saying that really the rdf in the at:md elements are  
literals? So how then do they differ from the content then?


4. problems with at:md element
------------------------------

The at:md element allows one to place rdf anywhere in a feed or entry  
element, and allows one to speak of anything.

[[
The subject of these statements is, by default, the value of the
    atom:id element in the same context (atom:element or atom:feed).
    However, this behaviour MAY be overridden by specifying the subject
    attribute.
]]

Since the rdf one can place inside the at:md element can be about  
anything one wonders what the point of the default behavior in the  
spec is about. It turns out that by default the subject of the at:md  
element should be the resource identified by the atom:id of the feed  
or some resource related by a link relation.
Furthermore the way to find the URI of the link relation is extremely  
contorted:

[[
    It MUST contain a URI which MUST be interpreted as a link relation;
    the first such occurrence of an atom:link element in the same  
context
    as its parent element with that relation (in lexical order) will
    indicate the URI to use as the subject.
]]

Since the order of link relations in atom is insignificant this is  
breaking the little atom semantics defined.

Furthermore both the id and the link relations MUST have URIs! So  
there is absolutely no need to have these default behaviors since RDF  
has many constructs to make speaking about anything with a URI very  
easy.

And even worse than all of the above, the default behavior makes it  
difficult to speak about the one thing that it may be important to  
speak about, namely  THE ENTRY (or the feed) ITSELF!!!!

In Atom Owl every entry is identified by a blank node, which has a  
functional relation awol:id to an id, and a functional relation  
awol:updated to a time stamp. It is not easy to speak about individual  
entries in atom since they don't have identifiers. So the obvious  
location to put information about them is as children of that element  
as hinted at by the atom spec in section 6.4.1, which admittedly is  
about simple extensions, but nevertheless makes the point well

[[
The element can be interpreted as a simple property (or name/value  
pair) of the parent element that encloses it. The pair consisting of  
the namespace-URI of the element and the local name of the element can  
be interpreted as the name of the property. The character data content  
of the element can be interpreted as the value of the property. If the  
element is empty, then the property value can be interpreted as an  
empty string.
]]


5. the mapping section
----------------------

The mapping section that allows one to make ad hoc mappings from atom  
elements to rdf relations is broken in two ways.

Take the examples:

[[
    <at:map
     property="http://purl.org/dc/elements/1.1/title">atom:title</ 
at:map>

    indicates that the atom:title element's content should be mapped to
    the http://purl.org/dc/elements/1.1/title property.  Given the entry

    <atom:entry>
     <atom:id>http://example.com/a</atom:id>
     <atom:title>Test</atom:title>
    </atom:entry>

    and the map above as a child of at:entrymap, the following triple
    would be implied;

  <http://example.com/a> <http://purl.org/dc/elements/1.1/title>  
"Test" .

]]


     5.1 Wrong place to put the mapping
     - - - - - - - - - - - - - - - - - -

  It is the wrong way to put these mappings. Much better would be to  
create a general semantics of atom, and then use RDF semantics to  
create these mappings. So for example it would be easy to add the  
following to atom owl

@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix awol: <http://bblfish.net/work/atom-owl/2006-06-06/#> .

awol:title rdfs:subPropertyOf dc:title .

Why add it every time to the atom document? Is this something that you  
think is going to be changing from one atom feed to another?


      5.2 the semantics are wrong
      - - - - - - - - - - - - - -


After a lot of work the AtomOwl group have come to develop an  
semantics for atom expressed in RDF. David Powell developed  
independently an ontology that was shown to be mostly isomorphic. Both  
would agree on the following: since atom allows two entries to have  
the same id, one should *not* make the subject of the title relations  
be the id URI. The following feed is valid atom


<atom:feed>
   ...
   <atom:entry>
    <atom:id>http://example.com/a</atom:id>
    <atom:title>Gold increases</atom:title>
    <atom:updated>2008-06-13T18:30:02Z</updated>
    <atom:content>The price of gold has just gone up</atom:content>
   </atom:entry>

   <atom:entry>
    <atom:id>http://example.com/a</atom:id>
    <atom:title>Gold value rises</atom:title>
    <updated>2008-06-14T02:30:02Z</updated>
    <atom:content>The price of gold has just gone up by 20%</ 
atom:content>
   </atom:entry>
  ...
</feed>

if the suggested mapping were right then the meaning of this would be

[] a awol:Feed;
   awol:entry <http://example.com/a> .

<http://example.com/a> dc:title "Gold increases", "Gold value rises" .
            awol:content "The price of gold has just gone up by 20%",
                         "The price of gold has just gone up" ;
            awol:updated "2008-06-13T18:30:02Z"^^xsd:dateTime,
                         "2008-06-14T02:30:02Z"^^xsd:dateTime .


which would make it impossible to work out:
     - which title goes with which content
     - which title goes with which time stamp
     - which time stamp goes with which content


yet that information is very clear in the atom document, and can  
furthermore be very clearly expressed in rdf without ambiguity, (but  
with some simplification) as:


@prefix : <http://bblfish.net/work/atom-owl/2006-06-06/#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[] a :Feed;
    :entry [
        :id "http://example.com/a"^^xsd:anyURI;
        :title "Gold Increases";
        :updated "2008-06-13T18:30:02Z"^^xsd:anyURI;
        :content "The Price of gold has just gone up";
        ];
    :entry [
        :id "http://example.com/a"^^xsd:anyURI;
        :title "Gold value rises";
        :updated "2008-06-14T02:30:02Z"^^xsd:anyURI;
        :content "The price of gold has just gone up by 20%";
        ]
.

Notice how we can very well associate which title goes with which  
entry, which updated time stamp, and which content.

The current Atom Triples spec would make force the wrong default  
interpretation of atom.


Given the above I do have to come to the conclusion that the above  
spec is badly broken. I would suggest first working on an official  
semantics for atom, then working on a better way to add general  
extensions to it that would work well with the semantics. This would  
be useful in getting a general semantics together and on making sure  
the extensions were meaningful.


Yours sincerely,

	Henry Story




On 1 Jul 2008, at 21:29, Dave Beckett wrote:
> Mark Nottingham and myself have co-authored an internet draft for
> transporting RDF in Atom.  We've called it "Atom Triples" since
> the focus is on Atom, annotating/adding the triples to the existing
> format.  Where we had a choice of the atom way or the rdf way, we
> picked the atom way.
>
> So the purpose of this format is to allow adding of triples for
> descriptions of the resources in an atom feed, using the URI
> of one atom:link as the main resource.  The body of the at:md
> is typically the blank-node-closure of the graph associated with
> the main resource.  Or at least, that's how I've done it so far.
>
> This is Version 0 and we know there are some things in the example
> that need clarifying and expanding and other questions, but here it  
> is:
>
> AtomTriples: Embedding RDF Statements in Atom
> http://www.ietf.org/internet-drafts/draft-nottingham- 
> atomtriples-00.txt
>
> Usual IETF I-D caveat: this URL will expire.
>
> Dave
> & Mark
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Wednesday, 2 July 2008 14:50:33 UTC