Re: GRDDL extraction *to* RDFa from Chimezie Ogbuji on 2006-09-11 (public-grddl-wg@w3.org from September 2006)

From: Chimezie Ogbuji <ogbujic@bio.ri.ccf.org>
Date: Mon, 11 Sep 2006 10:13:27 -0400 (EDT)
To: public-grddl-wg <public-grddl-wg@w3.org>
Message-ID: <Pine.GSO.4.60.0609110918200.28793@joplin.bio.ri.ccf.org>
On Sat, 9 Sep 2006, Ben Adida wrote:

> Here's another twist on it: some people think N3 is a better way of
> expressing RDF than RDF/XML. In that world, it would make sense to
> provide a GRDDL transform from RDF/XML to N3, in which case RDF/XML is
> now a GRDDL input. That would be awfully recursive, but it all depends
> on what we think the "pure" expression of RDF is. I wonder if we really
> want to be setting in stone such an opinion, or if we should instead say
> "GRDDL outputs some standard carrier of RDF triples, e.g. RDF/XML."

To be honest, I've gone back and forth on this 
and the argument that RDF/XML is 'the' standard seems less like a sound argument to make about 
what is truely an 'abstract' syntax - not only because of its inability to uniformly (and concisely) express RDF.

> In my mind, this second step is not always necessary. In fact, the
> transformation to 'raw' RDF discards important contextual information,
> which means you wouldn't want this to be the only way to "read" RDFa.
> Consider the way the RDFa bookmarklet works: it doesn't GRDDL the
> XHTML+RDFa to triples, it actually navigates the DOM tree, looking for
> attached triples along the way.
>
> I see this "DOM ornaments" approach as a major way of "reading the
> triples" from an RDFa document.
>
>> The only problem with the first step of course is that <?xml-stylesheet
>> ?> is more of a suggestion than a 'standard' (even though it is
>> supported by most major browsers).
>
> Yes, there is that, plus the fact that it doesn't indicate anything
> about *what* this transformation is meant for.

Agreed.  That is definately one advantage that GRDDL has over 
<?xml-stylesheet?> - It expresses a clear intent for the result of the 
transformation: an RDF description of the original source document.  It's 
worth noting, however, that one advantage that <?xml-stylesheet?> has over 
GRDDL is the ability to clearly express what the expected output of
  the transformation is (by mime-type) - it's @type attribute.  It's no
  coincidence that XSLT also includes a way to express the expected
output of a transformation: xsl:output/@media-type [1].  More on the significance of this later.

>> I've actually had a real world need to do something like your usecase
>> suggests.  I've been working on an Atom-driven Python Weblog tool [2]
>> which uses XSLT for its templating in conjunction with Python WSGI for
>> the web server stack.  I added a few presentation templates and modified
>> the XSLT (which takes an Atom feed as source ) to output XHTML with RDFa
>> markup for Atom metadata (author, label, date of creation, etc..).
>
> Very cool. Would it be okay if we linked this work from rdfa.info?

Sure,.. It's still a work in progress though :)

> What you're describing here fits very well within the use case I
> described, but I think you and I may have slightly different
> interpretations of why.

> In my mind, whether your munging together of ATOM and templating
> produces RDF/XML or XHTML+RDFa shouldn't make a difference. In the first
> case, it's only machine-readable, in the second case it's both machine
> and human readable. If you had, instead, produced some homegrown HTML
> with no RDFa or anything else, then you could transform that homegrown
> HTML to a carrier of RDF, which could be either RDF/XML or XHTML+RDFa.
>
> In other words, I see the GRDDL-able gap between homegrown-HTML and
> XHTML+RDFa. If I'm understanding you correctly, you see the gap between
> XHTML+RDFa and RDF/XML.

Yes, and my scoping in this way really is only because the current state 
of the specification doesn't allow anything thing beyond the second 
interpretation - which I'm starting to believe is an unfounded 
architectural restriction.

>> I think the categories of possible output for GRDDL are:
>>
>> - RDF/XML alone
>> - Stand-alone RDF syntaxes (NTriples, N3, TriX, etc..)
>> - Embedded RDF syntaxe for XML (eRDF, RDFa, etc..)
>
> So the central question is whether you conceive of RDFa as a
> serialization of RDF or as something that can be transformed into a
> valid serialization of RDF. I think it is a first-class serialization,
> but I know not everyone agrees.
>
> The second question is whether we should be deciding, in this working
> group, what constitutes "pure RDF," or whether we should say that a
> GRDDL output should be "some accepted, mime-typed, serialization of RDF,
> e.g. RDF/XML." I am clearly in favor of this latter approach.

I am as well.  Full disclosure: my original email was written in the hope 
to initiate further conversation about the current limitation of what is 
expected from GRDDL.  I think the important adjective in your above 
statement is 'mime-typed'.  XSLT, the 'suggested' transformation language 
of GRDDL is perfectly capable of supporting an output that is completely 
specified by mime-type (rather than an imposed restriction by GRDDL) and 
using the mime-type of the output opens the door for other serializations 
of RDF (including Turtle, which XSLT is also quite capable of 
generating).

The only caveat would that the use of mime-types would be required of the 
transformation algorithm (no problem with XSLT, but what of others?).

Consider if there was an RDF vocabulary for Microsummaries (per 
Dominique's recent email) with a registered mime-type 
'application/x.microsummary+rdf'. In a scenario where GRDDL only required 
the output to be a mime-type described RDF syntax, microsummaries would 
then fit quite nicely as a GRDDL transform for consumption by browsers 
that understood microsummary RDF.

I'd imagine any transformation pipeline would also rely on mime-types to 
indicate the output format at each stage of the transformation.

>
>> The current spec only seems to support the first
>
> Yes, I agree that the spec currently reads this way, and I'm proposing
> that this be made more flexible (though I don't think I'm the first to
> suggest this.)

Definately not..

[1] http://www.w3.org/TR/xslt#output

Chimezie Ogbuji
Lead Systems Analyst
Thoracic and Cardiovascular Surgery
Cleveland Clinic Foundation
9500 Euclid Avenue/ W26
Cleveland, Ohio 44195
Office: (216)444-8593
ogbujic@ccf.org
Received on Monday, 11 September 2006 14:13:54 UTC