Re: GRDDL extraction *to* RDFa from Chimezie Ogbuji on 2006-09-09 (public-grddl-wg@w3.org from September 2006)

From: Chimezie Ogbuji <ogbujic@bio.ri.ccf.org>
Date: Sat, 9 Sep 2006 13:18:31 -0400 (EDT)
To: public-grddl-wg <public-grddl-wg@w3.org>
Message-ID: <Pine.GSO.4.60.0609091317590.24977@joplin.bio.ri.ccf.org>

Hi Ben.

I have a concern that such a transform isn't really a GRDDL transform but 
a generic one, which *results*
in a GRDDL Source Document (which could refer to a standard [1] XHTML/RDFa 
-> RDF GRDDL transform)

I've always thought of the GRDDL process as a black box where an XML 
dialect goes in and an RDF syntax comes out (RDF/XML currently, though I 
think there
is an outstanding issue in the author's version to consider other 
serializations of RDF).  The mechanisms by which the transforms are 
registered and the transformation itself constitute the black box.

The problem with your scenario as I see it is that it requires an 
expansion of the current definition to include (as output) not only 
'non-standard' RDF serialization syntaxes (Turtle, N3, TriX, etc..) but 
embedded syntaxes which I'd argue are better served as the *input* to 
GRDDL which results in a 'stand-alone' RDF syntax.  It seems to me that 
your scenario is really a two phase process (where the first phase is a 
layout transformation and the second is a GRDDL transformation) especially 
since the primary motivation is "preserving the style and layout of the 
page.":

1) Transformation from homegrown HTML to XHTML+RDFa (I'd argue by the 
current scope this is not a GRDDL process and could easily be served with 
a <?xml-stylesheet?> instruction *within* the original HTML)
2) Transformation from the XHTML+RDFa to 'raw' RDF (used by the RDFa 
browser which itself could be a GRDDL Processor - using an existing [1] 
transform for this process)

The only problem with the first step of course is that <?xml-stylesheet ?> 
is more of a suggestion than a 'standard' (even though it is supported by 
most major browsers).

I've actually had a real world need to do something like your usecase 
suggests.  I've been working on an Atom-driven Python Weblog tool [2] 
which uses XSLT for its templating in conjunction with Python WSGI for the 
web server stack.  I added a few presentation templates and modified the 
XSLT (which takes an Atom feed as source ) to output XHTML with RDFa 
markup for Atom metadata (author, label, date of creation, etc..).  In 
order to test the template visually, I installed your RDFa Highlight 
bookmarklet [3].  I also modified the XSLT such that the output XHTML 
document was also a GRDDL Source Document (by adding the appropriate 
profile and link[@rel=transformation] element).  This way the RDFa 
bookmarklet could understand the RDFa directly and a generic GRDDL 
Processor could as well.

I think the categories of possible output for GRDDL are:

- RDF/XML alone
- Stand-alone RDF syntaxes (NTriples, N3, TriX, etc..)
- Embedded RDF syntaxe for XML (eRDF, RDFa, etc..)

The current spec only seems to support the first

[1] http://www-sop.inria.fr/acacia/soft/RDFa2RDFXML.xsl
[2] http://cheeseshop.python.org/pypi/BrightContent/0.1 (a work in progress)
[3] http://www.w3.org/2001/sw/BestPractices/HTML/rdfa-bookmarklet/

Chimezie Ogbuji Lead Systems Analyst Thoracic and Cardiovascular Surgery
Cleveland Clinic Foundation
9500 Euclid Avenue/ W26
Cleveland, Ohio 44195
Office: (216)444-8593
ogbujic@ccf.org

Received on Saturday, 9 September 2006 17:18:44 UTC