Experimenting with pseudo-hGRDDL (was Re: Clarifying CURIEs [was Re: RDFa - Dublin Core Metadata])

I have made a restructuring of pyRdfa implementation; I think it is
worth sharing the experiences for the discussion on hGRDDL-Dublin Core.
(Unfortunately, the cvs site where I store the code seems to have a
temporary glitch with the public URI, so the code is not updated there
yet. But the RDFa extractor works with that code already.)

What I did was to completely separate the 'core' RDFa processing with
something I called 'transformers'. These transformers make a
transformation on the incoming DOM tree by changing the DOM tree
somehow. The core RDFa processing is done on the result only.

Here are the transformers that I added (with a description in DOM terms
what they do)

1. handling @name in <meta> (add a @property with the value of @name)
2. handling the predefined rels and properties of XHTML (add an
@xmlns:xhtml to the <html> element, and change all relevant attributes
by adding the xhtml: prefix)
3. handle the <ul> and <ol> to generate rdf collections and containers
(I have described that change before; this is slightly more complex
because it requires a 'postprocessing' step on the graph, too)
4. handle the Dublin Core approach (add an xmlns to the <head> element
for each @rel="schema.XXX" in a <link> and exchange all XXX.YYY to
XXX:YYY in the attributes within the <head>
5. handle the OpenID entries (add a
xmlns:openid="http://xmlns.openid.net/auth#" to the <head> and exchange
all openid.XXX against openid:XXX).

Entries 1, 2, and 3 are standard, and required by RDFa (I am not sure we
have finally decided in #3, but I believe we should). #4 and #5 is optional.

So yes, this is *almost* the same as hGRDDL except that it is simulated
within pyRdfa. The question is how to define this in general so that an
implementation can 'follow its nose' as we like to say these days.

Ben, you say:

>> ======
>> <html xmlns="http://www.w3.org/1999/xhtml">
>>   <head profile="http://dublincore.org/documents/2007/07/27/dc-html/">
>>     <title>Services to Government</title>
>>     <link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />
>>     <link rel="DCTERMS.subject"
href="http://example.org/topics/archives" />
>>   </head>
>> ======
>>
>> The dc-html profile should specify both a GRDDL and an hGRDDL transform.

Do we expect an RDFa implementation to implement the general (h)GRDDL
mechanism for all possible profiles? Or would we have a mechanism
whereby the implementation can disclose which profiles it understands
and which it does not? I must admit, I am a bit concerned about the
first alternative (I am not sure how I would implement it in full
generality without, essentially, re-implementing a full GRDDL service!)

Ivan




Ivan Herman wrote:
> Hi Ben,
> 
> from an abstract, specification point of view I like the approach. Yes,
> it makes the core RDFa approach much cleaner and that is a great plus.
> 
> I does raise some questions on the implementation side, though. The
> problem is that GRDDL transforms are usually XSLT based, and I would
> expect that do would hGRDDL. I am not sure how this would/should work
> well with an implementation like, for example, pyRdfa.
> 
> I am back at home for a week (go to Singapore coming week end) but I
> might have some time to experiment with some architecture on pyRdfa that
> would be based on such pre-processor transformations first. But it looks
> pretty complicated to use pyRdfa to spawn off an arbitrary XSLT
> transform and pick up the results easily. Of course, it is easier to
> write transformations in Python and have an architecture that makes such
> transformations 'inside' (I will definitely do that to experiment with).
> Which means that one would have to re-implement those transformations in
> Python.
> 
> Can you refer to your hCard and hCalendar transforms to re-simulate them?
> 
> Ivan
> 
> Ben Adida wrote:
>> Ivan wrote, regarding the CURIE spec:
>>> However. We clearly would need proper TF resolution on these issues.
>> We have agreed (though not quite resolved) to include a complete CURIE
>> description in the RDFa Syntax document. See [1].
>>
>> Mark wrote:
>>> All of this could be done in a preprocessing step which might just be
>>> Ben's hGRDDL proposal from before.
>> Exactly, and in fact *all* of this can and should be handled with
>> hGRDDL. For those who don't know what I'm talking about, the point of
>> the hGRDDL proposal is:
>>
>> GRDDL takes XHTML and outputs RDF/XML
>>
>> while
>>
>> hGRDDL takes XHTML and outputs XHTML+RDFa.
>>
>> So hGRDDL would take XHTML with DC markup, and, using a transform
>> specified by the DC profile, sprinkle in the proper RDFa with the right
>> syntax. For example, DC's example #1 would be processed as follows
>>
>> (note that I moved @profile from the HTML element to HEAD... I think
>> they made a mistake?)
>>
>> ======
>> <html xmlns="http://www.w3.org/1999/xhtml">
>>   <head profile="http://dublincore.org/documents/2007/07/27/dc-html/">
>>     <title>Services to Government</title>
>>     <link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />
>>     <link rel="DCTERMS.subject" href="http://example.org/topics/archives" />
>>   </head>
>> ======
>>
>> The dc-html profile should specify both a GRDDL and an hGRDDL transform.
>>
>> The hGRDDL transform modifies the DOM as follows:
>>
>> ======
>> <html xmlns="http://www.w3.org/1999/xhtml">
>>   <head xmlns:dcterms="http://purl.org/dc/terms/">
>>     <title>Services to Government</title>
>>     <link rel="schema.DCTERMS" href="http://purl.org/dc/terms/" />
>>     <link rel="DCTERMS.subject dcterms:subject"
>> href="http://example.org/topics/archives" />
>>   </head>
>> ======
>>
>> (Note specifically the addition of the namespace and the @rel value of
>> dcterms:subject.)
>>
>> I've written up a paper about this which should see the light of day
>> soon, and which I think will be crucial to link other syntaxes with RDFa
>> without making RDFa a hodge-podge of different syntaxes.
>>
>> (And it's worth noting that I've implemented hGRDDL for hCard and
>> hCalendar, and it works surprisingly well with very little code.)
>>
>> In other words, I think we can ignore this for now: we'll have a proper
>> solution by transforming legacy syntaxes, rather than supporting them in
>> RDFa core.
>>
>> -Ben
>>
>>
>> [1] http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2007Jul/0116
>>
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Thursday, 23 August 2007 11:38:34 UTC