W3C home > Mailing lists > Public > public-rdf-in-xhtml-tf@w3.org > April 2007

Re: Cascading hGRDDL (was Re: hGRRDL for hCard: test case of a GRDDL transformation to produce RDFa from hCards)

From: Jeremy Carroll <jjc@hpl.hp.com>
Date: Fri, 20 Apr 2007 14:07:40 +0100
Message-ID: <4628BB1C.4000809@hpl.hp.com>
To: Fabien Gandon <Fabien.Gandon@sophia.inria.fr>
CC: GRDDL Working Group <public-grddl-wg@w3.org>, public-rdf-in-xhtml-tf@w3.org


I've checked these files into the test area, but I have some scepticism 
that this is the right way to do this.

The problem that I see is as follows:


Initial doc:

link: hCal2RDFa
link: hCard2RDFa

The Grddl processor sees this and applies both transforms, down two 
different computation routes

Route 1 hCal2RDFa
becomes
link: RDFa2RDFXML
link: hCard2RDFa


Route 2 hCard2RDFa
becomes
link: hCal2RDFa
link: RDFa2RDFXML

For each, the mime type is text/html so we repeat getting:

Route 1.1 hCal2RDFa/RDFa2RDFXML
becomes
hCal.rdf

Route 1.2 hCal2RDFa/hCard2RDFa
becomes
link: RDFa2RDFXML
link: RDFa2RDFXML


Route 2.1 hCard2RDFa/hCal2RDFa
becomes
link: RDFa2RDFXML
link: RDFa2RDFXML


Route 2.1 hCard2RDFa/RDFa2RDFXML
becomes
hCard.rdf

We then need to repeat 1.2 and 2.1 to get

1.2.1 hCal2RDFa/hCard2RDFa/RDFa2RDFXML
1.2.2 hCal2RDFa/hCard2RDFa/RDFa2RDFXML
2.1.1 hCard2RDFa/hCal2RDFa/RDFa2RDFXML
2.1.2 hCard2RDFa/hCal2RDFa/RDFa2RDFXML

giving us four copies of
hCalhCard.rdf

So, in the end we've computed 6 GRDDL results, 4 of which are hopefully 
the same, and the other two together should presumable amount to one of 
the other four.
So we get each triple 5 times over.

And if we have blank nodes in the output, we have multiple copies that 
do not get merged away. (deleting the multiple copies is fairly hard, 
equivalent to the subgraph isomorphism problem, and is not required of a 
GRDDL processor).


I think it would be better to have a single transform that somehow 
pipelined the two together - I suspect quite easy in XSLT 2.0, and 
difficult in XSLT 1.0

It may be possible to have a single transform, let's call it pipeline, 
that rewrites the link headers to execute one after the other e.g.

initial doc:

link rel="transformation" href="pipeline"

output

link rel="pipeline transformation" href="mf1"
link rel="pipeline" href="mf2 mf3 mf4"

GRDDL applies mf1 for first microformat,
and gets html output

link rel="pipeline transformation" href="pipeline"
link rel="pipeline" href="mf2 mf3 mf4"

(with mf1 knowing to hand back to pipeline rather than RDFa2RDFXML, 
perhaps keyed by the pipeline word in the rel attribute)

GRDDL applies pipeline which sees the current state and rewrites the 
link headers as

link rel="pipeline transformation" href="mf2"
link rel="pipeline" href="mf3 mf4"

etc.

at the end we have

link rel="pipeline transformation" href="pipeline"
link rel="pipeline" href=""

and GRDDL calls pipeline, which seeing that the pipeline is empty simply 
invokes RDFa2RDFXML by rewriting headers as

link rel="transformation" href="RDFa2RDFXML"

I'm unclear as to the value of this though.
It seems like quite a lot of work, in order to have an intermediate step 
that conforms with RDFa; admittedly that is publishable, but only if the 
GRDDL processor knows to stop and give you this intermediate file.

Jeremy






















Fabien Gandon wrote:
> 
> Jeremy Carroll:
>> (...) Then, a GRDDL processor that dispatches on the media-type of the 
>> XSLT output; on seeing HTML output, may choose to invoke itself 
>> (again) to turn the HTML into RDF, as you desire.
>> In my view this could be an informative test, since it pushes the spec 
>> to its limit, invoking a recursion that was neither explicitly 
>> permitted nor explicitly excluded.
> Yes, and now that we have two hGRDDL transformations (one for hCard and 
> one for hCalendar) I built a case with three GRDDL transforms in a row ;-)
> 
> 1, I have a source with both hCalendar and hCard data:
> http://www-sop.inria.fr/acacia/personnel/Fabien.Gandon/tmp/grddl/hGRDDL/hCardhCalendar.html 
> 
> 
> 2, I call the hGRDDL hCard transformation to augment the hCard with its 
> RDFa equivalent:
> transformation: 
> http://www-sop.inria.fr/acacia/personnel/Fabien.Gandon/tmp/grddl/hGRDDL/hCard2RDFa.xsl 
> 
> result: 
> http://www-sop.inria.fr/acacia/personnel/Fabien.Gandon/tmp/grddl/hGRDDL/hCardhCalendar_RDFa_hCardOnly.html 
> 
> 
> 3, on this result, I call the hGRDDL hCalendar transformation to augment 
> the hCalendar with its RDFa equivalent:
> transformation: 
> http://www-sop.inria.fr/acacia/personnel/Fabien.Gandon/tmp/grddl/hGRDDL/hCalendar2RDFa.xsl 
> 
> result: 
> http://www-sop.inria.fr/acacia/personnel/Fabien.Gandon/tmp/grddl/hGRDDL/hCardhCalendar_RDFa_hCardhCalendar.html 
> 
> 
> 4, on this last result I call the GRDDL transformation to get the RDF/XML:
> transformation: http://www-sop.inria.fr/acacia/soft/RDFa2RDFXML_v_0_7.xsl
> result: 
> http://www-sop.inria.fr/acacia/personnel/Fabien.Gandon/tmp/grddl/hGRDDL/hCardhCalendarRDFXML.rdf 
> 
> 
> So three transformations
> [XHTML] --(hCard)--> [XHTML+RDFa] --(hCalendar)--> [XHTML+RDFa*2] 
> --(RDFa2RDFXML) --> [RDF/XML]
> 
> Cheers,
> 

-- 
Hewlett-Packard Limited
registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Friday, 20 April 2007 13:10:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:15:04 GMT