Re: RDF-EASE: RDFa in CSS from Dan Brickley on 2009-01-02 (semantic-web@w3.org from January 2009)

From: Dan Brickley <danbri@danbri.org>
Date: Fri, 02 Jan 2009 13:43:02 +0100
To: Manos Batsis <manos_lists@geekologue.com>
Cc: Toby A Inkster <tai@g5n.co.uk>, Ivan Herman <ivan@w3.org>, Giovanni Tummarello <g.tummarello@gmail.com>, Semantic Web <semantic-web@w3.org>, Benjamin Nowack <bnowack@semsol.com>
Message-ID: <495E0BD6.9000802@danbri.org>

On 2/1/09 12:29, Manos Batsis wrote:
> Toby A Inkster wrote:
>>
>> Dan Brickley wrote:
>>
>>> Another thought (hmm maybe I mentioned this before) - does the idea
>>> of an RDF-EASE-to-XSLT convertor make sense, so that EASE could
>>> effectively serve as an authoring tool for GRDDL XSLT documents?
>>
>>
>> I think you might have suggested this to me before. My XSLT skills
>> would be nowhere near adequate for such a task - wouldn't even know
>> where to start. But it certainly seems an interesting idea.
>
> Would be happy to work on a java-based RDF-EASE-to-XSLT converter; it
> would be very easy for me to do the XSLT generation if someone could
> help with the CSS to objects parsing step.

That would be great! Can anyone here help with the CSS-to-objects piece? 
(Toby? others?).

Manos, if you're happy in Perl maybe you could build something based on 
http://buzzword.org.uk/2008/rdf-ease/implementation.pl ? If not, at 
least it gives an idea for using CSS APIs, and some algorithms to look at.

BTW, thinking about RDF-EASE a bit more, I've been wondering whether it 
could serve as a nice abstraction layer for writing screenscrapers in 
general. The sales pitch in 
http://buzzword.org.uk/2008/rdf-ease/spec#sec-intro follows GRDDL's 
emphasis on self-describing documents: "CSS is an external file that 
specifies how your document should look; RDF-EASE is an external file 
that specifies what your document means.". I suspect there are probably 
significant use cases where the RDF-EASE document is an annotation 
against someone else's content structures. I'm thinking of something 
like a declarative version of 'greasemonkey', where people write addons 
for other sites that can be used either client-side or in search 
indexers, proxies etc., to map from 'plain but mysterious' HTML into 
more explicit data structures.

For example, one might write an RDF-EASE thingy for 
http://www.last.fm/music/The+Rumble+Strips or 
http://www.myspace.com/rumblestripsuk that pulled data into the same RDF 
idioms used at 
http://www.bbc.co.uk/music/artists/0ca53fff-3b07-49eb-bcb9-bbe84f1ec768 
(which already btw has an RDF version, 
http://www.bbc.co.uk/music/artists/0ca53fff-3b07-49eb-bcb9-bbe84f1ec768.rdf 
). For that matter it would be interesting to see how much of 
http://dbpedia.org/page/The_Rumble_Strips could be reconstructed from 
applying RDF-EASE to http://en.wikipedia.org/wiki/The_Rumble_Strips - I 
know the DBPedia folks do quite a lot of tidying, but not sure how much 
is on the page-to-page level, versus larger contextual structures.

Does anyone here care to try writing RDF-EASE for Myspace, Last.fm, BBC 
Music or Wikipedia music entries? If that turns out well, the next piece 
of the puzzle is discovery/applicability: figuring out which RDF-EASE 
documents are usefully applied to which pages. POWDER is somehow 
relevant here (since it has vocabulary for associating URI patterns with 
RDF labels), but alone probably wouldn't be enough for something like 
Wikipedia, since we'd want different RDF-EASE views of different parts 
of the site. Not to mention that different apps might care about 
generating different flavours of RDF.

Anyway, having an XSLT-generator would be great I think, since there are 
already a few GRDDL interpreters around now. Ultimately a pure RDF-EASE 
parser would probably be faster (but who knows). For now, generating 
XSLT for GRDDL seems a nice way of getting compatibility from tools such 
as Redland, Jena etc.

cheers,

Dan

--
http://danbri.org/

Received on Friday, 2 January 2009 12:43:47 UTC