- From: Dan Brickley <danbri@danbri.org>
- Date: Fri, 02 Jan 2009 13:43:02 +0100
- To: Manos Batsis <manos_lists@geekologue.com>
- Cc: Toby A Inkster <tai@g5n.co.uk>, Ivan Herman <ivan@w3.org>, Giovanni Tummarello <g.tummarello@gmail.com>, Semantic Web <semantic-web@w3.org>, Benjamin Nowack <bnowack@semsol.com>
On 2/1/09 12:29, Manos Batsis wrote: > Toby A Inkster wrote: >> >> Dan Brickley wrote: >> >>> Another thought (hmm maybe I mentioned this before) - does the idea >>> of an RDF-EASE-to-XSLT convertor make sense, so that EASE could >>> effectively serve as an authoring tool for GRDDL XSLT documents? >> >> >> I think you might have suggested this to me before. My XSLT skills >> would be nowhere near adequate for such a task - wouldn't even know >> where to start. But it certainly seems an interesting idea. > > Would be happy to work on a java-based RDF-EASE-to-XSLT converter; it > would be very easy for me to do the XSLT generation if someone could > help with the CSS to objects parsing step. That would be great! Can anyone here help with the CSS-to-objects piece? (Toby? others?). Manos, if you're happy in Perl maybe you could build something based on http://buzzword.org.uk/2008/rdf-ease/implementation.pl ? If not, at least it gives an idea for using CSS APIs, and some algorithms to look at. BTW, thinking about RDF-EASE a bit more, I've been wondering whether it could serve as a nice abstraction layer for writing screenscrapers in general. The sales pitch in http://buzzword.org.uk/2008/rdf-ease/spec#sec-intro follows GRDDL's emphasis on self-describing documents: "CSS is an external file that specifies how your document should look; RDF-EASE is an external file that specifies what your document means.". I suspect there are probably significant use cases where the RDF-EASE document is an annotation against someone else's content structures. I'm thinking of something like a declarative version of 'greasemonkey', where people write addons for other sites that can be used either client-side or in search indexers, proxies etc., to map from 'plain but mysterious' HTML into more explicit data structures. For example, one might write an RDF-EASE thingy for http://www.last.fm/music/The+Rumble+Strips or http://www.myspace.com/rumblestripsuk that pulled data into the same RDF idioms used at http://www.bbc.co.uk/music/artists/0ca53fff-3b07-49eb-bcb9-bbe84f1ec768 (which already btw has an RDF version, http://www.bbc.co.uk/music/artists/0ca53fff-3b07-49eb-bcb9-bbe84f1ec768.rdf ). For that matter it would be interesting to see how much of http://dbpedia.org/page/The_Rumble_Strips could be reconstructed from applying RDF-EASE to http://en.wikipedia.org/wiki/The_Rumble_Strips - I know the DBPedia folks do quite a lot of tidying, but not sure how much is on the page-to-page level, versus larger contextual structures. Does anyone here care to try writing RDF-EASE for Myspace, Last.fm, BBC Music or Wikipedia music entries? If that turns out well, the next piece of the puzzle is discovery/applicability: figuring out which RDF-EASE documents are usefully applied to which pages. POWDER is somehow relevant here (since it has vocabulary for associating URI patterns with RDF labels), but alone probably wouldn't be enough for something like Wikipedia, since we'd want different RDF-EASE views of different parts of the site. Not to mention that different apps might care about generating different flavours of RDF. Anyway, having an XSLT-generator would be great I think, since there are already a few GRDDL interpreters around now. Ultimately a pure RDF-EASE parser would probably be faster (but who knows). For now, generating XSLT for GRDDL seems a nice way of getting compatibility from tools such as Redland, Jena etc. cheers, Dan -- http://danbri.org/
Received on Friday, 2 January 2009 12:43:47 UTC