Dublin core HTML->RDF (more semantic screen scraping with XSLT)

Share and enjoy...


Dublin Core Extraction Service

XSL file: 

XML data: 

How does it work?

The form invokes a generic XSLT service that takes

an XSLT transformation 
     the default transformation for this form, dc-extract.xsl, converts
     from the format given in Encoding Dublin Core Metadata in HTML,
     December 1999 by J. Kunze and produces RDF. 
some XML data 
     try the tidy service if you have HTML that isn't well-formed. 

     For example, the ADAM page isn't well-formed (i.e. if it isn't
     XHTML), but the results of running the ADAM page thru tidy is.

and returns the result.


I wrote the guts of dc-extract.xsl on my palm pilot, over drinks with
Eric Miller and Dan Brickley in Amsterdam after WWW9 in an effort to
show them how easy it is to use XSLT to extract RDF from real-world

Dan Connolly
$Revision: 1.4 $ of $Date: 2000/06/09 18:52:10 $ by $Author: connolly $ 

I copy dc@oclc.org per:

	"Additions, deletions and changes to
	this list are welcomed. Please submit all
	information to dc@oclc.org"

	-- http://purl.org/dc/tools/index.htm

Dan Connolly, W3C http://www.w3.org/People/Connolly/

Received on Friday, 9 June 2000 14:53:16 UTC