- From: Toby A Inkster <usenet200801@tobyinkster.co.uk>
- Date: Sat, 1 Mar 2008 13:30:24 +0000
- To: semantic-web@w3.org
Hello, I'm (planning on) writing a browser (reusing a rendering engine though!) with a focus on metadata. I've not started the GUI yet, but have so far written a parser to extract semantics from (X)HTML pages, and I'd appreciate feedback on it. It currently supports: * RDFa * eRDF * <title> element * <meta> element * <link rel> / <a rel> / <link rev> / <a rev> * The "role" attribute * Several microformats * Document structure (headings, etc) This data is parsed into an RDF-like data structure and can be dumped out in RDF, or as a Perl object dump. I'd appreciate examples where it fails. I'm aware that it occasionally fails due to encoding and/or entity problems -- I'd prefer examples where it simply fails to find some piece of metadata. I'd also like to know of any places where you think the RDF output could be improved. Thanks in advance for your feedback, -- Toby A Inkster BSc (Hons) ARCS
Received on Saturday, 1 March 2008 15:50:19 UTC