- From: Reece Dunn <msclrhd@googlemail.com>
- Date: Thu, 7 Jul 2011 19:16:39 +0100
- To: public-rdfa-wg@w3.org
Hi, I am not sure what is the best place for this, but given the recent discussions I thought I'd raise my thoughts. I am a user (author) of RDFa metadata on a website and am writing a C++ application (document reader) that consumes metadata from multiple document types (X/HTML, ePub, ODF, DocBook, etc.) and stores that metadata internally as an RDF graph. AS AN AUTHOR... 1 ... I want to express RDF metadata triples in a webpage and take advantage of HTML5 markup. I am using RDFa because I find the CURIE syntax easier to read and more compact than the Microdata format -- especially having exposure of this in XML and RDF documents. I also don't fully understand how the Microdata format maps to an RDF graph -- the algorithm for the Microdata specification on how to generate RDF is difficult to grok easily, whereas the RDFa representation is easier to understand. The Microdata format reads more verbosely with the namespaces written out in full and the general approach. 2 ... I want to express basic document information about the page (title, author, etc.) in Dublin Core metadata triples. But I don't want to mix and match metadata syntaxes (e.g. using the DC schema syntax in the meta element) and want to keep data repetition to a minimum. 3 ... I want to express Creative Commons attribution and W3C validation links as given. That is: I don't want to work out how to convert the markup between formats (RDFa <-> Microdata). 4 ... I want to express bibliographical references accurately. For example: <li id="ref1" rel="dct:references"> <span typeof="foaf:Document" about="http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/"> <span property="dc:creator">Dave Beckett</span>, <a property="dc:title" href="http://www.w3.org/TR/2004/REC-rdf-syntax-grammar-20040210/">RDF/XML Syntax Specification (Revised)</a>. <span rel="dc:publisher" resource="_:W3C"/> <span typeof="foaf:Organisation" about="_:W3C"> <span rel="foaf:homepage" resource="http://www.w3.org"/> <span property="foaf:name">World Wide Web Consortium</span> (<span property="foaf:name">W3C</span>), </span> <span property="dc:date" datatype="xsd:date" content="2004-02-10">2004</span>. </span> </li> I have the _:W3C as I point other dc:publisher elements at that single expression: <span rel="dc:publisher" resource="_:W3C"/>World Wide Web Consortium (W3C)</span>, This means I inherently want to be able to express graphs, even in a single document. My only gripe here with the RDFa syntax is that I cannot say: <span rel="dc:publisher" about="_:W3C" typeof="foaf:Organisation"> <span rel="foaf:homepage" resource="http://www.w3.org"/> <span property="foaf:name">World Wide Web Consortium</span> (<span property="foaf:name">W3C</span>), </span> to avoid the blank span tag. AS AN IMPLEMENTER ... 1 ... I want to process the metadata in a single pass This is because I want to keep the implementation simple and efficient. 2 ... I don't want to generate a DOM for the HTML document in order to extract metadata This is related to point 1 -- keeping the implementation efficient (especially when extracting metadata from a large document such as Anna Karenina on Project Gutenberg -- 2.1MB). 3 ... I don't want to duplicate the implementation for processing HTML documents Due to the nature of HTML documents, I am using a relaxed parser for HTML and am using that parser to handle XHTML documents (to avoid having two parsers). - Reece
Received on Friday, 8 July 2011 13:40:27 UTC