Myself, RDF and HTML

Hello all,

First to quickly introduce myself, I'm Toby Inkster. I'm an in-house 
developer for a UK charity, though my interest in the semantic web is not 
really related to my day job.

I'm working on a GPL'ed "semantics extractor" (for want of a better 
description) called Cognition. The ultimate aim is to make it into a 
desktop tool with integrated browser, but for now I'm happy to just parse 
stuff and export it in interesting formats. As input it supports various 
microformats (native, non-GRDDL support), RDFa, eRDF, RDF/XML (linked to 
with <link rel=meta>, or embedded in XHTML directly using namespaces or 
within HTML comments) and has partial support for GRDDL.

Cognition: http://buzzword.org.uk/cognition/

(Should be releasing a new version soon with s/instanceof/typeof/ amongst 
other changes.)

My interests lie within what I like to call the "mixed case semantic web": 
unifying POSH/microformat data models with the more formal side of the 
Semantic Web. Making sure that data extracted from one side is available 
to the other. As an example of the kind of thing I mean, take a look at 
<http://examples.tobyinkster.co.uk/hcard>, which is a hybrid example of 
hCard and RDFa. It is correctly parsed by Cognition, and can be output as 
vCard or RDF/XML.

Anyway, I popped my head into the beginning of the task force meeting on 
IRC this afternoon, to check if it was open to the public, as I had an 
idea I wanted to contribute for supporting RDFa in HTML (i.e. as opposed 
to XHTML).

The problem as I understand is that xmlns:foo attributes are unusable in 
HTML as they won't validate. Strictly, they won't validate against the  
XHTML DOCTYPE either, but we cough and mumble and ignore that because the 
W3C validator pretends that they're allowed.

Anyway, my idea is: RFC 2731 to the rescue! RFC 2731 was a technique 
proposed by the Dublin Core lot to allow the use of CURIE-like prefixes 
like "dc:" to be used for HTML <meta> elements.

For example, to define the prefix "dc" to point to the current Dublin Core 
RDF vocab, you could use:

 <link rel="schema.dc" href="http://purl.org/dc/terms/">

And then the prefix could be used in <meta> elements like:

 <meta name="dc.creator" content="Toby Inkster">

If this technique for defining prefixes were to be allowed in RDF (though 
I'd recommend replacing the dot separator with a colon) then RDF in HTML 
becomes feasible. 

With RFC 2731 these prefixes are valid document-wide, but it would be 
theoretically possible to extend RFC 2731 to allow prefixes to have a 
scope (i.e. equivalent to xmlns attributes on non-root elements) by simply 
following the general rules of RDFa:

 <div about="#thisSection">
   <span rel="schema:dc" href="http://purl.org/dc/terms/"></span>
   <h2 property="dc:title">Title of this Section</h2>
   <!-- ... etc ... -->
 </div>

if desired. However, that might be impractical to implement, because of 
cases like this:

 <div about="#thisSection">
   <h2 property="dc:title">Title of this Section</h2>
   <!-- ... etc ... -->
   <span rel="schema:dc" href="http://purl.org/dc/terms/"></span>
 </div>

so I'd probably suggest restricting this technique to just allow prefixes 
to be defined through <link> elements in the document head.

Anyway, those are my ideas with regards to RDFa in HTML. If anyone has any 
queries then, let me know either on or off list.

By the way, according to the list archives there are mumbles about 
changing the algorithm for parsing RDFa, particularly with regard to 
"dangling rels". If this has been decided, could the rdfa-syntax document 
be updated so that I can catch up?

regards

-- 
Toby A Inkster BSc (Hons) ARCS

Received on Thursday, 10 April 2008 17:35:38 UTC