A general approach to embed/extract metadata in XHTML

Hi all,

I wrote up yet another document on RDF extraction from XHTML [1].

This approach is an extension of RFC 2731 that defines a way to map schema
URI onto a prefix (used in attribute values), mainly for Dublin Core
encoding in (X)HTML. In this experiment, I extended this prefixed attribute
values to class attributes and rel attributes, so that we can embed
metadata in the body as well as head element in XHTML.

This allows us to use arbitrary vocabulary in XHTML attributes, and extract
RDF triples from them without prior knowledge.


For example, one may want to write a creation date and a modification date
in address element for human readers. These data also can be marked with
prefixed class attribute values as:

<address>Original: <em class="dcterms.created">2003-12-15</em>;
Last-modified: <em class="dcterms.modified">2004-03-14</em></address>

with link element that maps 'dcterms' to the namespace uri:

<link rel="schema.dcterms" href="http://purl.org/dc/terms/" />

An XSLT can then generate an RDF from these information:

<rdf:Description rdf:about="">
 <dcterms:created xmlns:dcterms="...">2004-03-13</dcterms:created>
 <dcterms:modified xmlns:dcterms="...">2004-03-15</dcterms:modified>
</rdf:Description>

(I use dot (.) to delimit prefix according to RFC 2731, but it's also
possible to use semicolon (:) and make them more QName like.)

More variations with examples and (partly English) explanation will be
found at the above mentioned document [1].

This method works well with GRDDL (the document itself is an example of
GRDDL [2]), and no extension to the existing XHTML specification is
required.

cheers,


[1] http://kanzaki.com/docs/sw/xh2rdf.html
[2]
http://www.w3.org/2000/06/webdata/xslt?xslfile=http://www.w3.org/2004/01/rdxh/grddl-xml-processor&xmlfile=http://kanzaki.com/docs/sw/xh2rdf.html

Received on Monday, 15 March 2004 07:33:31 UTC