Re: HTML in RDF

On 25/10/2007, Mark Birbeck <mark.birbeck@formsplayer.com> wrote:
> Karl/Shane,
>
> This is all true. :) But I think the question was about whether HTML
> could be serialised in its entirety to RDF.

I've vague recollections of wrapping an XHTML doc in <rdf:RDF/>,
applying an RDF parser, tweaking the markup and repeating... the level
of structural mismatch between HTML and RDF/XML is really quite
impressive given the small number of structural elements available.

This would allow documents
> to be analysed at a completely different level to simply processing
> the mark-up, and is something--like Bent--I'm very interested in for
> the future.

Hmm, with all due respect to recent trends around semantic HTML [1],
I'm not really sure how much could be taken to another level. A few
constructs like <title> would probably have direct counterparts in the
RDF way of thinking, but it's not obvious how one might RDFify
apparently simple elements like <code> without having a framework to
cover the whole structured text business. Given that lists often seem
to cause problems around RDF, I wonder what advantages an RDF
representation would have over existing tools for processing
structured text - has anyone done a (BNF?) grammar vocabulary..? Oh
yes - I find DanC has [2,3].

Coming at the modelling situation from the RDF side, a while ago Reto
Bachmann-Gmuerr did some interesting work around representing what we
now call information resources. His DiscoBits (discourse bits) vocab
[4] was at least in part prompted by looking for a way of better
representing the kind of (human-oriented) document data wrapped up in
syndication formats, while exploring options for Atom in RDF. While
seriously coarse-grained compared to HTML, even just the division of a
doc into title/content chunks did seem a lot more useful in this
context than treating the stuff as opaque literals (the vocab also
supports lists & nesting, so I guess it could be called a framework
for this business). More recently Reto put together an online Ajaxian
editor demo [5] using the Discobits model with the content stored in
his graph versioning system.

At the extreme, a kind of desert island project I've always wanted to
try is to write an RDF-based text editor,  going down as far as
representing individual characters as nodes in the graph. I doubt very
much whether it would have much use on the Semantic Web, but as an
exercise (and maybe API benchmark) it could be informative fun.

> RDFa provides a good basis for this.

Not sure I understand quite how..?

Cheers,
Danny.

[1] http://microformats.org/wiki/posh
[2] http://lists.w3.org/Archives/Public/public-cwm-talk/2006JanMar/0017.html
[3] http://dig.csail.mit.edu/breadcrumbs/node/85
[4] http://discobits.org/ontology
[5] http://discobits.org/editor/


-- 

http://dannyayers.com

Received on Thursday, 25 October 2007 08:32:33 UTC