Re: Discussion with Ian and Henri about HTML5+RDFa (part 2/2)

Philip Taylor wrote:
> One of those uses <a href="www.invaligia.com"
> property="cc:attributionName" rel="cc:attributionURL"> which means it's
> talking about the nonexistent page
> http://www.invaligia.com/www.invaligia.com

Interesting, because that will also create a clickable link that is
broken. In other words, with RDFa, if you've got broken triples, you
often have broken rendered HTML, too (not always, but it helps.)

> Two are not well-formed XML within the Creative Commons block of markup;
> the other two are not well-formed XML in the rest of the page. So it's
> not possible to extract the RDFa with an XML parser -- you would have to
> use an HTML parser instead (and presumably add hacks to emulate XML
> Namespace processing).

Most RDFa parsers are able to handle broken XHTML, by using tidy or by
using whatever the browser DOM generates.

> Somewhat relatedly, there's another four pages that use rel="dc:type".
> One of those (http://bytestrike.blogspot.com/) has it near a CC license
> link and does not have an xmlns:dc declaration anywhere, suggesting a
> copy-and-paste error.

Looks like it's broken in a more subtle way:

====
<span dc="http://purl.org/dc/elements/1.1/"
href="http://purl.org/dc/dcmitype/Text" rel="dc:type">
====

it says "dc" instead of "xmlns:dc". Wonder how that happened.

Good to see this info!

> I should probably try downloading some more recent pages, to see if
> CC/RDFa usage is more common now...

If you have time and resources to do so, that would be very useful!

-Ben

Received on Tuesday, 27 January 2009 17:59:52 UTC