W3C home > Mailing lists > Public > public-rdfa-wg@w3.org > December 2012

Re: Preserving markup when distilling @property values in xhtml

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Fri, 21 Dec 2012 23:54:31 -0500
Message-ID: <50D53D07.9090902@digitalbazaar.com>
To: Sebastian Heath <sebastian.heath@gmail.com>
CC: RDFa WG <public-rdfa-wg@w3.org>
On 12/20/2012 05:39 PM, Sebastian Heath wrote:
> My issue is that the '<i>' element has been dropped out. I guess
> this is because the original XMLLitteral is being co-erced into a
> plain string. If that's the explanation, I think that is the
> incorrect default behavior. I understand that I can add an @datatype,
> but that will make my markup very messy. Particularly as I've chosen
> a simple case. There are lots of places where I want to preserve the
> markup in @property as that markup communicates important aspects of
> the data. Again, the underlying data is an XML literal and I suggest
> that the default behavior should be to preserve that when distilling
> RDFa in XHTML contexts.
> It is possible that such preservation of markup should only be 
> defined for RDFa in (X)HTML(5). Again, why destroy good structured 
> information in a host-language context?

Hi Sebastian,

Yes, we debated this for a very long time in the XHTML+RDFa 1.0 days
(between 2006-2008) and came to the same conclusion you did - that any
markup should be preserved if found.

As it turns out, that was exactly the wrong decision to make. When we
did a post-REC analysis on how XHTML+RDFa 1.0 was being used in the
wild, we found many, many examples on the Web where people were
expressing strings with markup that they never intended to express. That
is, simple things like dc:title contained a slew of XHTML markup. Even
worse, something simple like "Foo <i>Bar</i>" would expand into a
gigantic string if there were lots of RDFa prefix declarations in the
document (because all xmlns: definitions need to be preserved in
XMLLiterals in RDFa to ensure the snippet stays well-formed.

So, we reversed the decision for RDFa 1.1 and made the processor strip
out markup if found:


As you stated, in XHTML+RDFa 1.1 you can continue to preserve markup if
you add datatype="rdf:XMLLiteral" to the element containing the
@property attribute and the markup you want to preserve.

If you want the XHTML+RDFa 1.0 default behavior of preserving markup,
you can force the processor into XHTML+RDFa 1.0 mode by adding a
version="XHTML+RDFa 1.0" on the HTML element of the document. You can
also set it by declaring the XHTML+RDFa 1.0 DTD at the top of the document.

So, there are 3 ways to achieve what you want, but as you say, it might
make your markup a bit more verbose. Generating XMLLiterals by default
was creating too much garbage data on the Web, which is why we do it the
other way now. Plain Literals by default - XMLLiterals (or HTML Literals
if you're in HTML mode) if you explicitly specify it.

Does that make sense?

-- manu

Manu Sporny (skype: msporny, twitter: manusporny)
Founder/CEO - Digital Bazaar, Inc.
blog: The Problem with RDF and Nuclear Power
Received on Saturday, 22 December 2012 04:55:14 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:19:57 UTC