Re: presenting GRDDL, an update on RDFinXHTML-35 and input on namespaceDocument-8 from Chris Lilley on 2004-02-06 (www-tag@w3.org from February 2004)

From: Chris Lilley <chris@w3.org>
Date: Sat, 7 Feb 2004 00:45:55 +0100
To: Dan Connolly <connolly@w3.org>
Cc: www-tag@w3.org
Message-ID: <1359059387.20040207004555@w3.org>
On Thursday, February 5, 2004, 11:36:26 PM, Dan wrote:


DC> This GRDDL stuff is starting to come together; I've
DC> written up the design history and rationale...
DC>   http://www.w3.org/2004/01/rdxh/specbg.html
DC> I'm starting to think that might make a nice TAG
DC> finding on issue 35.

Yes, it would. The historical background makes the argument easy to
read, and the case against stuffing RDF in XML comments is also well
made.

However, I believe this document misses the point in one area. It
gives the impression that lack of DTD validation was the main reason
to avoid putting RDF in (X)HTML. I don't think it was.

HTML 4 had a notion of 'strip the tags and present the content' for
dealing with unknown elements; an unhappy mixture of must-ignore and
must-present that gave problems including RDF (or indeed MathML, SVG,
or other namespaces) inline in the document.

This limitation made some sense in the early days when the HTML spec
was based on SGML and browsers were not expected by the spec to have
an actual SGML parser; they used a stream-based formatting approach
instead and could not be expected to keep track of such things as
element nesting. The hack has long since outgrown its usefulness,
however.

In consequence, an attribute-only syntax of RDF came into use which
(while just as invalid, to a DTD validator, as the element-content
syntax,) avoided the presentation of RDF content as text. This
indicates, to me, that it was a desire to hide the RDF from being
displayed by an HTML browser that was the main driving force.

The Creative Commons and weblog advocacy of XML comments achieves the
same end, avoidance of the content being presented (and incidentally
makes the content validate to a DTD. Similar ruses were used in the
early days to hide CSS inside a style element from being displayed as
text by HTML browsers. Eventually, the HTML browsers learned to not
display the contents of a style element. Formally, the UA default
stylesheet had a rule
style {display:none}

In contrast, the SVG specification encourages the use of RDF metadata
and has an element, metadata, for this purpose. SVG (and MathML, SMIL,
SOAP, VoiceXML etc) do not have this stream-based mixture of
must-ignore tags and must-present content.

SVG has just as much of an issue with validation of RDF-containing SVG
- its not valid to the DTD. Something that a move to schema is fixing
(and of course the same move to Schema could fix the validation issue
for XHTML, too. But validation, as i said, is not the root cause of
the problem).

But the contents of a metadata element are not to be displayed,
according to the SVG spec (they have to be well formed, of course).
There is a test of the metadata element in the SVG test suite. To
pass, implementations (including implementations on cellphones and
PDAs) must merely parse according to the rules of XML and render
according to the rules of SVG, ie not get flustered by a mass of
elements and attributes not in the SVG namespace, but draw the graphic
as if the RDF were not present. In consequence, placing metadata in
RDF into SVG is simple, natural, encouraged, and works on real
implementations.

Regrettably, in XHTML 1.x - which being based on XML would be expected
to be more comfortable with a true tree-based approach where the level
of 'must-ignore' would be subtrees or nodes, not strings in the
document source - what was merely a suggestion in HTML 4.x was
codified into a conformance requirement.

This is why the issue is called RDFinXHTML (because the misfeature is
a conformance requirement) and not RDFinHTML (because the misfeature
is merely suggested behavior in HTML) and not RDFinXML (because SMIL,
MathML, SVG, SOAP, VoiceXML, etc etc do not have the problem).

And this is where I have an issue with the GRDDL approach - it takes
the problem of RDFinXHTML and tries to solve the non-existent problem
RDFinXML. While I agree the latent RDF is extractable via XSLT, its no
more available to XPath than RDF-in-comments is.

The scope of this new mechanism of hiding RDF from the HTML renderer
should be clearly limited to those HTML -based languages (XHTML, RDDL
1) which display the problem.


-- 
 Chris Lilley                    mailto:chris@w3.org
 Chair, W3C SVG Working Group
 Member, W3C Technical Architecture Group
Received on Friday, 6 February 2004 18:45:56 UTC