Validating XHTML with embedded RDF (was Re: Scenario: Trackbacks)

Joseph Reagle <reagle@w3.org> wrote:

> You mean the -- ironically enough -- comment about escaping ampersands?
> 
>   <p>Phil, since this is slightly on topic: EntryEditLink now escapes 
>   the &'s to &amp;'s.  Hope this helps.</p>

Exactly.

> The wdg validator didn't complain about that either, but Amaya did.

This is one of known limitations in SP-derived SGML/XML parsers.
"Real" XML processors can easily catch this kind of fatal error,
e.g. the CSS Validator does catch such error.

> > I'd like to see more sample instances together with scenarios so that
> > I can test.  Depends on what level of validation you want, I have
> > several schemata to validate XHTML part without bothering foreign
> > namespaces.  We need to clarify what kind of validity we are talking
> > here, though.
> 
> One of my concerns is that this discussion can get a little weird (e.g., 
> removing RDF just for the sake of the validator). Isn't the validator a 
> means to an end and not an end in and of itself?

Well, the validator is just a starting point.  There're quite a few
constraints that DTD cannot express.  And when it comes to embedding
RDF/XML in XHTML, I have no hope to work out a DTD-based solution.

> But that then begs your 
> question of "what kind of validity" we are talking about.... We want 
> conformance to specifications for the sake of interoperability and 
> consistent/good user experiences across environments.

In the case of XHTML 1.0 as currently written, for example, I'm afraid
you have no chance.  As for document conformance, XHTML 1.0 only defined
"*strictly* conforming XHTML document", which is "an XML document that
requires only the facilities described as mandatory in this
specification" [2].  I won't go into details here, but by definition
XHTML with embedded RDF is not strictly conforming XHTML document.

But here's a trick: we intentionally called it "*strictly* conforming
XHTML document", which implies that there could be not-so-strictly
conforming XHTML documents.  Section 3.1.2 of the XHTML 1.0 spec
illustrates how you MAY use XHTML with other namespaces [3], but
it didn't define conformace for that, as we didn't have a good
technology to ensure such conformance at that time.  It's based
on the 20th century technology, for good or bad.

For future versions, we do consider taking advantage of the 21st
century technologies.  I posted my personal thoughts on XHTML 2
conformance to www-html last month [4], with RDF in XHTML issue
in mind.

> Without tipping into 
> philosophical questions about conformance, a pragmatic take is to at least 
> improve the present situation (e.g., permitting rdf-in-XHTML, making it 
> easy to find bugs in one's html, app interop, and ensuring consistent 
> experiences for the user).

Forgetting about XHTML conformance issue for now, and assuming that
RDF part doesn't necessarily have to be validated in this process,
I have several things to offer for better validation.

I wrote an experimental XML Schema for "extensible" XHTML 1.0
Transitional [5], which is an extension of the XHTML 1.0 Schema
published as a W3C Note [6].  The Schema included in that Note
was designed to be a "closed" schema, i.e. it didn't allow
foreign namespaces.  On the other hand this schema is an "open"
schema, i.e. it allows foreign elements and attributes on most
places, with processContents="lax" rather than the default value
of "strict".  So, you may embed RDF stuff or arbitrary XML almost
anywhere inside XHTML and you could also validate your RDF stuff
if it's at all possible to define some sort of XML Schema for
your RDF bit, otherwise your RDF stuff is only laxly assessed.
Yet schema-validity of the XHTML part is strictly assessed, so
you'll be able to find bugs in your XHTML part.

Another approach is to use Modular Namespaces (MNS) [7], I wrote
MNS schemata for XHTML 1.0 Strict [8], Transitional [9], and
Frameset [10].  These schemata use corresponding RELAX NG
schemata to check validity of the XHTML part, but foreign
elements and attributes (such as RDF) are ignored by pruning them
before validation.  If you care to try, an XHTML 2 version is also
available [11].

In both cases you may check strict validity of the XHTML part
(actually much better than DTD) while embedding RDF/XML almost
anywhere you like.  Of cource neither of them would provide
solution to define the semantics of such a mixed document, but
at least they could provide some way to overcome validation issue.
In the future, I hope ISO/IEC DSDL VCSL could provide better solution.

> Regardless, in [1] we have some requirements, and a few scenarios, and I'm 
> also hoping to here what sort of potential solutions the HTML community is 
> thinking about. What's the status of Steven's proposal,

If you look at the latest XHTML 2.0 draft, you'll notice that
the Metainformation Module now allows nesting of the meta element [12].
This is a step forward to allow RDF/XML-like encoding of metadata
through the meta element, as Micah/Steven proposed.

> or your non-DTD 
> work?

No official standing whatsoever.

> Any chance that would be adopted?

For XHTML 2, I have hope.  For older versions, I'm not so optimistic.

> If so, why not? (I think at the 
> plenary Steven mentioned something about a lack of internal user defined 
> entity support?) 

Well, that's a long story, and I'd love to separate entity problem
from validation issue.  Unless RDF folks are willing to provide
DTD for RDF, that point is rather moot and I don't think this task
force is an appropriate place to solve that problem.

> [1] http://www.w3.org/2003/03/rdf-in-xml.html

[2] http://www.w3.org/TR/xhtml1/#strict
[3] http://www.w3.org/TR/xhtml1/#well-formed
[4] http://lists.w3.org/Archives/Public/www-html/2003May/0297
[5] http://www.w3.org/People/mimasa/test/schemas/SCHEMA/xhtml1-loose.xsd
[6] http://www.w3.org/TR/xhtml1-schema/#xhtml1-transitional
[7] http://www.thaiopensource.com/relaxng/mns.html
[8] http://www.w3.org/People/mimasa/test/schemas/rng/xhtml1-strict.mns
[9] http://www.w3.org/People/mimasa/test/schemas/rng/xhtml1-transitional.mns
[10] http://www.w3.org/People/mimasa/test/schemas/rng/xhtml1-frameset.mns
[11] http://www.w3.org/People/mimasa/test/schemas/rng/xhtml2.mns
[12] http://www.w3.org/TR/2003/WD-xhtml2-20030506/mod-meta.html#s_metamodule

Regards,
-- 
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium

Received on Thursday, 12 June 2003 15:11:04 UTC