Re: On some existing work on compound documents

Michael Pediaditakis <mp49@kent.ac.uk> wrote:

> I've just noticed some existing work by James Clark
> on compound document mainly focusing on validation..
> 
> In the workshop we mostly focused on how to present
> compound documents but I think that for any processing
> model for compound documents, the first step has to
> be the syntax and validation integration.
> 
> So, there is the Namespace Routing Language (NRL)[1]
> and the part 5 of the ISO's
> Document Schema Definition Languages (DSDL) — Part 4:
> Namespace-based Validation Dispatching Language — NVDL[2].

I've been watching the development of DSDL with *great* interest,
in particular Part 4.  I've posted some opinions related to this
topic in some public mailing lists more than a year ago [4,5,6],
mainly from my viewpoint as a maintainer of XHTML 2.0 schema.

> The main focus seems to be to separate a compound document
> to separate subtrees of a single namespace and then validate
> them separately... (please correct me if I'm wrong)

Yes, "Divide and Validate" approach, originated in RELAX Namespace.

I wanted to mix XHTML 2.0 with MathML and SVG and EGIX and ContactXML
and HLink and RDF/XML and XML Character Entities, for example, and
still wanted to validate that compound document, so I played with
Modular Namespaces (MNS), NRL and co., and for the purpose of
validation, their approach seems quite promising.

NRL's "concurrent validation" feature is also quite nice, it allows
me to validate a single namespace with multiple schemata, e.g. I can
validate the XHTML part with XML Schema, RELAX NG and Schematron
concurrently, and each schema may check different aspects of validity.

At the last W3C Technical Plenary in Cannes-Mandelieu [7], I also
pointed out that when you mix multiple vocabularies together and
want to validate, not all schemata would be under your control, so
some vocabularies may use XML Schema and others may use RELAX NG,
yet others may choose another one, and so on.  Whatever schema
language you prefer, you can't simply assume that everyone else
would agree with you, so when we deal with compound documents,
we should be prepared to acknowledge that more than one schema
language exists in the real world, and should let them work together
(and that's why DSDL stands for "Document Schema Definition Language*s*",
not a single language).

> Are there any opinions on the importance of these approaches
> for the processing of compound documents???

For the purpose of validation, I think these approaches are quite
important and useful.  For other processing, such as rendering, they
may not always be appropriate.  It may work in some cases, but sometimes
you'll need to know how different piece of "islands" inter-operate.

> [1] http://www.thaiopensource.com/relaxng/nrl.html
> [2] http://www.dsdl.org/0525.pdf
> [3] http://www2003.org/cdrom/papers/poster/p108/p108-Pediaditakis.html

[4] http://lists.w3.org/Archives/Public/www-html/2003Mar/0150
[5] http://lists.w3.org/Archives/Public/www-html/2003May/0297
[6] http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2003Jun/0014
[7] http://www.w3.org/2004/03/plenary-minutes#Session4

Regards,
-- 
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium

Received on Friday, 9 July 2004 08:13:03 UTC