Re: Namespace dispatching from Patrick Stickler on 2002-02-20 (www-tag@w3.org from February 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Wed, 20 Feb 2002 12:28:18 +0200
To: ext Paul Prescod <paul@prescod.net>, WWW TAG <www-tag@w3.org>
Message-ID: <B89946E2.F0D0%patrick.stickler@nokia.com>

On 2002-02-19 16:02, "ext Paul Prescod" <paul@prescod.net> wrote:

> "Which of the following are appropriate triggers for determining the
> document type of an XML document when metadata is unavailable:
> 
> 1 DOCTYPE statement
> 2 top-level namespace
> ...
> 
> I have not seen a compelling use-case for anything other than 1 and 2.

#2 is ruled out by cases where the same root element may
occur in multiple document models with differing content model
definitions (having only consistent semantics, but not
identical syntax).

I.e. when there is not a 1:1 correspondence between namespace
and document model.

Such as XHTML 1.0 Strict, Transitional, and Frameset document
types, all of which have the same namespace and nearly identical
vocabulary but not identical content models.

One is unnable to determine from a top level element

  <html xmlns="http://www.w3.org/1999/xhtml">
  ...
  </html>

which document model is being used.

The only way to know which document model applies to a particular
XML instance is to specify the actual document model explicitly.
In the absence of other metadata, the DOCTYPE declaration appears
the only (and most reasonable) choice. After all, that's what it's
for, eh?

In order to address modularity, an alterative to an instance-wide
DOCTYPE declarations might be an XML attribute such as xml:doctype
which takes a PUBLIC URI value, allowing folks to specify, even for
modular fragments, which document models apply to which sub-trees
of the XML instance. E.g.

  <html xmlns="http://www.w3.org/1999/xhtml"
        xml:doctype="urn:publicid:-//W3C//DTD+XHTML+1.0+Transitional//EN">
  ...
  </html>

The only remaining issue to be resolved is to make PUBLIC identifiers
schema-formalism independent, so that ideally, we'd have something
akin to the more general PUBLIC identifier (sans 'DTD+')

  <html xmlns="http://www.w3.org/1999/xhtml"
        xml:doctype="urn:publicid:-//W3C//XHTML+1.0+Transitional//EN">
  ...
  </html>

to which one might have a choice of resolving to DTD, XML Schema,
RELAX NG, etc. all of which define the document model in question.

By declaring the doctype in the actual element, this allows for
modular "islands" that may be validated independently of their
context and it is clear specifically which document model applies.

Regards,

Patrick

--

Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com

Received on Wednesday, 20 February 2002 05:26:50 UTC