- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Wed, 20 Feb 2002 12:28:18 +0200
- To: ext Paul Prescod <paul@prescod.net>, WWW TAG <www-tag@w3.org>
On 2002-02-19 16:02, "ext Paul Prescod" <paul@prescod.net> wrote: > "Which of the following are appropriate triggers for determining the > document type of an XML document when metadata is unavailable: > > 1 DOCTYPE statement > 2 top-level namespace > ... > > I have not seen a compelling use-case for anything other than 1 and 2. #2 is ruled out by cases where the same root element may occur in multiple document models with differing content model definitions (having only consistent semantics, but not identical syntax). I.e. when there is not a 1:1 correspondence between namespace and document model. Such as XHTML 1.0 Strict, Transitional, and Frameset document types, all of which have the same namespace and nearly identical vocabulary but not identical content models. One is unnable to determine from a top level element <html xmlns="http://www.w3.org/1999/xhtml"> ... </html> which document model is being used. The only way to know which document model applies to a particular XML instance is to specify the actual document model explicitly. In the absence of other metadata, the DOCTYPE declaration appears the only (and most reasonable) choice. After all, that's what it's for, eh? In order to address modularity, an alterative to an instance-wide DOCTYPE declarations might be an XML attribute such as xml:doctype which takes a PUBLIC URI value, allowing folks to specify, even for modular fragments, which document models apply to which sub-trees of the XML instance. E.g. <html xmlns="http://www.w3.org/1999/xhtml" xml:doctype="urn:publicid:-//W3C//DTD+XHTML+1.0+Transitional//EN"> ... </html> The only remaining issue to be resolved is to make PUBLIC identifiers schema-formalism independent, so that ideally, we'd have something akin to the more general PUBLIC identifier (sans 'DTD+') <html xmlns="http://www.w3.org/1999/xhtml" xml:doctype="urn:publicid:-//W3C//XHTML+1.0+Transitional//EN"> ... </html> to which one might have a choice of resolving to DTD, XML Schema, RELAX NG, etc. all of which define the document model in question. By declaring the doctype in the actual element, this allows for modular "islands" that may be validated independently of their context and it is clear specifically which document model applies. Regards, Patrick -- Patrick Stickler Phone: +358 50 483 9453 Senior Research Scientist Fax: +358 7180 35409 Nokia Research Center Email: patrick.stickler@nokia.com
Received on Wednesday, 20 February 2002 05:26:50 UTC