- From: Eliot Kimber <ekimber@innodata-isogen.com>
- Date: Wed, 04 May 2005 10:11:21 -0500
- To: John.Hockaday@ga.gov.au, xmlschema-dev@w3.org
- CC: :www-xml-schema-comments@w3.org
John.Hockaday@ga.gov.au wrote: > I expect that document instances using W3C XML Schemas will use a "namespace" > declaration to identify which XML Schema should be used to validate that > document instance. The problem that I see with the namespace it that a URI > is the unique identifier. There is no PUBLIC identifier. As we have all > probably experienced with old bookmarks, the content at URLs change a lot. > If an XML Schema's version is not part of the URI and a new version of that > XML Schema is made then it is likely that this will *not* be reflected in the > URI and hence the namespace. I think you're confusing two similar but distinct functions here: namespaces and schema locations. DTD declarations with external identifiers are equivalent to schemaLocation specifications, not namespace declarations--SGML (and XML without namespaces) had no equivalent of namespaces [except for HyTime architectures]. For DTDs, which are a syntactic part of the document that references them, the PUBLIC identifier is nothing more than an alias for the external DTD subset's storage location (i.e., it's filename). Thus it is completely appropriate that it include a version identifier since if the external declaration subset is changed it's a new object and should be identified as such. By contrast, a document's namespace *does not* directly identify a schema. It identifies (or rather, can be exclusively associated with) an (abstract) "application" that might have any number of schemas associated with it. That is, for a given application, with a single associated namespace, there might be different schemas *at the same time*, reflecting different profiles or uses of the application, or there might be different schema versions over time reflecting changes over time to the details of the namespace. But the namespace itself is unchanged because the namespace identifies the application independent of it's various implementations over time. [For example, the XSLT namespace is invariant across versions of the XSLT spec because, as an abstract application, XSLT is XSLT regardless of the currently-defined details of it.] In DTD-based documents that use external declaration subsets you always have to have an external identifier for the subset, so you always had something you could resolve or use in a catalog. For non-DTD-based documents, there are two possible cases (assuming namespaced documents--the no-namespace case is degenerate and not worth considering because it allows no good general solution): 1. The document uses the schemaLocation= hint to say which specific schema it wants you to use. 2. The document specifies only a namespace. In the first case, the schema location can either be local, relative URI or it can be an absolute URI. In this case of the absolute URI, the URI functions essentially as a PUBLIC ID does: it essentially demands local mapping to a local resource via some sort of catalog method (for the simple practical reason that most processing environments aren't always net connected or because the schema is not in fact served on a publicly-available server). If the absolute URI includes some sort of version value, then you have *exactly the same* functionality and semantics as with PUBLIC IDs for external DTD subsets. In the second case, the implication is that the system must determine which version of the schema to use, which typically would be done using a catalog and probably implies that in most cases you want the latest or more general version of the schema. But, a processor could use other heuristics to decide which version to use, for example, looking in the document for other clues or using some outside information, such as metadata held in a document management system. Thus, I think the appropriate approach in your case is to require the use of schemaLocation= with absolute URIs that include version information--that gives you the same control you had before. It's important to remember that there is (and never was) any particular magic to PUBLIC identifiers--they are just magic strings that require indirection to be resolved. In that respect that are indistinguishable for URIs that also require indirection to be resolved to real resources. Cheers, Eliot -- W. Eliot Kimber Professional Services Innodata Isogen 9390 Research Blvd, #410 Austin, TX 78759 (512) 372-8155 ekimber@innodata-isogen.com www.innodata-isogen.com
Received on Wednesday, 4 May 2005 15:10:14 UTC