- From: Kurt Riede <kurt.riede@web.de>
- Date: Wed, 11 May 2005 17:24:59 +0200
- To: <xmlschema-dev@w3.org>
> An infinite set cannot meaningfully be versioned because you cannot > distinguish one version from another (because you can never enumerate > all its members in order to prove equality or difference). Yes you can. That's more or less what Kurt Goedel did in his incompleteness theory. You *can* enumerate all it's members, even if the set has an infinite size. And you *can* also check if such two sets are different or not. I don't see your philosophical reason for not versioning namespaces. > This does require that when there are different schema versions for a > given namespace that documents specify the correctly schemaLocation > value, otherwise John has no choice to be retrieve an arbitrary > (presumably the latest) version of the schema for that namespace. No, I think the author of the instance should declare, to which *version* of the namespace the instance conforms to. And the consumer of the instance *must know* that there are different versions of the schema. If the instance author provides a schemaLocation, the consumer should *never* rely on it, because it could be a fake to smuggle invalid documents to the consumer. If the author of the instance doesn't provide a version, the behaviour of the consumer is application dependent. One application might e.g. assume the latest (by the consumer) known version, another might just reject the instance. Cheers Kurt ----- Original Message ----- From: "Eliot Kimber" <ekimber@innodata-isogen.com> To: <xmlschema-dev@w3.org> Cc: "Fraser Crichton" <fraser.crichton@solnetsolutions.co.nz>; "Dan Vint" <dvint@dvint.com>; <John.Hockaday@ga.gov.au> Sent: Wednesday, May 11, 2005 4:50 PM Subject: Re: Versioning of XML Schema and namespaces > > Fraser Crichton wrote: > > Hi, > > > > I'm very interested in the reasons behind this - > > > > > Putting a version in the namespace is definitely not the right thing > > to do. > > > > I ask because I've seen that as a possible approach to versioning > > (http://www.xfront.com/Versioning.pdf) and it seems a number of > > practitioners have adopted this e.g. the US Dept of Navy, xCIL, etc. > > Per the W3C namespace spec, a namespace identifies an abstraction, an > infinite set of names distinguished from all other possible names by > having a unique prefix (the namespace URI). > > Thus a namespace URI identifies an abstraction--there is no particular > mechanism defined within the namespace spec for defining what names are > actually in the namespace. That is, a namespace URI identifies an > unbounded set of names, that is, an infinite set. > > An infinite set cannot meaningfully be versioned because you cannot > distinguish one version from another (because you can never enumerate > all its members in order to prove equality or difference). > > This is the philosophical reason for not versioning namespaces. > > The practical reason derives from this idea of namespaces naming > unversionable abstractions: > > In practice, namespaces are bound to XML "applications" [I put > "application" in quotes because it's not a precisely-defined term and to > distinguish it from the narrow usage of _application_ to mean a specific > software program.] For example, XSLT is an XML application, as are > DocBook and XHTML. This binding is done in application specifications. > > As an abstraction, the XSLT application is invariant over time: its > basic purpose and usage will always be what it is now, regardless of the > details of how it is implemented. > > Thus, in this use case, namespace URIs represent the abstract idea of > the application (that is, the concept of XSLT or DocBook or XHTML) and > that abstract idea cannot be versioned and doesn't change over time. > > That is, as long as the fundamental nature of a given application > doesn't change, it would be inappropriate and unnecessary to change it's > namespace URI simply because some implementation detail of the > application changed. > > Or said another way, if you change the namespace URI, in any way, you > are identifying a fundamentally *different* application. > > Or said another way, the namespace URI names *all current and future > versions" of the concrete expressions of the application. > > What *does* change are the concrete implementation artifacts that make > up the application at any point in time. As concrete objects, they are > versionable and will likely have different versions in time. Thus it is > appropriate (in fact essential) that the resource locators for those > concrete objects reflect the versions of them, otherwise you could only > locate a single version of any one of them, which would be very limiting > in most cases (for example, if I have two versions of the schema for a > given application and documents that validate against one version or the > other). > > Thus, while the namespace URI for a given application should be > invariant, the resource URLs for the concrete implementation components > (schemas, transforms, java classes, documentation, etc.) will be variant > as new versions are created. Of course, you might also offer URLs that > represent the "latest" version--resources may have any number of URLs > associated with them. But, in the general case, there should always be > version-specific URLs for the resources. > > How can this work in practice? > > The best solution, in the abstract, I think, is what Mike suggests, > namely an attribute that specifies the schema version, which the > processor then uses to determine the correct schema instance to apply. > This suggests that it might be useful for the XSD spec (or perhaps a > separate, more general spec, since this requirement isn't XSD-specific) > to define a "schema-version" attribute that can be used independently > from the schemaLocation attribute. > > But, given that current software (and certainly the Xerces processor, > which provides schema-awareness in many tool chains) depends primarily > on schemaLocation and/or catalogs, I think that a productive approach > would be as described below. > > John Hockaday writes: > > > If I don't already have a copy of the > > XSDs referred to in the XML document instances then I need to download those > > XSDs and validate them. > > > > If the XSDs are not valid then I report my findings to my clients and reject > > the relevant XML document instances. If the XSDs are valid then I validate > > the XML document instances against those XSDs and report my findings to my > > clients. Again only valid XML document instances are accepted. > > > If I do have a copy of the XSDs then I will have already validated them and I > > hope to use OASIS Catalogue files to refer to local copies of those XSDs when > > validating related XML document instances. This will of course reduce > > bandwidth, time and costs and is essential when validating 40,000+ metadata > > records at a time. > > Here there are two key and common requirements: > > 1. Validate documents against whatever schema they say they conform to > (and, as a side effect, validate the schemas themselves). > > 2. Provide local copies of schemas to reduce processing time and network > overhead. > > John knows that there may be different versions of schemas for the same > namespace. > > I think the solution here is use the catalogs as follows: > > 1. Require that incoming documents use absolute URIs for all > schemaLocation specifications (not sure if this is currently the case in > John's case). > > 2. Use the catalog to map these absolute URIs to the local copy of the > schema (if there is one--if there's not one, fetch it and update the > catlaog). > > 3. As a fallback, map namespace URIs to schema URIs, which the > appropriate schema for that namespace is known. > > This does require that when there are different schema versions for a > given namespace that documents specify the correctly schemaLocation > value, otherwise John has no choice to be retrieve an arbitrary > (presumably the latest) version of the schema for that namespace. > > In the case where the version has been used in the namespace and there > is no schemaLocation, the problem is the same: either there's exactly > one schema for that namespace or John has to arbitrarily pick one. > > This all puts the onus on document authors to specify correctly which > version of a namespace's schema they want to use. There is no way around > this--it's simply an unavoidable consequence of the fact that there can > be different versions of a schema for a given namespace. > > Note too, that this basic approach can be used to prevent authors from > using schemaLocation= to nefarious ends where you have the requirement > that documents conform only to a known, and controlled, set of schemas. > Because you are remapping the schemaLocation URIs to local files, if > authors specify a schemaLocation URI that you don't recognize (meaning > that it's not mapped in the catalog), you can fall back to pointing to > some local schema that will cause the document in question to fail its > validation check. This is the functional equivalent of ignoring > schemaLocation=. > > Cheers, > > Eliot > > -- > W. Eliot Kimber > Professional Services > Innodata Isogen > 9390 Research Blvd, #410 > Austin, TX 78759 > (512) 372-8155 > > ekimber@innodata-isogen.com > www.innodata-isogen.com > >
Received on Friday, 13 May 2005 06:02:59 UTC