- From: Simon Cox <simon.cox@jrc.ec.europa.eu>
- Date: Wed, 2 Sep 2009 20:08:59 +0200
- To: <xmlschema-dev@w3.org>
Also forwarded to the list as this is likely to be of general interest. -----Original Message----- From: Simon Cox [mailto:simon.cox@jrc.ec.europa.eu] Sent: Wednesday, 2 September 2009 20:06 To: 'noah_mendelsohn@us.ibm.com' Cc: 'Andrew Welch'; 'ekimber'; 'G. Ken Holman'; 'Henry S. Thompson'; 'Tsao, Scott' Subject: RE: Best Practices for Establishing Namespace Name Not in general, but sometimes, and often enough to matter for certain use-cases. A review of processing engines (including those built in to enterprise tools like Oracle) a few years ago led us to the conclusion that the real tools used in real organizations had diverse behaviours. OGC is in business of publishing schemas for widespread use by many organizations, used to build loosely coupled systems where it isn't feasible to enforce any particular one of the caching and processing models allowed by the spec, so we had to assume the worse case. That requirement inexorably led to the conclusion that new namespaces were required every time. This won't apply in every use case, but if you are publishing schemas that you expect lots of people to use, and you have limited control over them it is a scenario that must be considered. However, there is a nuance to this: where the new schema only adds stuff to an existing schema, you do this in a new namespace, but <import> the existing schema, so existing components do not change namespaces, its just that the ones added after the original publication have a new namespace. (Effectively this is the strategy used by Google when upgrading KML.) -------------------------------------------------------- Simon Cox European Commission, Joint Research Centre, Institute for Environment and Sustainability, Spatial Data Infrastructures Unit, TP 262 Via E. Fermi, 2749, I-21027 Ispra (VA), Italy Tel: +39 0332 78 3652 Fax: +39 0332 78 6325 mailto:simon.cox@jrc.ec.europa.eu http://ies.jrc.ec.europa.eu/simon-cox SDI Unit: http://sdi.jrc.ec.europa.eu/ IES Institute: http://ies.jrc.ec.europa.eu/ JRC: http://www.jrc.ec.europa.eu/ -------------------------------------------------------- -----Original Message----- From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com] Sent: Wednesday, 2 September 2009 19:56 To: Simon Cox Cc: 'Andrew Welch'; 'ekimber'; 'G. Ken Holman'; 'Henry S. Thompson'; 'Tsao, Scott' Subject: RE: Best Practices for Establishing Namespace Name Simon Cox writes: > A processors will maintain a cache of schema component definitions and > declarations and associate it with a namespace. Not in general. That's neither required nor encouraged by the Recommendation, though it is allowed, and some implementations do. > The processing rules in the XML Schema spec do not require that a > processor load the schema fresh if a new document comes in with the > same namespace. That's true, but neither do they forbid reloading. Quoting from the Rec: [1]: "Processors have the option to assemble (and perhaps to optimize or pre-compile) the entire schema prior to the start of an .assessment. episode, or to gather the schema lazily as individual components are required." I think your reasoning is somewhat backwards: I believe the intention is that processors may implement a variety of startegies, and that >users should choose processors (or processor switches) that are appropriate for the particular purpose<. So, rather than saying: "be sure to use a new namespace when the processing rules for your markup change between versions, because your processor will surely cache the old content models". I would say: "There are many tradeoffs in deciding whether to use the same markup from the same namespaces when content models and/or the interpretation of content changes from version to version. One complexity to consider is that some schema processors maintain caches of pre-assembled schemas, and those processors may not behave well if the same markup is to be interpreted differently according to the version of the language. Other processors do provide either for just-in-time assembly of schemas, or for the necessary level of control over schema document caching." BTW: there are really downsides to using a new namespace when minor changes are made to a language. There can be many other artifacts that will need revision that would otherwise be unnecessary, e.g. XPaths in stylesheets. Furthermore, it can happen that a new version is created to revise just one small feature of a language, and then the question arises whether to republish the whole language in a new namespace, or only the new features. If the former is done, then even documents otherwise unaffected by the language revision may wind up having two expressions, one using the old and one using the new namespace; conversely, if only changed markup is in the new namespace, then users have to remember which feature was revised when, and deal with many namespace prefixes when languages are revised many times. Overall, my observation has been that it's usually easier to use namespaces for more functional decomposition (e.g. one namespace for personnel-related vocabulary, one for inventory, etc.) and to use namespace changes sparingly when implementing revisions to a language specification. So, mostly, successive versions of a language should use the same namespaces, except maybe for qualitatively different features. Noah [1] http://www.w3.org/TR/xmlschema-1/#layer1 -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- "Simon Cox" <simon.cox@jrc.ec.europa.eu> 09/02/2009 01:34 PM To: "'Andrew Welch'" <andrew.j.welch@gmail.com> cc: <noah_mendelsohn@us.ibm.com>, "'Tsao, Scott'" <scott.tsao@boeing.com>, "'G. Ken Holman'" <gkholman@cranesoftwrights.com>, "'Henry S. Thompson'" <ht@inf.ed.ac.uk>, <xmlschema-dev@w3.org>, "'ekimber'" <ekimber@reallysi.com> Subject: RE: Best Practices for Establishing Namespace Name No - its not merely the identity issue. It's the processing issue. A processors will maintain a cache of schema component definitions and declarations and associate it with a namespace. The processing rules in the XML Schema spec do not require that a processor load the schema fresh if a new document comes in with the same namespace. So if the new document is actually using a different schema (even if the namespace is the same) then processing will fail. The only way to ensure safe processing (i.e. that respects *all* of the processing straegies allowed for in the XML Schema spec) is to be scrupulous about changing namespace if the schema changes. In many cases that is most easily handled by including a version identifier in the namespace. Because of all this, the XML Schema processing rules effectively imply that the target namespace is the schema identifier. -------------------------------------------------------- Simon Cox European Commission, Joint Research Centre, Institute for Environment and Sustainability, Spatial Data Infrastructures Unit, TP 262 Via E. Fermi, 2749, I-21027 Ispra (VA), Italy Tel: +39 0332 78 3652 Fax: +39 0332 78 6325 mailto:simon.cox@jrc.ec.europa.eu http://ies.jrc.ec.europa.eu/simon-cox SDI Unit: http://sdi.jrc.ec.europa.eu/ IES Institute: http://ies.jrc.ec.europa.eu/ JRC: http://www.jrc.ec.europa.eu/ -------------------------------------------------------- -----Original Message----- From: Andrew Welch [mailto:andrew.j.welch@gmail.com] Sent: Wednesday, 2 September 2009 19:18 To: Simon Cox Cc: noah_mendelsohn@us.ibm.com; Tsao, Scott; G. Ken Holman; Henry S. Thompson; xmlschema-dev@w3.org; ekimber Subject: Re: Best Practices for Establishing Namespace Name 2009/9/2 Simon Cox <simon.cox@jrc.ec.europa.eu>: > Andrew Welch wrote >> use a version attribute to distinguish the versions > > Where? Typically on the root element, but it could go anywhere that's suitable. > The issue was that elements with the same name were defined > differently in both GML 2.0 and GML 3.0, But they had the same target > namespace. The differences were subtle - technical rather than > conceptual - but real as far as a validating processor is concerned. > The XML namespace is to all practical intents and purposes the > designated identifier for 'the schema' and we had the same identifier > for different things. Chaos ensues. Between versions the content model of elements will change, but that doesn't mean you need a different namespace... Incompatible changes are actually easier than supporting backwards compatibility, instead of detecting the version and using the right xsd and corresponding parsing code, you simply reject anything that fails validation for that version. Anyway, it's interesting that you say the namespace is (to all intents ands purposes) the identifier for the schema, perhaps that's where the problem is... the namespace value itself has started to mean something, when its meant to mean nothing. Everyone seems to have different opinions on this, and I think Ive asked in the past if anyone has a best practices guide which didnt attract too many confident replies, but at the moment for me its simply "namespace that won't ever change, version attribute" : ) -- Andrew Welch http://andrewjwelch.com Kernow: http://kernowforsaxon.sf.net/
Received on Wednesday, 2 September 2009 18:09:40 UTC