- From: Rick JELLIFFE <ricko@geotempo.com>
- Date: Thu, 21 Sep 2000 22:55:45 +0800
- To: xml-dist-app@w3.org
Henrik Frystyk Nielsen wrote: . > David Orchard wrote: > > It's careless to make an assumption - namespaces are URIs for > > the purpose of fetching schemas - and then claim it as fact. > > It has never been the intent > > that applications can do a GET on the namespace URI to fetch a schema. > > I didn't claim that you are guaranteed a schema - I said might - just as > well as you might get HTML back when you go to some website. This is > what the NS spec states - you might or you might not. The same thing > goes with schemaLocation - you are not guaranteed a schema - that's just > life. It certainly does not state that "it has never been the > intent...". I followed the debate on XML Namespace with great interest on the XML IG at the time. Henrik is trying to rewrite history: the statement in the spec that is it "not a goal that" the URI reference "be directly usable for retrieval of a schema" clearly states that it is not the model or expectation in XML Namespaces that applications will do a GET on the namespace URI to fetch a schema. (Perhaps Henryk can point to the part of the Namespaces spec that we have missed, where is says that URIs are used so we might retrieve something using them.) There are many reasons why this should be so. The proponents of namespaces=schema have never tried to answer the reasons. Instead we get this treat-namespaces-as-lucky-dip-then-everything-will-be-fine guff. The XML Schema WG have, after long consideration, tried to make a workable approach with the schemaLocation attribute. The issue comes down to the purpose of namespaces. The XML Namespaces spec makes it very clear in its motivation opening paragraph: where "a single XML document may contain elements and attributes ... that are defined for an used by multiple software modules. ...if such a markup vocabulary exists which is well-understood and for which there is useful software available, it is better to re-use this markup than reinvent it." So the key here is to maximize how much "well-understood" markup vocabularies can be re-used. Re-use does not require or imply fine-grained schema consistancy: quite the reverse. The vocabulary is well understood and should be usable in different schemas even if they impose additional criteria. Identifying schema and namespace reduces re-use: it encourages duplicate names for things that are the same. For example, if I have an HTML element p. The HTML content model for p does not allow my element rick:dog. But I want to have <rick:pet><html:p>hello <rick:dog>Rover</rick:dog></html:p></rick:pet> where I disallow any contents of html:p apart from PCDATA and rick:dog elements. This is still an html:p element: I give it the well-understood name html:p because there is useful software available that can use it. It has a very different content model than the content model in any of the html DTDs, because content models do not adequately express the actual semantics of the element: they miss out a step. A paragraph can contain allowed text and inlined and embedded objects; in typical HTML these include many well-known elements; however, restricting away all of them in particular case is no reason for a different namespace name. Having to change the namespace name defeats the purpose of allowing reuse. Instead of having robust processing of well-understood names, we get software that has to understand zillions of names: every version change or content model change to add something that the particular schema language was incapable of modeling (or to try to redress some constraint introduced as an artifact of the limitations of the schema language) would require a new namespace. The idea that somehow our nice XML Schema software can trace through the type derivation hierarchy and eventually come to well-understood underlying names (and then figure out whether the derived type is compatible or not) is bogus. First because of performance/download issues. Second, because it is far more complex that a namespace-using system must be able trace through XML Schemas than if the namespace signified general semantics and the schemaLocation indicated the particular schema applicable directly. Third, because only the application itself knows which information items are essential to its operation and must be preserved: the schema that an application is built to may be much simpler or more complex than the schema that the data has: as we have no way of matching data schemas to application schemas the "be generous in what you accept" rule is wise. A system which immediately barfs when unimportant schema violations occur is fragile. So what would it be better for the namespace URI reference to ultimately locate? Either a semantic schema or, better, a directory of related resources discoverable by some conventions. Namespace=schema blocks the use of the namespace URI for more systematic and extensible purposes. This issue could be defused if SOAP provided some convention to prevent this blocking. For example, if it said that the query "?request=schema" should be appended to the namespace URI reference when attempting to derefence it to get a schema. This prevents hogging of the URL by structural schemas, allows other queries by other specs which want other resources based on derferencing the namespace URI, and the query will be ignored by servers which just have a file (at least, the servers I quickly tried ignored this.) Rick Jelliffe (Not speaking for employer)
Received on Thursday, 21 September 2000 10:41:21 UTC