- From: Al Gilman <asgilman@iamdigex.net>
- Date: Fri, 16 Jun 2000 10:42:56 -0500
- To: xml-uri@w3.org
**Summary The problem is both sides are assuming there is a 1:1 relationship and are arguing over how to define it. There is no answer in that space. There is no 1:1 relationship between namespaces and languages, between Qnames and element types. The actual operational requirement is for the lower layers to distinguish by namespace within a document and for the upper layers to associate by language across documents and between documents and processors. In particular note that the namespace does not necessarily uniquely identify the language nor definitively identify the types and attributes so named. A Qname does not completely identify an element type or attribute for all XML processing. Qnames suffice to keep lower level processing from identifying types occuring in one document which should be distinguished. But they do not suffice to identify the element type or attribute for all purposes, i.e. across documents and in the matching of documents to processors. For this, the full language definition is in general required. The Qname is sufficient when used as an index into the language definition, but not by itself because it is legal (and widely done) to reuse Qnames in related dialects, viz: HTML. **Details At 10:45 AM 2000-06-16 +0100, John Aldridge wrote: >Dan Connolly and David Carlisle were having an exchange a few days ago >which seemed to grind to a halt with... > >At 01:21 14/06/00 +0100, David Carlisle wrote: > > Dan Connolly wrote: > >> > put a schema for MathML at >> > http://www.w3.org/1998/Math/MathML > > (snip) > >> > (one that integrates MathML >> > into XHTML by saying, e.g. that <mathml> can be used as >> > an HTML block element), >> >> >> > This seems straightforward; am I misunderstanding your question? >> >>XHTML Basic? XHTML 1.1? There was a reason for that previous massive >>row over "three namespaces for html" to allow multiple schema for the >>html namespace. > >I don't think I missed an answer, but I'd really like to hear what Dan >Connolly sees as the resolution of this sort of problem. It seems to get >to the heart of the question of whether a namespace is or is not a >"language". Can I encourage him (or someone else sharing his vision) to >respond? Would you consider asking if a language is a namespace? The issue of whether the leaf-level element types and attributes in this document are the same as those in another document is not a question of syntax, but of usage. It is a question "is the language in use here and there the same?" To compare across documents, you have to compare languages, not namespaces. Element and attribute names, uttered within markup, are not atoms, but indices into some language schema. This schema may or may not be represented in a document, but the case where there is such a document and there are constraints as well as tokens associated with the nodes in the InfoSet for that language has to be included a_priori. Namespaces are OK for sorting things out locally, but namespace processing does not yield a conclusive answer to the cross-document comparison of the markup. The upper layers need to know and care about what language is being used where those names are being used. The lower layers just need to build an compliant infoset structure. Assuming that an element type Qname, or a namespace of them, is an ontological atom, in a space with a discrete topology, breaks the orderly allocation of functions between these two layers. The type-name of an element, even when qualified as to namespace, does not fully identify its type. It merely indexes _which type in the language_ is indicated. Without knowing the language context, the type is undefined. In the upper-layer processing, the same set of InfoSet nodes that has been segregated "by namespace" in the lower layers needs to be handled as bound to a particular language definition, a distinction finer than the namespacing done by the lower layers. It's the same filter of the InfoSet, only the identification is refined. The upper layers refine the identification of what that filter of nodes is associated with. "A namespace" is just the starting point. The lower layers should not need nor presume to recognize the namespace. Only distinguish the different namespaces appearing in one parse or one document. Match patterns in stylesheets refer to names in the space of the document that is being style-processed. They are name acceptors, not name creators. Common processing of "the same names in different documents" should not be automatic. Only common processing of "the same language in different documents." That is to say common processing above the layer that builds the InfoSet. There is no reason why an identifier of the language could not be used as the discriminator in lower-layer processing of a namespaced filter of markup within a document. Conversely, there is also no reason why the language in that filter should not be identified incrementally by separate namespace and schema location indications. [Wave DRUMS flag - strict in what you transmit, loose in what you accept] In the "how many namespaces for XHTML" debate we realized that it was useful to have two characterizations of the language in a doucument: a general characterization and a precise characterization. The analogy to MIME type/subtype nomeclature is strong. The casual processor only needs to know that the document is HTML; a validating parser needs to know what technical definition down to the jot and tittle you are using as a reference for this HTML. Different processors need to know different levels of precision in identifying the language that they are processing. Language identification is not atomic. It is at least as rich as Boolean. Given the rich lattice of sublanguages it is impractical to assume that the coarse and fine descriptions of the language in use in a particular namespaced filter of markup in a particular document are the same. So the atomic solution where the language identification is atomic and is used as the discriminant for the namespace (a.k.a. namespace name) is not practical. It is a bad fit to the actual need, as the HTML example demonstrates. The ns-attr and schema location attribute give us a mechanism to indicate both a coarse and fine description of the language in use within a local namespaced filter of markup. [not necessarily canonical, but workable] The actual operational requirement is for the lower layers to distinguish by namespace within a document and for the upper layers to associate by language across documents and between documents and processors. In particular note that the namespace does not necessarily uniquely identify the language. The layering of processing needs to provide for this progressive refinement in the identification of the types used in the markup. Al >-- >Cheers, >John >
Received on Friday, 16 June 2000 10:26:03 UTC