Re: How namespace names might be used from Al Gilman on 2000-06-16 (xml-uri@w3.org from June 2000)

From: Al Gilman <asgilman@iamdigex.net>
Date: Fri, 16 Jun 2000 10:42:56 -0500
To: xml-uri@w3.org
Message-Id: <Version.32.20000616085400.04199400@pop.iamdigex.net>
**Summary

The problem is both sides are assuming there is a 1:1 relationship and are
arguing over how to define it.  There is no answer in that space.  There is
no 1:1 relationship between namespaces and languages, between Qnames and
element types.

The actual operational requirement is for the lower layers to distinguish
by namespace within a document and for the upper layers to associate by
language across documents and between documents and processors.  In
particular note that the namespace does not necessarily uniquely identify
the language nor definitively identify the types and attributes so named.

A Qname does not completely identify an element type or attribute for all
XML processing.  Qnames suffice to keep lower level processing from
identifying types occuring in one document which should be distinguished.
But they do not suffice to identify the element type or attribute for all
purposes, i.e. across documents and in the matching of documents to
processors.  For this, the full language definition is in general required.
 The Qname is sufficient when used as an index into the language
definition, but not by itself because it is legal (and widely done) to
reuse Qnames in related dialects, viz: HTML.

**Details
At 10:45 AM 2000-06-16 +0100, John Aldridge wrote:
>Dan Connolly and David Carlisle were having an exchange a few days ago 
>which seemed to grind to a halt with...
>
>At 01:21 14/06/00 +0100, David Carlisle wrote:
> > Dan Connolly wrote:
>
>> > put a schema for MathML at
>> > http://www.w3.org/1998/Math/MathML
>
>   (snip)
>
>> > (one that integrates MathML
>> > into XHTML by saying, e.g. that <mathml> can be used as
>> > an HTML block element),
>>
>>
>> > This seems straightforward; am I misunderstanding your question?
>>
>>XHTML Basic? XHTML 1.1? There was a reason for that previous massive
>>row over "three namespaces for html" to allow multiple schema for the
>>html namespace.
>
>I don't think I missed an answer, but I'd really like to hear what Dan 
>Connolly sees as the resolution of this sort of problem.  It seems to get 
>to the heart of the question of whether a namespace is or is not a 
>"language".  Can I encourage him (or someone else sharing his vision) to 
>respond?

Would you consider asking if a language is a namespace?

The issue of whether the leaf-level element types and attributes in this
document are the same as those in another document is not a question of
syntax, but of usage.  It is a question "is the language in use here and
there the same?"  To compare across documents, you have to compare
languages, not namespaces.

Element and attribute names, uttered within markup, are not atoms, but
indices into some language schema.  This schema may or may not be
represented in a document, but the case where there is such a document and
there are constraints as well as tokens associated with the nodes in the
InfoSet for that language has to be included a_priori.

Namespaces are OK for sorting things out locally, but namespace processing
does not yield a conclusive answer to the cross-document comparison of the
markup.  

The upper layers need to know and care about what language is being used
where those names are being used.  The lower layers just need to build an
compliant infoset structure.
Assuming that an element type Qname, or a namespace of them, is an
ontological atom, in a space with a discrete topology, breaks the orderly
allocation of functions between these two layers.  The type-name of an
element, even when qualified as to namespace, does not fully identify its
type.  It merely indexes _which type in the language_ is indicated.
Without knowing the language context, the type is undefined.


In the upper-layer processing, the same set of InfoSet nodes that has been
segregated "by namespace" in the lower layers needs to be handled as bound
to a particular language definition, a distinction finer than the
namespacing done by the lower layers.  It's the same filter of the InfoSet,
only the identification is refined.

The upper layers refine the identification of what that filter of nodes is
associated with.  "A namespace" is just the starting point.

The lower layers should not need nor presume to recognize the namespace.
Only distinguish the different namespaces appearing in one parse or one
document.

Match patterns in stylesheets refer to names in the space of the document
that is being style-processed.  They are name acceptors, not name creators.

Common processing of "the same names in different documents" should not be
automatic.  Only common processing of "the same language in different
documents."  That is to say common processing above the layer that builds
the InfoSet.

There is no reason why an identifier of the language could not be used as
the discriminator in lower-layer processing of a namespaced filter of
markup within a document.  Conversely, there is also no reason why the
language in that filter should not be identified incrementally by separate
namespace and schema location indications.

[Wave DRUMS flag - strict in what you transmit, loose in what you accept]

In the "how many namespaces for XHTML" debate we realized that it was
useful to have two characterizations of the language in a doucument: a
general characterization and a precise characterization.  The analogy to
MIME type/subtype nomeclature is strong.  The casual processor only needs
to know that the document is HTML; a validating parser needs to know what
technical definition down to the jot and tittle you are using as a
reference for this HTML.

Different processors need to know different levels of precision in
identifying the language that they are processing.

Language identification is not atomic.  It is at least as rich as Boolean.

Given the rich lattice of sublanguages it is impractical to assume that the
coarse and fine descriptions of the language in use in a particular
namespaced filter of markup in a particular document are the same.  So the
atomic solution where the language identification is atomic and is used as
the discriminant for the namespace (a.k.a. namespace name) is not
practical.  It is a bad fit to the actual need, as the HTML example
demonstrates.

The ns-attr and schema location attribute give us a mechanism to indicate
both a coarse and fine description of the language in use within a local
namespaced filter of markup.   [not necessarily canonical, but workable]

The actual operational requirement is for the lower layers to distinguish
by namespace within a document and for the upper layers to associate by
language across documents and between documents and processors.  In
particular note that the namespace does not necessarily uniquely identify
the language.

The layering of processing needs to provide for this progressive refinement
in the identification of the types used in the markup.

Al

>--
>Cheers,
>John
>
Received on Friday, 16 June 2000 10:26:03 UTC