Re: How namespace names might be used

Point of procedure:

John Aldridge asked for some help understanding what people were thinking.
That is the request to which I responded.  I gave him a brain dump on how I
view the relationship between namespaces and XML language/languages.  He
thanked me [off the list].

I didn't ask anyone else to salute my vision or bless it as the consensus
of this group.  

I am concerned that Simon mis-read the intent of my remarks.  I agree with
Simon that the following quoted points of rough agreement are quite enough
agreement to separate the upper and lower layer problems and deal
separately with the lower layer issue of how a parser should distinguish
among prefixed names in a given context.

--
[Simon]
>I think it's reasonable to assert that 'distinguish' is the only part we're
>arguing over in the 'absolutize' | 'forbid' | 'literal' discussion.

[Al] That's what I want people to agree to.  The namespace Rec. seems to
have been read by some to say something else.
--
[Al]
>>The lower layers should not need nor presume to recognize the namespace.
>>Only distinguish the different namespaces appearing in one parse or one
>>document.
[Simon]
>We agree, for once!  (There's still the ugly question of how to
>distinguish...)
[Al]
>>Match patterns in stylesheets refer to names in the space of the document
>>that is being style-processed.  They are name acceptors, not name creators.
[Simon]
>Okay.
>

Al

PS: there are a few attempts at clarification below, but they pertain to
Simon's reading of what I said, more than to issues immediately before this
list; read if you wish.

At 02:11 PM 2000-06-16 -0400, Simon St.Laurent wrote:
>At 10:42 AM 6/16/00 -0500, Al Gilman wrote:
>>The problem is both sides are assuming there is a 1:1 relationship and are
>>arguing over how to define it.  There is no answer in that space.  There is
>>no 1:1 relationship between namespaces and languages, between Qnames and
>>element types.
>
>I'm afraid it's a little more complex than that - not everyone is aiming
>for a 1:1 relationship - many-to-one and one-to-many are important
>possibilities I think have been somewhat lost in previous discussions of
>architectural planning.
>

Good.  We agree there.

>>The actual operational requirement is for the lower layers to distinguish
>>by namespace within a document and for the upper layers to associate by
>>language across documents and between documents and processors.  In
>>particular note that the namespace does not necessarily uniquely identify
>>the language nor definitively identify the types and attributes so named.
>
>Where does this 'operational requirement' come from?
>

Optimizing the communication effectiveness of XML language.

I see XML dialects and applications as being designed under a tradeoff of
concerns for ease of processing vs. effectiveness of communication.

>I think it's reasonable to assert that 'distinguish' is the only part we're
>arguing over in the 'absolutize' | 'forbid' | 'literal' discussion.
>
>>A Qname does not completely identify an element type or attribute for all
>>XML processing.  Qnames suffice to keep lower level processing from
>>identifying types occuring in one document which should be distinguished.

>>But they do not suffice to identify the element type or attribute for all
>>purposes, i.e. across documents and in the matching of documents to
>>processors.  For this, the full language definition is in general required.
>> The Qname is sufficient when used as an index into the language
>>definition, but not by itself because it is legal (and widely done) to
>>reuse Qnames in related dialects, viz: HTML.
>
>I'm not sure the Qname should be taken as 'an index into the language
>definition' unless you take a very very very broad understanding of
>languages, one that allows for synonyms, loan words, and overlaps between
>languages.
>

a) such a broad understanding is presumed here.

b) note that when I say "an index _into_ the language definition" I mean a
key isolating some identifiable item inside the language definition, not
the pointer to or identification of the language definition context that is
indexed into.

>>Would you consider asking if a language is a namespace?
>
>I'm not sure if that's even a relevant question when it's clear that the
>parties involved have different understandings of 'language' in addition to
>'namespace'.
>
>>The issue of whether the leaf-level element types and attributes in this
>>document are the same as those in another document is not a question of
>>syntax, but of usage.  It is a question "is the language in use here and
>>there the same?"  To compare across documents, you have to compare
>>languages, not namespaces.
>
>In my understanding of language, that doesn't make sense.  To compare
>across documents, I'd compare context and understanding, not necessarily
>'language'.  There's too much contingency involved here for me to accept
>any overarching 'language' concept.
>

How would you identify context, and how would you guide understanding?  I
think we may be in flaming agreement, here.  

If you would say that two elements with identical Qnames for their
element-type-names appearing in different contexts, where the different
contexts bind to different schemata, may need to be processed somewhat
differently depending on what is asserted in the schemata; then I think we
are agreed and done.

>>Element and attribute names, uttered within markup, are not atoms, but
>>indices into some language schema.  This schema may or may not be
>>represented in a document, but the case where there is such a document and
>>there are constraints as well as tokens associated with the nodes in the
>>InfoSet for that language has to be included a_priori.
>
>Great.  So we're completely and utterly on different tracks.  To me,
>elements and attribute names are atoms, and the 'language schemas' are
>occasionally useful tools.
>

The 'occasional relationships' mentioned below, and the 'occasionally
useful' here indicate that we're not on such cleanly divergent tracks.

The overall framework has to provide for how one is to understand and
process the occasional cross-element relationship, or occasionally-useful
schema.

I'm just trying to make sure we don't slam the door on our foot as we
encapsulate lower-level processing to make it safe from upper-level meddling.


>>Namespaces are OK for sorting things out locally, but namespace processing
>>does not yield a conclusive answer to the cross-document comparison of the
>>markup.  
>
>All it should do is provide a full name for the element or attribute.
>
>>The upper layers need to know and care about what language is being used
>>where those names are being used.  The lower layers just need to build an
>>compliant infoset structure.
>
>Agreed on the lower layers, but I'm not sure the upper layer dream is
>viable, for reasons stated above.
>
>>Assuming that an element type Qname, or a namespace of them, is an
>>ontological atom, in a space with a discrete topology, breaks the orderly
>>allocation of functions between these two layers.  The type-name of an
>>element, even when qualified as to namespace, does not fully identify its
>>type.  It merely indexes _which type in the language_ is indicated.
>>Without knowing the language context, the type is undefined.
>
>Er... why are you so concerned with defining types on that broad a scale?

Because I understand that people have been saying on xml-dev that the
Namespaces Rec. accomplishes the same.  I want to get them to stop saying
that.

>I'm not sure it's necessary, and suspect the underlying approach is
>severely compromised by this dream of 1:1 mapping you described above.
>

There was no dream of a 1:1 mapping in my head.  I am sorry you found one
in my words.  My dream of an information model contains no 1:1 arcs.  1:1
relationships are tautologies, not information.  If there is a 1:1 arc in
your draft information model, you need to combine the nodes so linked and
merge the attribute and relationships attached to the formerly
distinguished entities.  The 1:1 cardinality tells you that there aren't
really two independent entities there at all; just two clusters of
properties that actually go together all the time and are the properities
of one entity.

>>In the upper-layer processing, the same set of InfoSet nodes that has been
>>segregated "by namespace" in the lower layers needs to be handled as bound
>>to a particular language definition, a distinction finer than the
>>namespacing done by the lower layers.  It's the same filter of the InfoSet,
>>only the identification is refined.
>
>At this point, I think we've got almost nothing in common in our upper
>layers, so maybe we'd better focus strictly on the lower layers.
>(Something like respecting the 'signifier/signified' distinction, and
>talking only about signifiers.)
>
>>The upper layers refine the identification of what that filter of nodes is
>>associated with.  "A namespace" is just the starting point.
>
>I don't see a namespace - I see lots of elements and attributes marked with
>namespace identifiers.  Seems to me like a much more practical starting
point.
>

But you still have cached somewhere in your stack the context in which you
find them.  If you step back to look into different contexts, you would
know context as well as Qname for each element and attribute, no?

That's what I'm trying to say: you haven't removed the necessity of knowing
the context by normalizing the prefixed name.  In addition to the
'globalized' Qname, you still need to be aware of the context.  Including
being aware of links to applicable schemata asserted in some contexts.


>>The lower layers should not need nor presume to recognize the namespace.
>>Only distinguish the different namespaces appearing in one parse or one
>>document.
>
>We agree, for once!  (There's still the ugly question of how to
>distinguish...)
>
>>Match patterns in stylesheets refer to names in the space of the document
>>that is being style-processed.  They are name acceptors, not name creators.
>
>Okay.
>
>>Common processing of "the same names in different documents" should not be
>>automatic.  Only common processing of "the same language in different
>>documents."  That is to say common processing above the layer that builds
>>the InfoSet.
>
>I'm not sure "common processing of 'the same language in different
>documents"' is coherent.  I'm much happier with common processing of the
>same names in different document contexts.
>

Well, we don't have an agreed public definition for either 'document' or
'context,' but I am equally as happy with 'context' as 'document.'  I only
said 'document' to use an particual context that I thought would be
concrete and clear; I get accused of being too abstract at times.

>>There is no reason why an identifier of the language could not be used as
>>the discriminator in lower-layer processing of a namespaced filter of
>>markup within a document.  Conversely, there is also no reason why the
>>language in that filter should not be identified incrementally by separate
>>namespace and schema location indications.
>
>I'd argue that the reuse of the identifier by the upper layers shouldn't
>bind the lower layer to additional processing that goes well beyond its
>needs or the needs of alternative upper layers.
>

Did I argue anything else?

You said it very well.  The lower layer reduces the element type and
attribute identification to standard Qnames.

The upper layers processes the elements and attributes based on these
Qnames and any other clauses (such as a schema location link) governing
understanding of these types and attributes articulated in the context
where the Qnames are used.

There is a conflict between this comment, where you say you want the lower
layer to be free to do its thing without being burdened with inappropriate
functional demands, and later where you say:

--
[Al]
>>Different processors need to know different levels of precision in
>>identifying the language that they are processing.
[Simon]
>But those levels of precision aren't bound to particular layers of
>processing.  

The way to make lower layer processing something that can be put in the can
without fear of contradiction is to do a good job of rationalizing the
processing layers.  That is to say there has to be a reasonably clear
rationale for the scope of some processing that gets done
first-and-finally, in terms of the dependency graph for processing concerns.

The processing layering and the understanding layering don't have to be
identical, but they have to be interoperable.  This does somewhat constrain
the relationship bettween the two layerings.

>>[Wave DRUMS flag - strict in what you transmit, loose in what you accept]
>>

>>In the "how many namespaces for XHTML" debate we realized that it was
>>useful to have two characterizations of the language in a doucument: a
>>general characterization and a precise characterization.  The analogy to
>>MIME type/subtype nomeclature is strong.  The casual processor only needs
>>to know that the document is HTML; a validating parser needs to know what
>>technical definition down to the jot and tittle you are using as a
>>reference for this HTML.
>
>I think the way XHTML/HTML are actually being processed in XML contexts in
>browsers is on an element-by-element basis, without any strong attachment
>to 'this is the HTML language'.  Even Microsoft's heavy-duty binding of the
>html: prefix to HTML elements and its apparent preference for HTML as
>documents and XML as data retains enough flexibility to let XML developers
>get a lot of work done using HTML atoms, without grave concerns for the
>integrity of the HTML 'language'.
>
>>Different processors need to know different levels of precision in
>>identifying the language that they are processing.
>
>But those levels of precision aren't bound to particular layers of
>processing.  My personal vision of XML processing doesn't talk about
>'language'.  It talks about atoms and their occasional relations.  I know
>I'm not alone in that.
>
>>Language identification is not atomic.  It is at least as rich as Boolean.
>>
>>Given the rich lattice of sublanguages it is impractical to assume that the
>>coarse and fine descriptions of the language in use in a particular
>>namespaced filter of markup in a particular document are the same.  So the
>>atomic solution where the language identification is atomic and is used as
>>the discriminant for the namespace (a.k.a. namespace name) is not
>>practical.  It is a bad fit to the actual need, as the HTML example
>>demonstrates.
>>
>>The ns-attr and schema location attribute give us a mechanism to indicate
>>both a coarse and fine description of the language in use within a local
>>namespaced filter of markup.   [not necessarily canonical, but workable]
>
>Give who a mechanism?  Lots of people aren't looking for language
>descriptions per se.  A cluster of atomic descriptions might be handy, but
>there's no reason for it to be limited to descriptions of atoms sharing a
>single namespace.
>
>>The actual operational requirement is for the lower layers to distinguish
>>by namespace within a document and for the upper layers to associate by
>>language across documents and between documents and processors.  In
>>particular note that the namespace does not necessarily uniquely identify
>>the language.
>
>As noted above, I don't think 'requirement' is necessary.
>
>>The layering of processing needs to provide for this progressive refinement
>>in the identification of the types used in the markup.
>
>There is no 'progressive refinement' except in certain systems which like
>to think of themselves as 'progressive' and/or 'refined'.  I'd suggest we
>stop applying such value judgments to different processing models, and get
>over this obsession with 'language'.
>

The sense of "progressive refinement" as used here is the following.  The
type is defined by its proper knowledge: a collection of applicable
assertions.  The lower layer understands a supertype, i.e. a subset of this
collection of assertions.  Higher layers understand greater subsets of the
proper knowledge pertaining to the type.  This is the "progressive
refinement:" successive views resolve items in terms of more and more
refined ontologies.  Each higher layer makes finer distinctions concerning
the types of things.


>Simon St.Laurent
>XML Elements of Style / XML: A Primer, 2nd Ed.
>http://www.simonstl.com - XML essays and books
> 

Received on Saturday, 17 June 2000 13:50:33 UTC