thoughts on namespaces

I've been trying to follow the namespace discussion, but I'm sure
I haven't completely understood the details of all of it.  In
the hopes that something here might be useful, I'd like to
outline my current thinking.

I start by treating the namespace problem as basically a 
problem of uniquifying names in an SGML instance and providing
a simple syntactic way to group an interconnected collection of
names.  (In this message, I'll concentrate on element names,
but namespace applies also to attr names, entity names, etc.)

Consider a 12083 article document that embeds a cals table
that contains one entry with paragraphs and lists from the
12083 doctype, another entry with an aap equation, and a third
entry with a small HTML document fragment.

(With no namespace, the document is well-formed (!), but there is
no idea that the <a> element in the HTML fragment and the <a> element
in the aap equation are different elements.  It is also impossible
to associate the different <a> elements properly with their respective
style sheets, and it is of course impossible to do any kind of checking
against any part of the DTD/schema model.)

We see we need all elements down to the table to be in the
12083-article namespace; we need the table element and all its
descendants down to the entry elements to be in the cals namespace;
we need the entry element *markup* to be in the cals namespace, but
we need all its contents in the 12083-article namespace; we need 
the outmost equation element (either f or fd) and all its descendants 
to be in the aap namespace; we need the outmost HTML element and all 
its descendants to be in the HTML namespace.

We need two specification capabilities:

I.  This element's markup is in namespace X
II. This element's content is in namespace Y

(Note for II, "content" not only includes subelements but also 
entity references that might be direct children of this element.)

We can package these specification capabilities into functions in
at least two ways:

A.  Element's markup and content namespaces are set independently
-----------------------------------------------------------------
1.  Set the namespace for this element's markup to X (without
    affecting the namespace of its contents/descendants)
2.  Set the namespace for this element's contents/desendants to Y
    (without affecting the namespace for this element's markup)

B.  Element's markup and content namespaces are usually dependent
-----------------------------------------------------------------
1.  Set the namespace for both this element's markup and its
    contents/descendants to X
2.  Set the namespace for this element's contents/desendants to Y
    (without affecting the namespace for this element's markup)

A2 and B2 are identical.  In the B case, if you want to change
only the namespace of the element's markup without changing the
namespace of its contents/dependents, you'd have to do B1(X)
followed by:
     B2(what the namespace was before you changed it to X).
In terms of clean orthogonality of functions, I prefer A.
Note that A1 has no "inheritance/defaulting" issues (that is,
A1 has no affect on any of its descendant element's namespaces);
A2, by definition, does set the namespace for the entire subtree.

In either case A or B, it's important to realize that you have
two functions you must be able to associate with an element instance.
It's the function that specifies the namespace for the element's markup
(whether or not is also specifies the namespace for the element's
content) where I have problems with attributes/archforms (since
I posted these points earlier, I'll just abbreviate here):

1.  you would need to parse the start tag for all attributes to
    determine if it has a namespace-specifying attribute before
    you can determine what the namespace is for the very element gi
    and attribute specification list you just parsed;

2.  by using archforms, you have no way to allow different names in
    a start tag to come from different namespaces;

3.  if you go with something like a "namespaceid:" prefix, as long as
    you've added : to namechar, a standard XML parser will be able to
    parse the (fully namespace-qualified) document.  

I like the idea of using notation declarations to associate a namespaceid
prefix with a URI/FPI.  In fact, using FSI syntax (or equivalent), you
can associate a namespaceid with a sequence or or-group of URIs. 

Once you've decided to use namespace: to indicate the A1 (or B1) function,
I don't have nearly so much trouble with using some attribute assignment
to indicate the A2/B2 function, since this does not affect the namespace 
of the names in the start tag in which it appears, but only sets the 
namespace for that element's content.  The value of this XML-content-namespace
(or whatever) attribute be one of the namespaceid's that are associated
(via a notation declaration) with the full namespace URI or whatever.

paul

Received on Thursday, 19 June 1997 11:57:15 UTC