namespace node implementation

I've been thinking about how to efficiently implement namespace "nodes". 
(This is for serialization and get-in-scope-namespace only - I already 
have efficient code for namespace resolution when parsing XML, and for 
matching QNames.)

Does the following sound like it would work ok?  (The last point is the 
critical non-obvious one.)

* Each element has a "namespace mapping", which maps prefixes to uris, 
and which can be implemented as a hash-table or a vector (propery-list). 
  (Since the namespace mapping is primarily used for serialization, it 
makes more sense to use a space-efficient vector.)
* Once created a namespace mapping is immutable, and so can be shared 
between element nodes.
* When parsing an XML document, if an element has no namespace 
attributes we re-use the namespace mapping of its parent.  If it has 
namespace attributes, we create a new namespace mapping which is the 
combination of the parent's namespace mapping with the new namespace 
attributes.
* When serializing an element, we print all the namespaces in the 
element's namespace mapping, except for ones that are redundant because 
they have already been serialized in an enclosing element.
* When an element is constructed, its namespace mapping includes all the 
"active namespaces nodes" (in the sense of the specification) plus any 
of the namespaces in the prologue or predefined that are referenced in 
the current element *or* (if this is a direct element constructor) in 
any enclosed direct element constructors.  (This rule is meant to 
minimize the number of distinct namespace mapping we have to create. 
The implemengtatin may need to be a little bit clever here.)
* When an element is (conceptually) copied (re-parented), we use its 
existing namespace mapping.  We do *not* create a new namespace mapping 
to incorporate any namespace in the parent.

The last point may cause some slightly surprising behavior.  Consider:

let $a := <a xmlns:ns1="NS1"><b/></a>
let $b := $x/b
let $c := <c xmlsns:ns2="NS2">{$b}</c>
let $d := <d xmlsns:ns3="NS3">{$c}</d>

Serializing gives us:
$a -> <a xmlns:ns1="NS1"><b/></a>
$b -> <b xmlns:ns1="NS1"/>
$c -> <c xmlsns:ns2="NS2"><b xmlns:ns1="NS1"/></c>
$c/b -> <b xmlns:ns1="NS1"/>
$d -> <d xmlsns:ns3="NS3"><c xmlsns:ns2="NS2"><b xmlns:ns1="NS1"/></c></d>
$d/c -> <c xmlsns:ns2="NS2"><b xmlns:ns1="NS1"/></c>
$d/c/b -> <b xmlns:ns1="NS1"/>

I.e. $b "inherits" ns1 from <a> and keeps it even "removed" from <a>,
but it does not "inherit" ns2 from <c> in the same way.  This may
be counter-intuitive.

get-in-scope-namespaces($a) -> "ns1"
get-in-scope-namespaces($b) -> "ns1"
get-in-scope-namespaces($c) -> "ns2", "ns1"
get-in-scope-namespaces($c/b) -> "ns1"
get-in-scope-namespaces($d) -> "ns3", "ns2", "ns1"
get-in-scope-namespaces($d/c) -> "ns2", "ns1"
get-in-scope-namespaces($d/c/b) -> "ns1"
$b is $c/b -> false
deep-equals($b, $c/b) -> true

I think this produces correct and reasonable output for a modest 
implementation price, but perhaps I'm missing something.
-- 
	--Per Bothner
per@bothner.com   http://per.bothner.com/

Received on Tuesday, 21 October 2003 14:41:21 UTC