- From: Per Bothner <per@bothner.com>
- Date: Tue, 21 Oct 2003 11:35:45 -0700
- To: www-ql@w3.org
I've been thinking about how to efficiently implement namespace "nodes". (This is for serialization and get-in-scope-namespace only - I already have efficient code for namespace resolution when parsing XML, and for matching QNames.) Does the following sound like it would work ok? (The last point is the critical non-obvious one.) * Each element has a "namespace mapping", which maps prefixes to uris, and which can be implemented as a hash-table or a vector (propery-list). (Since the namespace mapping is primarily used for serialization, it makes more sense to use a space-efficient vector.) * Once created a namespace mapping is immutable, and so can be shared between element nodes. * When parsing an XML document, if an element has no namespace attributes we re-use the namespace mapping of its parent. If it has namespace attributes, we create a new namespace mapping which is the combination of the parent's namespace mapping with the new namespace attributes. * When serializing an element, we print all the namespaces in the element's namespace mapping, except for ones that are redundant because they have already been serialized in an enclosing element. * When an element is constructed, its namespace mapping includes all the "active namespaces nodes" (in the sense of the specification) plus any of the namespaces in the prologue or predefined that are referenced in the current element *or* (if this is a direct element constructor) in any enclosed direct element constructors. (This rule is meant to minimize the number of distinct namespace mapping we have to create. The implemengtatin may need to be a little bit clever here.) * When an element is (conceptually) copied (re-parented), we use its existing namespace mapping. We do *not* create a new namespace mapping to incorporate any namespace in the parent. The last point may cause some slightly surprising behavior. Consider: let $a := <a xmlns:ns1="NS1"><b/></a> let $b := $x/b let $c := <c xmlsns:ns2="NS2">{$b}</c> let $d := <d xmlsns:ns3="NS3">{$c}</d> Serializing gives us: $a -> <a xmlns:ns1="NS1"><b/></a> $b -> <b xmlns:ns1="NS1"/> $c -> <c xmlsns:ns2="NS2"><b xmlns:ns1="NS1"/></c> $c/b -> <b xmlns:ns1="NS1"/> $d -> <d xmlsns:ns3="NS3"><c xmlsns:ns2="NS2"><b xmlns:ns1="NS1"/></c></d> $d/c -> <c xmlsns:ns2="NS2"><b xmlns:ns1="NS1"/></c> $d/c/b -> <b xmlns:ns1="NS1"/> I.e. $b "inherits" ns1 from <a> and keeps it even "removed" from <a>, but it does not "inherit" ns2 from <c> in the same way. This may be counter-intuitive. get-in-scope-namespaces($a) -> "ns1" get-in-scope-namespaces($b) -> "ns1" get-in-scope-namespaces($c) -> "ns2", "ns1" get-in-scope-namespaces($c/b) -> "ns1" get-in-scope-namespaces($d) -> "ns3", "ns2", "ns1" get-in-scope-namespaces($d/c) -> "ns2", "ns1" get-in-scope-namespaces($d/c/b) -> "ns1" $b is $c/b -> false deep-equals($b, $c/b) -> true I think this produces correct and reasonable output for a modest implementation price, but perhaps I'm missing something. -- --Per Bothner per@bothner.com http://per.bothner.com/
Received on Tuesday, 21 October 2003 14:41:21 UTC