Re: namespace node implementation

On Tuesday, Oct 21, 2003, at 21:15 Europe/Berlin, Per Bothner wrote:

> james anderson wrote:
>
>> some of them seem unnecessarily complex. i suspect you're not 
>> thinking about first-class names, which may be the cause of the 
>> problem.
>
> I don't know what "first-class names" refers to, so I don't know if 
> I'm not thinking about it ...

"first class" in the standard sense: data which can be passed, bound, 
and returned.

>  Perhaps I didn't make obvious the context, which is XQuery (and XPath 
> 2.0) implementation, without support for
> explicit namespace nodes (as in XPath 1.0).
>
>> is it a combination operator, a path construct, or a ?

the one which confused me was the "$b = $x/b", which one should well 
have presumed was meant to have been  "$b = $a/b". in any case, i 
illustrate below the results from a processor which supports 
first-class names. for a the data model uses first-class names, it is 
not necessary to include a prefix-namespace binding context in the 
processor state. the processor can generate in-scope bindings on 
demand, but it neither retains them nor uses them for processing. only 
the serializer uses them - it generates them incrementally for the 
purpose of reconciling or fabricating prefixes.

while this is not expressed in xpath/xquery syntax, it does demonstrate 
interaction with a query data model engine which is the compilation 
target for processors of those sorts. the particular examples in the 
original post don't really do justice to the issue, as the bindings do 
not affect the interpretation of any names. for that reason, this post 
includes several additional examples which demonstrate that a repeat of 
thirty-year-old programming-language research is not necessary in order 
to achieve the effect of referentially transparent expressions. one 
need only constant-fold.

if the parentheses get to be too much, just skip to the last example 
and explain what namespace mappings accomplish which first class names 
do not, and describe how they apply to attribute names.

>
> let $a := <a xmlns:ns1="NS1"><b/></a>

? (defParameter $a (root (parse-document "<a 
xmlns:ns1='NS1'><b/></a>")))
$A
? (describe $a)
#<ELEM-NODE ||::\a 1 #x127556E>
Class: #<STANDARD-CLASS ELEM-NODE>
Wrapper: #<CCL::CLASS-WRAPPER ELEM-NODE #x1253E5E>
Instance slots
DOCUMENT: #<DOC-NODE <no uri> #x1273546>
ORDINALITY: 1
PARENT: NIL
DEF: NIL
NAME: ||::\a
CHILDREN: (#<ELEM-NODE ||::\b 2 #x127573E>)
ATTRIBUTES: NIL
NAMESPACES: (#<NS-NODE |xmlns|::|ns1| -> "NS1" #x1275466>)
VALID: #<Unbound>

note, the name of that element is "first class": it comprises all 
properties necessary to model a universal name without recourse to an 
ancillary namespace mapping.

? (describe (name $a))
Symbol: ||::\a
INTERNAL in package: #<Package "">
Print name: "a"
Value: #<Unbound>
Function: #<Unbound>
Plist: (:PREFIX "")

> let $b := $x/b

? (defParameter $b (./ $a '{}b))
$B
? (describe $b)
#<ELEM-NODE ||::\b 2 #x127573E>
Class: #<STANDARD-CLASS ELEM-NODE>
Wrapper: #<CCL::CLASS-WRAPPER ELEM-NODE #x1253E5E>
Instance slots
DOCUMENT: #<DOC-NODE <no uri> #x1273546>
ORDINALITY: 2
PARENT: #<ELEM-NODE ||::\a 1 #x127556E>
DEF: NIL
NAME: ||::\b
CHILDREN: NIL
ATTRIBUTES: NIL
NAMESPACES: NIL
VALID: #<Unbound>
?

one will note that there are no artifactual namespace maps. neither 
decoded nor imputed. there are none, because, if the names are 
first-class, the prefix binding environments are unnecessary.

> let $c := <c xmlsns:ns2="NS2">{$b}</c>

? (defParameter $c
     (make-elem-node :name '{}c
                     :namespaces (list (make-ns-node :name '{xmlns}ns2
                                                     :value "NS2"))
                     :children (list $b)))

$C
? (describe $c)
#<ELEM-NODE ||::\c #x12B19CE>
Class: #<STANDARD-CLASS ELEM-NODE>
Wrapper: #<CCL::CLASS-WRAPPER ELEM-NODE #x1253E5E>
Instance slots
DOCUMENT: NIL
ORDINALITY: NIL
PARENT: NIL
DEF: NIL
NAME: ||::\c
CHILDREN: (#<ELEM-NODE ||::\b 2 #x127573E>)
ATTRIBUTES: NIL
NAMESPACES: (#<NS-NODE |xmlns|::|ns2| -> "NS2" #x12B1956>)
VALID: #<Unbound>
?

> let $d := <d xmlsns:ns3="NS3">{$c}</d>

? (defParameter $d
   (make-elem-node :name '{}d
                   :namespaces (list (make-ns-node :name '{xmlns}ns3
                                                   :value "NS3"))
                   :children (list $c)))
$D
? (describe $d)
#<ELEM-NODE ||::\d #x12C3BA6>
Class: #<STANDARD-CLASS ELEM-NODE>
Wrapper: #<CCL::CLASS-WRAPPER ELEM-NODE #x1253E5E>
Instance slots
DOCUMENT: NIL
ORDINALITY: NIL
PARENT: NIL
DEF: NIL
NAME: ||::\d
CHILDREN: (#<ELEM-NODE ||::\c #x12B19CE>)
ATTRIBUTES: NIL
NAMESPACES: (#<NS-NODE |xmlns|::|ns3| -> "NS3" #x12C3B2E>)
VALID: #<Unbound>
?

>
> Serializing gives us:
> $a -> <a xmlns:ns1="NS1"><b/></a>
> $b -> <b xmlns:ns1="NS1"/>
> $c -> <c xmlsns:ns2="NS2"><b xmlns:ns1="NS1"/></c>
> $c/b -> <b xmlns:ns1="NS1"/>
> $d -> <d xmlsns:ns3="NS3"><c xmlsns:ns2="NS2"><b 
> xmlns:ns1="NS1"/></c></d>
> $d/c -> <c xmlsns:ns2="NS2"><b xmlns:ns1="NS1"/></c>
> $d/c/b -> <b xmlns:ns1="NS1"/>
>

? (defun path-value (path)
   (if (symbolp path)
     (symbol-value path)
     (do ((value (symbol-value (pop path)) (./ value (pop path))))
         ((null path) value))))

PATH-VALUE
? (dolist (path '($a $b $c ($c {}b) $d ($d {}c) ($d {}c {}b)))
   (format *trace-output* "~%~:[~a~;~{~a~^/~}~] -> " (consp path) path)
   (write-node (path-value path) *trace-output*))


$A -> <a xmlns:ns1='NS1'><b /></a>
$B -> <b />
$C -> <c xmlns:ns2='NS2'><b /></c>
$C/b -> <b />
$D -> <d xmlns:ns3='NS3'><c xmlns:ns2='NS2'><b /></c></d>
$D/c -> <c xmlns:ns2='NS2'><b /></c>
$D/c/b -> <b />
NIL
?

I.e. $b "inherits" ns1 from <a> and keeps it even "removed" from <a>,
> but it does not "inherit" ns2 from <c> in the same way.  This may
> be counter-intuitive.
>
> get-in-scope-namespaces($a) -> "ns1"
> get-in-scope-namespaces($b) -> "ns1"
> get-in-scope-namespaces($c) -> "ns2", "ns1"
> get-in-scope-namespaces($c/b) -> "ns1"
> get-in-scope-namespaces($d) -> "ns3", "ns2", "ns1"
> get-in-scope-namespaces($d/c) -> "ns2", "ns1"
> get-in-scope-namespaces($d/c/b) -> "ns1"
> [equality issues ignored for the moment]

? (dolist (path '($a $b $c ($c {}b) $d ($d {}c) ($d {}c {}b)))
   (format *trace-output* "~%inscope-namespaces(~:[~a~;~{~a~^/~}~]) -> " 
(consp path) path)
   (princ (xqdm::inscope-namespaces (path-value path)) *trace-output*))

inscope-namespaces($A) -> (#<NS-NODE ns1 -> NS1 #x1275466>)
inscope-namespaces($B) -> (#<NS-NODE ns2 -> NS2 #x12B1956> #<NS-NODE 
ns3 -> NS3 #x12C3B2E>)
inscope-namespaces($C) -> (#<NS-NODE ns2 -> NS2 #x12B1956> #<NS-NODE 
ns3 -> NS3 #x12C3B2E>)
inscope-namespaces($C/b) -> (#<NS-NODE ns2 -> NS2 #x12B1956> #<NS-NODE 
ns3 -> NS3 #x12C3B2E>)
inscope-namespaces($D) -> (#<NS-NODE ns3 -> NS3 #x12C3B2E>)
inscope-namespaces($D/c) -> (#<NS-NODE ns2 -> NS2 #x12B1956> #<NS-NODE 
ns3 -> NS3 #x12C3B2E>)
inscope-namespaces($D/c/b) -> (#<NS-NODE ns2 -> NS2 #x12B1956> 
#<NS-NODE ns3 -> NS3 #x12C3B2E>)
NIL
?

which controvert, in $B, the "namespace mapping" principle of the  
original post, but are closer to the infoset definition. the more 
important issue is, why does it matter whether the namespace bindings 
of the original context element are retained by a child node? it can 
only matter if some name in the scope of that element was resolved from 
a qualified to a universal name based on the binding which was 
introduced by the context element. so let's look at that case. for 
example:

? (defParameter $e (root (parse-document "<e xmlns:ns1='NS1'><f 
ns1:att='x'/></e>")))
$E
? (describe $e)
#<ELEM-NODE ||::\e 1 #x13689C6>
Class: #<STANDARD-CLASS ELEM-NODE>
Wrapper: #<CCL::CLASS-WRAPPER ELEM-NODE #x1253E5E>
Instance slots
DOCUMENT: #<DOC-NODE <no uri> #x136696E>
ORDINALITY: 1
PARENT: NIL
DEF: NIL
NAME: ||::\e
CHILDREN: (#<ELEM-NODE ||::\f 2 #x1368CCE>)
ATTRIBUTES: NIL
NAMESPACES: (#<NS-NODE |xmlns|::|ns1| -> "NS1" #x136888E>)
VALID: #<Unbound>
? (defParameter $f (./ $e '{}f))
$F
? (describe $f)
#<ELEM-NODE ||::\f 2 #x1368CCE>
Class: #<STANDARD-CLASS ELEM-NODE>
Wrapper: #<CCL::CLASS-WRAPPER ELEM-NODE #x1253E5E>
Instance slots
DOCUMENT: #<DOC-NODE <no uri> #x136696E>
ORDINALITY: 2
PARENT: #<ELEM-NODE ||::\e 1 #x13689C6>
DEF: NIL
NAME: ||::\f
CHILDREN: NIL
ATTRIBUTES: (#<STRING-ATTR-NODE NS1::|att| #x1368DC6>)
NAMESPACES: NIL
VALID: #<Unbound>
? (describe (first (attributes $f)))
#<STRING-ATTR-NODE NS1::|att| #x1368DC6>
Class: #<STANDARD-CLASS STRING-ATTR-NODE>
Wrapper: #<CCL::CLASS-WRAPPER STRING-ATTR-NODE #x135363E>
Instance slots
PARENT: #<ELEM-NODE ||::\f 2 #x1368CCE>
CHILDREN: ("x")
DEF: NIL
DOCUMENT: #<DOC-NODE <no uri> #x136696E>
NAME: NS1::|att|
VALUE: NIL
?

now it gets interesting:

? (defParameter $g
     (make-elem-node :name '{}g
                     :namespaces (list (make-ns-node :name '{xmlns}ns2
                                                     :value "NS2"))
                     :children (list $f)))
$G
? (dolist (path '($e $f $g ($g {}f)))
     (format *trace-output* "~%~:[~a~;~{~a~^/~}~] -> " (consp path) path)
     (write-node (path-value path) *trace-output*))


$E -> <e xmlns:ns1='NS1'><f ns1:att='x' /></e>
$F -> <f NS1:att='x' xmlns:NS1='NS1' />
$G -> <g xmlns:ns2='NS2'><f NS1:att='x' xmlns:NS1='NS1' /></g>
$G/f -> <f NS1:att='x' xmlns:NS1='NS1' />
NIL

whereby, to reiterate,

? (describe $f)
#<ELEM-NODE ||::\f 2 #x1368CCE>
Class: #<STANDARD-CLASS ELEM-NODE>
Wrapper: #<CCL::CLASS-WRAPPER ELEM-NODE #x1253E5E>
Instance slots
DOCUMENT: #<DOC-NODE <no uri> #x136696E>
ORDINALITY: 2
PARENT: #<ELEM-NODE ||::\g #x13DA4B6>
DEF: NIL
NAME: ||::\f
CHILDREN: NIL
ATTRIBUTES: (#<STRING-ATTR-NODE NS1::|att| #x1368DC6>)
NAMESPACES: NIL
VALID: #<Unbound>
?

no namespace maps required.

...

ps. if (as i gather from p.bothner's subsequent post) ip disclaimers 
are the order of the day in the forum, please note that the 
demonstrated implementation has been released under gpl since at least 
2001, derives from a publicly available implementation which was 
roughly contemporaneous with the namespaces-wd, and would, in terms of 
prior are, be traceable to mit ai lab memo 199, which dates from 1970.

Received on Tuesday, 21 October 2003 18:11:20 UTC