Re: namespace node implementation

On Wednesday, Oct 22, 2003, at 01:18 Europe/Berlin, Per Bothner wrote:

> james anderson wrote:
>> On Tuesday, Oct 21, 2003, at 21:15 Europe/Berlin, Per Bothner wrote:
>>> I don't know what "first-class names" refers to, so I don't know if 
>>> I'm not thinking about it ...
>> "first class" in the standard sense: data which can be passed, bound, 
>> and returned.
> Yes, but how are your "first class names" different from QNames, which 
> can be passed, bound, and returned in XQuery?

the xml query drafts read as if qnames are "qualified names", that is a 
(prefix localpart) pair, for which the so called "expanded qname", that 
is the universal name, which is required for any given operation, is 
determined at the point of the operation based on some set of 
prefix-namespace bindings.

> Btw, QNames in Qexo are implmented using a class called Symbol, which 
> is also used for the JEmacs (Emacs Lisp) and (embryonic) Common Lisp 
> implementations that are also part of Kawa.  So I am quite familiar 
> with first-class names.


> Your implementation is of course a very natural one when using Common 
> Lisp.

the implementation language is not material.

>  However, it is not particularly space-efficient one, since you appear 
> to use a CLOS object for each node.  (This is similar to using DOM, of 
> course.)  I'm aiming for a much more space-efficient implementation.
the examples were chosen to use a data model which is a direct 
reflection of the query data model. other data models are possible. the 
storage efficiency of the particular model illustrated is not material 
to the issue of namespace bindings.

> Also note that XQuery requires that $b and $c/b must be two 
> *different* nodes.  Specifically, the parent of $b is $a, while the 
> parent of $c/b is $c.  Conceptually you must copy the $b node when 
> evaluating <c>{$b}</c>.

there was a note at that point in the original examples, which 
indicated that i was ignoring the equality issue for the moment. i did 
not bother to illustrate the copy operation, as, as i noted at the head 
of the message, this was not a direct xquery formulation. if the reader 
examines the description of the $f node, they will observe that its 
parent is the $g node. that is for purposes of establishing in-scope 
namespaces, it is as if it had been copied.

>   That means that if $b and hence $c/b has a sub-element that uses the 
> ns1 prefix, you can't find it by looking up the parent chain to get to 
> $a.
a "first-class" name does not "use the ns1 prefix". that is what 
example e/f/g was about. it is not necessary to carry the bindings 
within which one would resolve the prefix if one folds constant names.

> > inscope-namespaces($B)
> >  -> (#<NS-NODE ns2 -> NS2 #x12B1956> #<NS-NODE ns3 -> NS3 #x12C3B2E>)

that is an artifact of having not copied the node.

> I don't think this is acceptable.
> What you've left out is what algorithm write-node uses to select where 
> to put the namespace attributes.  My guess is that when it prints an 
> element that uses a namespace prefix that it hasn't yet printed a 
> definition for then it searches up the parent links for a matching 
> binding in the namespaces slot.  Is that correct?

there is no need to guess. it's in the file 
xml:code;xparser;xml-printer.lisp of the cl-xml release and has been 
for years. it builds shallow binding environments based on those 
asserted in the encoded elements and augments them where a name appears 
for which no binding is present. then only does it do a search through 
the then lexically apparent elements to decide what prefix to use.

which means it would produce the second of the encodings illustrated 

> That works, but consider:
>   let $a := <a xmlns:ns1="NS1"><b><ns1:bx/><ns1:by/></b></a>
>   return $a/b
> This can print as either:
>   <b xmlns:ns1="NS1"><ns1:bx/><ns1:by/></b>
> or:
>   <b><ns1:bx xmlns:ns1="NS1"/><ns1:by xmlns:ns1="NS1"/></b>
> Both are valid, but I would much prefer the former.  My algorithm does 
> that.
>  I think it would be difficult for your algorithm to do that without 
> an extra pass.
why does it matter?

Received on Tuesday, 21 October 2003 20:14:45 UTC