Re: [xml-dev] [good] Question about NS 1.1 from james anderson on 2002-04-13 (www-xml-blueberry-comments@w3.org from April 2002)

From: james anderson <james.anderson@setf.de>
Date: Sun, 14 Apr 2002 01:01:34 +0200
To: "David G. Durand" <david@dynamicdiagrams.com>, www-xml-blueberry-comments@w3.org, xml-names-editor@w3.org
Message-ID: <3CB8B8CC.B79A258@setf.de>
i thought my examples did respond to the descriptions of cases. i regret
if they were unintelligible. it was suggested that expressive flexibilty
precludes strict typing. in summary, the demonstrations all involved
instructions to the serializer as to which prefixes to
use for certain namespaces. the instructions were to the serializer
only. the document models themselves were strictly namespace conformant.

the issue behind the demonstrations, is whether it makes sense to worry
about the correctness of transformations as now expressed or whether one
should just express the transformations correctly in order to permit an
implementation which precludes problems. i burden you with one
last example, below, which demonstrates, perhaps more intelligibly
than its predecessors, that the latter approach has much to recommend it.

perhaps i should just take the time, implement a namespace-aware (that
is, by my standard) version of xslt, and observe the effects of
restricting names to universal names. my predjudice is that any
transform which can now be expressed will still be expressible.

i am agnostic on changing xslt itself. i am concerned that, as is the
case with ns-1.1, one contemplates changing a standard to accomplish a
certain goal, when the proposed change cannot achieve that goal. this
discussion started with the issue of how in-scope namespace should be
managed. ostensibly in order to accomodate operations like combining
stylesheet components which comprised qnames-in-content. the change to
namespaces is, in itself, innoccuous, but it points out flaws inherent in
the way xml applications like xslt are conceived with respect to names
and calls into question the base standards, like xml and namespace in xml.

the response has been, "these problems are not flaws," they are a
consequence of the "flexibility" which is necessary for expressive and
operative power. i can only continue to maintain: the power is in the
wrong place.



careful consideration of the origins of these problems in the nature of
qualified names leads to the conclusion qualified names must be
incorprated into xml as primtive types.

since the qname domain is not, despite its definition as such, (prefix X
localPart), but in fact (prefix X localPart X (prefix -> namespaceName))
a trivial combination of transform expressions must fail because a
combination of the (prefix -> namespaceName) functions will fail unless
the prefixes are guaranteed unique. the alternative is to describe the
transformations in terms of universal names. in which case, the problems
cannot occur. in order to express the transforms in terms of universal
names, it must be possible to guarantee the such expressions can be
correctly encoded to and decoded from xml.  with the addition of only
QNAME and QNAMES tokenized attribute types namespace-conformant
documents are an adequate encoding for such expressions. absent these
types, correct decoding and encoding cannot be quaranteed. the change to
tokenized types is the only required change in notation. the changes to
a parser implementation in order to accommodate this change in notation
amounted to two additional classes and two additional methods for the
respective value normalizations. in all about forty lines of code. 

below follow examples which demonstration the problem as well as how to
eliminate it. some of the following is expressed in lisp because the
text is cut from the console/listener of the development environment of
cl-xml, which is lisp-based. those operator names which are not
self-explanatory can be best understood by analogy with xpath
expressions. 

the "?" is the prompt. the expression to evaluate follows.
the output from and the concrete result of the evaluation on the next
line or lines.





1. define a parameter, named "*D1*", parse a document string to produce
a document model and bind it to this parameter. in this version, the
document definition declares the element attributes to be character
data. note that the dtd uses a prefix which differs from those in the
root element:

? (defParameter *d2*
  (parse-document
   "<!DOCTYPE element [
     <!ELEMENT qwer:element EMPTY >
     <!ATTLIST qwer:element
               name CDATA 'qwer:stylesheet'
               use-attribute-sets CDATA #IMPLIED
               xmlns:qwer CDATA 'http://www.w3.org/1999/XSL/Transform' > ]>
    <asdf:stylesheet xmlns:asdf='http://www.w3.org/1999/XSL/Transform'>
      <asdf:element use-attribute-sets='asdf:attribute' />
      <element xmlns='http://www.w3.org/1999/XSL/Transform'
               xmlns:asdf='http://www.w3.org/2001/XMLSchema'
               name='asdf:int'
               use-attribute-sets='asdf:string' />
      </asdf:stylesheet>"))
*D2*


2. the bound document model:

? *D2*
#<DOC-NODE <no uri> #x95A7286>


3. the second node in the document is the first element named {xsl}element:

? (.// *D2* 2)
#<ELEM-NODE #<UNAME {xsl}element #x8B1F32E> 2 #x95A746E>


4. this element has a attribute, {}name, introduced from the dtd
default, with a string value:

? (value (./@ (.// *d2* 2) '{}name))
"qwer:stylesheet"


5. it also has an explicitly encoded attribute, {}use-attribute-sets,
also with the string value:

? (value (./@ (.// *d2* 2) '{}use-attribute-sets))
"asdf:attribute"


6. the third element has also two attributes, {}name and {}use-attribute-sets:

? (attributes (.// *d2* 3))
(#<STRING-ATTR-NODE #<UNAME {}name #x8B15A76>
                    "asdf:int" #x95A79E6>
 #<STRING-ATTR-NODE #<UNAME {}use-attribute-sets #x8B1F3CE>
                    "asdf:string" #x95A79B6>)


7. where the contents of the two {}use-attribute-sets attributes are
combined and recorded as the value of the first attribute, an apparently
legitimate string of qnames is produced:

? (setf (value (./@ (.// *d2* 2) '{}use-attribute-sets))
      (concatenate 'string
                   (value (./@ (.// *d2* 2) '{}use-attribute-sets))
                   " "
                   (value (./@ (.// *d2* 3) '{}use-attribute-sets))))
"asdf:attribute asdf:string"
? 


8. the resulting document model, on the other hand, likely does not
reflect the intent. this is evident when it is serialized:

? (write-node *d2* *trace-output* :encoding :usascii)
<!DOCTYPE element [
 <!-- no root element definition present -->
 <!ELEMENT qwer:element EMPTY >
 <!ATTLIST qwer:element 
   xmlns:qwer CDATA  'http://www.w3.org/1999/XSL/Transform'
   use-attribute-sets CDATA #IMPLIED
   name CDATA  'qwer:stylesheet' >
 ]>

<asdf:stylesheet xmlns:asdf='http://www.w3.org/1999/XSL/Transform'>
      <qwer:element xmlns:qwer='http://www.w3.org/1999/XSL/Transform'
                    use-attribute-sets='asdf:attribute asdf:string'
                    name='qwer:stylesheet' />
      <element xmlns='http://www.w3.org/1999/XSL/Transform'
               xmlns:asdf='http://www.w3.org/2001/XMLSchema'
               xmlns:qwer='http://www.w3.org/1999/XSL/Transform'
               name='asdf:int' use-attribute-sets='asdf:string' />
      </asdf:stylesheet>
#<DOC-NODE <no uri> #x95A6366>
? 


9. the alternative is to correctly describe and implement the domains.
where that is done, the problem cannot occur. this second document is
analogous to the previous one, but declares the domain of the
{}name and {}use-attribute-sets attributes accurately:

? (defParameter *d3*
  (parse-document
   "<!DOCTYPE element [
     <!ELEMENT qwer:element EMPTY >
     <!ATTLIST qwer:element
               name QNAME 'qwer:stylesheet'
               use-attribute-sets QNAMES #IMPLIED
               xmlns:qwer CDATA 'http://www.w3.org/1999/XSL/Transform' > ]>
    <asdf:stylesheet xmlns:asdf='http://www.w3.org/1999/XSL/Transform'>
      <asdf:element name='asdf:stylesheet'
                    use-attribute-sets='asdf:attribute' />
      <element xmlns='http://www.w3.org/1999/XSL/Transform'
               xmlns:asdf='http://www.w3.org/2001/XMLSchema'
               name='asdf:int'
               use-attribute-sets='asdf:string' />
      </asdf:stylesheet>"))
*D3*


10. as a consequence of which, the values need no longer remain strings,
but can now be modelled more correctly as a universal name and a list of
universal names, respectively:

? (value (./@ (.// *D3* 2) '{}name))
#<UNAME {xsl}stylesheet #x8B15996>
? (value (./@ (.// *D3* 2) '{}use-attribute-sets))
(#<UNAME {xsl}attribute #x8B22316>)


11. in this case, combination operations can be performed safely:

? (setf (value (./@ (.// *D3* 2) '{}use-attribute-sets))
      (concatenate 'list
                   (value (./@ (.// *D3* 2) '{}use-attribute-sets))
                   (value (./@ (.// *D3* 3) '{}use-attribute-sets))))
(#<UNAME {xsl}attribute #x8B22316> #<UNAME {xsd}string #x875E4DE>)


12. furthermore, the meaning of the document remains intact when it is
serialized: it is possible to correctly encode the value of the
use-attribute-sets attribute and to recognize that an additional
namespace declaration is necessary.

? (write-node *D3* *trace-output* :encoding :usascii)
<!DOCTYPE element [
 <!-- no root element definition present -->
 <!ELEMENT qwer:element EMPTY >
 <!ATTLIST qwer:element 
   xmlns:qwer CDATA  'http://www.w3.org/1999/XSL/Transform'
   use-attribute-sets QNAMES #IMPLIED
   name QNAME  'qwer:stylesheet' >
 ]>

<asdf:stylesheet xmlns:asdf='http://www.w3.org/1999/XSL/Transform'>
  <qwer:element xmlns:qwer='http://www.w3.org/1999/XSL/Transform'
                name='qwer:stylesheet'
                use-attribute-sets='qwer:attribute nsp-1:string'
                xmlns:nsp-1='http://www.w3.org/2001/XMLSchema-datatypes' />
  <element xmlns='http://www.w3.org/1999/XSL/Transform'
           xmlns:asdf='http://www.w3.org/2001/XMLSchema-datatypes'
           xmlns:qwer='http://www.w3.org/1999/XSL/Transform'
           name='asdf:int'
           use-attribute-sets='asdf:string' />
  </asdf:stylesheet>
#<DOC-NODE <no uri> #x95BDCA6>
? 

my argument with xslt and with xslt-aware editors is minor. even without
this notational change, the implications of these QNAME/QNAMES
declarations are hardwired and should be implemented. absent that level
of implementation, the respective applications are not namespace-aware.

the more significant issue is that the base encoding form, that is xml,
cannot claim to serve as an encoding medium with which to express
operations on itself until it can describe all of its own types. with
the introduction of universal names, xml must include QNAME and QNAMES
as primitive tokenized types.

this change meets all requirements set out in
WD-xml-names11-req-20020403. in particular, contrary to my earlier
assessment with respect to requirements 3 and 5, the necessary
modifications constituted trivial changes to an existing namespace-aware processor.

...
Received on Saturday, 13 April 2002 18:54:15 UTC