Re: Comment on Canonical XML draft of 2000-06-01, clause A.3

John Boyer wrote:

> I apologize if I've offended you in some way [...].

By no means.

> W.R.T. javascript being in comments, they do appear to be in
> comments,

They *appear* to be in comments, but they are not, as I explained.
Inside a SCRIPT element in HTML, the strings "<!--" and
"-->" are not comment delimiters, but part of the character data content
of the SCRIPT element.  This magic is performed by the HTML 4.0 DTDs,
which declare SCRIPT elements to have the type CDATA.

> and when I feed an XML compliant piece of HTML to an XML processor
> I have on hand, it does seem to report them as comments, so I don't believe
> I am 'calling the dog's tail a leg' so to speak.

And if you had a Javascript-compliant browser that understood XHTML directly,
you would need to wrap the script (complete with "<!--" and "-->") in a
CDATA section, as clause 4.8 of XHTML 1.0 tells you to do.  HTML of the
kind you are describing is not XML-compliant.


> I seem to recall a SAX 1 technical write-up that claimed that SAX 1 would
> not send comment events because comments were for document authors, not
> document consumers.  The assumption seemed to be that XML would only be
> authored in a text editor.

I think the assumption is that SAX is not a suitable interface for an
XML editor's parser; it hides far too much.  An editor needs a far lower-level
representation, preserving entity references, DOCTYPE declarations, and all
quite intact.

>  If I am using a design tool, I don't want it to
> toss my comments.  The point being that it is generally better to keep the
> comments, then give specific applications the ability to opt out.

I agree in general.  I'm not so sure that from the DSIG point of view,
which is the main (so far, the only) customer for Canonical XML, that
the idea of signing stuff in comments really makes sense.  And I cannot
see an XML editor which would be willing to take a Canonical XML view of
the file being edited, for the same reason I gave above.

So in fact I would like to see viable use cases for keeping comments,
since I believe that neither the "Javascript" case nor the "editor" case
stands up to scrutiny.
 
> 3) Whitespace Text node descendants of the root.  The toolsets' elimination
> of this whitespace would seem to be in contradiction of section 2.10 of the
> XML spec.

You are right.  However, the Infoset at least has interpreted 2.10 to exclude
whitespace outside the document, as has the DOM.

>` However, the point is moot since the problem can be solved by the
> same idea as #2 above.  The XPath expression to opt out of this is given.

My point is that unless the XPath processor is founded on something other
than SAX or DOM, it won't be able to access such whitespace anyhow.
So omitting it makes it easier for applications, especially small-footprint
applications, to generate Canonical XML.

It is also not clear that the XPath data model requires it to even be
represented: clause 5.1 of [XPath] speaks of an element node, comment nodes,
and processing instruction nodes as children of the root node, but no mention
is made of text nodes.

I have sent queries to XML Core and to xpath-comments
to ask what the intention is wrt whitespace outside the document.

> Also, do the tools you have discard comments outside the main document
> element?  In other words, do we have the same toolset problem with comment
> node descendants of the root?

DOM at least does allow the representation of such comments.

-- 

Schlingt dreifach einen Kreis um dies! || John Cowan <jcowan@reutershealth.com>
Schliesst euer Aug vor heiliger Schau,  || http://www.reutershealth.com
Denn er genoss vom Honig-Tau,           || http://www.ccil.org/~cowan
Und trank die Milch vom Paradies.            -- Coleridge (tr. Politzer)

Received on Friday, 2 June 2000 17:19:05 UTC