- From: François Yergeau <francois@yergeau.com>
- Date: Wed, 06 Aug 2003 13:35:49 -0400
- To: www-dom@w3.org
Sorry about the delay, these are the comments from the XML Core WG about the DOM Level 3 Core and Load&Save Last Call specs. Core (http://www.w3.org/TR/2003/WD-DOM-Level-3-Core-20030609/) ==== Document interface, "xmlStandalone" and "xmlVersion" attributes: both descriptions say that there is a NOT_SUPPORTED_ERR if this document *does* support the "XML" feature. Shouldn't that be does *not* support? +++++++++++++++++++++++++++++++++++++++++ Document interface, "xmlStandalone" attribute: this is said to be from the XML declaration, but is boolean whereas the XML declaration can specify 3 values: "yes", "no" and not specified. Either the datatype should be changed or (my preference) it should be specified that this is true when the XML declaration says "yes", false otherwise. Appendix C.1.1 says that this comes from the [standalone] property, but does not address the case where the property has no value. +++++++++++++++++++++++++++++++++++++++++ Node interface, nodeName attribute: the description doesn't say that this is supposed to be the qualified name of the node. Nor do the descriptions of Element.tagName and Attr.name. One has to determine that from the description of Node.prefix! +++++++++++++++++++++++++++++++++++++++++ Attr interface, "value" attribute: the spec should clearly specify that when retrieving the value of an attribute that contains a reference to an entity for which no definition is available, the processor will treat this entity reference as an empty reference (see the reply to C15 in http://lists.w3.org/Archives/Member/w3c-dom-wg/2003AprJun/0075.html). Same comment for Element.getAttribute() and Element.getAttributeNS(). +++++++++++++++++++++++++++++++++++++++++ Entity interface: the 4th paragraph starts "XML does not mandate that a non-validating XML processor read and process entity declarations made in the external subset or declared in external parameter entities." The last occurrence of "external" is superfluous and somewhat misleading, since non-validating processors are not obligated to read even *internal* parameter entities. This latter point is pretty obscure and under-documented in the XML spec, but see the last sentence in http://www.w3.org/TR/2000/REC-xml-20001006#sec-rmd, as well as published erratum E8 at http://www.w3.org/XML/xml-V10-2e-errata#E8. +++++++++++++++++++++++++++++++++++++++++ Appendix B.1 void Element.normalizeNamespaces() { ... // Fixup element's namespace // if ( Element's namespaceURI != null ) { ... } else { // Element has no namespace URI: if ( Element's localName is null ) { ... } else { // Element has no namespace URI // Element has no pseudo-prefix if ( default Namespace in scope is "no namespace" ) { ==> do nothing, we're fine as we stand } else { if ( there's a conflicting local default namespace declaration already present ) { ==> change its value to use this empty namespace. } else { ==> Set the default namespace to "no namespace" by creating or changing a local declaration attribute: xmlns="". } There seems to be useless redundancy in the last "if". Paraphrasing: "If there's a declaration, change it, else create one or change an existing one". Either drop the "if" and keep the "else" part or keep the "if" and drop "or changing" from the "else" part. +++++++++++++++++++++++++++++++++++++++++ In appendix B.1.1, the first sentence (after the initial Note) is really hard to parse. Comments in the algorithms say things like "if the prefix/namespace pair is within scope...", using this wording would be clearer. It's not an element that's within scope, it's really the prefix/namespaceURI pair. Some rewording/tightening needed. +++++++++++++++++++++++++++++++++++++++++ In appendix C.1.1, Document.actualEncoding should be set to [character encoding scheme], which is "The name of the character encoding scheme in which the document entity is expressed." in the infoset spec, not necessarily the value of the encoding pseudo-att of the XML declaration. Document.xmlEncoding probably should not be set. In appendix C.1.2, [character encoding scheme] should be set from Document.actualEncoding. +++++++++++++++++++++++++++++++++++++++++ In appendix C.1.1, Document.xmlVersion should be "The [version] property or 1.0 if the latter has no value." Same, mutatis mutandis, for xmlStandalone (false if no value). +++++++++++++++++++++++++++++++++++++++++ In appendix C.1.2, [notations] should be "Document.doctype.notations", in order to point to the correct DocumentType object. +++++++++++++++++++++++++++++++++++++++++ In appendix C.2.2, the statement "Element nodes with no namespace URI (Node.namespaceURI equals to null) cannot be represented using the Infoset." is counterfactual. The infoset supports these, provided names do not contain colons. +++++++++++++++++++++++++++++++++++++++++ In appendix C.3.1, Node.previousSibling and Node.nextSibling should be set to null. +++++++++++++++++++++++++++++++++++++++++ In appendix C.6.1, it should be clarified that CDATASection nodes cannot occur from an infoset mapping, since the infoset doesn't contain CDATA section boundaries. Load&Save (http://www.w3.org/TR/2003/WD-DOM-Level-3-LS-20030619/) ========= In several places (1.2.3, 1.2.4, DOMInput, DOMOutput), it is said that UTF-16 is defined in [Unicode] and Amendment 1 of [ISO/IEC 10646]. That last part is obsolete, UTF-16 was defined in Amd 1 of 10646:1993, but integrated in an Appendix of 10646:2000. Just say "...in [Unicode] and in [ISO/IEC 10646]". +++++++++++++++++++++++++++++++++++++++++ In interface DOMImplementationLS, the createDOMOutput() method is missing from the IDL definition and from the Methods details. +++++++++++++++++++++++++++++++++++++++++ In interface DOMImplementationLS, method createDOMInput(), it says "Create a new empty input source." "Empty" is not defined. Does it mean that all attributes are null? This comment will also probably apply to createDOMOutput() when the latter is added (see previous comment). +++++++++++++++++++++++++++++++++++++++++ In interface DOMParser, 1st bullet after 3rd para, it is wrong to claim that CDATA sections are structure. It also seems wrong to set expectations that CDATA sections will show up after parsing when in fact parsers are not required to report them. +++++++++++++++++++++++++++++++++++++++++ In interface DOMParser, 2nd bullet after 3rd para, there is an instance of "datatype-normalization", within quotes and as a link to Core, and just below an instance of "data-type-normalization" without quotes, in fixed-width type, not linked and with an apparently extraneous hyphen. Aren't these the same? The spec certainly makes a nice effort to make them appear different :-) +++++++++++++++++++++++++++++++++++++++++ In interface DOMParser, in the description of the "unbound-namespace-in-entity" warning, how can an unbound prefix be found in an entity *declaration*? Perhaps you mean in an entity's replacement text? Also typo "encounterd". +++++++++++++++++++++++++++++++++++++++++ In interface DOMParser, in the description of the "namespaces" parameter, shouldn't there also be a reference to [XML Namespaces 1.1]? +++++++++++++++++++++++++++++++++++++++++ In interface DOMInput, it says "The DOMParser will use the DOMInput object to determine how to read data. The DOMParser will look at the different inputs specified in the DOMInput in the following order to know which one to read from, the first one through which data is available will be used: " It is not clear how the DOMParser does that, i.e. how it determines if data is available. Is there an expectation that, say, DOMInput.characterStream will be null if data is not available there? What about stringData? Null or empty? Is this binding-specific? Same comment for DOMOutput. +++++++++++++++++++++++++++++++++++++++++ In interface DOMSerializer, the statement "For all other types of nodes the serialized form is not specified, but should be something useful to a human for debugging or diagnostic purposes." seems a bit weak. It should be possible to specify more, especially for Element nodes. +++++++++++++++++++++++++++++++++++++++++ In interface DOMSerializer, method writeURI(), it would be desirable to specify more how to write to a URI, at least for very common schemes such as HTTP(S) and mailto. In HTTP, it would seem desirable to actually be able to choose which verb (POST or PUT) is used. POST is supposed to be used when posting forms, which XForms does with XML data. PUT is supposed to be used for uploading data, here an XML document. The DOM user should be able to specify which to use, perhaps using an additional parameter to the method. The spec should also specify to include a Content-Type header with a media type (which? need a parameter to the method?) and a charset parameter. Same comment for DOMOutput when the systemID ends up being used. +++++++++++++++++++++++++++++++++++++++++ In interface DOMOutput, the descriptions of encoding and systemID seem to have been more or less copy-pasted from DOMInput, not fully taking into account the fact that output is involved, not input. Setting encoding indicates an intention, not a knowledge of the encoding of some existing data. +++++++++++++++++++++++++++++++++++++++++ In interface DOMOutput, is it not possible to specify the behavior when the systemID is a relative URI? Wouldn't "relative to baseURI of Document" work? +++++++++++++++++++++++++++++++++++++++++ -- François Yergeau
Received on Wednesday, 6 August 2003 13:36:25 UTC