Last Call feedback on XML Infoset

Dear members of the XML Core working group

In response to the Last Call of the Infoset working draft [1], the XML
Query Working Group (or XML Query for short) would like to make the
following comments that are based on the expectations and the requirements
that XML Query has as a downstream consumer of an infoset provider and
as a provider of XML fragments and documents.

These comments were discussed in our most recent teleconference:
http://lists.w3.org/Archives/Member/w3c-xml-query-wg/2000Feb/0097.html

We understand that some of these comments might not be immediately
addressable for the version 1.0 PR and we are aware that we are submitting
these comments after the deadline. However, XML Query considers these
issues of importance to its own work and humbly requests that these items
to be seriously considered and definitely addressed in the subsequent
versions of Infoset in cooperation with XML Query.

The XML Query WG supports the comments [2] by the XML Schema WG on the
document "XML Information Set W3C Working Draft 20-December-1999" [1] and
would like to add the following issues.

1. Addition to the following comment 1 of [2]:

"* The Infoset spec should make clearer that it provides a definition
   of one specific body of information, not an exhaustive or unified
   inventory of the various kinds information which specialized XML
   processors (e.g. schema-aware processors) may provide to
   downstream applications."

To meet requirements 3.3.1, 3.3.2, and 3.3.5 in the requirements document
[3] of the XML Query WG, it is necessary that a future version of Infoset
includes type information derivable from a schema-aware XML-processor. This
needs to include complex types (such as list-typed attributes NMTOKENS,
IDREFS) as well as all simple types of XML-Schema.

2. Namespace URI in the Namespace Information Item

The current draft [1] requires that the namespace information item provides
either an absolute namespace URI or the character list that forms the
namespace identifier. It is not clear what the term absolute URI means and
how it relates to the second, alternative representation. Does it means
that
a relative URI in a namespace declaration is extended by the base URI of
the
document? This seems to be in contradiction to the spirit of the namespace
specification, that reads:

     [Definition:] URI references which identify namespaces are
     considered identical when they are exactly the same character-
     for-character. Note that URI references which are not identical
     in this sense may in fact be functionally equivalent. Examples
     include URI references which differ only in case, or which are
     in external entities which have different effective base URIs.

This means that two documents with the same relative namespace URI are
equivalent regardless of the (implied) absolute URI. This is certainly not
the same as the Infoset specification implies.

Also, the two properties are specified as alternative properties. Thus,
the same document can have two different infosets, where one provides the
absolute namespace URI in [namespace URI] form and the other presents the
same URI as written using the [children] property.  No explanation is given
for having the two properties as alternatives or when each should be used,
although it is implied in the property definition of [namespace URI] that
resolution of relative namespaces is taking place and that this may be the
distinguishing factor.

Comparison is important from a query point of view; thus, it is critical
that the conditions be specified for when to use each property.  Since
the Infoset specification cannot by itself redefine the Namespaces
specification, resolution of relative namespaces into absolute form
cannot yet be required.  Until the Namespaces specification is revised,
we propose that the [children] property -- possibly renamed [URI] to
avoid confusion -- be a core property for holding the namespace URI (the
exact string as supplied in the original document) and that the current
[namespace URI] property be removed.

3. Editorial issue: null string vs empty string

XML Query is concerned about the unnecessary and potentially misleading
use of the term "null" in the Infoset.

The phrases "null string" and "empty string" are used as though these are
interchangeable - which is not the case in XML Schema or in standard SQL -
and it is the "empty string" that would be consistent with usage in those
specifications.

Moreover, neither the Namespaces REC nor RFC2396 ever says "null" for an
unspecified or "" URI-reference - they always say "empty", e.g.

  Namespaces 5.2:

"If the URI reference in a default namespace declaration is empty, then
unprefixed elements in the scope of the declaration are not considered to
be
in any namespace.

...The default namespace can be set to the empty string."

  RFC 2396 4.2:

"In other words, an empty URI reference within a document is interpreted as
a reference to the start of that document,..."

The XML Query WG requests that the use of "null" in Infoset be changed to
"empty", to be consistent with Schema, Namespaces, RFC2396, and SQL.


The XML Query Working Group kindly requests that these issues are addressed
and resolved as soon as possible and that the necessary liasons between
Infoset/Core and the other working groups are established to resolve these
issues in a mutually acceptable way.

Best regards
The XML Query WG members

[1] http://www.w3.org/TR/xml-infoset
[2]
http://lists.w3.org/Archives/Public/www-xml-infoset-comments/2000JanMar/0013

.html
[3] http://www.w3.org/TR/2000/WD-xmlquery-req-20000131

Paul Cotton, DB2 Language Architecture & Standards
IBM Canada Ltd, 17 Eleanor Drive, Nepean, Ontario K2E 6A3
Phone: (613) 225-5445   Fax:  (613) 226-6913
email: cotton@ca.ibm.com

Received on Monday, 28 February 2000 09:29:44 UTC