Comments on WD-xml-media-types-20041102

Dear XML Protocol Working Group,
Dear Web Services Description Working Group,

  These are my comments on the WD "Assigning Media Types to Binary Data
in XML" at <http://www.w3.org/TR/2004/WD-xml-media-types-20041102/>.

Please change the document so that it is clear which sections and which
references are normative and which are informative.

The document has several sections labeled "Editorial note", these use
table markup but do not really contain tabular data, please change the
markup to something else, e.g. <dl> or just a <p> with a <strong>
leading "Editorial note:". Also note that "Editorial note:" is not a
proper table summary (as specified in the summary attribute).

The xml-infoset reference refers to the first edition of the document,
please change it to refer to the second edition.

Section 1 has an example "text/xml; charset=utf-16", please change this
example to something else, use of text/xml is discouraged and for XML
documents using a charset parameter is claimed to be unnecessary and
harmful.

Section 1.1 states

  Namespace names of the general form "http://example.org/..." and
  "http://example.com/..." represent application or context-dependent
  URIs [IETF RFC 2396].

Please clarify in the document what you mean here, the statement does
not really make sense to me.

Section 1.1 notes "http://www.w3.org/2004/11/xmlmime" is defined by the
document, the namespace URI is too long and it generally makes little
sense to put dates into namespaces URIs, please change this at least to
e.g. "http://www.w3.org/2004/xmlmime" to be consistent with a wide range
of W3C namespaces URIs.

Section 2.2 states

  The value and the meaning of the expectedMediaType attribute is
  similar to the value allowed for the 'Accept' header defined by HTTP
  1.1 specification, Section 14.1 [IETF RFC 2616] and MUST follow the
  production rules defined in that section. The 'q' parameter defined by
  HTTP 1.1 specification, Section 3.9 [IETF RFC 2616] is allowed, but
  other accept-extensions are not allowed.

The production rule in RFC 2616 cannot be used here, it is defined in
terms of octets while the information item is a sequence of characters.
In order to re-use the production rule you would need to define how to
map the characters to octets before attempting to match; the alternate
solution would be to create a new production rule that is defined in
terms of characters.

Please change the section so that it is clear what the differences are,
you note these are "similar" and cite one difference, but it is not
clear to me whether this is the only difference.

Please change the section so that it is clear which production rule you
actually mean, section 14.1 of RFC 2616 has several. It would seem that
you mean the production rule for "Accept", but as that contains
"Accept:" you probably don't, so you might mean the Accept production
rule without the leading "Accept:" or media-range.

Whatever you do, please ensure that parameters can be used, e.g.

  x:expectedMediaType = 'application/xhtml+xml;profile=http://...'

It seems that re-using the Accept header syntax here makes it impossible
to state that any kind of "XML" is accepted, I however think that this
is an important use case for such an attribute. It is often not possible
to include XML documents in other XML documents (e.g. if the document
has a document type declaration). Using e.g.

  x:expectedMediaType = 'application/xml'

would likely not work as e.g. image/svg+xml does not match. I think the
syntax should be extended so that it is possible to express that XML is
expected whatever type might be used. This would be useful e.g. for a
web service interface to the W3C Markup Validator.

The attribute could allow a special string like

  x:expectedMediaType = 'xml'

in place of a media-range, this would allow the W3C Markup Validator to
use it like

  x:expectedMediaType = 'xml, text/html, text/sgml'

If accept-extensions continues to be disallowed, please include a
rationale for the exclusion in the document.

Section 3.1 states

  When the expectedMediaType annotation attribute has a wildcard ("*")
  or a list of acceptable media types, the schema SHOULD require the
  contentType attribute to be present.

This seems to imply that

  x:expectedMediaType = '*'

would be allowed which does not seem to be the case. If the intention is
to allow this syntax, please change the definition accordingly, if it is
not allowed, please re-word the paragraph to make it clear what you
mean.

Section 3.1 states

  The value of the contentType attribute, if present, SHOULD be within
  the range specified by the expectedMediaType annotation attribute, if
  specified in the schema.

It is not clear to me how it is determined whether this requirement has
been met. If there is an algorithm, please reference it normatively, or
provide your own.

Section 2.1 does not define the lexical or value space of the attribute,
it states it is xs:string but it would seem you would rather want to say
that this relates to RFC 2616 Content-Type in some way. When fixing this
please consider my remarks about the "Accept:" header reference here,
too.

The XML Schema for the attributes seems overly lax, for example, as
currently defined, x:expectedMediaType requires that the string has at
least three characters, please change the definition so that a XML
Schema Validator can determine conformance of the attribute value as
much as possible.

Please add a namespace prefix to all occurences of "expectedMediaType"
and "contentType" in the document where you do not specifically refer to
the local name of the attribute, this helps to avoid misunderstandings
by people not totally namespace-aware and make the document more
readable.

regards.

Received on Tuesday, 2 November 2004 17:47:22 UTC