Fw: Revising RFC 3023

Forwarded by MURATA Makoto <murata@hokkaido.email.ne.jp>
----------------------- Original Message -----------------------
From:    MURATA Makoto <murata@hokkaido.email.ne.jp>
To:      ietf-xml-mime@imc.org
Date:    Sun, 25 Jul 2004 22:43:07 +0900
Subject: Revising RFC 3023
----

I am forwarding an e-mail from Noah Mendelsohn.  It contains some
suggestions I got from Noah in private discussion at IBM.  I have his
permission to circulate publicly.

Cheers,

MURATA Makoto
--------------------------------------------
Murata Makoto writes:

>> It is true that RFC 3023 does not reference to XML 1.1, although I
>> do not see anything specific to XML 1.0 (with the exception of
>> security concerns caused by C0 control functions)..

The references I found in 3023 are:

"The World Wide Web Consortium has issued Extensible Markup Language
(XML) 1.0 (Second Edition)[XML].  " "Published specification:
Extensible Markup Language (XML) 1.0 (Second Edition)[XML]."

and more specifically all the examples which show:

Content-type: text/xml; charset="utf-8"
<?xml version="1.0" encoding="utf-8"?>

and so on.  I now see that all of this may have just been intended as
the then-latest forms, but I read them as specific instructions for
use of xml 1.0 as the proper form of application/xml.  So, at the very
least, I think it would be helpful to be more explicit in future
replacements or updates to 3023.

>> I plan to reference to XML 1.1 as well as XML 1.0 in the next I-D,
>> which is expected to supsersede RFC 3023.

I'm torn about this.  In the specific case of SOAP, our recommendation
says that implementations of our HTTP bindings MUST be capable of
sending and receiving in the media-type "application/soap+xml", which
we normatively base on 3023 and application/xml.  In making this
statement, I doubt that most members of the working group realized
that they might have been introducing an open-ended need to conform to
all future versions of XML, including those that would allow
previously-illegal control character content.  Maybe if we'd known
such an XML 1.1 was coming we would have and should have put in such a
requirement to support both: my point is just that I don't think we
did so consciously.

I have openned a discussion of all this on the distApp mailing list
[1], and will in a minute send an update pointing out the tie-in to
RFC 3023.  I will cc: you on that update.  distApp is a public mailing
list, and you are most welcome to join in.

For what it's worth, SOAP puts a tremendous emphasis on
interoperability, so saying that two conforming implemenations of this
binding can disagree about what's legal in the preferred form of
messages is at least a bit troubling.  The original design point was:
if you conform, you can communicate.  Even if 3023 is to support any
version of XML, we in SOAP may be able to fix things with a
clarification to our recommendation.  It so happens that we already
allow use of any other media type, it's just that we make clear that
application/soap+xml is the lingua franca.  Use something else and it
will only work if the recipient agrees.  I suspect we'll wind up
saying: "application/xml with xml version 1.0 is the lingua franca,
you can use other media types or application/soap+xml with later
versions of XML, but only if the recipient supports them."

The control character business is a bigger mess for us I think, as we
have a quite strong rule that the constraints on SOAP envelopes are
the same everywhere, and they are basically what you'd expect from
Infoset (and implicitly XML 1.0).  Note that such envelopes are by
definition synthetic Infosets.  XML 1.1 suggests that, regardless of
how you serialize the message on each hop, there is a question as to
whether in the abstract the new control characters are allowed in the
envelope infosets.  If so, then such messages cannot transit hops
implemented with already deployed software, and that's very troubling
to me.

>> If you have any comments about how these two specs should be
>> referenced, please let me know.  I even think that there should be a
>> W3C guideline about references to XML 1.0 and XML 1.1.  Does the XML
>> Core WG have some recommendations?

I'm not sure I have a carefully considered opinion, but what about the
following as a starting point for discussion?

Very Rough Draft of Proposed W3C policy:

"The publication of XML 1.1 highlights the fact that evolved versions
of XML may change both the serialized form of an XML document and the
content representable in such a document.  In the case of XML 1.1, for
example, certain control characters disallowed by XML 1.0 were added
to the Char production.  Accordingly, W3C recommends that:

* Every recommendation or other publication that makes normative
reference to XML as a serialized document format should make clear
which versions of XML it supports, and in particular whether it allows
for support of potential variants to be published in the future.

* When calling for serialization of data into an XML document, such
recommendations SHOULD appeal to the conventions of the supported
versions of XML regarding proper use of the XML declaration (for
example, the XML 1.1 recommendation suggests use of version="1.1" only
for those documents which would not be correctly represented as
version="1.0".)

* Recommendations should similarly clarify any dependencies on
particular versions of XML regarding the abstract content of data
modeled as XML.  For example, a recommendation based on the Infoset,
XPath data model, DOM, SAX, etc.  SHOULD indicate whether the control
characters introduced by XML 1.1 are allowed in element content,
SHOULD indicate whether potential future changes to XML constructs
such (e.g. a purely hypothetical future change to the set of legal XML
name characters) would be supported, and so on.  "

In the case of future versions of RFC 3023, I think the most
appropriate course might be to say:

"application/xml is to be used with any W3C Recommendation-level
version of XML as identified in the version specification of the XML
declaration.  When no such declaration is present, XML 1.0 is assumed.
In all examples herein where a specific version such as version="1.0"
is shown, it is understood that other versions may also be used,
providing the content does indeed conform to the specified version of
the XML Recommendation.

Specifications and recommendations based on or referring to this RFC
SHOULD indicate any limitations on the particular versions of XML to
be used.  For example, a particular specification might indicate:
"content MUST be represented using media-type application/xml, and the
document must either (a) carry an xml declaration specifying
version="1.0" or (b) omit the xml declaration, in which case per the
XML recommendation the version defaults to 1.0"

Does that seem like a reasonable start?  

[1] http://lists.w3.org/Archives/Public/xml-dist-app/2004Feb/0006.html


--------------------- Original Message Ends --------------------

-- 
MURATA Makoto <murata@hokkaido.email.ne.jp>

--------------------- Original Message Ends --------------------

Received on Monday, 26 July 2004 19:20:22 UTC