Re: Rough sketch for an I-D (a successor of RFC 3023)

On Wednesday, October 29, 2003, 5:24:57 PM, MURATA wrote:


MM> Here is a rough sketch.  Having presented this sketch, I ask the TAG to
MM> reconsider its decision to publish an I-D that updates RFC 3023.  It
MM> would be nice if somebody from W3C (probably some member of I18N WG or
MM> XML Core WG?) can help me.

Your suggestion to write a replacement was discussed by the TAG, in
the context of my action item to write an updater ID. They sugested we
join forces here. So, if you are looking for co-authors, I would like
to suggest myself.

MM> I think that further discussion about the content of this I-D
MM> should be moved to the IETF-XML-MIME ML.

I prefer a list and a process that I understand well.

MM> By the way, I cannot find image/svg+xml in the IANA list

Correct. I have an action to write it, but we thought that the MIME
registration revision by Ned Freed would have been an RFC by now. We
want to register image/svg+xml in the standards tree, obviously.

MM>  and cannot find an I-D.
MM> I find an I-D for application/rdf+xml, but no RFC yet.

MM> 1) deprecate text/xml, text/xml-external-parsed-entity, and text/*+xml

Yes.

MM> - the MIME canonical form with short lines delimited by CR-LF, making
MM>   UTF-16 and UTF-32 impossible

MM> - Casual users will be embarrassed if XML is displayed as text, while
MM>   experts can certainly save and then browse XML documents.

In other words, its text, but it does not meet the requirements for
text/*

MM>

MM> - Worries that the absence of the charset parameter of
MM>   text/xml and text/*+xml is particularly harmful, since the
MM>   default of that parameter is US-ASCII

Yes, that is particularly harmful. If MIME headers are given
precedence and treated as authoritative, it makes every text/xml
document that uses any characters outside ASCII be not well formed.


MM> 2) the optional charset parameter is RECOMMENDED if and
MM>    only if the value is guaranteed to be correct

That is an improvement, but does not go far enough.

Firstly, the charset parameter and the xml encoding declaration should
never differ, because otherwise the document is only well formed in
transit and not when processed on the server or when saved to the
client.

MM> - Server implementers or Server Managers SHOULD NOT specify the
MM>   default value of the charset
MM>   parameter of text/xml, application/xml,
MM>   Text/xml-external-parsed-entity,
MM>   Application/xml-external-parsed-entity, */*+xml, or
MM>   Application/xml-dtd, unless they can guarantee that 
MM>   that default value is correct for all MIME entities of these media
MM>   types.

Which, it is possible to show, they cannot do in the general case.

MM> 3) Fragment identifier

MM> At present, RFC 3023 says:

MM> 	As of today, no established specifications define identifiers
MM> 	for XML media types. However, a working draft published by
MM> 	W3C, namely "XML Pointer Language (XPointer)", attempts to
MM> 	define fragment identifiers for text/xml and
MM> 	application/xml. The current specification for XPointer is
MM> 	available at http://www.w3.org/TR/xptr.

MM> We have XPointer recommendations but are not ready to bless 
MM> XPointer.  We should say so.

Perhaps the framework and scheme should be pointed to?

MM> 4) Possible reasons for not providing the charset parameter for
MM> specialized media types

MM> I think that "This media type is utf-8 only and thus does not need any
MM> mechanism to identify the charset" is a perfectly good reason, since
MM> "UTF-8 only" is a generic principle.  This should be mentioned in the
MM> I-D.

Its one specific reason. Its not enough though. Why should an XML file
that can be either UTF-8 or UTF-16 need a charset parameter? It offers
no useful or additional information. All XML processors handle both
charsets as a conformance requirement.

MM> 5) Needs a real example for the +xml convention.

MM> Application/soap+xml should be mentioned in Section 8 (Examples).

MM> 6) Update References 

MM> Reference to three XPointer recommendations without blessing them as
MM> fragment identifiers of XML media types.

MM> Reference to MathML Version 2 rather than MathML Version 1.1

MM> Reference to Scalable Vector Graphics (SVG) 1.1

MM> Although XML 1.1 is not a recommendation yet, I think that we should
MM> mention it and say "It is very likely that XML 1.1 will reference to
MM> this document".

MM> 7) New Appendix: Changes from RFC 3023

MM> We need a summary of these changes 

The rest of this sounds good; but I think we need fuller discussion on
the use of a charset parameter that disagrees with what the XML
encoding declaration says. This is clearly harmful, yet seems to be
encouraged.

-- 
 Chris                            mailto:chris@w3.org

Received on Wednesday, 29 October 2003 14:38:00 UTC