W3C home > Mailing lists > Public > www-tag@w3.org > October 2003

Re: Rough sketch for an I-D (a successor of RFC 3023)

From: MURATA Makoto <murata@hokkaido.email.ne.jp>
Date: Thu, 30 Oct 2003 08:33:01 +0900
To: Chris Lilley <chris@w3.org>
Cc: www-tag@w3.org
Message-Id: <20031030074247.081F.MURATA@hokkaido.email.ne.jp>

> Your suggestion to write a replacement was discussed by the TAG, in
> the context of my action item to write an updater ID. They sugested we
> join forces here. So, if you are looking for co-authors, I would like
> to suggest myself.

That is fine to me.  I am happy to work with you.

> MM> I think that further discussion about the content of this I-D
> MM> should be moved to the IETF-XML-MIME ML.
> I prefer a list and a process that I understand well.

I prefer an IETF ML an RFC at this stage of the game.  First, Ned's 
update is not an RFC yet.  Second, if we are going to create an RFC 
(as suggested in the TAG minutes), an IETF ML is a better place.  

> MM> By the way, I cannot find image/svg+xml in the IANA list
> Correct. I have an action to write it, but we thought that the MIME
> registration revision by Ned Freed would have been an RFC by now. We
> want to register image/svg+xml in the standards tree, obviously.

Thanks for the clarification.

> MM> 2) the optional charset parameter is RECOMMENDED if and
> MM>    only if the value is guaranteed to be correct
> That is an improvement, but does not go far enough.
> Firstly, the charset parameter and the xml encoding declaration should
> never differ, because otherwise the document is only well formed in
> transit and not when processed on the server or when saved to the
> client.

To make progress, let's say that there is a concern about such differences 
and that, if the recipient saves the document in a file without rewriting 
the encoding declaration, the result is a broken XML document.

> MM> - Server implementers or Server Managers SHOULD NOT specify the
> MM>   default value of the charset
> MM>   parameter of text/xml, application/xml,
> MM>   Text/xml-external-parsed-entity,
> MM>   Application/xml-external-parsed-entity, */*+xml, or
> MM>   Application/xml-dtd, unless they can guarantee that 
> MM>   that default value is correct for all MIME entities of these media
> MM>   types.
> Which, it is possible to show, they cannot do in the general case.

Although WWW servers send many XML documents, protocol implementations (e.g., SOAP) also 
send XML documents.  It is easy for them to correctly specify the charset parameter.
I think that future implementations of trackback should correctly specify the charset 
parameter.  (At present, people unfortunately use "application/x-www-form-urlencoded" 
without providing any info, which causes lots of data corruption in Japan.)

> MM> 3) Fragment identifier
> Perhaps the framework and scheme should be pointed to?

I can imagine that this starts a heated discussion and a significant delay.  
I know that the XML Core WG would like to register XPointer as fragment 
identifiers, but has W3C agreed on this?  (I'm just asking.)

> MM> 4) Possible reasons for not providing the charset parameter for
> MM> specialized media types
> MM> I think that "This media type is utf-8 only and thus does not need any
> MM> mechanism to identify the charset" is a perfectly good reason, since
> MM> "UTF-8 only" is a generic principle.  This should be mentioned in the
> MM> I-D.
> Its one specific reason. Its not enough though. Why should an XML file
> that can be either UTF-8 or UTF-16 need a charset parameter? It offers
> no useful or additional information. All XML processors handle both
> charsets as a conformance requirement.

In general, I do not want to undermine the only generic mechanism (the 
charset parameter) without establishing an alternative.  Something limited 
to XML is not generic to me.  Furthermore, "UTF-16LE" or "UTF16-BE" are preferred 
by RFC 2781 but they are not mandatory in XML.  However, to make progress, 
I am willing to mention "UTF-8 or UTF-16 only" as a possible reason 
together with my concern above.

> The rest of this sounds good; but I think we need fuller discussion on
> the use of a charset parameter that disagrees with what the XML
> encoding declaration says. This is clearly harmful, yet seems to be
> encouraged.

As I suggested above, let's say that there is a concern about it 
and see how people feel about it.


MURATA Makoto <murata@hokkaido.email.ne.jp>
Received on Wednesday, 29 October 2003 18:36:36 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:32:40 UTC