Re: WG Last Call: Section on Internationalization, XML

Another message caught by the spam filter.  I have now learned that it is 
possible to have a separate list of email addresses for people who can post 
to the list, but whose email address isn't part of the list (say, for 
people who receive list email through a spam filter address, but send mail 
from a different address).  I am adding valid email addresses caught by the 
spam filter to this "accept2" list, so hopefully this "caught by the spam 
filter" phenomenon will decrease to zero over time.

- Jim

Now for the message from Martin Duerst:

-----Original Message-----
From:	Martin J. Duerst [SMTP:duerst@w3.org]
Sent:	Wednesday, January 21, 1998 9:23 PM
To:	ejw@ics.uci.edu; 'WEBDAV WG'
Subject:	[Spam?] Re: WG Last Call: Section on Internationalization, XML

At 14:56 98/01/19 -0800, Jim Whitehead wrote:
>
> *** WORKING GROUP LAST CALL FOR COMMENTS ***

Hello Jim,

I had a look at the document, in particular the i18n section.
It is great to see such a section (it's the first i-d where
I see such a section), and it reads very nicely.

The only factual error I have found is the (implicit)
claim that UTF-8 does not support all of ISO 10646.
This is not at all true. UTF-8 is as potent as UCS-4;
if not, the IETF/IESG would not support it as they do.

I would therefore propose to remove any reference to
UCS-4.

Also, I would propose to clearly state, here or in a
usage document or whereever, that in order to support
interoperability, it is strongly suggested that only
UTF-8 (and UTF-16) be used, and that the former be
preferred for ease of debugging and compactness,...
You could even go further and say that in Webdav,
only UTF-8 is accepted. It will help a lot, because
otherwise, you will very quickly get two webdav
implementations that don't work together.

XML has to serve many masters, and in particular for
the document community, it seemed too audacious to
simply restrict it to one encoding only. But for a
place such as webdav, it's the best solution to do
this on its own.


There are a few other points where features of XML
seemed to have gone unused. In particular, I didn't
find, in the examples or in general:

- Entities
- Inline DTD additions/extensions
- Attributes
- CDATA sections

While the lack of attributes is somewhat unusual, but
easy to understand (you put everything into element
content), the lack of the three other things looks
not very surprising. There is e.g. some folklore
that says that XML implementers typically implement
everything but entities.

With the definitions and the examples in the spec,
the following scenario is very probable:
- Receiver-side implementors provide the necessary
   features in their parser and test the examples.
- Because a receiver is usually also a sender, they
   won't send data with the above features.
- Everything goes fine for some time.
- Some new implementor comes along, has read all of
   the XML spec, and decides to be "clever".
- Stuff doesn't work, and everybody gets blamed to
   not conform to XML.

So I think the spec should:

- Either be pepped up with some more complicated
   examples with the above features.
- Or say clearly that it only uses a subset of XML,
   and say which features are excluded.


Please also note that while XML 1.0 is a W3C proposed
recommendation, and may become a recommendation in
the near future, namespaces are not part of XML 1.0.


Hope this helps.  With kind regards,   Martin.

Received on Thursday, 22 January 1998 12:09:04 UTC